SDF Chatter
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
RSS Bot@lemmy.bestiver.seMB to Lobste.rs@lemmy.bestiver.seEnglish · 5 days ago

SPFresh: Incremental In-Place Update for Billion-Scale Vector Search

arxiv.org

external-link
message-square
0
fedilink
1
external-link

SPFresh: Incremental In-Place Update for Billion-Scale Vector Search

arxiv.org

RSS Bot@lemmy.bestiver.seMB to Lobste.rs@lemmy.bestiver.seEnglish · 5 days ago
message-square
0
fedilink
Approximate Nearest Neighbor Search (ANNS) is now widely used in various applications, ranging from information retrieval, question answering, and recommendation, to search for similar high-dimensional vectors. As the amount of vector data grows continuously, it becomes important to support updates to vector index, the enabling technique that allows for efficient and accurate ANNS on vectors. Because of the curse of high dimensionality, it is often costly to identify the right neighbors of a single new vector, a necessary process for index update. To amortize update costs, existing systems maintain a secondary index to accumulate updates, which are merged by the main index by global rebuilding the entire index periodically. However, this approach has high fluctuations of search latency and accuracy, not even to mention that it requires substantial resources and is extremely time-consuming for rebuilds. We introduce SPFresh, a system that supports in-place vector updates. At the heart of SPFresh is LIRE, a lightweight incremental rebalancing protocol to split vector partitions and reassign vectors in the nearby partitions to adapt to data distribution shift. LIRE achieves low-overhead vector updates by only reassigning vectors at the boundary between partitions, where in a high-quality vector index the amount of such vectors are deemed small. With LIRE, SPFresh provides superior query latency and accuracy to solutions based on global rebuild, with only 1% of DRAM and less than 10% cores needed at the peak compared to the state-of-the-art, in a billion scale vector index with 1% of daily vector update rate.

Comments

alert-triangle
You must log in or register to comment.

Lobste.rs@lemmy.bestiver.se

lobsters@lemmy.bestiver.se

Subscribe from Remote Instance

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !lobsters@lemmy.bestiver.se
lock
Community locked: only moderators can create posts. You can still comment on posts.

RSS Feed of lobste.rs

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 1 user / day
  • 1 user / week
  • 1 user / month
  • 1 user / 6 months
  • 0 local subscribers
  • 159 subscribers
  • 151 Posts
  • 0 Comments
  • Modlog
  • mods:
  • patrick@lemmy.bestiver.se
  • RSS Bot@lemmy.bestiver.se
  • BE: 0.19.8
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org