Saving and loading in-memory btrees to disk#820
Conversation
There was a problem hiding this comment.
Pull request overview
This PR extends BfTreeProvider persistence to support saving and loading indices that were originally backed by in-memory (:memory:) BfTrees, including both non-quantized and PQ-quantized variants.
Changes:
- Added
snapshot_to_disk(...)helpers on vector/quant/neighbor providers to persist in-memory BfTrees to disk. - Extended persisted
SavedParamswith anis_memoryflag and updated load/save flows to round-trip in-memory indices. - Added async tests covering save/load for in-memory indices with and without PQ quantization.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
diskann-providers/src/model/graph/provider/async_/bf_tree/vector_provider.rs |
Adds a convenience method to snapshot in-memory vector BfTree to disk. |
diskann-providers/src/model/graph/provider/async_/bf_tree/quant_vector_provider.rs |
Adds a convenience method to snapshot in-memory quant-vector BfTree to disk. |
diskann-providers/src/model/graph/provider/async_/bf_tree/neighbor_provider.rs |
Adds a convenience method to snapshot in-memory neighbor-list BfTree to disk. |
diskann-providers/src/model/graph/provider/async_/bf_tree/provider.rs |
Persists is_memory, adjusts save/load logic for in-memory backends, and adds new tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
diskann-providers/src/model/graph/provider/async_/bf_tree/provider.rs
Outdated
Show resolved
Hide resolved
diskann-providers/src/model/graph/provider/async_/bf_tree/provider.rs
Outdated
Show resolved
Hide resolved
|
This looks fine to me. The repeated pattern of loading and saving all the various providers could maybe be factored into two helper functions? I'm not sure how hard that is, so feel free to do what is reasonable. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #820 +/- ##
==========================================
- Coverage 90.64% 88.96% -1.69%
==========================================
Files 432 442 +10
Lines 79629 81906 +2277
==========================================
+ Hits 72182 72870 +688
- Misses 7447 9036 +1589
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Good suggestion. For loading I created the following function: and I'm calling it in 5 places. For saving I was able to refactor the code with help of traits but the resulting code was complicated and longer than the original code. So, for saving, perhaps we can leave the current solution. |
This PR allows to save and load in-memory (
:memory:) indices. Previously, it was possible to save/load only on-disk indices.It also adds
is_memoryvariable toSavedParamsstruct so that we know whether the original bf-tree index was in-memory or on-disk.We also add two tests on whether in-memory saving/loading works both in pq and non-pq settings.