In-Mem 2.0#1206
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces “In-Mem 2.0” by adding a new diskann-inmem crate implementing a concurrent-safe in-memory provider (EBR + slot tags + blob store), along with integration/stress testing and benchmark-runner plumbing to exercise it and compare against checked-in baselines.
Changes:
- Add new
diskann-inmemcrate (epoch registry, slot tags, store, ID translation, distance layers) plus unit + integration stress tests and baseline regression harness. - Integrate inmem2 into
diskann-benchmarkbehind a newinmem2feature and extenddiskann-benchmark-runnerformatting/UX helpers used by the integration binary. - Wire the new crate into the workspace (
Cargo.toml,Cargo.lock) and CI feature matrix.
Reviewed changes
Copilot reviewed 48 out of 49 changed files in this pull request and generated 16 comments.
Show a summary per file
| File | Description |
|---|---|
| rfcs/01206-inmem2.md | New RFC documenting the concurrency motivation and EBR/tag protocol design. |
| diskann-vector/src/distance/implementations.rs | Adjusts SIMD dispatch hook to use run2_inline. |
| diskann-inmem/src/test/sequencer.rs | Adds a test sequencing helper (condvar + state machine). |
| diskann-inmem/src/test/mod.rs | Introduces diskann-inmem unit-test module structure. |
| diskann-inmem/src/test/epoch.rs | Adds directed stress tests for the epoch Registry. |
| diskann-inmem/src/tag.rs | Implements atomic slot tags + protocol documentation and tests. |
| diskann-inmem/src/sharded.rs | Adds sharded external↔internal ID mapping utility with tests. |
| diskann-inmem/src/num.rs | Adds Bytes/Align strong types with tests. |
| diskann-inmem/src/lib.rs | New crate root wiring modules, exports, and internal helper macro. |
| diskann-inmem/src/layers/mod.rs | Introduces distance-layer traits (Layer, Search, QueryDistance, etc.). |
| diskann-inmem/src/layers/full.rs | Implements full-precision distance layer + specialization and tests. |
| diskann-inmem/src/integration/store.rs | Public (integration-test-only) wrapper around internal store::Store. |
| diskann-inmem/src/integration/mod.rs | Integration-test-only module exports. |
| diskann-inmem/src/integration/counters.rs | Integration-test-only counter snapshot type for reporting. |
| diskann-inmem/src/freelist.rs | Adds a freelist helper for finding/recycling internal slot IDs (with tests). |
| diskann-inmem/src/counters.rs | Adds counters/no-op counters depending on integration-test feature. |
| diskann-inmem/src/buffer.rs | Adds raw buffer + RawSlice abstraction for miri-friendly storage (with tests). |
| diskann-inmem/integration/support/tolerance.rs | Defines an “empty tolerance” input type for runner regression checks. |
| diskann-inmem/integration/support/mod.rs | Integration support module wiring. |
| diskann-inmem/integration/support/io.rs | Adds dataset loading + conversion helper for integration tests. |
| diskann-inmem/integration/support/check.rs | Adds structured baseline mismatch aggregation/printing utilities. |
| diskann-inmem/integration/store.rs | Adds store concurrency stress benchmark binary implementation. |
| diskann-inmem/integration/main.rs | Adds integration-test runner binary entrypoint and regression harness. |
| diskann-inmem/integration/jsons/store-stress.json | Adds example stress-test job JSON. |
| diskann-inmem/integration/jsons/store-stress-test.json | Adds a smaller stress-test job JSON for CI/unit testing. |
| diskann-inmem/integration/jsons/integration.json | Adds integration-test job definitions (YFCC 10k A/B runs). |
| diskann-inmem/integration/jsons/integration-baseline.json | Adds checked-in baseline outputs for regression comparison. |
| diskann-inmem/integration/jsons/checks.json | Adds checker config using empty tolerances. |
| diskann-inmem/integration/index/tests.rs | Adds index-level insert + knn execution helpers and result DTOs. |
| diskann-inmem/integration/index/runner.rs | Adds runner glue: deserialize inputs, build provider, run regression checks. |
| diskann-inmem/integration/index/object.rs | Adds trait-object interface for an index + counter/metric DTOs. |
| diskann-inmem/integration/index/mod.rs | Wires index integration modules and registration. |
| diskann-inmem/DEV.md | Documents how to run full test suite with integration-test enabled. |
| diskann-inmem/Cargo.toml | New crate manifest, deps, bin target, and integration-test feature. |
| diskann-inmem/.clippy.toml | Clippy config allowing unwrap/expect/panic in tests for this crate. |
| diskann-benchmark/src/index/mod.rs | Registers new inmem2 benchmarks behind feature flag. |
| diskann-benchmark/Cargo.toml | Adds optional dependency and feature flag for diskann-inmem. |
| diskann-benchmark-runner/src/utils/fmt.rs | Adds KeyValue formatter utility and tests. |
| diskann-benchmark-runner/src/files.rs | Implements Display for InputFile to improve formatting. |
| diskann-benchmark-runner/src/app.rs | Minor formatting fix for mismatch reporting output. |
| Cargo.toml | Adds diskann-inmem to workspace + dependency table; adds profile.samply. |
| Cargo.lock | Locks new diskann-inmem package and dependencies into the workspace lockfile. |
| .github/workflows/ci.yml | Enables integration-test and inmem2 in CI feature set. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1206 +/- ##
==========================================
+ Coverage 89.64% 89.82% +0.17%
==========================================
Files 503 518 +15
Lines 95761 98963 +3202
==========================================
+ Hits 85848 88895 +3047
- Misses 9913 10068 +155
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Introduce a second in-memory provider with the intention of replacing the current provider.
Why
The RFC outlines much of the motivation. In short, the goal here is to:
The concurrency argument comes from the epoch-based-reclamation (EBR) protection scheme for internal slots. See the RFC for more details.
Known Follow-up Items
Store, but will need some thought on how to add the reranking layer. This shouldn't be an architectural blocker, though.max_fp_vecsfeature of our current PQ implementation, which reads some full-precision and some quantized vectors during prune. Like with quantization in general, I think this is not a fundamental issue.Suggested Reviewing Order
The majority of this PR is in a new
diskann-inmemcrate. To facilitate testing, this crate has an "integration-test" feature, which enables the code indiskann-inmem/src/integration. This code is an unstable public reexport of internal types meant only for consumption in thediskann-inmem/integrationintegration test binary.The integration test binary is powered by
diskann-benchmark-runner.diskann-inmemIndependent Low-Level Utilities
num.rs: Strong type utilities for byte and alignment representations.buffer.rs: A miri-compliant version ofAlignedMemoryVectorStore. This type allows vectors/neighbors to be stored in a single larger allocation. The use ofRawSliceallows slots within theBufferto be inspected and manipulated without forcing reference materialization (which is important to prevent aliasing).neighbors.rs: The new version ofSimpleNeighborVectorProviderAsync. This reuses the sharded-lock idea, but provides additional utility, including the ability to perform read-modify-write operations on adjacency lists.counters.rs: Event counters. When the "integration-test" feature is not enabled, counters become a no-op. These are enabled for testing to monitor changes.sharded.rs: An external-to-internal ID translation utility. The main trick with this struct is to provide utilities likeSharded::occupied_entry, which locks and returns anexternal/internalmapping. The proxyEntrystruct is important as it verifies that such a mapping exists and provides an infallible way of deleting the mapping. This is chained with higher level operations (e.g.Provider::delete) to delete both the ID-mapping and the internal data-slot in lock-step.Concurrency Protocol
The concurrency protocol is built upon three main layers:
tag.rs: An atomic slot tag for controlling access to data.epoch.rs: The central registry where readers register and deregister. This is the crux of this PR and probably the most important file.store.rs: A binary blob store built on top ofepoch.rsto provide the safe concurrent store for data. This provides the following operations:Reader).The
Storeinstore.rshas some help fromfreelist.rsto accelerate locating available slots internally.Testing:
epoch.rshas unit tests with injectable delays to set up known pathological orderings. The sequencing is helped bytest/sequencer.rs. In addition,test/epoch.rsincludes a direct stress test for theRegistry. This is particularly helpful when run under Miri, which has the ability to detect some race conditions.A larger concurrency stress test lives in the integration-test binary. This directly tests
store.rsby spinning up readers, writers, and retirers and hammers a singleStore. Data is read and written into the store in a knowable pattern, allowing readers to detect torn reads, implying a race condition. For this PR, I ran the following stress test file{ "search_directories": [ ], "output_directory": null, "jobs": [ { "type": "store-stress", "content": { "capacity": 8192, "duration_secs": 600, "entry_bytes": 256, "low_watermark": 4096, "max_ops": 50000000000, "readers": 32, "retirers": 16, "seed": 11935966405698895599, "writers": 32 } } ] }with the command
The generated output was
While not a proof of correctness, this is a pretty decent stress test.
Providers
The implementation of the data provider is split into two logical pieces. The first lives in
layers/and is focused on computing distances. With this approach, I am trying to avoid the need to replicateAccessors andStrategys for each future quantization type.layers/full.rsis the full-precision implementation. One thing to note is the use of theFullPrecisionmarker trait from which the implementations of the wholelayersAPI is derived forlayers::Full. This allows users to include just aT: FullPrecisiontrait bound and really simplifies the generics upstream.Within
provider.rs- my goal here is to minimize the use of generics as much as possible. In particular for search, I use a trait object forExpandBeam. When coupled with thelayers::QueryDistanceAPI, we can create implementations where the distance function is inlined and the number of prefetch instructions can be tailored to the data length. Fully optimizing this is still a work-in-progress.Another thing to call out in
provider.rsis the care needed for data insertion and deletion. Since external/internal ID translation is supported, we need to ensure that the translation table stays in-sync with the internal store. On insert, if we allocate an internal slot only to find the external ID already exists, we need to abort the operation rather than publish the internal slot. Similarly on delete, we first need to establish if the external/internal ID mapping exists. If so, then we can try to retire the slot. If slot retiring fails (it shouldn't, but bugs can happen) - we need to not commit the ID mapping deletion and instead return an error.Testing
Like the concurrency stress tests, testing uses the
integration-testbinary. Here, the 10k YFCC dataset is used for non-trivial runs. Using the A/B functionality indiskann-benchmark-runner, we can compare against checked-in baselines. This allows us to capture rich metrics for recall, number of operations, etc. and update easily. The main logic for checking and reporting baseline mismatches is inintegration/support/check.rs. The goal is to summarize all such mismatches for presentation to provide the highest signal possible.diskann-benchmarkIntegration into the benchmarks is straightforward. I elected to put everything in a single file to minimize disruption. I'm trying an approach of using
diskann_benchmark_runner::Input::from_rawto separate out the deserialization types from the actual inputs, allowing richer types (e.g., a fulldiskann_benchmark_core::streaming::bigann::RunBook) to be loaded.Also note how relatively simple the streaming benchmark integration is. Since the new inmem provider supports ID translation and internal slot allocation, it does not need the same level of hand-holding that the current inmem provider needs.