Skip to content

In-Mem 2.0#1206

Open
hildebrandmw wants to merge 48 commits into
mainfrom
mhildebr/inmem2
Open

In-Mem 2.0#1206
hildebrandmw wants to merge 48 commits into
mainfrom
mhildebr/inmem2

Conversation

@hildebrandmw

@hildebrandmw hildebrandmw commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Introduce a second in-memory provider with the intention of replacing the current provider.

Why

The RFC outlines much of the motivation. In short, the goal here is to:

  • Make the provider safe under concurrent inserts/search/deletes.
  • Support proper external/internal ID translation.
  • Improve test coverage
  • Do so with minimal performance overhead.

The concurrency argument comes from the epoch-based-reclamation (EBR) protection scheme for internal slots. See the RFC for more details.

Known Follow-up Items

  • Perf Parity: This generally has pretty good performance, but could use a little more tuning to bring it fully on-par with our current inmem index.
  • Quantization: This initial prototype is lacking quantization support. Adding quantization is relatively straightforward in the primary Store, but will need some thought on how to add the reranking layer. This shouldn't be an architectural blocker, though.
  • Hybrid PQ: A larger open question is how to support the max_fp_vecs feature of our current PQ implementation, which reads some full-precision and some quantized vectors during prune. Like with quantization in general, I think this is not a fundamental issue.
  • Support for non-uniform sized items in slots: For this, I'm mainly thinking of multi-vectors where the number of vectors within each multi-vector can vary. In the context of multi-vectors, fast element access is less important than for traditional vectors as distance computations take considerably longer.

Suggested Reviewing Order

The majority of this PR is in a new diskann-inmem crate. To facilitate testing, this crate has an "integration-test" feature, which enables the code in diskann-inmem/src/integration. This code is an unstable public reexport of internal types meant only for consumption in the diskann-inmem/integration integration test binary.

The integration test binary is powered by diskann-benchmark-runner.

diskann-inmem

Independent Low-Level Utilities

  • num.rs: Strong type utilities for byte and alignment representations.
  • buffer.rs: A miri-compliant version of AlignedMemoryVectorStore. This type allows vectors/neighbors to be stored in a single larger allocation. The use of RawSlice allows slots within the Buffer to be inspected and manipulated without forcing reference materialization (which is important to prevent aliasing).
  • neighbors.rs: The new version of SimpleNeighborVectorProviderAsync. This reuses the sharded-lock idea, but provides additional utility, including the ability to perform read-modify-write operations on adjacency lists.
  • counters.rs: Event counters. When the "integration-test" feature is not enabled, counters become a no-op. These are enabled for testing to monitor changes.
  • sharded.rs: An external-to-internal ID translation utility. The main trick with this struct is to provide utilities like Sharded::occupied_entry, which locks and returns an external/internal mapping. The proxy Entry struct is important as it verifies that such a mapping exists and provides an infallible way of deleting the mapping. This is chained with higher level operations (e.g. Provider::delete) to delete both the ID-mapping and the internal data-slot in lock-step.

Concurrency Protocol

The concurrency protocol is built upon three main layers:

  • tag.rs: An atomic slot tag for controlling access to data.
  • epoch.rs: The central registry where readers register and deregister. This is the crux of this PR and probably the most important file.
  • store.rs: A binary blob store built on top of epoch.rs to provide the safe concurrent store for data. This provides the following operations:
    • Storage of binary data in "slots".
    • Reading of data in slots (provided by Reader).
    • Tracking on the valid/invalid state slots.
    • Finding available slots into which new data can be inserted.
    • Safe retirement of slots and eventual reclamation.

The Store in store.rs has some help from freelist.rs to accelerate locating available slots internally.

Testing: epoch.rs has unit tests with injectable delays to set up known pathological orderings. The sequencing is helped by test/sequencer.rs. In addition, test/epoch.rs includes a direct stress test for the Registry. This is particularly helpful when run under Miri, which has the ability to detect some race conditions.

A larger concurrency stress test lives in the integration-test binary. This directly tests store.rs by spinning up readers, writers, and retirers and hammers a single Store. Data is read and written into the store in a knowable pattern, allowing readers to detect torn reads, implying a race condition. For this PR, I ran the following stress test file

{
  "search_directories": [
  ],
  "output_directory": null,
  "jobs": [
    {
      "type": "store-stress",
      "content": {
        "capacity": 8192,
        "duration_secs": 600,
        "entry_bytes": 256,
        "low_watermark": 4096,
        "max_ops": 50000000000,
        "readers": 32,
        "retirers": 16,
        "seed": 11935966405698895599,
        "writers": 32
      }
    }
  ]
}

with the command

cargo run --package diskann-inmem \
  --bin integration-test \
  --features integration-test \
  --release -- \
   run --input-file stress.json --output-file temp.json

The generated output was

readers:       32
writers:       32
retirers:      16
capacity:      8192
entry_bytes:   256
low_watermark: 4096
duration_secs: 600
max_ops:       50000000000
seed:          11935966405698895599
elapsed_secs:  600.001182962
reads:         110206469888
acquires_ok:   248086767
acquires_fail: 619145
retires_ok:    248082573
retires_fail:  243312644
reclaims:      108847566
transitions:   42006520
peak_live:     8131

While not a proof of correctness, this is a pretty decent stress test.

Providers

The implementation of the data provider is split into two logical pieces. The first lives in layers/ and is focused on computing distances. With this approach, I am trying to avoid the need to replicate Accessors and Strategys for each future quantization type. layers/full.rs is the full-precision implementation. One thing to note is the use of the FullPrecision marker trait from which the implementations of the whole layers API is derived for layers::Full. This allows users to include just a T: FullPrecision trait bound and really simplifies the generics upstream.

Within provider.rs - my goal here is to minimize the use of generics as much as possible. In particular for search, I use a trait object for ExpandBeam. When coupled with the layers::QueryDistance API, we can create implementations where the distance function is inlined and the number of prefetch instructions can be tailored to the data length. Fully optimizing this is still a work-in-progress.

Another thing to call out in provider.rs is the care needed for data insertion and deletion. Since external/internal ID translation is supported, we need to ensure that the translation table stays in-sync with the internal store. On insert, if we allocate an internal slot only to find the external ID already exists, we need to abort the operation rather than publish the internal slot. Similarly on delete, we first need to establish if the external/internal ID mapping exists. If so, then we can try to retire the slot. If slot retiring fails (it shouldn't, but bugs can happen) - we need to not commit the ID mapping deletion and instead return an error.

Testing

Like the concurrency stress tests, testing uses the integration-test binary. Here, the 10k YFCC dataset is used for non-trivial runs. Using the A/B functionality in diskann-benchmark-runner, we can compare against checked-in baselines. This allows us to capture rich metrics for recall, number of operations, etc. and update easily. The main logic for checking and reporting baseline mismatches is in integration/support/check.rs. The goal is to summarize all such mismatches for presentation to provide the highest signal possible.

diskann-benchmark

Integration into the benchmarks is straightforward. I elected to put everything in a single file to minimize disruption. I'm trying an approach of using diskann_benchmark_runner::Input::from_raw to separate out the deserialization types from the actual inputs, allowing richer types (e.g., a full diskann_benchmark_core::streaming::bigann::RunBook) to be loaded.

Also note how relatively simple the streaming benchmark integration is. Since the new inmem provider supports ID translation and internal slot allocation, it does not need the same level of hand-holding that the current inmem provider needs.

@harsha-simhadri harsha-simhadri linked an issue Jun 30, 2026 that may be closed by this pull request
Comment thread Cargo.toml Outdated
hildebrandmw added a commit that referenced this pull request Jul 2, 2026
When working on #1206, I ran into a situation where I parsed and stored
the runbook during input validation rather than deferring until the
benchmark is actually run.

Making it cloneable allows reuse of such a pre-parsed runbook.
@hildebrandmw hildebrandmw changed the title inmem 2.0 In-Mem 2.0 Jul 2, 2026
@hildebrandmw hildebrandmw marked this pull request as ready for review July 2, 2026 22:45
@hildebrandmw hildebrandmw requested review from a team and Copilot July 2, 2026 22:45

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces “In-Mem 2.0” by adding a new diskann-inmem crate implementing a concurrent-safe in-memory provider (EBR + slot tags + blob store), along with integration/stress testing and benchmark-runner plumbing to exercise it and compare against checked-in baselines.

Changes:

  • Add new diskann-inmem crate (epoch registry, slot tags, store, ID translation, distance layers) plus unit + integration stress tests and baseline regression harness.
  • Integrate inmem2 into diskann-benchmark behind a new inmem2 feature and extend diskann-benchmark-runner formatting/UX helpers used by the integration binary.
  • Wire the new crate into the workspace (Cargo.toml, Cargo.lock) and CI feature matrix.

Reviewed changes

Copilot reviewed 48 out of 49 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
rfcs/01206-inmem2.md New RFC documenting the concurrency motivation and EBR/tag protocol design.
diskann-vector/src/distance/implementations.rs Adjusts SIMD dispatch hook to use run2_inline.
diskann-inmem/src/test/sequencer.rs Adds a test sequencing helper (condvar + state machine).
diskann-inmem/src/test/mod.rs Introduces diskann-inmem unit-test module structure.
diskann-inmem/src/test/epoch.rs Adds directed stress tests for the epoch Registry.
diskann-inmem/src/tag.rs Implements atomic slot tags + protocol documentation and tests.
diskann-inmem/src/sharded.rs Adds sharded external↔internal ID mapping utility with tests.
diskann-inmem/src/num.rs Adds Bytes/Align strong types with tests.
diskann-inmem/src/lib.rs New crate root wiring modules, exports, and internal helper macro.
diskann-inmem/src/layers/mod.rs Introduces distance-layer traits (Layer, Search, QueryDistance, etc.).
diskann-inmem/src/layers/full.rs Implements full-precision distance layer + specialization and tests.
diskann-inmem/src/integration/store.rs Public (integration-test-only) wrapper around internal store::Store.
diskann-inmem/src/integration/mod.rs Integration-test-only module exports.
diskann-inmem/src/integration/counters.rs Integration-test-only counter snapshot type for reporting.
diskann-inmem/src/freelist.rs Adds a freelist helper for finding/recycling internal slot IDs (with tests).
diskann-inmem/src/counters.rs Adds counters/no-op counters depending on integration-test feature.
diskann-inmem/src/buffer.rs Adds raw buffer + RawSlice abstraction for miri-friendly storage (with tests).
diskann-inmem/integration/support/tolerance.rs Defines an “empty tolerance” input type for runner regression checks.
diskann-inmem/integration/support/mod.rs Integration support module wiring.
diskann-inmem/integration/support/io.rs Adds dataset loading + conversion helper for integration tests.
diskann-inmem/integration/support/check.rs Adds structured baseline mismatch aggregation/printing utilities.
diskann-inmem/integration/store.rs Adds store concurrency stress benchmark binary implementation.
diskann-inmem/integration/main.rs Adds integration-test runner binary entrypoint and regression harness.
diskann-inmem/integration/jsons/store-stress.json Adds example stress-test job JSON.
diskann-inmem/integration/jsons/store-stress-test.json Adds a smaller stress-test job JSON for CI/unit testing.
diskann-inmem/integration/jsons/integration.json Adds integration-test job definitions (YFCC 10k A/B runs).
diskann-inmem/integration/jsons/integration-baseline.json Adds checked-in baseline outputs for regression comparison.
diskann-inmem/integration/jsons/checks.json Adds checker config using empty tolerances.
diskann-inmem/integration/index/tests.rs Adds index-level insert + knn execution helpers and result DTOs.
diskann-inmem/integration/index/runner.rs Adds runner glue: deserialize inputs, build provider, run regression checks.
diskann-inmem/integration/index/object.rs Adds trait-object interface for an index + counter/metric DTOs.
diskann-inmem/integration/index/mod.rs Wires index integration modules and registration.
diskann-inmem/DEV.md Documents how to run full test suite with integration-test enabled.
diskann-inmem/Cargo.toml New crate manifest, deps, bin target, and integration-test feature.
diskann-inmem/.clippy.toml Clippy config allowing unwrap/expect/panic in tests for this crate.
diskann-benchmark/src/index/mod.rs Registers new inmem2 benchmarks behind feature flag.
diskann-benchmark/Cargo.toml Adds optional dependency and feature flag for diskann-inmem.
diskann-benchmark-runner/src/utils/fmt.rs Adds KeyValue formatter utility and tests.
diskann-benchmark-runner/src/files.rs Implements Display for InputFile to improve formatting.
diskann-benchmark-runner/src/app.rs Minor formatting fix for mismatch reporting output.
Cargo.toml Adds diskann-inmem to workspace + dependency table; adds profile.samply.
Cargo.lock Locks new diskann-inmem package and dependencies into the workspace lockfile.
.github/workflows/ci.yml Enables integration-test and inmem2 in CI feature set.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread diskann-inmem/src/layers/full.rs
Comment thread diskann-inmem/src/layers/full.rs
Comment thread diskann-inmem/src/layers/mod.rs Outdated
Comment thread diskann-inmem/src/layers/mod.rs Outdated
Comment thread diskann-inmem/src/layers/mod.rs Outdated
Comment thread rfcs/01206-inmem2.md
Comment thread rfcs/01206-inmem2.md Outdated
Comment thread rfcs/01206-inmem2.md Outdated
Comment thread rfcs/01206-inmem2.md
Comment thread rfcs/01206-inmem2.md Outdated
@codecov-commenter

codecov-commenter commented Jul 2, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 95.31981% with 150 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.82%. Comparing base (ae25484) to head (a4ee3b2).

Files with missing lines Patch % Lines
diskann-inmem/src/provider.rs 88.02% 69 Missing ⚠️
diskann-inmem/src/store.rs 96.08% 16 Missing ⚠️
diskann-inmem/src/epoch.rs 96.01% 13 Missing ⚠️
diskann-inmem/src/buffer.rs 94.71% 12 Missing ⚠️
diskann-inmem/src/layers/full.rs 96.25% 11 Missing ⚠️
diskann-inmem/src/test/epoch.rs 94.81% 10 Missing ⚠️
diskann-inmem/src/test/sequencer.rs 87.50% 6 Missing ⚠️
diskann-inmem/src/layers/mod.rs 50.00% 5 Missing ⚠️
diskann-inmem/src/freelist.rs 97.96% 4 Missing ⚠️
diskann-benchmark-runner/src/files.rs 0.00% 3 Missing ⚠️
... and 1 more
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1206      +/-   ##
==========================================
+ Coverage   89.64%   89.82%   +0.17%     
==========================================
  Files         503      518      +15     
  Lines       95761    98963    +3202     
==========================================
+ Hits        85848    88895    +3047     
- Misses       9913    10068     +155     
Flag Coverage Δ
miri 89.82% <95.31%> (+0.17%) ⬆️
unittests 89.50% <95.31%> (+0.18%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann-benchmark-runner/src/app.rs 83.97% <100.00%> (ø)
diskann-benchmark-runner/src/utils/fmt.rs 98.80% <100.00%> (+0.40%) ⬆️
diskann-benchmark/src/index/mod.rs 100.00% <ø> (ø)
diskann-inmem/src/counters.rs 100.00% <100.00%> (ø)
diskann-inmem/src/lib.rs 100.00% <100.00%> (ø)
diskann-inmem/src/neighbors.rs 100.00% <100.00%> (ø)
diskann-inmem/src/num.rs 100.00% <100.00%> (ø)
diskann-inmem/src/tag.rs 100.00% <100.00%> (ø)
diskann-vector/src/distance/implementations.rs 95.06% <100.00%> (ø)
diskann-inmem/src/sharded.rs 99.46% <99.46%> (ø)
... and 10 more

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

in-mem maintenance: Delete in-mem v1 after in-mem v2 lands.

4 participants