In-Mem 2.0 by hildebrandmw · Pull Request #1206 · microsoft/DiskANN

hildebrandmw · 2026-06-27T00:02:11Z

Introduce a second in-memory provider with the intention of replacing the current provider.

Why

The RFC outlines much of the motivation. In short, the goal here is to:

Make the provider safe under concurrent inserts/search/deletes.
Support proper external/internal ID translation.
Improve test coverage
Do so with minimal performance overhead.

The concurrency argument comes from the epoch-based-reclamation (EBR) protection scheme for internal slots. See the RFC for more details.

Known Follow-up Items

Perf Parity: This generally has pretty good performance, but could use a little more tuning to bring it fully on-par with our current inmem index.
Quantization: This initial prototype is lacking quantization support. Adding quantization is relatively straightforward in the primary Store, but will need some thought on how to add the reranking layer. This shouldn't be an architectural blocker, though.
Hybrid PQ: A larger open question is how to support the max_fp_vecs feature of our current PQ implementation, which reads some full-precision and some quantized vectors during prune. Like with quantization in general, I think this is not a fundamental issue.
Support for non-uniform sized items in slots: For this, I'm mainly thinking of multi-vectors where the number of vectors within each multi-vector can vary. In the context of multi-vectors, fast element access is less important than for traditional vectors as distance computations take considerably longer.

Suggested Reviewing Order

The majority of this PR is in a new diskann-inmem crate. To facilitate testing, this crate has an "integration-test" feature, which enables the code in diskann-inmem/src/integration. This code is an unstable public reexport of internal types meant only for consumption in the diskann-inmem/integration integration test binary.

The integration test binary is powered by diskann-benchmark-runner.

`diskann-inmem`

Independent Low-Level Utilities

num.rs: Strong type utilities for byte and alignment representations.
buffer.rs: A miri-compliant version of AlignedMemoryVectorStore. This type allows vectors/neighbors to be stored in a single larger allocation. The use of RawSlice allows slots within the Buffer to be inspected and manipulated without forcing reference materialization (which is important to prevent aliasing).
neighbors.rs: The new version of SimpleNeighborVectorProviderAsync. This reuses the sharded-lock idea, but provides additional utility, including the ability to perform read-modify-write operations on adjacency lists.
counters.rs: Event counters. When the "integration-test" feature is not enabled, counters become a no-op. These are enabled for testing to monitor changes.
sharded.rs: An external-to-internal ID translation utility. The main trick with this struct is to provide utilities like Sharded::occupied_entry, which locks and returns an external/internal mapping. The proxy Entry struct is important as it verifies that such a mapping exists and provides an infallible way of deleting the mapping. This is chained with higher level operations (e.g. Provider::delete) to delete both the ID-mapping and the internal data-slot in lock-step.

Concurrency Protocol

The concurrency protocol is built upon three main layers:

tag.rs: An atomic slot tag for controlling access to data.
epoch.rs: The central registry where readers register and deregister. This is the crux of this PR and probably the most important file.
store.rs: A binary blob store built on top of epoch.rs to provide the safe concurrent store for data. This provides the following operations:
- Storage of binary data in "slots".
- Reading of data in slots (provided by Reader).
- Tracking on the valid/invalid state slots.
- Finding available slots into which new data can be inserted.
- Safe retirement of slots and eventual reclamation.

The Store in store.rs has some help from freelist.rs to accelerate locating available slots internally.

Testing: epoch.rs has unit tests with injectable delays to set up known pathological orderings. The sequencing is helped by test/sequencer.rs. In addition, test/epoch.rs includes a direct stress test for the Registry. This is particularly helpful when run under Miri, which has the ability to detect some race conditions.

A larger concurrency stress test lives in the integration-test binary. This directly tests store.rs by spinning up readers, writers, and retirers and hammers a single Store. Data is read and written into the store in a knowable pattern, allowing readers to detect torn reads, implying a race condition. For this PR, I ran the following stress test file

{
  "search_directories": [
  ],
  "output_directory": null,
  "jobs": [
    {
      "type": "store-stress",
      "content": {
        "capacity": 8192,
        "duration_secs": 600,
        "entry_bytes": 256,
        "low_watermark": 4096,
        "max_ops": 50000000000,
        "readers": 32,
        "retirers": 16,
        "seed": 11935966405698895599,
        "writers": 32
      }
    }
  ]
}

with the command

cargo run --package diskann-inmem \
  --bin integration-test \
  --features integration-test \
  --release -- \
   run --input-file stress.json --output-file temp.json

The generated output was

readers:       32
writers:       32
retirers:      16
capacity:      8192
entry_bytes:   256
low_watermark: 4096
duration_secs: 600
max_ops:       50000000000
seed:          11935966405698895599
elapsed_secs:  600.001182962
reads:         110206469888
acquires_ok:   248086767
acquires_fail: 619145
retires_ok:    248082573
retires_fail:  243312644
reclaims:      108847566
transitions:   42006520
peak_live:     8131

While not a proof of correctness, this is a pretty decent stress test.

Providers

The implementation of the data provider is split into two logical pieces. The first lives in layers/ and is focused on computing distances. With this approach, I am trying to avoid the need to replicate Accessors and Strategys for each future quantization type. layers/full.rs is the full-precision implementation. One thing to note is the use of the FullPrecision marker trait from which the implementations of the whole layers API is derived for layers::Full. This allows users to include just a T: FullPrecision trait bound and really simplifies the generics upstream.

Within provider.rs - my goal here is to minimize the use of generics as much as possible. In particular for search, I use a trait object for ExpandBeam. When coupled with the layers::QueryDistance API, we can create implementations where the distance function is inlined and the number of prefetch instructions can be tailored to the data length. Fully optimizing this is still a work-in-progress.

Another thing to call out in provider.rs is the care needed for data insertion and deletion. Since external/internal ID translation is supported, we need to ensure that the translation table stays in-sync with the internal store. On insert, if we allocate an internal slot only to find the external ID already exists, we need to abort the operation rather than publish the internal slot. Similarly on delete, we first need to establish if the external/internal ID mapping exists. If so, then we can try to retire the slot. If slot retiring fails (it shouldn't, but bugs can happen) - we need to not commit the ID mapping deletion and instead return an error.

Testing

Like the concurrency stress tests, testing uses the integration-test binary. Here, the 10k YFCC dataset is used for non-trivial runs. Using the A/B functionality in diskann-benchmark-runner, we can compare against checked-in baselines. This allows us to capture rich metrics for recall, number of operations, etc. and update easily. The main logic for checking and reporting baseline mismatches is in integration/support/check.rs. The goal is to summarize all such mismatches for presentation to provide the highest signal possible.

`diskann-benchmark`

Integration into the benchmarks is straightforward. I elected to put everything in a single file to minimize disruption. I'm trying an approach of using diskann_benchmark_runner::Input::from_raw to separate out the deserialization types from the actual inputs, allowing richer types (e.g., a full diskann_benchmark_core::streaming::bigann::RunBook) to be loaded.

Also note how relatively simple the streaming benchmark integration is. Since the new inmem provider supports ID translation and internal slot allocation, it does not need the same level of hand-holding that the current inmem provider needs.

When working on #1206, I ran into a situation where I parsed and stored the runbook during input validation rather than deferring until the benchmark is actually run. Making it cloneable allows reuse of such a pre-parsed runbook.

Copilot

Pull request overview

This PR introduces “In-Mem 2.0” by adding a new diskann-inmem crate implementing a concurrent-safe in-memory provider (EBR + slot tags + blob store), along with integration/stress testing and benchmark-runner plumbing to exercise it and compare against checked-in baselines.

Changes:

Add new diskann-inmem crate (epoch registry, slot tags, store, ID translation, distance layers) plus unit + integration stress tests and baseline regression harness.
Integrate inmem2 into diskann-benchmark behind a new inmem2 feature and extend diskann-benchmark-runner formatting/UX helpers used by the integration binary.
Wire the new crate into the workspace (Cargo.toml, Cargo.lock) and CI feature matrix.

Reviewed changes

Copilot reviewed 48 out of 49 changed files in this pull request and generated 16 comments.

Show a summary per file

File	Description
rfcs/01206-inmem2.md	New RFC documenting the concurrency motivation and EBR/tag protocol design.
diskann-vector/src/distance/implementations.rs	Adjusts SIMD dispatch hook to use `run2_inline`.
diskann-inmem/src/test/sequencer.rs	Adds a test sequencing helper (condvar + state machine).
diskann-inmem/src/test/mod.rs	Introduces `diskann-inmem` unit-test module structure.
diskann-inmem/src/test/epoch.rs	Adds directed stress tests for the epoch `Registry`.
diskann-inmem/src/tag.rs	Implements atomic slot tags + protocol documentation and tests.
diskann-inmem/src/sharded.rs	Adds sharded external↔internal ID mapping utility with tests.
diskann-inmem/src/num.rs	Adds `Bytes`/`Align` strong types with tests.
diskann-inmem/src/lib.rs	New crate root wiring modules, exports, and internal helper macro.
diskann-inmem/src/layers/mod.rs	Introduces distance-layer traits (`Layer`, `Search`, `QueryDistance`, etc.).
diskann-inmem/src/layers/full.rs	Implements full-precision distance layer + specialization and tests.
diskann-inmem/src/integration/store.rs	Public (integration-test-only) wrapper around internal `store::Store`.
diskann-inmem/src/integration/mod.rs	Integration-test-only module exports.
diskann-inmem/src/integration/counters.rs	Integration-test-only counter snapshot type for reporting.
diskann-inmem/src/freelist.rs	Adds a freelist helper for finding/recycling internal slot IDs (with tests).
diskann-inmem/src/counters.rs	Adds counters/no-op counters depending on `integration-test` feature.
diskann-inmem/src/buffer.rs	Adds raw buffer + `RawSlice` abstraction for miri-friendly storage (with tests).
diskann-inmem/integration/support/tolerance.rs	Defines an “empty tolerance” input type for runner regression checks.
diskann-inmem/integration/support/mod.rs	Integration support module wiring.
diskann-inmem/integration/support/io.rs	Adds dataset loading + conversion helper for integration tests.
diskann-inmem/integration/support/check.rs	Adds structured baseline mismatch aggregation/printing utilities.
diskann-inmem/integration/store.rs	Adds store concurrency stress benchmark binary implementation.
diskann-inmem/integration/main.rs	Adds integration-test runner binary entrypoint and regression harness.
diskann-inmem/integration/jsons/store-stress.json	Adds example stress-test job JSON.
diskann-inmem/integration/jsons/store-stress-test.json	Adds a smaller stress-test job JSON for CI/unit testing.
diskann-inmem/integration/jsons/integration.json	Adds integration-test job definitions (YFCC 10k A/B runs).
diskann-inmem/integration/jsons/integration-baseline.json	Adds checked-in baseline outputs for regression comparison.
diskann-inmem/integration/jsons/checks.json	Adds checker config using empty tolerances.
diskann-inmem/integration/index/tests.rs	Adds index-level insert + knn execution helpers and result DTOs.
diskann-inmem/integration/index/runner.rs	Adds runner glue: deserialize inputs, build provider, run regression checks.
diskann-inmem/integration/index/object.rs	Adds trait-object interface for an index + counter/metric DTOs.
diskann-inmem/integration/index/mod.rs	Wires index integration modules and registration.
diskann-inmem/DEV.md	Documents how to run full test suite with `integration-test` enabled.
diskann-inmem/Cargo.toml	New crate manifest, deps, bin target, and `integration-test` feature.
diskann-inmem/.clippy.toml	Clippy config allowing unwrap/expect/panic in tests for this crate.
diskann-benchmark/src/index/mod.rs	Registers new `inmem2` benchmarks behind feature flag.
diskann-benchmark/Cargo.toml	Adds optional dependency and feature flag for `diskann-inmem`.
diskann-benchmark-runner/src/utils/fmt.rs	Adds `KeyValue` formatter utility and tests.
diskann-benchmark-runner/src/files.rs	Implements `Display` for `InputFile` to improve formatting.
diskann-benchmark-runner/src/app.rs	Minor formatting fix for mismatch reporting output.
Cargo.toml	Adds `diskann-inmem` to workspace + dependency table; adds `profile.samply`.
Cargo.lock	Locks new `diskann-inmem` package and dependencies into the workspace lockfile.
.github/workflows/ci.yml	Enables `integration-test` and `inmem2` in CI feature set.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

codecov-commenter · 2026-07-02T23:00:26Z

Codecov Report

❌ Patch coverage is 95.31981% with 150 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.82%. Comparing base (ae25484) to head (a4ee3b2).

Files with missing lines	Patch %	Lines
diskann-inmem/src/provider.rs	88.02%	69 Missing ⚠️
diskann-inmem/src/store.rs	96.08%	16 Missing ⚠️
diskann-inmem/src/epoch.rs	96.01%	13 Missing ⚠️
diskann-inmem/src/buffer.rs	94.71%	12 Missing ⚠️
diskann-inmem/src/layers/full.rs	96.25%	11 Missing ⚠️
diskann-inmem/src/test/epoch.rs	94.81%	10 Missing ⚠️
diskann-inmem/src/test/sequencer.rs	87.50%	6 Missing ⚠️
diskann-inmem/src/layers/mod.rs	50.00%	5 Missing ⚠️
diskann-inmem/src/freelist.rs	97.96%	4 Missing ⚠️
diskann-benchmark-runner/src/files.rs	0.00%	3 Missing ⚠️
... and 1 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1206      +/-   ##
==========================================
+ Coverage   89.64%   89.82%   +0.17%     
==========================================
  Files         503      518      +15     
  Lines       95761    98963    +3202     
==========================================
+ Hits        85848    88895    +3047     
- Misses       9913    10068     +155

Flag	Coverage Δ
miri	`89.82% <95.31%> (+0.17%)`	⬆️
unittests	`89.50% <95.31%> (+0.18%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
diskann-benchmark-runner/src/app.rs	`83.97% <100.00%> (ø)`
diskann-benchmark-runner/src/utils/fmt.rs	`98.80% <100.00%> (+0.40%)`	⬆️
diskann-benchmark/src/index/mod.rs	`100.00% <ø> (ø)`
diskann-inmem/src/counters.rs	`100.00% <100.00%> (ø)`
diskann-inmem/src/lib.rs	`100.00% <100.00%> (ø)`
diskann-inmem/src/neighbors.rs	`100.00% <100.00%> (ø)`
diskann-inmem/src/num.rs	`100.00% <100.00%> (ø)`
diskann-inmem/src/tag.rs	`100.00% <100.00%> (ø)`
diskann-vector/src/distance/implementations.rs	`95.06% <100.00%> (ø)`
diskann-inmem/src/sharded.rs	`99.46% <99.46%> (ø)`
... and 10 more

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Mark Hildebrand added 30 commits June 17, 2026 17:08

This is a lot of work.

a9c5b50

Checkpoint.

5ecae48

Checkpoint.

1a367cb

More stuff.

3203d32

Checkopint.

6ab25c0

Checkpoint.

f60cfe1

Checkpoint.

8782b6b

Checkpoint.

ab22b90

ID translation.

b3a2a2a

Streaming works (kinda).

1f9f20e

Checkpoint.

c4e20b3

Getting closer.

2f03e0d

Checkpoint.

89209ef

Reorganize.

70f03b9

Fix up compile errors.

02cb059

Checkpoint.

340d3a5

Checkpoing before vibes.

135e743

Checkpoint.

690d1de

Checkpoint.

969156c

Checkpoint.

a1b2319

Checkpoint.

b52d9aa

More progress.

5efb805

Unit testing is almost there ...

d5a24f4

The end is in sight.

b822031

Checkpoint.

e030eec

Test coverage looking good.

6170d3b

Clippy!.

f1d6dfd

Clippy!

2d022db

Merge remote-tracking branch 'origin/main' into mhildebr/inmem2

239d36e

We're getting there!

366aee2

Mark Hildebrand added 2 commits June 26, 2026 16:57

Checkpoint.

03ac022

Move machines.

4207c1d

harsha-simhadri linked an issue Jun 30, 2026 that may be closed by this pull request

in-mem maintenance: Delete in-mem v1 after in-mem v2 lands. #712

Open

Mark Hildebrand added 4 commits June 30, 2026 17:22

Checkpoint.

23e386a

Minor tweaks.

af57bb6

Add RFC.

b91419b

Upload RFC.

9d5f143

hildebrandmw mentioned this pull request Jul 2, 2026

Add Clone to Runbook. #1226

Merged

JordanMaples reviewed Jul 2, 2026

View reviewed changes

Comment thread Cargo.toml Outdated

Mark Hildebrand added 9 commits July 2, 2026 13:54

Restore stress tests.

9a10379

Merge remote-tracking branch 'origin/main' into mhildebr/inmem2

5971698

More cleanups.

f1a7121

Add tests for KeyValue.

b3a00b7

Tests for KeyValue.

ab92dae

Merge remote-tracking branch 'origin/main' into mhildebr/inmem2

159bea8

Cleanup and reduce false-sharing in stress test.

3b59cf1

Fix-up Cargo.toml

2678f36

Make compatible with Aarch64.

db9d1b9

hildebrandmw changed the title ~~inmem 2.0~~ In-Mem 2.0 Jul 2, 2026

Mark Hildebrand added 2 commits July 2, 2026 15:43

Not syncing with git?

43d3fcf

Remove extra whitespace.

ce7a321

hildebrandmw marked this pull request as ready for review July 2, 2026 22:45

hildebrandmw requested review from a team and Copilot July 2, 2026 22:45

Copilot started reviewing on behalf of hildebrandmw July 2, 2026 22:45 View session

Copilot AI reviewed Jul 2, 2026

View reviewed changes

Typos.

a4ee3b2

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

In-Mem 2.0#1206

In-Mem 2.0#1206
hildebrandmw wants to merge 48 commits into
mainfrom
mhildebr/inmem2

hildebrandmw commented Jun 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Jul 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

hildebrandmw commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

Known Follow-up Items

Suggested Reviewing Order

diskann-inmem

Independent Low-Level Utilities

Concurrency Protocol

Providers

Testing

diskann-benchmark

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hildebrandmw commented Jun 27, 2026 •

edited

Loading

`diskann-inmem`

`diskann-benchmark`

codecov-commenter commented Jul 2, 2026 •

edited

Loading