Skip to content

bftree lazy neighbor init#1215

Open
JordanMaples wants to merge 2 commits into
mainfrom
jordanmaples/bftree_lazy_neighbor_init
Open

bftree lazy neighbor init#1215
JordanMaples wants to merge 2 commits into
mainfrom
jordanmaples/bftree_lazy_neighbor_init

Conversation

@JordanMaples

Copy link
Copy Markdown
Contributor

What

Make bf-tree neighbor lists initialize lazily instead of eagerly materializing
an empty list for every id at construction.

Two changes in diskann-bftree:

  1. neighbors.rs — On neighbor read, treat a bf-tree NotFound result as an
    empty adjacency list rather than an error. (Deleted / InvalidKey still error.)
  2. provider.rs — Drop the O(max_points) eager loop in new_empty that
    wrote an empty list for every id up front.

Why

The eager loop was a stop-gap for a missing "exists" query. But bf-tree already
distinguishes Found / NotFound / Deleted natively, so mapping NotFound to
an empty list (identical to the existing Found(0) case) lets neighbor lists be
created on first write.

This makes construction O(num_start_points) instead of O(max_points),
eliminating a billion-insert startup wall for large, larger-than-memory datasets,
and removes the artificial coupling that required capacity to be fully
materialized up front.

Correctness

The consolidation paths that motivated the stop-gap (consolidate_vector /
on_neighbors) read neighbor lists during delete; a NotFound→empty result is
identical to the eager-init outcome for an unwritten id. These remain covered by
the delete-and-search tests, which exercise inplace_delete end to end.

JordanMaples and others added 2 commits June 30, 2026 10:30
Treat a bf-tree `NotFound` on neighbor read as an empty adjacency list
rather than an error, and drop the O(max_points) eager initialization
that wrote an empty list for every id at construction.

The eager loop was a stop-gap for a missing "exists" query, but bf-tree
already distinguishes `Found` / `NotFound` / `Deleted` natively. Mapping
`NotFound` to an empty list (identical to the existing `Found(0)` case)
lets neighbor lists be created lazily on first write. This makes
construction O(num_start_points) instead of O(max_points) — eliminating
a billion-insert startup wall for large, larger-than-memory datasets —
and removes the artificial coupling that required capacity to be fully
materialized up front.

The consolidate paths that motivated the stop-gap remain covered by the
delete-and-search tests, which exercise inplace_delete end to end.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add coverage for the NotFound -> empty mapping introduced by lazy
neighbor-list initialization: a serial sweep over a wide range of
never-written ids (including sparse written ids and ids beyond any
write) and a concurrent multi-thread hammer. Both assert that absent
ids yield an empty adjacency list with no errors or panics.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@JordanMaples JordanMaples requested review from a team and Copilot June 30, 2026 21:28
@JordanMaples JordanMaples changed the title Jordanmaples/bftree lazy neighbor init bftree lazy neighbor init Jun 30, 2026
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.82%. Comparing base (7718073) to head (ffa53a9).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1215      +/-   ##
==========================================
- Coverage   90.85%   89.82%   -1.04%     
==========================================
  Files         489      489              
  Lines       93575    93618      +43     
==========================================
- Hits        85020    84093     -927     
- Misses       8555     9525     +970     
Flag Coverage Δ
miri 89.82% <100.00%> (-1.04%) ⬇️
unittests 89.48% <100.00%> (-1.34%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann-bftree/src/neighbors.rs 96.95% <100.00%> (+0.27%) ⬆️
diskann-bftree/src/provider.rs 90.43% <100.00%> (-0.03%) ⬇️

... and 40 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@harsha-simhadri

Copy link
Copy Markdown
Contributor

nice! can you post perf and recall results for this change?

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

Comment on lines +87 to +90
/// `neighbors` is cleared first upon each invocation. A vector with no
/// stored neighbor list yet (bf-tree `NotFound`) yields an empty list rather
/// than an error, so neighbor lists are created lazily on first write.
/// One data copy is involved which copies the data from bf-tree to `neighbors`
Comment on lines +488 to +498
// Sweep a wide id range, including the written ids and the gaps between
// and beyond them. The buffer is reused across calls to confirm it is
// cleared on every read.
for id in 0u32..200_000 {
neighbor_provider.get_neighbors(id, &mut result).unwrap();
if written.contains(&id) {
assert_eq!(&[1, 2, 3], &*result, "written id {id} lost its list");
} else {
assert!(result.is_empty(), "unwritten id {id} should be empty");
}
}
Comment on lines +520 to +525
// Overlapping id ranges across threads to maximize contention on
// the shared bf-tree read path for absent keys.
for id in (t * 1_000)..(t * 1_000 + 10_000) {
neighbor_provider.get_neighbors(id, &mut result).unwrap();
assert!(result.is_empty());
}
@JordanMaples

Copy link
Copy Markdown
Contributor Author

@harsha-simhadri here's a copilot summary of 3 A/B runs for this lazy init change.

Ran the wikipedia-100K streaming benchmark (full-precision bf-tree, 100 insert+search stages, 100K vectors, L=200, k=100, 8 threads), 3 trials per branch, all release builds.

┌──────────────────────────────────┬──────────────┬──────────────┬────────┐
│ Metric (mean of 3, steady-state) │ A: lazy-init │ B: baseline  │ Δ      │
├──────────────────────────────────┼──────────────┼──────────────┼────────┤
│ Recall (all stages)              │ 0.9765       │ 0.9765       │ −0.00% │
├──────────────────────────────────┼──────────────┼──────────────┼────────┤
│ Recall (steady-state)            │ 0.9712       │ 0.9712       │ −0.00% │
├──────────────────────────────────┼──────────────┼──────────────┼────────┤
│ QPS (steady-state)               │ 703 ±46      │ 730 ±56      │ −3.7%  │
├──────────────────────────────────┼──────────────┼──────────────┼────────┤
│ Insert/op                        │ 537 ±36 ms   │ 523 ±31 ms   │ +2.8%  │
├──────────────────────────────────┼──────────────┼──────────────┼────────┤
│ Insert p99 latency               │ 7353 ±442 µs │ 7055 ±252 µs │ +4.2%  │
└──────────────────────────────────┴──────────────┴──────────────┴────────┘

Conclusion: Lazy neighbor init is result-neutral — recall is identical to baseline on every metric. The ~3–4% timing lean toward baseline is within run-to-run noise: per-trial ranges overlap on all timing metrics (e.g. QPS: A-best 718 vs B-worst 698; insert/op: A-best 521ms vs B-worst 540ms), and the trial spread is as large as the mean difference. No statistically confident perf regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants