Skip to content

Switch to device permute in Vamana#2214

Open
lowener wants to merge 4 commits into
rapidsai:mainfrom
lowener:26.08-vamana-permute
Open

Switch to device permute in Vamana#2214
lowener wants to merge 4 commits into
rapidsai:mainfrom
lowener:26.08-vamana-permute

Conversation

@lowener
Copy link
Copy Markdown
Contributor

@lowener lowener commented Jun 3, 2026

No description provided.

Signed-off-by: Mickael Ide <mide@nvidia.com>
@lowener lowener self-assigned this Jun 3, 2026
@lowener lowener requested a review from a team as a code owner June 3, 2026 16:29
@lowener lowener added improvement Improves an existing functionality C++ non-breaking Introduces a non-breaking change labels Jun 3, 2026
@lowener lowener moved this to In Progress in Unstructured Data Processing Jun 3, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 3, 2026

Review Change Stack

📝 Walkthrough

Summary by CodeRabbit

  • Refactor
    • Randomized insertion order is now generated on the GPU and batched insert operations avoid host/device round-trips, improving index-building performance, memory efficiency, and scalability for large datasets.

Walkthrough

Vamana insert-order permutation generation moved from CPU to GPU: create_insert_permutation now accepts raft::resources, a device vector view and optional seed, generates keys on-device and sorts with Thrust to produce a device-resident permutation; batched_insert_vamana uses that device permutation and device-to-device copies during batch inserts.

Changes

Vamana Permutation Migration to GPU

Layer / File(s) Summary
GPU permutation function and dependencies
cpp/src/neighbors/detail/vamana/vamana_build.cuh
Headers added for Thrust policy and GPU RNG. create_insert_permutation reimplemented to initialize a device index sequence, generate random float keys on-device (optional seed) via raft::random::uniform, and compute the permutation by thrust::sort_by_key into the provided raft::device_vector_view; signature changed to (raft::resources const&, raft::device_vector_view<IdxT, uint32_t>, uint64_t seed).
Batched insertion integration and batch loop data flow
cpp/src/neighbors/detail/vamana/vamana_build.cuh
batched_insert_vamana now allocates insert_order as a device vector and populates it via create_insert_permutation(res, insert_order.view()). Per-batch query_ids staging copies slices directly from device insert_order into device query_ids using raft::copy (device-to-device), replacing the previous host-mediated slice copy.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive No pull request description was provided by the author, making it impossible to evaluate whether it is related to the changeset. Add a pull request description explaining the motivation, implementation details, and benefits of switching to device-side permutation in Vamana.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Switch to device permute in Vamana' directly describes the main change: moving the permutation operation from host to device in the Vamana algorithm implementation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/src/neighbors/detail/vamana/vamana_build.cuh`:
- Line 62: The fallback seed uses non-thread-safe std::rand() when constructing
raft::random::RngState (the expression raft::random::RngState rng(seed != 0 ?
seed : static_cast<uint64_t>(std::rand()))); replace the std::rand() call with a
thread-safe source (for example, use std::random_device() or a time-based value
such as std::chrono::high_resolution_clock::now().time_since_epoch().count()) so
the fallback expression becomes thread-safe; ensure the replacement is cast to
uint64_t and preserves the existing conditional logic that uses seed when
non-zero.
- Around line 52-55: The function signature for create_insert_permutation has a
syntax error: the parameter list is missing a comma between the insert_order and
seed parameters. Edit the create_insert_permutation declaration to insert a
comma between raft::device_vector_view<IdxT, uint32_t> insert_order and uint64_t
seed = 0 so the compiler sees two distinct parameters (reference the
create_insert_permutation function name and the insert_order and seed parameter
names).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: bcebf6b2-2208-4281-aa12-ab0fc4936104

📥 Commits

Reviewing files that changed from the base of the PR and between 0c3d007 and 166309a.

📒 Files selected for processing (1)
  • cpp/src/neighbors/detail/vamana/vamana_build.cuh

Comment thread cpp/src/neighbors/detail/vamana/vamana_build.cuh Outdated
Comment thread cpp/src/neighbors/detail/vamana/vamana_build.cuh Outdated
lowener added 2 commits June 3, 2026 19:11
Signed-off-by: Mickael Ide <mide@nvidia.com>
Signed-off-by: Mickael Ide <mide@nvidia.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/neighbors/detail/vamana/vamana_build.cuh (1)

185-187: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pass a real seed here, or drop the third argument.

Line 187 is still passing N positionally, but create_insert_permutation now derives the length from insert_order and treats the third parameter as seed. That means same-sized builds get the same insertion order, which is almost certainly not what this refactor intended.

Suggested fix
-  create_insert_permutation(res, insert_order.view(), static_cast<uint32_t>(N));
+  create_insert_permutation(res, insert_order.view());

If you want reproducible-but-configurable behavior, thread an actual seed from index_params instead of reusing N.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/neighbors/detail/vamana/vamana_build.cuh` around lines 185 - 187, The
call creating the insertion permutation passes N as the third (seed) argument to
create_insert_permutation, causing identical seeds for equal-sized builds;
change the call in the block that creates insert_order so it either omits the
third argument or passes a real seed value (e.g., thread a seed from
index_params or another config field) to
create_insert_permutation(insert_order.view(), static_cast<uint32_t>(N), seed)
(or remove the seed parameter entirely), and ensure the code uses
raft::make_device_vector<IdxT, uint32_t>(res, N) and create_insert_permutation
with matching arguments.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@cpp/src/neighbors/detail/vamana/vamana_build.cuh`:
- Around line 185-187: The call creating the insertion permutation passes N as
the third (seed) argument to create_insert_permutation, causing identical seeds
for equal-sized builds; change the call in the block that creates insert_order
so it either omits the third argument or passes a real seed value (e.g., thread
a seed from index_params or another config field) to
create_insert_permutation(insert_order.view(), static_cast<uint32_t>(N), seed)
(or remove the seed parameter entirely), and ensure the code uses
raft::make_device_vector<IdxT, uint32_t>(res, N) and create_insert_permutation
with matching arguments.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: def4ad9d-bf5e-4b72-94cc-3691d1698dfe

📥 Commits

Reviewing files that changed from the base of the PR and between 3357349 and 457f4c4.

📒 Files selected for processing (1)
  • cpp/src/neighbors/detail/vamana/vamana_build.cuh

Signed-off-by: Mickael Ide <mide@nvidia.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/neighbors/detail/vamana/vamana_build.cuh (1)

52-66: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

create_insert_permutation now always uses seed = 0, making insert order deterministic across builds

create_insert_permutation defaults uint64_t seed = 0 and initializes raft::random::RngState rng(seed), while batched_insert_vamana calls it as create_insert_permutation(res, insert_order.view()) (no seed override). Please confirm this deterministic behavior is intentional; if not, plumb a seed through the API or use a non-deterministic (but thread-safe) seed source.

Secondary note: the permutation keys are float in [0, 1]; for very large N, collisions can reduce randomness/uniformity (still yields a valid permutation).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/neighbors/detail/vamana/vamana_build.cuh` around lines 52 - 66,
create_insert_permutation currently defaults seed=0 causing deterministic insert
orders; update the API to accept a non-default seed (or thread-safe random
source) and propagate it from callers such as batched_insert_vamana (i.e., add a
seed parameter to create_insert_permutation and change batched_insert_vamana to
pass a runtime-generated seed), replace float keys with a wider integer key
(e.g., uint64_t) generated from raft::random::RngState to avoid collisions for
large N, and ensure the thrust::sort_by_key still uses the new key type when
shuffling insert_order.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@cpp/src/neighbors/detail/vamana/vamana_build.cuh`:
- Around line 52-66: create_insert_permutation currently defaults seed=0 causing
deterministic insert orders; update the API to accept a non-default seed (or
thread-safe random source) and propagate it from callers such as
batched_insert_vamana (i.e., add a seed parameter to create_insert_permutation
and change batched_insert_vamana to pass a runtime-generated seed), replace
float keys with a wider integer key (e.g., uint64_t) generated from
raft::random::RngState to avoid collisions for large N, and ensure the
thrust::sort_by_key still uses the new key type when shuffling insert_order.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4df9963e-5c27-4817-b9dd-a13e9ad34679

📥 Commits

Reviewing files that changed from the base of the PR and between 457f4c4 and e8e1904.

📒 Files selected for processing (1)
  • cpp/src/neighbors/detail/vamana/vamana_build.cuh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant