-
Notifications
You must be signed in to change notification settings - Fork 376
Add RFC Process and RFC 0001 — Multi-Vector Distance Functions #731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds an RFC framework to the DiskANN repository and introduces the first RFC describing a proposed high-performance Chamfer distance implementation for ColBERT-style multi-vector representations.
Changes:
- Introduces
rfcs/with an RFC process guide and an index. - Adds an RFC template (
0000-template.md) to standardize future proposals. - Adds RFC 0001 proposing “query-transposed tiling” for faster multi-vector Chamfer distance.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
rfcs/README.md |
Documents the RFC process, lifecycle, and index of RFCs. |
rfcs/0000-template.md |
Provides a reusable RFC template and metadata format. |
rfcs/0001-multi-vector-distance-functions.md |
Proposes the multi-vector Chamfer distance approach, API shape, and benchmarks. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| type MultiVectorRef<'a> = MatRef<'a, Standard<f32>>; | ||
| ``` | ||
|
|
||
| `Standard<f32>` provides contiguous row-major storage with `as_slice()` for BLAS compatibility and zero-copy views via `MatRef`. `QueryMatRef` (a newtype over `MatRef`) distinguishes query from document matrices for asymmetric distance functions. |
Copilot
AI
Feb 6, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Standard<f32>/Mat in diskann_quantization::multi_vector don't expose an as_slice() API for the full matrix; the current API provides row slices via rows()/get_row() and views via as_view(). This statement is likely inaccurate—please update the RFC to describe the actual access patterns/APIs (or point to the correct method if one exists).
| `Standard<f32>` provides contiguous row-major storage with `as_slice()` for BLAS compatibility and zero-copy views via `MatRef`. `QueryMatRef` (a newtype over `MatRef`) distinguishes query from document matrices for asymmetric distance functions. | |
| `Standard<f32>` provides contiguous row-major storage; the current API exposes row slices via `rows()`/`get_row()` and matrix views via `as_view()`/`MatRef`, which can be used to integrate with BLAS or other kernels. `QueryMatRef` (a newtype over `MatRef`) distinguishes query from document matrices for asymmetric distance functions. |
| // 2. Transpose query into SIMD-friendly block layout (once per query) | ||
| let transposed_query = TransposedMultiVector::from_view(query.as_ref()); | ||
|
|
Copilot
AI
Feb 6, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example calls query.as_ref(), but Mat<...> in diskann_quantization::multi_vector uses as_view() to produce a MatRef (there’s no as_ref() method). As written, the snippet won’t compile—please update to the actual view API.
| // Populate per-query-vector max similarities | ||
| max_sim.evaluate(&transposed_query, doc.as_ref()); | ||
|
|
Copilot
AI
Feb 6, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_sim.evaluate(&transposed_query, doc.as_ref()); has two issues in the example: (1) doc.as_ref() isn’t an API on the multi-vector Mat types (use the real view method, e.g. as_view(), or pass an existing MatRef), and (2) the proposed evaluate returns a Result but it’s ignored here. Please adjust the snippet to be compilable and show intended error handling.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #731 +/- ##
=======================================
Coverage 89.01% 89.01%
=======================================
Files 428 428
Lines 78294 78294
=======================================
Hits 69692 69692
Misses 8602 8602 🚀 New features to boost your workflow:
|
|
@microsoft-github-policy-service agree company="Microsoft" |
|
|
||
| ## When to Write an RFC | ||
|
|
||
| An RFC is required for changes that affect the public API or architecture of the project, or whenever an important design decision needs review from maintainers and stakeholders. Examples include: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we are not at the stage where all changes to the public API need an RFC. Obviously, big changes need more thorough consensus, but this project is still developing so rapidly that, for example, adding a variant to a public enum in an internally supporting crate that should not have external users technically changes the public API but, in my opinion, should not require an RFC. Requiring such will bog us down in bureaucracy.
|
|
||
| ## How to Submit an RFC | ||
|
|
||
| 1. Copy [0000-template.md](0000-template.md) to `NNNN-short-title.md` (use the next available number). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can use PR numbers instead of total ordering? I'm worried that ensuring all RFCs are sequentially ordered otherwise would be a giant pain.
| 2. Fill in all sections. Remove instructional comments. | ||
| 3. Open a pull request with the RFC file. The PR description should summarize the proposal. | ||
| 4. Discuss in the PR. Update the RFC based on feedback. | ||
| 5. Once accepted, the RFC is merged and the status is updated to **Accepted**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merging an RFC should be considered acceptance. This way, there's no need for additional book-keeping.
| | # | Title | Status | Author | Resolved Date | | ||
| |---|-------|--------|--------|---------------| | ||
| | [0000](0000-template.md) | RFC Template | - | - | - | | ||
| | [0001](0001-multi-vector-distance-functions.md) | Multi-Vector Distance Functions | InReview | Suryansh Gupta | - | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote to not have an index like this. Without tooling, I think it's highly unlikely that it will be updated.
| 2. Achieve 2x+ speedup over baseline SIMD through memory layout optimization | ||
| 3. Maintain compatibility with DiskANN's `DistanceFunctionMut` trait | ||
| 4. Provide a clean API that enables standalone distance function usage without full index integration | ||
| 5. Achieve performance within 10–20% of `faer` SGEMM-based Chamfer computation, when both our implementation and `faer` are restricted to AVX2 (no AVX-512 on either side) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use AVX-512 if available? We have support, and it could likely really benefit dense operations like this.
|
|
||
| ### Alternative Approaches | ||
|
|
||
| The experimental crate explored six approaches total. The query-transposed tiling approach was selected as the proposal, but the alternatives remain available and may be better for specific workloads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing not covered here is how this behaves under multi-threading. it's possible that single threaded tests may look fine, but if effort is not applied to restrict working set sizes to the size of the L1/L2 as much as possible, then multiple-threads could stomp on one-another's usage of L3 when running multiple computations in parallel.
| | dim=384, doc=32, query=8 | 8,412 | 1.41x | **1.65x** | 1.30x | 1.24x | | ||
| | dim=384, doc=64, query=16 | 38,162 | 1.30x | 1.47x | **1.70x** | 1.66x | | ||
| | dim=384, doc=128, query=32 | 171,431 | 1.53x | 1.94x | 2.04x | **2.16x** | | ||
| | **Average** | — | **1.38x** | **1.79x** | **1.93x** | **1.59x** | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also see situations where there are, say, 900+ documents?
| ### Speedup vs SIMD Baseline (Median, Lower Latency = Better) | ||
|
|
||
| | Configuration | SIMD (µs) | transposed_simd | transposed_tiling | query_transposed_tiling | sgemm | | ||
| |--------------|-----------|-----------------|-------------------|------------------------|-------| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit - can this table be adjusted to it's readable without rendering the markdown?
Summary
Introduces the
rfcs/directory with the RFC process and the first RFC proposing high-performance Chamfer distance for multi-vector representations in DiskANN.Changes
New Files
rfcs/README.mdrfcs/0000-template.mdrfcs/0001-multi-vector-distance-functions.mdRFC 0001 — Multi-Vector Distance Functions
Proposes a query-transposed tiling approach for Chamfer distance computation:
MaxSimscratch buffer across distance calls (zero hot-path allocation)Builds on existing types from
diskann-quantization(Mat,MatRef,MaxSim,Chamfer) and proposesMaxSimimplementingDistanceFunctionMutfromdiskann-vectorfor ecosystem integration.Experimental benchmarks available in PR #730.
Review Focus
This PR is the RFC itself — please review the proposal and leave comments directly on the RFC file:
MaxSimimplementingDistanceFunctionMutdirectlyTransposedMultiVectortype design andReprtrait integration