maxsim

ColBERT-style late-interaction MaxSim scoring for multi-vector retrieval — pure Rust, zero dependencies, builds anywhere offline.

Late interaction in 60 seconds

Most dense retrievers pool a whole query or document into a single vector and compare them with one dot product. That early compression is fast but lossy: token-level structure — which word matched which — is gone before scoring even begins. Late interaction keeps one embedding per token and defers the comparison to scoring time, recovering that fidelity.

The scoring rule is MaxSim. Given a query Q and document D, each represented as a bag of per-token vectors:

score(Q, D) = Σ_i  max_j  sim(q_i, d_j)

For each query token q_i, find its single best-matching document token (the max over the document's tokens), then sum those maxima across the query. Every query term independently "finds its evidence" anywhere in the document, and the score measures how completely the document covers the query, term by term. Because it is a sum over query tokens, the score is unbounded above and only comparable across documents for a fixed query — which is exactly what ranking needs.

This per-token matching is why late-interaction reranking is consistently more faithful than cosine over pooled embeddings: a single pooled vector cannot represent "this passage nails term A but is weak on term B," whereas MaxSim can. The approach was introduced by ColBERT (Khattab & Zaharia, SIGIR 2020) and later extended to images by ColPali (Faysse et al., 2024), which scores document-image patch embeddings against query tokens with the same MaxSim aggregate.

Usage

use maxsim::{max_sim, max_sim_normalized, l2_normalize_tokens, rank};

// Each token is a vector; a query or document is a sequence of token vectors.
let query = vec![vec![1.0, 0.0], vec![0.0, 1.0]];

let docs = vec![
    vec![vec![1.0, 0.0]],                  // doc 0: covers one query token
    vec![vec![1.0, 0.0], vec![0.0, 1.0]],  // doc 1: covers both
];

// Score a single document.
let s = max_sim(&query, &docs[1]);

// Rank a whole corpus — returns (doc_index, score), best first.
let ranking = rank(&query, &docs);
assert_eq!(ranking[0].0, 1);

// Fast path for pre-normalized embeddings: cosine reduces to a dot product,
// so normalize once and reuse.
let qn = l2_normalize_tokens(&query);
let dn: Vec<_> = docs.iter().map(|d| l2_normalize_tokens(d)).collect();
let s_fast = max_sim_normalized(&qn, &dn[1]);

API

Function	Purpose
`cosine(a, b)`	Cosine similarity between two `&[f32]`; safe on zero-norm and length mismatch.
`max_sim(query, doc)`	MaxSim using cosine as the per-pair similarity.
`max_sim_normalized(query, doc)`	MaxSim using the dot product — the fast path for L2-normalized inputs.
`l2_normalize_tokens(tokens)`	L2-normalize each token vector for use with `max_sim_normalized`.
`rank(query, docs)`	Rank documents by descending MaxSim; returns `Vec<(index, score)>`.

Fast-path tradeoff. When embeddings are already L2-normalized (the usual ColBERT setup), cosine(a, b) == dot(a, b), so max_sim_normalized skips recomputing norms for every (query_token, doc_token) pair. The caller owns that invariant: feed it un-normalized vectors and you get a raw dot-product aggregate, not a cosine MaxSim. Normalize once with l2_normalize_tokens and reuse the result across many scorings.

Edge-case policy (never panics)

Empty query or empty document → score 0.0.
Mismatched per-token dimensions → that pair contributes 0.0 (treated as orthogonal), so one malformed token can't poison the whole aggregate.
Zero-norm token → cosine 0.0 against everything (no division by zero).
Empty corpus → rank returns an empty Vec; ties preserve input order (stable sort).

Build and test

cargo build
cargo test                 # unit tests + doctests
cargo run --example rerank # toy reranking demo

No network and no dependencies are required — the crate is pure std.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
examples		examples
src		src
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

maxsim

Late interaction in 60 seconds

Usage

API

Edge-case policy (never panics)

Build and test

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

maxsim

Late interaction in 60 seconds

Usage

API

Edge-case policy (never panics)

Build and test

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages