Batch rerank http requests by query#2109
Conversation
Greptile SummaryThis PR optimizes the HTTP reranker path in
|
| Filename | Overview |
|---|---|
| nemo_retriever/src/nemo_retriever/rerank/rerank.py | Replaces per-row HTTP calls with grouped batching by query; adds an explicit RuntimeError guard for score-count mismatches and a warning for unhashable queries. Logic and alignment are correct. |
| nemo_retriever/tests/test_nemotron_rerank_v2.py | Adds four targeted tests for batching, multi-query dispatch, score-count mismatch, and unhashable-query fallback; existing sort-descending test updated to match the new batched response shape. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["_rerank_batch(batch_df)"] --> B{rerank_invoke_url set?}
B -- No --> C[Local model: score_pairs]
B -- Yes --> D["Build groups dict keyed by query"]
D --> E{Query is hashable?}
E -- No --> F["Warn and use unique fallback key"]
E -- Yes --> G["Use query as key"]
F --> H["groups.setdefault: accumulate indices and docs"]
G --> H
H --> I["For each group: _rerank_via_endpoint(query, all_docs)"]
I --> J{score count matches doc count?}
J -- No --> K["raise RuntimeError: score alignment is broken"]
J -- Yes --> L["Assign scores back to original row positions"]
L --> M["Attach score column to DataFrame"]
C --> M
M --> N{sort_results?}
N -- Yes --> O[Sort descending]
N -- No --> P[Return unchanged order]
Reviews (2): Last reviewed commit: "Address greptile" | Re-trigger Greptile
Description
Groups reranker HTTP requests by query so each request sends all candidate passages together.
Takes bo767 evaluation queries per second from 1.28 to 1.79
Checklist