Cat 7.b: query latency distribution (YCSB percentiles)#33
Conversation
Adds sme/categories/latency.py with LatencyCollector that wraps adapter calls and captures wall-clock timing, computing YCSB-standard percentiles (p50, p95, p99, p99.9, max) via numpy. The [latency] optional extra reserves an HdrHistogram upgrade path when higher-fidelity tail measurement is needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a new Category 7.b latency distribution utility and accompanying tests, plus an optional dependency stanza for a latency-related extra.
Changes:
- Introduces
LatencyCollector/LatencyReportfor recording and reporting query latency percentiles. - Adds pytest coverage for collector behavior, dict conversion, and report formatting.
- Adds a
latencyoptional dependency extra inpyproject.toml.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| tests/test_latency.py | Adds unit tests for latency collection, percentile reporting, dict serialization, and formatting output. |
| sme/categories/latency.py | Implements latency collection/reporting and a human-readable summary formatter. |
| pyproject.toml | Adds a latency optional-deps extra and includes it in the all extra. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Uses numpy for percentile calculations; optionally upgrades to | ||
| HdrHistogram when the [latency] extra is installed. |
| import logging | ||
| from dataclasses import dataclass, field | ||
|
|
||
| import numpy as np | ||
|
|
||
| log = logging.getLogger(__name__) | ||
|
|
||
|
|
| import logging | ||
| from dataclasses import dataclass, field | ||
|
|
||
| import numpy as np | ||
|
|
||
| log = logging.getLogger(__name__) | ||
|
|
||
|
|
| def test_timed_call(): | ||
| c = LatencyCollector() | ||
| def slow_fn(x): | ||
| time.sleep(0.01) | ||
| return x * 2 | ||
| result, latency = c.timed_call(slow_fn, 5) | ||
| assert result == 10 | ||
| assert latency >= 10.0 |
| latency = [ | ||
| "hdrhistogram>=0.10", | ||
| ] |
|
YCSB percentiles (p50/p95/p99/p99.9) are exactly the right shape for query-latency — aggregate means hide the tail, and the tail is where memory systems differentiate. The Three small things before merge:
One framing question for the spec angle: Cat 7.b (latency) and Cat 7 (cost-per-correct, #34) are both landing in the same category. Is the intended Cat 7 readout a single combined section or two side-by-side? The answer changes what shape the headline metric becomes — "cost-per-correct at p95 latency" vs. two independent columns. |
…ic latency test Back LatencyCollector with HdrHistogram when the [latency] extra is installed (guarded optional import); fall back to np.percentile on the base install. HdrHistogram gives fixed per-sample memory at large N and lossless cross-run merging, fulfilling the module docstring's prior unmet promise. Record microseconds for sub-ms precision. Remove the unused logging import. Replace the flaky time.sleep(0.01) timed_call test with a monkeypatched perf_counter clock. Percentile assertions now tolerate HdrHistogram quantization on the HDR path while staying exact on the numpy fallback; add HDR-only merge/backend tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Pushed eb79499:
CI green on 3.10/3.11/3.12 (the numpy fallback path — the extra isn't installed in CI). On the Cat 7 framing question (also on #34): going with two side-by-side readouts — latency (this PR) and cost (#34) stay independent so each lands on its own. 🫏 |
|
Verified — HdrHistogram backend behind the One downstream methodology question, not a blocker: the HDR path and the numpy path return percentile values that differ by HDR quantization bounds. For published cross-system readings, is one canonical, or does the spec require |
Summary
Closes #25.
Adds Cat 7.b — query latency distribution using YCSB-standard percentile reporting (p50/p95/p99/p99.9/max).
sme/categories/latency.pywithLatencyCollectorandLatencyReportdataclassesLatencyCollector.timed_call(fn, *args)wraps any callable with wall-clock measurementformat_latency_report()renders human-readable outputDesign decisions:
[latency]optional extra added to pyproject.toml for future HdrHistogram upgrade pathTest plan
test_latency.py— collector recording, percentile computation, timed_call wrapper, empty report edge case, format output🫏 Generated with Claude Code