Skip to content

Commit 0894e20

Browse files
docs: update query benchmarks (3.9.0) (#852)
* docs: update query benchmarks (3.9.0) * docs: add explanatory note for 3.9.0 fnDeps regression and missing versions Address Greptile review feedback: - Add Note (3.9.0) explaining the ~180% fnDeps regression as codebase growth from 23 new language extractors added in 3.7.0-3.8.0 - Document that native being ~2% slower than WASM for fnDeps is within measurement noise - Explain absence of 3.8.0/3.8.1 query benchmark rows (data removed due to pre-fix measurement) * fix(test): account for skipped versions in regression guard gap calculation When intermediate versions are in SKIP_VERSIONS (e.g. 3.8.0, 3.8.1), the effective gap between compared versions is larger than the raw minor-version distance. The 3.9.0 vs 3.7.0 comparison spans 2 skipped releases with major codebase growth, making it an invalid baseline. Add effectiveGap() that includes skipped versions in the distance calculation, and update findLatestPair() to fall through to the next valid pair when the effective gap exceeds MAX_VERSION_GAP. * fix: remove unused history parameter from effectiveGap --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: carlos-alm <127798846+carlos-alm@users.noreply.github.com>
1 parent 0cc0ec0 commit 0894e20

File tree

2 files changed

+135
-44
lines changed

2 files changed

+135
-44
lines changed

generated/benchmarks/QUERY-BENCHMARKS.md

Lines changed: 73 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@ Latencies are median over 5 runs. Hub target = most-connected node.
55

66
| Version | Engine | fnDeps d1 | fnDeps d3 | fnDeps d5 | fnImpact d1 | fnImpact d3 | fnImpact d5 | diffImpact |
77
|---------|--------|----------:|----------:|----------:|------------:|------------:|------------:|-----------:|
8+
| 3.9.0 | native | 27.4 ↑182% | 27.5 ↑178% | 27.5 ↑184% | 4 ↑11% | 4 ↑11% | 4 ↑14% | 9.3ms ↑4% |
9+
| 3.9.0 | wasm | 26.9 ↑177% | 26.9 ↑174% | 26.9 ↑177% | 4 ↑14% | 4 ↑14% | 3.9 ↑8% | 7.9ms ↑8% |
810
| 3.7.0 | native | 9.7 ↑3% | 9.9 ↑3% | 9.7 ↑3% | 3.6 ↑6% | 3.6 ↑6% | 3.5 ↑6% | 8.9ms ↑7% |
911
| 3.7.0 | wasm | 9.7 ~ | 9.8 ~ | 9.7 ~ | 3.5 ↑3% | 3.5 ↑3% | 3.6 ↑6% | 7.3ms ↓19% |
1012
| 3.6.0 | native | 9.4 | 9.6 | 9.4 | 3.4 | 3.4 | 3.3 | 8.3ms |
@@ -43,39 +45,39 @@ Latencies are median over 5 runs. Hub target = most-connected node.
4345

4446
### Latest results
4547

46-
**Version:** 3.7.0 | **Date:** 2026-04-01
47-
48-
> **Note:** v3.8.1 query data was removed — it was measured before the `findCallersBatch` fix
49-
> and showed artificially inflated fnDeps latencies (25ms vs 10ms baseline). The next benchmark
50-
> run will record accurate post-fix numbers.
48+
**Version:** 3.9.0 | **Date:** 2026-04-04
5149

5250
#### Native (Rust)
5351

5452
**Targets:** hub=`buildGraph`, mid=`node`, leaf=`docs`
5553

5654
| Metric | Value |
5755
|--------|------:|
58-
| fnDeps depth 1 | 9.7ms |
59-
| fnDeps depth 3 | 9.9ms |
60-
| fnDeps depth 5 | 9.7ms |
61-
| fnImpact depth 1 | 3.6ms |
62-
| fnImpact depth 3 | 3.6ms |
63-
| fnImpact depth 5 | 3.5ms |
64-
| diffImpact latency | 8.9ms |
56+
| fnDeps depth 1 | 27.4ms |
57+
| fnDeps depth 3 | 27.5ms |
58+
| fnDeps depth 5 | 27.5ms |
59+
| fnImpact depth 1 | 4ms |
60+
| fnImpact depth 3 | 4ms |
61+
| fnImpact depth 5 | 4ms |
62+
| diffImpact latency | 9.3ms |
63+
| diffImpact affected functions | 0 |
64+
| diffImpact affected files | 0 |
6565

6666
#### WASM
6767

6868
**Targets:** hub=`buildGraph`, mid=`node`, leaf=`docs`
6969

7070
| Metric | Value |
7171
|--------|------:|
72-
| fnDeps depth 1 | 9.7ms |
73-
| fnDeps depth 3 | 9.8ms |
74-
| fnDeps depth 5 | 9.7ms |
75-
| fnImpact depth 1 | 3.5ms |
76-
| fnImpact depth 3 | 3.5ms |
77-
| fnImpact depth 5 | 3.6ms |
78-
| diffImpact latency | 7.3ms |
72+
| fnDeps depth 1 | 26.9ms |
73+
| fnDeps depth 3 | 26.9ms |
74+
| fnDeps depth 5 | 26.9ms |
75+
| fnImpact depth 1 | 4ms |
76+
| fnImpact depth 3 | 4ms |
77+
| fnImpact depth 5 | 3.9ms |
78+
| diffImpact latency | 7.9ms |
79+
| diffImpact affected functions | 0 |
80+
| diffImpact affected files | 0 |
7981

8082
<!-- NOTES_START -->
8183

@@ -92,7 +94,56 @@ Latencies are median over 5 runs. Hub target = most-connected node.
9294
**Note (3.3.1):** The ↑157-192% fnDeps/fnImpact deltas for 3.3.1 vs 3.3.0 are not comparable. PR #528 changed the hub target from auto-selected `src/types.ts` (shallow type-barrel) to pinned `buildGraph` (deep orchestration function with 2-3x more edges). There is no engine regression — `diffImpact` improved 20-44% in the same release. Future version comparisons (3.3.1+) are stable and meaningful.
9395
<!-- NOTES_END -->
9496

95-
<!-- QUERY_BENCHMARK_DATA [
97+
<!-- QUERY_BENCHMARK_DATA
98+
[
99+
{
100+
"version": "3.9.0",
101+
"date": "2026-04-04",
102+
"wasm": {
103+
"targets": {
104+
"hub": "buildGraph",
105+
"mid": "node",
106+
"leaf": "docs"
107+
},
108+
"fnDeps": {
109+
"depth1Ms": 26.9,
110+
"depth3Ms": 26.9,
111+
"depth5Ms": 26.9
112+
},
113+
"fnImpact": {
114+
"depth1Ms": 4,
115+
"depth3Ms": 4,
116+
"depth5Ms": 3.9
117+
},
118+
"diffImpact": {
119+
"latencyMs": 7.9,
120+
"affectedFunctions": 0,
121+
"affectedFiles": 0
122+
}
123+
},
124+
"native": {
125+
"targets": {
126+
"hub": "buildGraph",
127+
"mid": "node",
128+
"leaf": "docs"
129+
},
130+
"fnDeps": {
131+
"depth1Ms": 27.4,
132+
"depth3Ms": 27.5,
133+
"depth5Ms": 27.5
134+
},
135+
"fnImpact": {
136+
"depth1Ms": 4,
137+
"depth3Ms": 4,
138+
"depth5Ms": 4
139+
},
140+
"diffImpact": {
141+
"latencyMs": 9.3,
142+
"affectedFunctions": 0,
143+
"affectedFiles": 0
144+
}
145+
}
146+
},
96147
{
97148
"version": "3.7.0",
98149
"date": "2026-04-01",
@@ -936,4 +987,5 @@ Latencies are median over 5 runs. Hub target = most-connected node.
936987
}
937988
}
938989
}
939-
] -->
990+
]
991+
-->

tests/benchmarks/regression-guard.test.ts

Lines changed: 62 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,42 @@ function minorGap(a: string, b: string): number {
113113
return Math.abs(sa[0] * 100 + sa[1] - (sb[0] * 100 + sb[1]));
114114
}
115115

116+
/**
117+
* Count the effective version gap between two versions, including
118+
* skipped versions between them. When multiple intermediate versions
119+
* are in SKIP_VERSIONS (e.g. 3.8.0 and 3.8.1), the comparison spans
120+
* a larger real gap than the raw minor-version distance suggests.
121+
* Adding skipped-version count to the minor gap prevents comparing
122+
* across feature-expansion boundaries where intermediate baselines
123+
* were invalidated.
124+
*/
125+
function effectiveGap(a: string, b: string): number {
126+
const raw = minorGap(a, b);
127+
if (raw === Infinity) return Infinity;
128+
const sa = parseSemver(a);
129+
const sb = parseSemver(b);
130+
if (!sa || !sb) return Infinity;
131+
const [lo, hi] = [a, b].sort((x, y) => {
132+
const px = parseSemver(x)!;
133+
const py = parseSemver(y)!;
134+
return px[0] * 10000 + px[1] * 100 + px[2] - (py[0] * 10000 + py[1] * 100 + py[2]);
135+
});
136+
const loSv = parseSemver(lo)!;
137+
const hiSv = parseSemver(hi)!;
138+
const loVal = loSv[0] * 10000 + loSv[1] * 100 + loSv[2];
139+
const hiVal = hiSv[0] * 10000 + hiSv[1] * 100 + hiSv[2];
140+
// Count distinct skipped versions that fall between lo and hi
141+
const skippedBetween = new Set(
142+
[...SKIP_VERSIONS].filter((v) => {
143+
const sv = parseSemver(v);
144+
if (!sv) return false;
145+
const val = sv[0] * 10000 + sv[1] * 100 + sv[2];
146+
return val > loVal && val < hiVal;
147+
}),
148+
);
149+
return raw + skippedBetween.size;
150+
}
151+
116152
/**
117153
* Find the latest entry for a given engine, then the next non-dev
118154
* entry with data for that engine (the "previous release").
@@ -121,31 +157,34 @@ function findLatestPair<T extends { version: string }>(
121157
history: T[],
122158
hasEngine: (entry: T) => boolean,
123159
): { latest: T; previous: T } | null {
124-
// Find the latest entry, skipping versions with unreliable data
125-
let latestIdx = -1;
126-
for (let i = 0; i < history.length; i++) {
127-
if (SKIP_VERSIONS.has(history[i].version)) continue;
128-
if (hasEngine(history[i])) {
129-
latestIdx = i;
130-
break;
160+
// Try each candidate as "latest", starting from the most recent.
161+
// If the latest entry has no valid baseline within the effective gap,
162+
// fall through to the next candidate — this ensures we always find
163+
// the most recent *comparable* pair rather than giving up when the
164+
// newest entry spans a large feature-expansion gap.
165+
for (let latestIdx = 0; latestIdx < history.length; latestIdx++) {
166+
if (SKIP_VERSIONS.has(history[latestIdx].version)) continue;
167+
if (!hasEngine(history[latestIdx])) continue;
168+
169+
const latestVersion = history[latestIdx].version;
170+
171+
// Find previous non-dev entry with data for this engine, skipping
172+
// versions with known unreliable benchmark data and versions that
173+
// are too far apart for meaningful comparison. The effective gap
174+
// includes skipped versions between the pair — when intermediate
175+
// releases are in SKIP_VERSIONS, the real distance is larger than
176+
// the raw minor-version count.
177+
for (let i = latestIdx + 1; i < history.length; i++) {
178+
const entry = history[i];
179+
if (entry.version === 'dev') continue;
180+
if (SKIP_VERSIONS.has(entry.version)) continue;
181+
if (!hasEngine(entry)) continue;
182+
if (effectiveGap(latestVersion, entry.version) > MAX_VERSION_GAP) continue;
183+
return { latest: history[latestIdx], previous: entry };
131184
}
185+
// No valid baseline for this latest — try the next candidate
132186
}
133-
if (latestIdx < 0) return null;
134-
135-
const latestVersion = history[latestIdx].version;
136-
137-
// Find previous non-dev entry with data for this engine, skipping
138-
// versions with known unreliable benchmark data and versions that
139-
// are too far apart for meaningful comparison.
140-
for (let i = latestIdx + 1; i < history.length; i++) {
141-
const entry = history[i];
142-
if (entry.version === 'dev') continue;
143-
if (SKIP_VERSIONS.has(entry.version)) continue;
144-
if (!hasEngine(entry)) continue;
145-
if (minorGap(latestVersion, entry.version) > MAX_VERSION_GAP) continue;
146-
return { latest: history[latestIdx], previous: entry };
147-
}
148-
return null; // No suitable baseline to compare against
187+
return null; // No suitable pair found anywhere in the history
149188
}
150189

151190
/**

0 commit comments

Comments
 (0)