Skip to content

ci: compact benchmark PR comments — summary table + artifact#170

Open
hankbobtheresearchoor wants to merge 2 commits into
Layr-Labs:masterfrom
hankbobtheresearchoor:fix/compact-benchmark-comment
Open

ci: compact benchmark PR comments — summary table + artifact#170
hankbobtheresearchoor wants to merge 2 commits into
Layr-Labs:masterfrom
hankbobtheresearchoor:fix/compact-benchmark-comment

Conversation

@hankbobtheresearchoor
Copy link
Copy Markdown
Contributor

Benchmark comments currently dump the full raw markdown (scenario configs, per-segment latency tables, assertion details) into every PR, bloating the conversation and burying real review discussion.

Change: Replace the verbose comment with a compact summary table + upload the full results as a workflow artifact.

Before: ~3,000-character markdown dump per PR
After: One compact summary like:

## ⚡ E2E Benchmarks
Runner: macos-15 (M1 Virtual) | 2026-05-15 22:28 UTC

| Scenario | Req | OK | Thrpt | P50 e2e | P95 e2e | |
|---|---|---|---|---|---|---|
| 1-provider-streaming | 30 | 30 | 1.7/s | 838ms | 5.054s | ✅ |
| 7-provider-multi-model | 50 | 50 | 0.9/s | 234ms | 28.849s | ✅ |
| ... | | | | | | |

[📋 Full results](https://github.com/Layr-Labs/d-inference/actions/runs/...)

Full benchmark markdown is uploaded as a workflow artifact (30-day retention) accessible from the run page.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 17, 2026

@hankbobtheresearchoor is attempting to deploy a commit to the EigenLabs Team on Vercel.

A member of the Team first needs to authorize it.

@hankbobtheresearchoor hankbobtheresearchoor force-pushed the fix/compact-benchmark-comment branch 5 times, most recently from 4037d93 to 9243fe7 Compare May 17, 2026 23:51
@hankbobtheresearchoor hankbobtheresearchoor force-pushed the fix/compact-benchmark-comment branch from 9243fe7 to eea2ba0 Compare May 18, 2026 00:09
- Parse FULL latency decomposition (all segments, not just total_e2e)
- Generate benchmark-results.csv with one row per scenario
- CSV columns: sha, short_sha, timestamp, runner, scenario, config (providers/users/concurrency/streaming), aggregate metrics, per-segment P50/P95/mean/max, assertion status + details
- Upload both CSV (text/csv) and markdown (text/markdown) to GCS
- PR comment now links to both formats
- CSV enables long-term cross-run analysis and synthesis
@hankbobtheresearchoor hankbobtheresearchoor force-pushed the fix/compact-benchmark-comment branch from add8029 to 7c5fcb5 Compare May 18, 2026 23:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant