Ci/metrics history#114
Merged
Merged
Conversation
…formance history updates
0407389 to
63fb264
Compare
…s) for all running steps
…scovery and metadata handling
…etadata JSON files
…ify testing scenarios
…r PR merges and main branch pushes
stratika
approved these changes
May 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Performance metrics history pipeline
Adds end-to-end collection of benchmark metrics from every CI run into a persistent
docs/perf-history.jsonlfile committed back tomain.What changed
scripts/write_metrics_sidecar.py(new)Helper script that writes a
*.meta.jsonsidecar fromKEY=VALUEshell arguments. Coerces JSON-native types (booleans, numbers) automatically.scripts/process_metrics.py(rewritten)rglob("*.json")null(not0) in the JSONL output"benchmark"and"metrics"objects for forward compatibility.github/workflows/build-and-run.yml./llama-tornadostep now setsJAVA_TOOL_OPTIONSto emit a metrics JSON and callswrite_metrics_sidecar.pyto write the matching sidecar (same stem, same directory)publish-performance-historypublish-performance-historyonly runs on pushes tomain(not PRs or forks), avoiding duplicate entries and push-permission errorsResult
One JSONL row per benchmark step per merge to
main, covering all backends, models, quantizations, and inference configurations currently in CI.