Harden CI measurement baseline artifact downloads#668
Merged
schickling-assistant merged 1 commit intoMay 20, 2026
Merged
Conversation
CI Measurementspartial - advisory gate - readiness
No non-zero actionable measurement impact detected. Unchanged / 0-impact measurements (8)These rows had compatible baseline data, but their semantic impact rounded to 0.00x because the movement was below the configured budget, below the noise floor, or inside the robust noise band.
Diagnostic / ungated measurements (22)
All measurements
Source-of-truth JSON{
"schemaVersion": 1,
"title": "CI Measurements",
"status": "partial",
"gate": "advisory",
"readiness": "partial (8/23 enabled observations gateable)",
"commit": {
"shortSha": "8188525",
"sha": "818852525b9673969c4f6e2081a82f175eb8e95b"
},
"run": {
"id": "26190197600",
"attempt": "1",
"url": "https://github.com/overengineeringstudio/effect-utils/actions/runs/26190197600"
},
"baseline": null,
"protocol": "devenv-perf-warm-median-v2",
"chart": {
"meaning": "semantic-impact",
"zeroImpactMeaning": "no actionable PR impact after budgets, noise floor, and robust evidence checks",
"svg": "https://raw.githubusercontent.com/overengineeringstudio/effect-utils/ci-measurement-assets/ci-measurements/pr-668/818852525b9673969c4f6e2081a82f175eb8e95b/run-26190197600-attempt-1/ci-measurements.svg",
"lightPng": "https://raw.githubusercontent.com/overengineeringstudio/effect-utils/ci-measurement-assets/ci-measurements/pr-668/818852525b9673969c4f6e2081a82f175eb8e95b/run-26190197600-attempt-1/ci-measurements.png",
"darkPng": "https://raw.githubusercontent.com/overengineeringstudio/effect-utils/ci-measurement-assets/ci-measurements/pr-668/818852525b9673969c4f6e2081a82f175eb8e95b/run-26190197600-attempt-1/ci-measurements-dark.png"
},
"measurements": [
{
"id": "source.lines",
"label": "Genie CI workflow helpers lines",
"group": "source / effect-utils / genie / ci-workflow / source / ci",
"status": "pass",
"direction": "regressed",
"gateable": false,
"gateReason": "disabled",
"confidence": "diagnostic",
"comparisonMode": "budget",
"unit": "lines",
"baseline": 4432,
"current": 6633,
"delta": 2201,
"ratio": 1.496615523465704,
"semanticImpactScore": null,
"semanticImpactKind": "diagnostic",
"baselineSources": 1,
"currentSamples": 7,
"pairedSamples": 0,
"evidenceDeltaLower": 1757.8,
"evidenceDeltaUpper": 2644.2,
"pairedEvidenceQuantile": 0.25,
"dimensions": {
"scope": "genie_ci_workflow"
}
},
{
"id": "source.lines",
"label": "Genie runtime lines",
"group": "source / effect-utils / packages / genie / source / genie",
"status": "pass",
"direction": "regressed",
"gateable": false,
"gateReason": "disabled",
"confidence": "diagnostic",
"comparisonMode": "budget",
"unit": "lines",
"baseline": 18624,
"current": 18722,
"delta": 98,
"ratio": 1.005262027491409,
"semanticImpactScore": null,
"semanticImpactKind": "diagnostic",
"baselineSources": 1,
"currentSamples": 61,
"pairedSamples": 0,
"evidenceDeltaLower": -1764.4,
"evidenceDeltaUpper": 1960.4,
"pairedEvidenceQuantile": 0.25,
"dimensions": {
"scope": "genie_runtime"
}
},
{
"id": "genie_check_direct",
"label": "Genie check direct",
"group": "devenv / genie",
"status": "pass",
"direction": "unchanged",
"gateable": true,
"gateReason": "eligible",
"confidence": "within_budget",
"comparisonMode": "paired",
"unit": "seconds",
"baseline": 9.463,
"current": 9.252,
"delta": -0.21099999999999852,
"ratio": 0.9777026313008561,
"semanticImpactScore": 0,
"semanticImpactKind": "neutral",
"baselineSources": 5,
"currentSamples": 5,
"pairedSamples": 5,
"evidenceDeltaLower": -0.293,
"evidenceDeltaUpper": -0.038,
"pairedEvidenceQuantile": 0.25,
"dimensions": {
"probe": "genie_check_direct",
"probeLabel": "Genie check direct",
"status": 0,
"sampleCount": 11,
"warmupCount": 1,
"measuredSampleCount": 5,
"pairedSampleCount": 5,
"pairedOrderProtocol": "balanced-seeded-alternating-v1",
"pairedOrderSeed": "26190197600-1-c2867d97629668b0e2a3d8fe4fd25ab608445a7a",
"measurementProtocol": "devenv-perf-warm-median-v2",
"aggregation": "median",
"phase": "warm",
"devenvRev": "2cf62a010000b70f15c78a72761fad7c9e6fb47a",
"otelServiceName": "devenv-perf-ci"
}
},
{
"id": "task_genie_run",
"label": "Genie run task",
"group": "devenv / genie",
"status": "pass",
"direction": "unchanged",
"gateable": true,
"gateReason": "eligible",
"confidence": "noise_floor",
"comparisonMode": "paired",
"unit": "seconds",
"baseline": 1.507,
"current": 1.415,
"delta": -0.09199999999999986,
"ratio": 0.9389515593895157,
"semanticImpactScore": 0,
"semanticImpactKind": "neutral",
"baselineSources": 5,
"currentSamples": 5,
"pairedSamples": 5,
"evidenceDeltaLower": -0.015,
"evidenceDeltaUpper": -0.009,
"pairedEvidenceQuantile": 0.25,
"dimensions": {
"probe": "task_genie_run",
"probeLabel": "Genie run task",
"status": 0,
"sampleCount": 11,
"warmupCount": 1,
"measuredSampleCount": 5,
"pairedSampleCount": 5,
"pairedOrderProtocol": "balanced-seeded-alternating-v1",
"pairedOrderSeed": "26190197600-1-c2867d97629668b0e2a3d8fe4fd25ab608445a7a",
"measurementProtocol": "devenv-perf-warm-median-v2",
"aggregation": "median",
"phase": "warm",
"devenvRev": "2cf62a010000b70f15c78a72761fad7c9e6fb47a",
"otelServiceName": "devenv-perf-ci"
}
},
{
"id": "shell_eval_warm",
"label": "Warm shell eval",
"group": "devenv / devenv shell",
"status": "pass",
"direction": "unchanged",
"gateable": true,
"gateReason": "eligible",
"confidence": "noise_floor",
"comparisonMode": "paired",
"unit": "seconds",
"baseline": 6.011,
"current": 5.965,
"delta": -0.04600000000000026,
"ratio": 0.9923473631675261,
"semanticImpactScore": 0,
"semanticImpactKind": "neutral",
"baselineSources": 5,
"currentSamples": 5,
"pairedSamples": 5,
"evidenceDeltaLower": -0.164,
"evidenceDeltaUpper": -0.057,
"pairedEvidenceQuantile": 0.25,
"dimensions": {
"probe": "shell_eval_warm",
"probeLabel": "Warm shell eval",
"status": 0,
"sampleCount": 11,
"warmupCount": 1,
"measuredSampleCount": 5,
"pairedSampleCount": 5,
"pairedOrderProtocol": "balanced-seeded-alternating-v1",
"pairedOrderSeed": "26190197600-1-c2867d97629668b0e2a3d8fe4fd25ab608445a7a",
"measurementProtocol": "devenv-perf-warm-median-v2",
"aggregation": "median",
"phase": "warm",
"devenvRev": "2cf62a010000b70f15c78a72761fad7c9e6fb47a",
"otelServiceName": "devenv-perf-ci"
}
},
{
"id": "task_check_quick_warm",
"label": "Warm cached check:quick",
"group": "devenv / quality gates / check:quick",
"status": "pass",
"direction": "unchanged",
"gateable": true,
"gateReason": "eligible",
"confidence": "noise_floor",
"comparisonMode": "paired",
"unit": "seconds",
"baseline": 3.565,
"current": 3.523,
"delta": -0.041999999999999815,
"ratio": 0.9882187938288921,
"semanticImpactScore": 0,
"semanticImpactKind": "neutral",
"baselineSources": 5,
"currentSamples": 5,
"pairedSamples": 5,
"evidenceDeltaLower": -0.225,
"evidenceDeltaUpper": 0.113,
"pairedEvidenceQuantile": 0.25,
"dimensions": {
"workload": "cached-no-op",
"taskCacheMode": "warm",
"probe": "task_check_quick_warm",
"probeLabel": "Warm cached check:quick",
"status": 0,
"sampleCount": 11,
"warmupCount": 1,
"measuredSampleCount": 5,
"pairedSampleCount": 5,
"pairedOrderProtocol": "balanced-seeded-alternating-v1",
"pairedOrderSeed": "26190197600-1-c2867d97629668b0e2a3d8fe4fd25ab608445a7a",
"measurementProtocol": "devenv-perf-warm-median-v2",
"aggregation": "median",
"phase": "warm",
"devenvRev": "2cf62a010000b70f15c78a72761fad7c9e6fb47a",
"otelServiceName": "devenv-perf-ci"
}
},
{
"id": "task_check_quick_forced",
"label": "Forced check:quick",
"group": "devenv / quality gates / check:quick",
"status": "pass",
"direction": "unchanged",
"gateable": true,
"gateReason": "eligible",
"confidence": "noise_floor",
"comparisonMode": "paired",
"unit": "seconds",
"baseline": 8.281,
"current": 8.319,
"delta": 0.038000000000000256,
"ratio": 1.004588817775631,
"semanticImpactScore": 0,
"semanticImpactKind": "neutral",
"baselineSources": 3,
"currentSamples": 3,
"pairedSamples": 3,
"evidenceDeltaLower": -0.086,
"evidenceDeltaUpper": -0.014,
"pairedEvidenceQuantile": 0.25,
"dimensions": {
"workload": "forced-task-cache",
"taskCacheMode": "refresh",
"probe": "task_check_quick_forced",
"probeLabel": "Forced check:quick",
"status": 0,
"sampleCount": 6,
"warmupCount": 0,
"measuredSampleCount": 3,
"pairedSampleCount": 3,
"pairedOrderProtocol": "balanced-seeded-alternating-v1",
"pairedOrderSeed": "26190197600-1-c2867d97629668b0e2a3d8fe4fd25ab608445a7a",
"measurementProtocol": "devenv-perf-warm-median-v2",
"aggregation": "median",
"phase": "warm",
"devenvRev": "2cf62a010000b70f15c78a72761fad7c9e6fb47a",
"otelServiceName": "devenv-perf-ci"
}
},
{
"id": "task_pnpm_install",
"label": "pnpm install task",
"group": "devenv / workspace setup",
"status": "pass",
"direction": "unchanged",
"gateable": true,
"gateReason": "eligible",
"confidence": "noise_floor",
"comparisonMode": "paired",
"unit": "seconds",
"baseline": 0.709,
"current": 0.691,
"delta": -0.018000000000000016,
"ratio": 0.9746121297602256,
"semanticImpactScore": 0,
"semanticImpactKind": "neutral",
"baselineSources": 5,
"currentSamples": 5,
"pairedSamples": 5,
"evidenceDeltaLower": -0.031,
"evidenceDeltaUpper": 0.009,
"pairedEvidenceQuantile": 0.25,
"dimensions": {
"probe": "task_pnpm_install",
"probeLabel": "pnpm install task",
"status": 0,
"sampleCount": 11,
"warmupCount": 1,
"measuredSampleCount": 5,
"pairedSampleCount": 5,
"pairedOrderProtocol": "balanced-seeded-alternating-v1",
"pairedOrderSeed": "26190197600-1-c2867d97629668b0e2a3d8fe4fd25ab608445a7a",
"measurementProtocol": "devenv-perf-warm-median-v2",
"aggregation": "median",
"phase": "warm",
"devenvRev": "2cf62a010000b70f15c78a72761fad7c9e6fb47a",
"otelServiceName": "devenv-perf-ci"
}
},
{
"id": "tasks_list",
"label": "devenv tasks list",
"group": "devenv / devenv cli",
"status": "pass",
"direction": "unchanged",
"gateable": true,
"gateReason": "eligible",
"confidence": "noise_floor",
"comparisonMode": "paired",
"unit": "seconds",
"baseline": 0.053,
"current": 0.054,
"delta": 0.0010000000000000009,
"ratio": 1.0188679245283019,
"semanticImpactScore": 0,
"semanticImpactKind": "neutral",
"baselineSources": 9,
"currentSamples": 9,
"pairedSamples": 9,
"evidenceDeltaLower": 0,
"evidenceDeltaUpper": 0.002,
"pairedEvidenceQuantile": 0.25,
"dimensions": {
"probe": "tasks_list",
"probeLabel": "devenv tasks list",
"status": 0,
"sampleCount": 19,
"warmupCount": 1,
"measuredSampleCount": 9,
"pairedSampleCount": 9,
"pairedOrderProtocol": "balanced-seeded-alternating-v1",
"pairedOrderSeed": "26190197600-1-c2867d97629668b0e2a3d8fe4fd25ab608445a7a",
"measurementProtocol": "devenv-perf-warm-median-v2",
"aggregation": "median",
"phase": "warm",
"devenvRev": "2cf62a010000b70f15c78a72761fad7c9e6fb47a",
"otelServiceName": "devenv-perf-ci"
}
},
{
"id": "processes_help",
"label": "devenv processes --help",
"group": "devenv / devenv cli",
"status": "pass",
"direction": "unchanged",
"gateable": true,
"gateReason": "eligible",
"confidence": "noise_floor",
"comparisonMode": "paired",
"unit": "seconds",
"baseline": 0.022,
"current": 0.022,
"delta": 0,
"ratio": 1,
"semanticImpactScore": 0,
"semanticImpactKind": "neutral",
"baselineSources": 9,
"currentSamples": 9,
"pairedSamples": 9,
"evidenceDeltaLower": -0.001,
"evidenceDeltaUpper": 0,
"pairedEvidenceQuantile": 0.25,
"dimensions": {
"probe": "processes_help",
"probeLabel": "devenv processes --help",
"status": 0,
"sampleCount": 19,
"warmupCount": 1,
"measuredSampleCount": 9,
"pairedSampleCount": 9,
"pairedOrderProtocol": "balanced-seeded-alternating-v1",
"pairedOrderSeed": "26190197600-1-c2867d97629668b0e2a3d8fe4fd25ab608445a7a",
"measurementProtocol": "devenv-perf-warm-median-v2",
"aggregation": "median",
"phase": "warm",
"devenvRev": "2cf62a010000b70f15c78a72761fad7c9e6fb47a",
"otelServiceName": "devenv-perf-ci"
}
},
{
"id": "source.files",
"label": "Genie CI workflow helpers files",
"group": "source / effect-utils / genie / ci-workflow / source / ci",
"status": "pass",
"direction": "unchanged",
"gateable": false,
"gateReason": "disabled",
"confidence": "diagnostic",
"comparisonMode": "budget",
"unit": "count",
"baseline": 7,
"current": 7,
"delta": 0,
"ratio": 1,
"semanticImpactScore": null,
"semanticImpactKind": "diagnostic",
"baselineSources": 1,
"currentSamples": 7,
"pairedSamples": 0,
"evidenceDeltaLower": -1,
"evidenceDeltaUpper": 1,
"pairedEvidenceQuantile": 0.25,
"dimensions": {
"scope": "genie_ci_workflow"
}
},
{
"id": "source.files",
"label": "Genie runtime files",
"group": "source / effect-utils / packages / genie / source / genie",
"status": "pass",
"direction": "unchanged",
"gateable": false,
"gateReason": "disabled",
"confidence": "diagnostic",
"comparisonMode": "budget",
"unit": "count",
"baseline": 61,
"current": 61,
"delta": 0,
"ratio": 1,
"semanticImpactScore": null,
"semanticImpactKind": "diagnostic",
"baselineSources": 1,
"currentSamples": 61,
"pairedSamples": 0,
"evidenceDeltaLower": -6.1000000000000005,
"evidenceDeltaUpper": 6.1000000000000005,
"pairedEvidenceQuantile": 0.25,
"dimensions": {
"scope": "genie_runtime"
}
},
{
"id": "source.files",
"label": "Nix workspace tools files",
"group": "source / effect-utils / nix / workspace-tools / source / nix",
"status": "pass",
"direction": "unchanged",
"gateable": false,
"gateReason": "disabled",
"confidence": "diagnostic",
"comparisonMode": "budget",
"unit": "count",
"baseline": 13,
"current": 13,
"delta": 0,
"ratio": 1,
"semanticImpactScore": null,
"semanticImpactKind": "diagnostic",
"baselineSources": 1,
"currentSamples": 13,
"pairedSamples": 0,
"evidenceDeltaLower": -1.3,
"evidenceDeltaUpper": 1.3,
"pairedEvidenceQuantile": 0.25,
"dimensions": {
"scope": "nix_workspace_tools"
}
},
{
"id": "source.lines",
"label": "Nix workspace tools lines",
"group": "source / effect-utils / nix / workspace-tools / source / nix",
"status": "pass",
"direction": "unchanged",
"gateable": false,
"gateReason": "disabled",
"confidence": "diagnostic",
"comparisonMode": "budget",
"unit": "lines",
"baseline": 3237,
"current": 3237,
"delta": 0,
"ratio": 1,
"semanticImpactScore": null,
"semanticImpactKind": "diagnostic",
"baselineSources": 1,
"currentSamples": 13,
"pairedSamples": 0,
"evidenceDeltaLower": -323.70000000000005,
"evidenceDeltaUpper": 323.70000000000005,
"pairedEvidenceQuantile": 0.25,
"dimensions": {
"scope": "nix_workspace_tools"
}
},
{
"id": "nix.closure.bucket.nar_size",
"label": "Nix sources closure size",
"group": "nix / closures / packages / genie / buckets / nix-sources / nix closure buckets",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "budget",
"unit": "bytes",
"baseline": null,
"current": 0,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"bucket": "nix-sources"
}
},
{
"id": "nix.closure.bucket.nar_size",
"label": "Nix sources closure size",
"group": "nix / closures / packages / megarepo / buckets / nix-sources / nix closure buckets",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "budget",
"unit": "bytes",
"baseline": null,
"current": 0,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"bucket": "nix-sources"
}
},
{
"id": "nix.closure.bucket.nar_size",
"label": "Nix sources closure size",
"group": "nix / closures / packages / oxlint-npm / buckets / nix-sources / nix closure buckets",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "budget",
"unit": "bytes",
"baseline": null,
"current": 0,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"bucket": "nix-sources"
}
},
{
"id": "nix.closure.bucket.nar_size",
"label": "Node / pnpm closure size",
"group": "nix / closures / packages / genie / buckets / node / nix closure buckets",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "budget",
"unit": "bytes",
"baseline": null,
"current": 0,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"bucket": "node"
}
},
{
"id": "nix.closure.bucket.nar_size",
"label": "Node / pnpm closure size",
"group": "nix / closures / packages / megarepo / buckets / node / nix closure buckets",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "budget",
"unit": "bytes",
"baseline": null,
"current": 0,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"bucket": "node"
}
},
{
"id": "nix.closure.bucket.nar_size",
"label": "Node / pnpm closure size",
"group": "nix / closures / packages / oxlint-npm / buckets / node / nix closure buckets",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "budget",
"unit": "bytes",
"baseline": null,
"current": 0,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"bucket": "node"
}
},
{
"id": "nix.closure.bucket.nar_size",
"label": "Rust closure size",
"group": "nix / closures / packages / genie / buckets / rust / nix closure buckets",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "budget",
"unit": "bytes",
"baseline": null,
"current": 0,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"bucket": "rust"
}
},
{
"id": "nix.closure.bucket.nar_size",
"label": "Rust closure size",
"group": "nix / closures / packages / megarepo / buckets / rust / nix closure buckets",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "budget",
"unit": "bytes",
"baseline": null,
"current": 0,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"bucket": "rust"
}
},
{
"id": "nix.closure.bucket.nar_size",
"label": "Rust closure size",
"group": "nix / closures / packages / oxlint-npm / buckets / rust / nix closure buckets",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "budget",
"unit": "bytes",
"baseline": null,
"current": 0,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"bucket": "rust"
}
},
{
"id": "shell_eval_traced",
"label": "Shell eval with OTEL trace",
"group": "devenv / devenv shell",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "historical",
"unit": "seconds",
"baseline": null,
"current": 96.086,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"probe": "shell_eval_traced",
"probeLabel": "Shell eval with OTEL trace",
"status": 0,
"sampleCount": 2,
"warmupCount": 0,
"measuredSampleCount": 1,
"pairedSampleCount": 1,
"pairedOrderProtocol": "balanced-seeded-alternating-v1",
"pairedOrderSeed": "26190197600-1-c2867d97629668b0e2a3d8fe4fd25ab608445a7a",
"measurementProtocol": "devenv-perf-warm-median-v2",
"aggregation": "median",
"phase": "warm",
"devenvRev": "2cf62a010000b70f15c78a72761fad7c9e6fb47a",
"otelServiceName": "devenv-perf-ci"
}
},
{
"id": "nix.closure.path_count",
"label": "Total closure path count",
"group": "nix / closures / packages / genie / total / path-count / nix closure",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "budget",
"unit": "count",
"baseline": null,
"current": 80,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"bucket": "total"
}
},
{
"id": "nix.closure.path_count",
"label": "Total closure path count",
"group": "nix / closures / packages / megarepo / total / path-count / nix closure",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "budget",
"unit": "count",
"baseline": null,
"current": 5,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"bucket": "total"
}
},
{
"id": "nix.closure.path_count",
"label": "Total closure path count",
"group": "nix / closures / packages / oxlint-npm / total / path-count / nix closure",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "budget",
"unit": "count",
"baseline": null,
"current": 8,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"bucket": "total"
}
},
{
"id": "nix.closure.nar_size",
"label": "Total closure size",
"group": "nix / closures / packages / genie / total / nar-size / nix closure",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "budget",
"unit": "bytes",
"baseline": null,
"current": 533018624,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"bucket": "total"
}
},
{
"id": "nix.closure.nar_size",
"label": "Total closure size",
"group": "nix / closures / packages / megarepo / total / nar-size / nix closure",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "budget",
"unit": "bytes",
"baseline": null,
"current": 148820792,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"bucket": "total"
}
},
{
"id": "nix.closure.nar_size",
"label": "Total closure size",
"group": "nix / closures / packages / oxlint-npm / total / nar-size / nix closure",
"status": "missing_baseline",
"direction": "unknown",
"gateable": false,
"gateReason": "missing_baseline",
"confidence": "missing_baseline",
"comparisonMode": "budget",
"unit": "bytes",
"baseline": null,
"current": 161363816,
"delta": null,
"ratio": null,
"semanticImpactScore": null,
"semanticImpactKind": null,
"baselineSources": 0,
"currentSamples": 1,
"pairedSamples": null,
"evidenceDeltaLower": null,
"evidenceDeltaUpper": null,
"pairedEvidenceQuantile": null,
"dimensions": {
"bucket": "total"
}
}
]
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The CI measurement report helper used
gh run downloadfor historical baseline artifacts. In the dotfiles production PR this made the report spend several minutes in a single baseline download step, so the intended timeout did not give a tight bound.Goal
Make baseline artifact downloads bounded and reusable across repos, so CI measurement comments can be trusted to finish promptly even when a historical artifact endpoint is slow.
Decisions
The helper now keeps GitHub CLI for run/artifact discovery, but downloads the selected artifact archive directly through the GitHub artifact API with
curl --connect-timeoutand--max-time, then extracts it withunzip. This avoids the opaquegh run downloadtransfer path while preserving the existing seed-run, candidate-history, and required-observation logic.Verification
devenv tasks run genie:run --mode before --no-tui --show-outputdevenv tasks run ts:check lint:check:genie --mode before --no-tui --show-outputsource-shapeartifact: downloadedactions/artifacts/<id>/zipwith curl, unzipped it, and verifiedcurrent/dotfiles/measurements.jsonwith 8 observations.Complexity
Adds no new abstraction. It swaps the download implementation to standard
curl/unziptooling with explicit network timeouts.Concerns
The helper now depends on
curlandunzip; if absent, it resolves them via Nix and skips baseline download only if resolution fails.Friction & Bottlenecks
The production dotfiles report showed
gh run downloadtaking about seven minutes for one baseline artifact candidate despite the intended 120-second bound. This PR addresses that bottleneck in the shared helper.Follow-ups
Repin dotfiles to this PR and re-run the production measurement report to verify the bounded path end to end.
References
Follow-up to #658.
Posted on behalf of @schickling
agent_nameagent_session_idagent_toolagent_tool_versionagent_runtimeagent_modelworktreemachinetooling_profile