ci(rag-gate): skip green-with-warning when the eval key is rejected#69
Merged
Merged
Conversation
The gate already skips gracefully when no eval key is set, but a key that IS set yet rejected by Anthropic (expired/revoked/wrong value) fell through to the full eval and hard-failed red on the first 401 — turning the weekly scheduled run red with no actionable PR (a secret-ops gap, not a regression). Add a zero-token auth preflight (GET /v1/models) inside the keycheck step: a 401/403 is treated like a missing key (has_key=false → skip + ::warning:: "rotate ATTUNE_CI_EVAL_KEY"). Transient codes (429/5xx/network → 000) fall through so the real eval still surfaces them. No downstream steps change — they continue to gate on has_key. Verified GET /v1/models returns 401 on a bogus key (no token spend). Refs specs/rag-gate-accuracy-baseline (the corpus rebaseline itself merged in #51; this hardens the gate against the invalid-key red runs seen since). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The weekly scheduled RAG Gate has been failing red (e.g. run on 2026-06-15) with:
The corpus rebaseline fix already merged (#51), so this is not a corpus/code regression —
ATTUNE_CI_EVAL_KEYis set but rejected (expired/revoked/wrong value). The gate skips gracefully when a key is absent, but a present-but-rejected key fell through to the full eval and hard-failed on the first call — a recurring red scheduled run with no actionable PR.Fix
Add a zero-token auth preflight (
GET /v1/models) inside the existing keycheck step. A401/403is treated exactly like a missing key:has_key=false→ skip + a::warning::telling the maintainer to rotate the secret. Transient codes (429/5xx/network →000) fall through so a genuine eval still runs and surfaces them.has_key.GET /v1/modelsreturns 401 on a bogus key, zero token cost.Note for the maintainer
This stops the red; it does not make the gate run. To restore live drift detection, rotate
ATTUNE_CI_EVAL_KEY(repo → Settings → Secrets → Actions). The full-25 re-baseline (make eval) and q39 template regen remain a separate key-gated handoff underspecs/rag-gate-accuracy-baseline.🤖 Generated with Claude Code