Skip to content

ci(rag-gate): skip green-with-warning when the eval key is rejected#69

Merged
silversurfer562 merged 1 commit into
mainfrom
chore/rag-gate-invalid-key-skip
Jun 17, 2026
Merged

ci(rag-gate): skip green-with-warning when the eval key is rejected#69
silversurfer562 merged 1 commit into
mainfrom
chore/rag-gate-invalid-key-skip

Conversation

@silversurfer562

Copy link
Copy Markdown
Member

Problem

The weekly scheduled RAG Gate has been failing red (e.g. run on 2026-06-15) with:

[1/10] answer q2|sonnet ... FAIL (401 - authentication_error: 'invalid x-api-key')
Process completed with exit code 2

The corpus rebaseline fix already merged (#51), so this is not a corpus/code regression — ATTUNE_CI_EVAL_KEY is set but rejected (expired/revoked/wrong value). The gate skips gracefully when a key is absent, but a present-but-rejected key fell through to the full eval and hard-failed on the first call — a recurring red scheduled run with no actionable PR.

Fix

Add a zero-token auth preflight (GET /v1/models) inside the existing keycheck step. A 401/403 is treated exactly like a missing key: has_key=false → skip + a ::warning:: telling the maintainer to rotate the secret. Transient codes (429/5xx/network → 000) fall through so a genuine eval still runs and surfaces them.

  • No downstream steps change — they still gate on has_key.
  • Verified locally: GET /v1/models returns 401 on a bogus key, zero token cost.

Note for the maintainer

This stops the red; it does not make the gate run. To restore live drift detection, rotate ATTUNE_CI_EVAL_KEY (repo → Settings → Secrets → Actions). The full-25 re-baseline (make eval) and q39 template regen remain a separate key-gated handoff under specs/rag-gate-accuracy-baseline.

🤖 Generated with Claude Code

The gate already skips gracefully when no eval key is set, but a key that
IS set yet rejected by Anthropic (expired/revoked/wrong value) fell through
to the full eval and hard-failed red on the first 401 — turning the weekly
scheduled run red with no actionable PR (a secret-ops gap, not a regression).

Add a zero-token auth preflight (GET /v1/models) inside the keycheck step:
a 401/403 is treated like a missing key (has_key=false → skip + ::warning::
"rotate ATTUNE_CI_EVAL_KEY"). Transient codes (429/5xx/network → 000) fall
through so the real eval still surfaces them. No downstream steps change —
they continue to gate on has_key.

Verified GET /v1/models returns 401 on a bogus key (no token spend).

Refs specs/rag-gate-accuracy-baseline (the corpus rebaseline itself merged
in #51; this hardens the gate against the invalid-key red runs seen since).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 17, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@silversurfer562 silversurfer562 merged commit a771c6c into main Jun 17, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant