Skip to content

Validate quality-gate severity-marker regex against anthropics/claude-code-action actual output #183

@cbeaulieu-gt

Description

@cbeaulieu-gt

Summary

The quality-gate's severity-marker grep was authored against assumed marker formats but never validated against the actual output of anthropics/claude-code-action. If the upstream prompt template phrases findings differently, the gate misses every Critical/High finding silently.

Current pattern

BLOCKER_HITS=$(printf '%s' "$BODY" | grep -E -c '🔴 Critical|Critical \(BLOCKING\)|🟡 High-Priority|MAJOR|BLOCKING' || true)

What we don't know

  • Whether anthropics/claude-code-action defaults emit any of these exact strings
  • Whether the action's prompt has changed since the gate was written
  • Whether plain prose like **Critical:**, Severity: Critical, or ⚠️ Critical (different emoji) gets used in real reviews
  • Whether casing varies (major vs MAJOR)
  • Whether language drift in subsequent versions of the action will silently break the gate

Why it matters

The commit message for #179 acknowledges the regex is "intentionally fragile." That's an acceptable design choice if we know the patterns actually fire on real findings. Right now, we don't.

Acceptance criteria

  • Inspect the current default review-prompt template in anthropics/claude-code-action (read the action source / prompt file) and document which exact severity strings it emits

  • Sample at least 5 recent real claude-pr-review comments across this repo (and any other consumer) and grep them with the current pattern; record hit/miss per comment

  • If the pattern misses any real Critical/High findings, expand the regex to cover them (favor over-matching to under-matching for v1)

  • Pin the validated upstream version with a comment in pr-review/action.yml:

    # Severity-marker pattern validated against
    # anthropics/claude-code-action@<sha-or-tag> as of <YYYY-MM-DD>.
    # If the upstream prompt format changes, re-validate.
    
  • Add a fixture-based test that runs the regex against captured real review-comment bodies

Related

Long-term fix is a structured-output contract instead of grep — tracked as a separate follow-up.

Refs #176 #179

🤖 Generated by Claude Code on behalf of @cbeaulieu-gt

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions