Skip to content

feat(triage+eval): hard-disqualifier filter + comp floor gate + Block C why-this-fits#333

Merged
mitwilli-create merged 1 commit into
mainfrom
feat/triage-eval-quality-gate-2026-05-29-claude-10f709f7
May 29, 2026
Merged

feat(triage+eval): hard-disqualifier filter + comp floor gate + Block C why-this-fits#333
mitwilli-create merged 1 commit into
mainfrom
feat/triage-eval-quality-gate-2026-05-29-claude-10f709f7

Conversation

@mitwilli-create
Copy link
Copy Markdown
Owner

Summary

Closes the recurring dropped ask documented at data/spec-triage-eval-quality-gate-2026-05-29.md. Three independent fixes:

  1. NEW lib/triage-hard-disqualifier-filter.mjs — deterministic post-triage regex gate; overrides Haiku verdict to SKIP when JD trips hard-disqualifier rules (Python-production-engineering, former PM/EM, mandatory leetcode, etc.) with archetype-aware exemptions.
  2. NEW lib/comp-floor-gate.mjs — post-eval comp extraction; demotes rows below floor (sourced from config/profile.yml::comp_floor, falling back to hardcoded $160K/$180K/$216K/$220K matching batch/triage-prompt.md).
  3. NEW ## Block C — Why This Fits Mitchell in eval reports — 3-4 sentences, every claim cites a specific corpus location ([cv.md: <section>], [article-digest: <entry>], [second-brain: <doc>], [hm-intel: <field>]), downscored if thin corpus support.

Plus override mechanism (data/triage-overrides.json), env kill switches (DISABLE_HARD_DISQUALIFIER_FILTER, DISABLE_COMP_FLOOR_GATE), backfill audit script (scripts/audit-current-queue-against-new-gates.mjs), 3 new test files (35 pass + 1 expected skip), AGENTS.md bug-class entry for llm-judge-soft-enforcement-of-hard-rules, CLAUDE.md Session Notes.

Pre-fix snapshot (current queue, 2026-05-29)

=== Gap A — score distribution ===
Total ranked rows: 21
scores <4.0: 0
scores 4.0-4.4: 15
scores >=4.5: 6
min: 4 median: 4.3 max: 4.65

grep -l 'Block C' reports/*.md → 28 reports, but all matches are prose mentions inside body text, not section headers. Zero reports contain ## Block C — Why This Fits Mitchell as a section.

grep -E 'comp_floor|comp floor|salary_floor' config/profile.yml → file doesn't exist on Mitchell's filesystem (gitignored at .gitignore:192). Same for modes/_profile.md (.gitignore:194). System runs off batch/triage-prompt.md:25 defaults.

Post-fix backfill audit (deterministic, $0)

=== Backfill audit — apply-now-queue × new gates ===
Spec: data/spec-triage-eval-quality-gate-2026-05-29.md
Total ranked rows scanned: 21

  🛑 Would SKIP via hard-disqualifier:  0
  ⬇️  Would DEMOTE for comp-below-floor: 2
  ❓ Comp UNKNOWN (no auto-demote):     8
  ✅ Would PASS all gates:               11

--- Comp-below-floor DEMOTE candidates (review for override) ---
  #2049 Ramp (4.1) — AI Operations Specialist — Agentic Workflows  [sf floor $216k, found $195k]
  #2286 Arize AI (4.05) — AI Product Manager  [sf floor $216k, found $170k]

The 2 DEMOTE candidates validate Mitchell's concern that comp-offensive rows were reaching the queue. Both will be demoted on the next batch eval cycle unless added to data/triage-overrides.json::overrides.

Hard-disqualifier shows 0 against the current queue because rule patterns are tuned for full JD text; the audit fell back to report bodies (which only contain JD excerpts in the CV Match section). Rules will fire on fresh JDs at triage time. The audit is conservative — false negatives in backfill, not false positives.

Files changed (13 total)

Kind File Why
NEW lib lib/triage-hard-disqualifier-filter.mjs 6-rule deterministic post-pass + override + env kill
NEW lib lib/comp-floor-gate.mjs comp extraction + location floor + override + env kill
NEW test tests/triage-hard-disqualifier.test.mjs 17 tests
NEW test tests/comp-floor-gate.test.mjs 17 tests
NEW test tests/eval-report-block-c-invariant.test.mjs INVARIANT; skips until first post-PR fresh report
NEW data data/triage-overrides.json empty { "overrides": [] }; append-only audit trail
NEW script scripts/audit-current-queue-against-new-gates.mjs backfill audit; deterministic / $0
MOD triage.mjs wire applyHardDisqualifiers between Haiku return + writeAdvance
MOD batch-runner-batches.mjs wire gateCompFloor after report write + location extractor
MOD batch/batch-prompt.md add Block C — Why This Fits Mitchell rubric + output template
MOD config/profile.example.yml add comp_floor: documentation block under compensation:
MOD AGENTS.md new bug class llm-judge-soft-enforcement-of-hard-rules
MOD CLAUDE.md Session Notes for 2026-05-29 ship

Override mechanism

data/triage-overrides.json is the append-only override file consumed by both gates:

{
  "overrides": [
    { "rowNum": 2049, "gate": "comp-floor", "reason": "pre-IPO equity heavy", "addedAt": "2026-05-29" }
  ]
}

Each entry: { rowNum, gate: 'hard-disqualifier' | 'comp-floor', reason, addedAt, status?: 'active'|'revoked' }. Never delete entries — flip status: revoked to revert. Tests assert gate-specific override scoping.

Kill switches (rollback paths)

# Disable hard-disqualifier filter (Haiku verdict wins)
launchctl setenv DISABLE_HARD_DISQUALIFIER_FILTER true

# Disable comp floor gate (no demotion)
launchctl setenv DISABLE_COMP_FLOOR_GATE true

# Hard rollback if PR already on main
git revert <merge-commit-sha> && git push origin main

Test plan

  • node --test tests/triage-hard-disqualifier.test.mjs — 17/17 pass
  • node --test tests/comp-floor-gate.test.mjs — 17/17 pass
  • node --test tests/eval-report-block-c-invariant.test.mjs — 1/1 pass + 1/1 expected skip (no covered reports yet)
  • node test-all.mjs --quick — 76 pass, 0 fail, 2 unrelated warnings
  • node --check clean on all 5 modified/new source files
  • node scripts/audit-current-queue-against-new-gates.mjs — captured output above; Mitchell to curate overrides
  • /deploy-verify Phases 6-9 green (post-merge, on Mitchell's invocation)
  • First post-PR fresh ingest cycle produces ## Block C — Why This Fits Mitchell (verifies the invariant test will fire correctly)

🤖 Generated with Claude Code

… C why-this-fits

Closes the 2026-05-29 recurring dropped ask documented at
data/spec-triage-eval-quality-gate-2026-05-29.md. Three independent gates fix
three modes of triage/eval drift Mitchell has surfaced repeatedly:

1. NEW lib/triage-hard-disqualifier-filter.mjs — deterministic regex post-pass
   for 6 hard-SKIP rules (mandatory leetcode/systems-design, Python production
   engineering as primary screen with A2a exemption, former PM, former EM,
   10+yr leadership, pure cloud/devops/mlops). Wired into triage.mjs after
   Haiku returns, before writeAdvance.

2. NEW lib/comp-floor-gate.mjs — post-eval comp extraction (4 regex variants)
   + location-aware floor (remote/seattle/sf/nyc). BELOW_FLOOR → score 0.0,
   status Discarded, DEMOTED audit note. Wired into batch-runner-batches.mjs.
   Layered floor resolver: config/profile.yml::comp_floor → hardcoded
   defaults matching batch/triage-prompt.md.

3. NEW Block C — Why This Fits Mitchell in eval reports. batch/batch-prompt.md
   updated with rubric + output template. 3-4 sentences, ≥3 corpus citations
   ([cv.md], [article-digest], [second-brain], [hm-intel]), downscore Overall
   by 0.3 if citations thin. Voice rules per lib/ground-prompt.mjs.

Plus override mechanism (data/triage-overrides.json), env kill switches
(DISABLE_HARD_DISQUALIFIER_FILTER, DISABLE_COMP_FLOOR_GATE), backfill audit
script (scripts/audit-current-queue-against-new-gates.mjs), 3 new test files
(35 pass + 1 expected skip), AGENTS.md bug-class entry
(llm-judge-soft-enforcement-of-hard-rules), CLAUDE.md Session Notes.

Backfill audit against current 21-row queue surfaces 2 comp-below-floor
DEMOTE candidates (#2049 Ramp \$195K SF + #2286 Arize \$170K SF) that the
prior soft-enforcement triage let through.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@mitwilli-create mitwilli-create force-pushed the feat/triage-eval-quality-gate-2026-05-29-claude-10f709f7 branch from 29beaab to 4e05ee8 Compare May 29, 2026 15:22
@mitwilli-create mitwilli-create merged commit b47aafa into main May 29, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant