feat(triage+eval): hard-disqualifier filter + comp floor gate + Block C why-this-fits#333
Merged
mitwilli-create merged 1 commit intoMay 29, 2026
Conversation
… C why-this-fits Closes the 2026-05-29 recurring dropped ask documented at data/spec-triage-eval-quality-gate-2026-05-29.md. Three independent gates fix three modes of triage/eval drift Mitchell has surfaced repeatedly: 1. NEW lib/triage-hard-disqualifier-filter.mjs — deterministic regex post-pass for 6 hard-SKIP rules (mandatory leetcode/systems-design, Python production engineering as primary screen with A2a exemption, former PM, former EM, 10+yr leadership, pure cloud/devops/mlops). Wired into triage.mjs after Haiku returns, before writeAdvance. 2. NEW lib/comp-floor-gate.mjs — post-eval comp extraction (4 regex variants) + location-aware floor (remote/seattle/sf/nyc). BELOW_FLOOR → score 0.0, status Discarded, DEMOTED audit note. Wired into batch-runner-batches.mjs. Layered floor resolver: config/profile.yml::comp_floor → hardcoded defaults matching batch/triage-prompt.md. 3. NEW Block C — Why This Fits Mitchell in eval reports. batch/batch-prompt.md updated with rubric + output template. 3-4 sentences, ≥3 corpus citations ([cv.md], [article-digest], [second-brain], [hm-intel]), downscore Overall by 0.3 if citations thin. Voice rules per lib/ground-prompt.mjs. Plus override mechanism (data/triage-overrides.json), env kill switches (DISABLE_HARD_DISQUALIFIER_FILTER, DISABLE_COMP_FLOOR_GATE), backfill audit script (scripts/audit-current-queue-against-new-gates.mjs), 3 new test files (35 pass + 1 expected skip), AGENTS.md bug-class entry (llm-judge-soft-enforcement-of-hard-rules), CLAUDE.md Session Notes. Backfill audit against current 21-row queue surfaces 2 comp-below-floor DEMOTE candidates (#2049 Ramp \$195K SF + #2286 Arize \$170K SF) that the prior soft-enforcement triage let through. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
29beaab to
4e05ee8
Compare
Merged
9 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the recurring dropped ask documented at
data/spec-triage-eval-quality-gate-2026-05-29.md. Three independent fixes:lib/triage-hard-disqualifier-filter.mjs— deterministic post-triage regex gate; overrides Haiku verdict to SKIP when JD trips hard-disqualifier rules (Python-production-engineering, former PM/EM, mandatory leetcode, etc.) with archetype-aware exemptions.lib/comp-floor-gate.mjs— post-eval comp extraction; demotes rows below floor (sourced fromconfig/profile.yml::comp_floor, falling back to hardcoded$160K/$180K/$216K/$220Kmatchingbatch/triage-prompt.md).## Block C — Why This Fits Mitchellin eval reports — 3-4 sentences, every claim cites a specific corpus location ([cv.md: <section>],[article-digest: <entry>],[second-brain: <doc>],[hm-intel: <field>]), downscored if thin corpus support.Plus override mechanism (
data/triage-overrides.json), env kill switches (DISABLE_HARD_DISQUALIFIER_FILTER,DISABLE_COMP_FLOOR_GATE), backfill audit script (scripts/audit-current-queue-against-new-gates.mjs), 3 new test files (35 pass + 1 expected skip), AGENTS.md bug-class entry forllm-judge-soft-enforcement-of-hard-rules, CLAUDE.md Session Notes.Pre-fix snapshot (current queue, 2026-05-29)
grep -l 'Block C' reports/*.md→ 28 reports, but all matches are prose mentions inside body text, not section headers. Zero reports contain## Block C — Why This Fits Mitchellas a section.grep -E 'comp_floor|comp floor|salary_floor' config/profile.yml→ file doesn't exist on Mitchell's filesystem (gitignored at.gitignore:192). Same formodes/_profile.md(.gitignore:194). System runs offbatch/triage-prompt.md:25defaults.Post-fix backfill audit (deterministic, $0)
The 2 DEMOTE candidates validate Mitchell's concern that comp-offensive rows were reaching the queue. Both will be demoted on the next batch eval cycle unless added to
data/triage-overrides.json::overrides.Hard-disqualifier shows 0 against the current queue because rule patterns are tuned for full JD text; the audit fell back to report bodies (which only contain JD excerpts in the CV Match section). Rules will fire on fresh JDs at triage time. The audit is conservative — false negatives in backfill, not false positives.
Files changed (13 total)
lib/triage-hard-disqualifier-filter.mjslib/comp-floor-gate.mjstests/triage-hard-disqualifier.test.mjstests/comp-floor-gate.test.mjstests/eval-report-block-c-invariant.test.mjsdata/triage-overrides.json{ "overrides": [] }; append-only audit trailscripts/audit-current-queue-against-new-gates.mjstriage.mjsapplyHardDisqualifiersbetween Haiku return + writeAdvancebatch-runner-batches.mjsgateCompFloorafter report write + location extractorbatch/batch-prompt.mdconfig/profile.example.ymlcomp_floor:documentation block undercompensation:AGENTS.mdllm-judge-soft-enforcement-of-hard-rulesCLAUDE.mdOverride mechanism
data/triage-overrides.jsonis the append-only override file consumed by both gates:{ "overrides": [ { "rowNum": 2049, "gate": "comp-floor", "reason": "pre-IPO equity heavy", "addedAt": "2026-05-29" } ] }Each entry:
{ rowNum, gate: 'hard-disqualifier' | 'comp-floor', reason, addedAt, status?: 'active'|'revoked' }. Never delete entries — flipstatus: revokedto revert. Tests assert gate-specific override scoping.Kill switches (rollback paths)
Test plan
node --test tests/triage-hard-disqualifier.test.mjs— 17/17 passnode --test tests/comp-floor-gate.test.mjs— 17/17 passnode --test tests/eval-report-block-c-invariant.test.mjs— 1/1 pass + 1/1 expected skip (no covered reports yet)node test-all.mjs --quick— 76 pass, 0 fail, 2 unrelated warningsnode --checkclean on all 5 modified/new source filesnode scripts/audit-current-queue-against-new-gates.mjs— captured output above; Mitchell to curate overrides## Block C — Why This Fits Mitchell(verifies the invariant test will fire correctly)🤖 Generated with Claude Code