docs: align prompt-injection thresholds in CLAUDE.md and ARCHITECTURE.md to security.ts (v1.6.4.0 catch-up)#1290
Closed
brycealan wants to merge 1 commit intogarrytan:mainfrom
Closed
Conversation
…h-up) CLAUDE.md:290 and ARCHITECTURE.md:159 were missed when WARN was bumped 0.60 → 0.75 in d75402b (v1.6.4.0, "cut Haiku classifier FP from 44% to 23%, gate now enforced", garrytan#1135). browse/src/security.ts:37 has WARN: 0.75 and BROWSER.md:743 was updated alongside that commit; CLAUDE.md and ARCHITECTURE.md still read 0.60. Also adds the SOLO_CONTENT_BLOCK: 0.92 entry to CLAUDE.md (already in security.ts:50 and BROWSER.md:745, missing from CLAUDE.md's threshold table). No code change. No behavior change. Pure doc-vs-code alignment. Verification: $ grep -n "WARN" browse/src/security.ts CLAUDE.md ARCHITECTURE.md BROWSER.md browse/src/security.ts:37: WARN: 0.75, CLAUDE.md:290: - \`WARN: 0.75\` ... ARCHITECTURE.md:159: ...>= \`WARN\` (0.75)... BROWSER.md:743: - \`WARN: 0.75\` ... Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merged
7 tasks
Owner
|
Thanks @brycealan — your fix shipped in v1.30.0.0 (#1391) with credit in the CHANGELOG. Closing since it's already on main. Appreciate the contribution. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
CLAUDE.md and ARCHITECTURE.md were missed when
WARNwas bumped 0.60 → 0.75 in d75402b (v1.6.4.0, #1135).browse/src/security.ts:37hasWARN: 0.75andBROWSER.md:743was updated alongside that commit;CLAUDE.md:290andARCHITECTURE.md:159still read0.60.This PR brings the two stale docs in line with the source-of-truth and the sister doc that was already updated. Also adds the
SOLO_CONTENT_BLOCK: 0.92entry to CLAUDE.md (already insecurity.ts:50andBROWSER.md:745, missing from CLAUDE.md's threshold table).No code change. No behavior change. Pure doc-vs-code alignment.
What's actually true
browse/src/security.ts:35-51is authoritative:The v1.6.4.0 commit message stated the change directly:
Why this matters
0.60— wrong by 20% on a security-critical number.SOLO_CONTENT_BLOCK: 0.92entry is the floor that preventstestsavant/debertafrom solo-firing on phishing-flavored benign content. Operators who don't know it exists can't reason about why a high single-layer score didn'tBLOCK.Verification
All four sources now agree.
Files changed
CLAUDE.md— fixWARN: 0.60→0.75(line 290), addSOLO_CONTENT_BLOCK: 0.92row with the FP-floor rationale.ARCHITECTURE.md— fix inlineWARN (0.60)→(0.75)(line 159).Test plan
grep "0\.60\|0\.75"acrossCLAUDE.md,ARCHITECTURE.md,BROWSER.md, andsecurity.tsshows all four files agree on0.75.How this was found
Surfaced by a multi-artifact audit that fused CLAUDE.md, ARCHITECTURE.md, BROWSER.md, and the security source. The drift is invisible from any single file — each looks self-consistent — but emerges when you cross-check the three docs against the code.
🤖 Generated with Claude Code