docs(promo): data_exfiltration follow-up dev.to article draft (5/26 target)#335
Open
Hashevolution wants to merge 1 commit into
Open
docs(promo): data_exfiltration follow-up dev.to article draft (5/26 target)#335Hashevolution wants to merge 1 commit into
Hashevolution wants to merge 1 commit into
Conversation
…arget) Article: "Why output-stage PII masking is the wrong protective surface for data exfiltration in RAG" Fulfills the commitment made to Ali Afana in the 6th-turn LinkedIn DM (2026-05-19) and fills the 5/26 publication slot vacated by E1 Show HN deferral (HN new-account restriction received 2026-05-19). Article structure (~1,960 words): - TL;DR: output filter runs after the LLM has already seen the confidential context; three classes of leak (paraphrase, inference, cross-turn) can no longer be stopped at that stage. Retrieval-stage ABAC is the load-bearing protective surface; output filter remains as defense-in-depth. - The seductive default: typical RAG-with-RBAC pattern with PII mask as the only gate. Why it's tempting (simple pipeline, surgical control, well-established vocabulary). - Three failure modes the output filter can't catch — creative paraphrasing, inference, cross-turn/context-window persistence. - The structural fix: gate at retrieval, before the prompt is built, so the model never sees what it shouldn't. - A working three-stage realization: PolicyEngine.can_retrieve (load-bearing) + can_walk (graph traversal) + can_emit (defense- in-depth). Code references to core/policy_engine.py and core/retrieval_engine.py with verbatim snippets. - Where it matters most: catalog poisoning — the injection-fixtures schema v1.1 catalog_context field encodes this case directly. - Credit: Ali Afana 6-turn DM exchange, paraphrased with permission; Ali's exact 5th-turn wording reframed the article's priority ordering. - Three open questions (PII vs confidential meaning boundary, cross-document inference, trace-stage authorization). - Code references with deep links to main-branch artifacts. Target publish date: 2026-05-26 (5/26 slot, originally E1 Show HN, reassigned per launch-tracker PR #334). Frontmatter set to published: false — Joe reviews and toggles to true when ready to publish on dev.to. Tags: rag, llm, security, ai (dev.to allows 4).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Drafts the
data_exfiltrationfollow-up dev.to article promised to Ali in the 6th-turn LinkedIn DM (2026-05-19). Also fills the 5/26 publication slot vacated by E1 Show HN deferral (HN new-account restriction notice received 2026-05-19, deferring Show HN → 2026-06-16 per tracker PR #334).Article: "Why output-stage PII masking is the wrong protective surface for data exfiltration in RAG" — ~1,960 words, dev.to medium-length article.
Argument: output-stage PII mask runs after the LLM has already seen the confidential context, so three classes of leak (paraphrase, inference, cross-turn persistence) are no longer stoppable at that stage. Retrieval-stage ABAC is the load-bearing protective surface; output filter remains as defense-in-depth.
Code references cited (all verified against
mainHEAD):core/policy_engine.py—PolicyEngine.can_retrieve/can_walk/can_emitcore/retrieval_engine.pyhybrid_search— actual ABAC call site at line ~106docs/ARCHITECTURE.md§3 (Principle 3 + new Principle 8)reports/promo-assets/injection-fixtures-schema-v0.mdcatalog_context(v1.1)Ali credit: 6-turn LinkedIn DM exchange named, dev.to profile linked, 5th-turn paraphrase quoted (with note that paraphrasing was done with permission).
Verification
grepconfirmedPolicyEngine.can_retrieveatcore/policy_engine.py:138,_policy.can_retrievecall site atcore/retrieval_engine.py:106-112)published: falseso Joe reviews and toggles before dev.to publishrag, llm, security, ai(4 — dev.to maximum)Out of scope
published: true+ submits on 2026-05-26 (or earlier if review is fast and afterglow window from K1+K4 is still favorable)Behavior note
No
core/files touched. No bench numbers required. Module-size gate (20 KB / core) N/A. No trust boundaries modified.Generated by Claude Code