Skip to content

docs(promo): data_exfiltration follow-up dev.to article draft (5/26 target)#335

Open
Hashevolution wants to merge 1 commit into
mainfrom
docs/v0.3-devto-data-exfiltration
Open

docs(promo): data_exfiltration follow-up dev.to article draft (5/26 target)#335
Hashevolution wants to merge 1 commit into
mainfrom
docs/v0.3-devto-data-exfiltration

Conversation

@Hashevolution
Copy link
Copy Markdown
Owner

Summary

Drafts the data_exfiltration follow-up dev.to article promised to Ali in the 6th-turn LinkedIn DM (2026-05-19). Also fills the 5/26 publication slot vacated by E1 Show HN deferral (HN new-account restriction notice received 2026-05-19, deferring Show HN → 2026-06-16 per tracker PR #334).

Article: "Why output-stage PII masking is the wrong protective surface for data exfiltration in RAG" — ~1,960 words, dev.to medium-length article.

Argument: output-stage PII mask runs after the LLM has already seen the confidential context, so three classes of leak (paraphrase, inference, cross-turn persistence) are no longer stoppable at that stage. Retrieval-stage ABAC is the load-bearing protective surface; output filter remains as defense-in-depth.

Code references cited (all verified against main HEAD):

  • core/policy_engine.pyPolicyEngine.can_retrieve / can_walk / can_emit
  • core/retrieval_engine.py hybrid_search — actual ABAC call site at line ~106
  • docs/ARCHITECTURE.md §3 (Principle 3 + new Principle 8)
  • reports/promo-assets/injection-fixtures-schema-v0.md catalog_context (v1.1)

Ali credit: 6-turn LinkedIn DM exchange named, dev.to profile linked, 5th-turn paraphrase quoted (with note that paraphrasing was done with permission).

Verification

  • Word count: 1,960 (within 1,500-2,000 target for dev.to medium-length)
  • Code-fence balance: 4 balanced 3-backtick python fences (29/37, 93/107, 122/143, 147/157)
  • All cited code paths exist on main (grep confirmed PolicyEngine.can_retrieve at core/policy_engine.py:138, _policy.can_retrieve call site at core/retrieval_engine.py:106-112)
  • Frontmatter: published: false so Joe reviews and toggles before dev.to publish
  • Tags: rag, llm, security, ai (4 — dev.to maximum)
  • No marketing adjectives; technical claim with explicit limits in "Open questions" section
  • Honest disclosure footer matches the pattern of prior dev.to articles

Out of scope

  • Actual dev.to publish — Joe reviews + toggles published: true + submits on 2026-05-26 (or earlier if review is fast and afterglow window from K1+K4 is still favorable)
  • Cross-channel propagation — deliberately NOT cross-posted to LinkedIn/X within 24h (channel separation discipline)
  • launch-tracker URL recording — separate micro-PR after publish
  • HN karma-building (Joe's 3-week plan starting 5/20)

Behavior note

No core/ files touched. No bench numbers required. Module-size gate (20 KB / core) N/A. No trust boundaries modified.


Generated by Claude Code

…arget)

Article: "Why output-stage PII masking is the wrong protective surface
for data exfiltration in RAG"

Fulfills the commitment made to Ali Afana in the 6th-turn LinkedIn DM
(2026-05-19) and fills the 5/26 publication slot vacated by E1 Show HN
deferral (HN new-account restriction received 2026-05-19).

Article structure (~1,960 words):
- TL;DR: output filter runs after the LLM has already seen the
  confidential context; three classes of leak (paraphrase, inference,
  cross-turn) can no longer be stopped at that stage. Retrieval-stage
  ABAC is the load-bearing protective surface; output filter remains
  as defense-in-depth.
- The seductive default: typical RAG-with-RBAC pattern with PII mask
  as the only gate. Why it's tempting (simple pipeline, surgical
  control, well-established vocabulary).
- Three failure modes the output filter can't catch — creative
  paraphrasing, inference, cross-turn/context-window persistence.
- The structural fix: gate at retrieval, before the prompt is built,
  so the model never sees what it shouldn't.
- A working three-stage realization: PolicyEngine.can_retrieve
  (load-bearing) + can_walk (graph traversal) + can_emit (defense-
  in-depth). Code references to core/policy_engine.py and
  core/retrieval_engine.py with verbatim snippets.
- Where it matters most: catalog poisoning — the injection-fixtures
  schema v1.1 catalog_context field encodes this case directly.
- Credit: Ali Afana 6-turn DM exchange, paraphrased with permission;
  Ali's exact 5th-turn wording reframed the article's priority
  ordering.
- Three open questions (PII vs confidential meaning boundary,
  cross-document inference, trace-stage authorization).
- Code references with deep links to main-branch artifacts.

Target publish date: 2026-05-26 (5/26 slot, originally E1 Show HN,
reassigned per launch-tracker PR #334).

Frontmatter set to published: false — Joe reviews and toggles to true
when ready to publish on dev.to.

Tags: rag, llm, security, ai (dev.to allows 4).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants