feat(governance): pre-LLM privacy gate for the R2 mechanical regime by sammy995 · Pull Request #7 · SantanderAI/mech-gov-framework

sammy995 · 2026-06-23T12:50:05Z

Description

Adds a pre-LLM privacy gate to the R2 mechanical regime. It reversibly tokenizes direct identifiers (EMAIL, PHONE, SSN, PAN, IBAN, IP) so the model never sees raw personal data, and mechanically DEFERs (PRIV_0) a case when residual identifiers exceed a configurable budget or detection fails (fail-closed). This adds a data-minimization dimension (GDPR Art. 5(1)(c) / OWASP LLM06).

The new primitive mirrors the existing hard_gates / i6q shape (config dataclass + result dataclass + pure functions) and slots into the documented pipeline as a new stage:

hard_gates → privacy → E3_commit → CEFL → I6Q → ambiguity_gate → E3_reveal

R2 decisions are driven by risk_score / regulatory_flags, not identities, so tokenization does not affect outcomes. The reversible token map stays in-memory and is never written to DecisionResult / to_dict() or logs — only integer counts (privacy_entities_found, privacy_residual_pii) go into metadata.

This is PR#1 of 2. A follow-up adds a Privacy Leakage Rate metric (alongside CDL/DIU), synthetic narrative data that exercises the gate end-to-end, an optional NER recognizer behind a [privacy-ner] extra, and R1/R3 coverage via an LLMInterface wrapper.

What's included

governance/primitives/privacy_gate.py — PrivacyConfig, PrivacyResult, RegexRecognizer, pluggable PiiRecognizer, scan_and_tokenize, detokenize, privacy_gate.
R2 wiring (pre-LLM stage; off via PrivacyConfig(enabled=False)).
tests/test_privacy_gate.py — 19 offline tests (recognizer, reversible tokenization, residual fail-safe, fail-closed, R2 integration, serialization-safety).
examples/privacy_demo.py (offline) and a CHANGELOG entry.

Related issue

Closes #6

Type of change

New feature (non-breaking change that adds functionality)

Vendor-neutral core

This PR keeps the core vendor-neutral (no cloud SDK; stdlib only)

Checklist

I have signed the CLA (the CLA Assistant bot will prompt external contributors)
My commit messages follow Conventional Commits
ruff check . and black --check . pass
mypy src passes
pytest passes (tests run offline with the mock provider)
I have added/updated tests where relevant
I have updated documentation / CHANGELOG where relevant
No secrets, API keys, internal URLs, or proprietary content are included

Open items for maintainers

Gate id PRIV_0 vs the K0_x taxonomy — happy to rename.
PrivacyConfig.enabled defaults to True (adds two integer metadata keys; no behavior change on the current synthetic dataset, which has no free-text PII). Can default to opt-in if preferred.
New-file copyright header mirrors the repo's existing style.

…point

github-actions · 2026-06-23T12:50:21Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

sammy995 · 2026-06-23T13:35:12Z

I have read the CLA Document and I hereby sign the CLA

sammy995 added 7 commits June 23, 2026 18:01

feat(governance): add regex PII recognizer for privacy gate

af1cab1

feat(governance): reversible PII tokenization for privacy gate

0d4011a

feat(governance): residual fail-safe + fail-closed privacy gate entry…

2cb0054

…point

feat(governance): run privacy gate before the LLM in R2

0eccb1c

docs(changelog): record the R2 privacy gate

a6b8e7d

test(governance): use RFC 5737 documentation IP in privacy gate tests

1056c58

docs(examples): add offline privacy gate demo

edd3a99

sammy995 requested a review from a team as a code owner June 23, 2026 12:50

github-actions Bot added a commit that referenced this pull request Jun 23, 2026

@sammy995 has signed the CLA in #7

420b4ec

Merge branch 'main' into feat/privacy-gate

c6fd9ec

sammy995 requested a review from a team as a code owner June 25, 2026 03:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(governance): pre-LLM privacy gate for the R2 mechanical regime#7

feat(governance): pre-LLM privacy gate for the R2 mechanical regime#7
sammy995 wants to merge 8 commits into
SantanderAI:mainfrom
sammy995:feat/privacy-gate

sammy995 commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026 •

edited

Loading

Uh oh!

sammy995 commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

sammy995 commented Jun 23, 2026

Description

What's included

Related issue

Type of change

Vendor-neutral core

Checklist

Open items for maintainers

Uh oh!

github-actions Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sammy995 commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jun 23, 2026 •

edited

Loading