Skip to content

fix(security): mask unaliased PII expressions (D2 #125 follow-up)#126

Merged
brownjuly2003-code merged 1 commit into
mainfrom
fix/masking-unaliased-pii
Jun 30, 2026
Merged

fix(security): mask unaliased PII expressions (D2 #125 follow-up)#126
brownjuly2003-code merged 1 commit into
mainfrom
fix/masking-unaliased-pii

Conversation

@brownjuly2003-code

Copy link
Copy Markdown
Owner

What

Closes the unaliased-expression PII leak flagged as a known follow-up in
#125 — the last D2 masking bypass shape.

SELECT upper(email) FROM users_enriched      -- output column: upper(email)
SELECT email || '' FROM users_enriched        -- output column: (email || '')

An unaliased expression has no alias_or_name, so _projection_source_columns
skipped it entirely; the result column kept DuckDB's rendered name
(upper(email), (email || '')), which never matched a masking rule field, so
the PII was returned cleartext with no X-PII-Masked signal. Reproduced
against live DuckDB: was_masked=False.

A name-based fix is impossible because sqlglot's rendering does not reproduce
DuckDB's column naming — UPPER(email) vs upper(email), email || '' vs
(email || '') (case and parenthesisation differ).

Fix

Align projections positionally to the real result keys (projection order ==
result-column order), which mask_query_results already has from the rows. Each
projection is then keyed by its true DuckDB output name:

This completes the D2 masking surface: aliased renames, subquery/CTE renames,
SELECT *-blinded inner renames (#125), and now unaliased expressions are all
masked.

Verification

  • Counterfactual (stash the fix, keep the tests): the unaliased test fails
    on post-fix: close 2 residual defects from auditing #123/#124 (1 MEDIUM PII leak + 1 LOW DDL race) #125 code (SELECT upper(email)masked=False, cleartext) and
    passes with the fix.
  • No over-masking: a new invariant test pins that a directly-named non-PII
    column alongside a PII one (SELECT email, user_id) keeps user_id untouched
    — positional alignment keys each projection to its own column.
  • Full unit suite green; ruff + ruff format clean; mypy --strict clean.
    Independently reproduced closed against live DuckDB; top-level SELECT *
    name-match and all prior D2 shapes still pass.
  • masking.py mutation score is unaffected (the mutation harness covers the
    masking primitives, not the lineage/projection resolver).

🤖 Generated with Claude Code

Closes the unaliased-expression PII leak flagged as a known follow-up in #125.
An unaliased expression over PII — `SELECT upper(email) FROM users_enriched`
(output column `upper(email)`), `SELECT email || '' ...` (`(email || '')`) — has
no `alias_or_name`, so `_projection_source_columns` skipped it; the result
column kept DuckDB's rendered name, which never matched a rule field, and the
PII was returned cleartext with no X-PII-Masked signal (reproduced against live
DuckDB: was_masked=False). A name-based fix is impossible because sqlglot's
rendering does not reproduce DuckDB's column naming (UPPER(email) vs
upper(email); case and parenthesisation differ).

Align projections positionally to the real result keys (projection order ==
result-column order), which mask_query_results already has from the rows, so
each projection is keyed by its true output name: aliased/bare projections keep
the deep-lineage union shallow resolution (incl. the #125 SELECT*-blinded star
leaf), an unaliased expression is masked by the columns it references
(upper(email) -> email), and a top-level SELECT * / parse failure / count
mismatch still falls back to name-matching. Completes the D2 masking surface.

Counterfactual: the unaliased test fails on post-#125 code (cleartext) and
passes with the fix; a new invariant test pins that a directly-named non-PII
column (SELECT email, user_id) is not over-masked. Full unit suite 1513 passed;
ruff/format/mypy --strict clean. masking.py mutation score unaffected (the
harness covers the masking primitives, not the projection resolver).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

DORA Metrics

  • Window: last 30 days
  • Branch: main
  • Deployment frequency: 137 total / 31.97 per week
  • Lead time for changes: avg 0.31h / median 0.0h
  • Change failure rate: 79.56% (109/137)
  • MTTR: 0.25h across 3 incident(s)

@brownjuly2003-code brownjuly2003-code merged commit 2d46916 into main Jun 30, 2026
23 checks passed
@brownjuly2003-code brownjuly2003-code deleted the fix/masking-unaliased-pii branch June 30, 2026 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants