Skip to content

fix(eap): Generate minimal NULL-free SQL for map-backed attribute filters#8035

Merged
phacops merged 11 commits into
masterfrom
fix/eap-in-not-in-analyzer-column-mismatch
Jun 16, 2026
Merged

fix(eap): Generate minimal NULL-free SQL for map-backed attribute filters#8035
phacops merged 11 commits into
masterfrom
fix/eap-in-not-in-analyzer-column-mismatch

Conversation

@phacops

@phacops phacops commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

EAP IN / NOT IN filters on optional attributes still fail under the new ClickHouse analyzer, even after #8032:

Code: 10. DB::Exception: Not found column sumOrNullIf(... in(if(...,
arrayElement(...), _CAST(NULL_Nullable(String), 'Nullable(String)')), __set)) ... in block.
There are only columns: ... in(if(..., arrayElement(...), NULL_Nullable(Nothing)), __set) ...

#8032 fixed the isNull(...) occurrence of the existence if(mapContains, arrayElement, NULL), but the in(...) left operand itself still diverges: the analyzer keeps the CAST when naming the projection column but folds it to a bare NULL_Nullable(Nothing) in the computed block, so the extrapolation aggregate column no longer matches its projection name. It's the IN-operand coercion, independent of whether the NULL is bare or typed.

Fix

Build per-key filters without the NULL-laden existence if(...). Since isNull(if(mapContains, arrayElement, NULL)) is just not(mapContains), each comparison uses the raw arrayElement (no NULL) guarded by mapContains:

IN:          and(exists, in(value, set))
NOT IN:      not(and(exists, in(value, set)))
EQUALS:      and(exists, equals(value, v));      = null  -> not(exists)
NOT_EQUALS:  not(and(exists, equals(value, v))); != null -> exists
LIKE:        and(exists, like(value, v))
NOT_LIKE:    not(and(exists, like(value, v)))

mapContains distinguishes an absent key from a stored empty value ('' / 0, which arrayElement also returns for a missing key). Coalesced attributes (transaction, environment, ...) are generated as a NULL-free multiIf over the per-key mapContains (first-present fallthrough), never coalesce(if(..., NULL), ...).

Minimal output

The mapContains guard is only required when the compared literal could equal the column default ('' / 0) — the only value an absent key reads as. For EQUALS / NOT_EQUALS / IN / NOT_IN against a non-default literal it is dropped, so the incident query collapses to its minimal form:

not(in(arrayElement(attributes_string,'sentry.status'), ['ok','cancelled','unknown']))

The guard is kept for comparisons against the default (= '', IN ['', ...]), for LIKE / NOT_LIKE (a pattern may match the default), and for = null / != null.

Verification

  • Unit (TestAnalyzerSafeFilters): pruned forms carry no mapContains / NULL / if(...); guarded forms (default literal, LIKE, null) keep mapContains; coalesced renders as multiIf, not coalesce.
  • E2E (TestEmptyVsAbsentComparison): one row with attr='', one with attr absent — every operator classifies them identically (the empty-vs-absent edge).
  • Local CH is 25.3 (enable_analyzer=1 by default), so the full test_endpoint_time_series / test_endpoint_trace_item_table suites run under the analyzer and pass; CI also runs test_distributed + the 25.3/25.8 matrix.

⚠️ I could not reproduce the original Not found column failure locally — even with the analyzer on over a Distributed table — because it needs the exact prod query (a prepared __set, the full multi-aggregate context) and likely prod scale. The fix removes the exact construct the error points at and is regression-clean under the analyzer, but confirm on staging before re-enabling the analyzer in production; keep the rollback until then.

Fixes SNUBA-B62
Fixes SNUBA-B6C
Fixes SNUBA-A13

… name it

The previous typed-NULL fix (#8032) made the isNull(...) occurrence of the
existence if(mapContains, arrayElement, NULL) consistent, but EAP IN / NOT IN
filters on optional attributes still died under the new ClickHouse analyzer:

  Code: 10. DB::Exception: Not found column sumOrNullIf(... in(if(...,
  arrayElement(...), _CAST(NULL_Nullable(String), 'Nullable(String)')), __set))
  ... in block. There are only columns: ... in(if(..., arrayElement(...),
  NULL_Nullable(Nothing)), __set) ...

The remaining divergence is the in(...) left operand itself: the analyzer keeps
the explicit CAST when it names the projection column but folds it to a bare
NULL_Nullable(Nothing) when it computes the block, so the aggregate column no
longer matches its projection name. This happens regardless of whether the NULL
is bare or typed, because it is the IN-operand coercion that diverges.

Stop feeding the existence if(...) into in(...) entirely. For IN/NOT_IN on
map-backed attributes, compare the raw arrayElement(...) value (which the
existence pass leaves untouched, so no NULL constant survives inside in(...))
and guard it with an explicit mapContains:

  IN:      and(mapContains, in(arrayElement, set))
  NOT IN:  not(and(mapContains, in(arrayElement, set)))

mapContains is what distinguishes an absent key from a stored empty value
('' / 0, which arrayElement also returns for a missing key). This is equivalent
to the old isNull(if(mapContains, arrayElement, NULL)) handling for every value
list these filters accept: the typed array fields only carry scalars, so the
set never contains NULL and the old has(set, NULL) branch was always constant
false. The predicate now contains no NULL literal at all.

Non-IN comparisons are unchanged, as are array/normalized/boolean keys (which
carry no SubscriptableReference and so never produced the foldable if(...)).

Verified semantically against a local ClickHouse via the existing
OP_IN-in-conditional-aggregation e2e tests; the analyzer column-naming itself
must still be replayed with enable_analyzer=1 before re-enabling the analyzer in
production.

Fixes SNUBA-B62
Fixes SNUBA-B6C
Fixes SNUBA-A13
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Agent transcript: https://claudescope.sentry.dev/share/cWD_0FRYO1tqYclqE9vE0yfuaJS-O8RTXHO4feo9IzI
@phacops phacops marked this pull request as ready for review June 15, 2026 20:17
@phacops phacops requested review from a team as code owners June 15, 2026 20:17
Comment thread snuba/web/rpc/common/common.py Outdated
Comment thread snuba/web/rpc/common/common.py Outdated
phacops and others added 3 commits June 15, 2026 15:14
…rayElement)

Generalize the IN/NOT_IN analyzer-safe rewrite to the other per-key
comparison operators. The legacy idiom wrapped every map lookup in
if(mapContains, arrayElement, NULL) and detected an absent key via
isNull(...), which drags a foldable NULL constant into the WHERE /
aggregate-condition predicate — the same shape that breaks the new ClickHouse
analyzer for IN.

isNull(if(mapContains, arrayElement, NULL)) is just not(mapContains), so each
operator is rebuilt from a (value, exists) pair via the shared
_map_backed_operands helper:

  EQUALS:      and(exists, equals(value, v));   = null  -> not(exists)
  NOT_EQUALS:  not(and(exists, equals(value, v))); != null -> exists
  LIKE:        and(exists, like(value, v))
  NOT_LIKE:    not(and(exists, like(value, v)))

exists = mapContains is the only thing that separates an absent key from a
stored empty value (arrayElement reads both as '' / 0), so the empty-vs-absent
distinction is preserved. The predicates now carry no NULL literal.

Coalesced attributes are excluded (_use_map_backed_operands): coalesce relies
on each branch being NULL for a missing key to fall through to the next, which
the if(..., NULL) wrapper provides but a bare arrayElement does not — they keep
the legacy isNull-based handling. Normalized columns / booleans / arrays were
already on the legacy path (no SubscriptableReference).

Adds AST tests (TestComparisonAnalyzerSafe) asserting the new shapes carry no
NULL literal / SubscriptableReference / foldable if(...), and end-to-end
empty-vs-absent tests (TestEmptyVsAbsentComparison) pinning that each operator
classifies a stored empty value and an absent key identically before and after.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…esce

The previous commit excluded coalesced attributes from the analyzer-safe
rewrite, leaving common filter attributes (transaction, environment,
http.method, release, user.email, ...) on the legacy
coalesce(if(mapContains, ..., NULL), ...) path — still feeding a foldable NULL
into the predicate.

Instead of rewriting the generated coalesce, generate the NULL-free form
directly from the AttributeKey for the filter path: a multiIf over the per-key
mapContains, with the existence OR as the guard.

  value  = multiIf(mapContains(k1), arrayElement(k1), ...,
                   mapContains(k_{n-1}), arrayElement(k_{n-1}), arrayElement(kn))
  exists = mapContains(k1) OR ... OR mapContains(kn)

This preserves the first-present-key fallthrough (the multiIf else value is only
reached when every key is absent, where exists is already false) and the
empty-vs-absent distinction, with no NULL anywhere. _map_backed_operands now
takes the AttributeKey and reuses _generate_subscriptable_reference per
coalesced key, so it never builds a coalesce object. The legacy
coalesce(if(..., NULL)) form remains only in SELECT (via
attribute_key_to_expression), where NULL output is wanted.

Coverage: TestComparisonAnalyzerSafe asserts the coalesced shape is a multiIf
with no coalesce / NULL / SubscriptableReference / if(...). The existing
test_coalesce_attributes (writes deprecated ai.model_id, filters on canonical
gen_ai.response.model) exercises the fallthrough end-to-end through this path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
No behavior change. Trim the repeated analyzer-bug explanation to one canonical
description in _map_backed_operands (other helpers reference it), shorten inline
comments, and merge the two near-duplicate AST test classes
(TestInNotInAnalyzerSafe + TestComparisonAnalyzerSafe) into a single
TestAnalyzerSafeFilters sharing one set of walk/assert helpers. Drops the
redundant existence-pass no-op test (its guarantee — no SubscriptableReference /
if(...) in the predicate — is already covered by _assert_clean).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@phacops phacops changed the title fix(eap): Keep existence if(...) out of IN/NOT_IN so the analyzer can name it fix(eap): Build map-backed attribute filters NULL-free for the new analyzer Jun 15, 2026
…e default

The existence guard is only needed to stop an absent key — which arrayElement
reads as the column default ('' / 0) — from matching. For EQUALS / NOT_EQUALS /
IN / NOT_IN against a literal that isn't the default, that can't happen, so the
guard is dead weight. Emit the bare comparison instead, giving the simplest
analyzer-safe SQL:

  sentry.status NOT IN ['ok','cancelled','unknown']
    -> not(in(arrayElement(attributes_string,'sentry.status'), ['ok','cancelled','unknown']))

The guard is kept when the literal IS the default (`= ''`, `= 0`, `IN ['', ...]`)
so absent keys are still excluded, and always for LIKE/NOT_LIKE (a pattern may
match the default) and the `= null` / `!= null` cases. The empty-vs-absent e2e
tests still pass unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@phacops phacops changed the title fix(eap): Build map-backed attribute filters NULL-free for the new analyzer fix(eap): Generate minimal NULL-free SQL for map-backed attribute filters Jun 15, 2026
_map_backed_operands now writes arrayElement / mapContains directly from the
AttributeKey (using PROTO_TYPE_TO_ATTRIBUTE_COLUMN), instead of generating a
SubscriptableReference via _generate_subscriptable_reference and then rewriting
it to arrayElement. Drops the _subscriptable_references_to_array_element helper
and the get_field_existence_expression round-trip for this path.

Identical generated SQL (pure internal cleanup) — the renders and all unit/e2e
tests are unchanged. As a side effect the operands are alias-free by
construction (we never attach an alias), rather than building-with-alias then
stripping; conditions don't need aliases and an alias here would collide with
the SELECT clause's existence if(...) for the same attribute.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@phacops phacops enabled auto-merge (squash) June 16, 2026 00:08
Comment thread snuba/web/rpc/common/common.py
Map-backed filters now build arrayElement(attributes_string, key) directly
instead of a SubscriptableReference, but _apply_virtual_columns only rewrote
SubscriptableReference reads into the value-mapping transform. A filter on a
virtual column (e.g. device.class) was therefore left comparing the raw public
alias key against the unmapped literal, matching nothing and returning zero
rows. Recognize the arrayElement form too so virtual-column filters compare
against the mapped value, as they did before the map-backed rewrite.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Agent transcript: https://claudescope.sentry.dev/share/P8gdU4fKq0U5g2jvhHBcfZ4HxsStWD1OJ1lagpQ_ldo
Comment thread snuba/web/rpc/common/common.py
@phacops phacops disabled auto-merge June 16, 2026 03:55
_comparison_can_match_column_default is always called for map-backed
OP_IN/OP_NOT_IN, and it routes through _scalar_value, which raised
NotImplementedError on val_null (a null filter value or a null array
element). Translation crashed where the non-map-backed path builds in(k, NULL)
fine. Treat val_null as None in _scalar_value so the guard sees a non-default
scalar and SQL is built consistently.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread snuba/web/rpc/common/common.py
Map-backed filters build a mapContains(attributes_string, key) existence guard
from the request attribute name. For a virtual column that key is the
to_column_name, which is absent in storage, so LIKE / NOT_LIKE, null checks, and
default-value guards kept the existence check on the virtual key and matched no
(or every) row. _apply_virtual_columns now rewrites the mapContains form too,
mapping the guard to the backing from_column_name's existence — alongside the
arrayElement value rewrite.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 8d9b451. Configure here.

Comment thread snuba/web/rpc/common/common.py
Comment thread snuba/web/rpc/common/common.py
@phacops phacops merged commit e722ded into master Jun 16, 2026
66 checks passed
@phacops phacops deleted the fix/eap-in-not-in-analyzer-column-mismatch branch June 16, 2026 04:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants