Skip to content

fix limited-distinct identity contract#2174

Merged
shangyian merged 11 commits into
DataJunction:mainfrom
shangyian:reduce-pushdown
May 25, 2026
Merged

fix limited-distinct identity contract#2174
shangyian merged 11 commits into
DataJunction:mainfrom
shangyian:reduce-pushdown

Conversation

@shangyian
Copy link
Copy Markdown
Collaborator

@shangyian shangyian commented May 24, 2026

Summary

A metric defined as COUNT(DISTINCT id) carries a derived expression like count(DISTINCT id_distinct_HASH) (it references the hashed component identity). However, the measures SQL output for that metric projects only the bare id column, so external consumers using the derived expression to evaluate against the measures output can't resolve the column.

PR #1931 introduced a grain_alias field on MetricComponent: for plain-column DISTINCT it was set to the bare column name, for complex DISTINCT (e.g., CASE/IF expressions) it stayed equal to the component name (hashed). The measures sql then routed plain-DISTINCT projections under grain_alias as an optimization, with the idea being that the bare column is already in the projection as a GROUP BY grain key, so there was no need to emit a redundant aliased copy.

To fix this, we add the register_limited_component helper function:

  • For plain-column DISTINCT, it emits an extra <bare> AS <component.name> projection so that the combiner rewriters stay consistent.
  • For complex DISTINCT, the behavior is unchanged.

This also adds backwards-compatibility for materializations created before the fix.

Test Plan

  • PR has an associated issue: #
  • make check passes
  • make test shows 100% unit test coverage

Deployment Plan

@netlify
Copy link
Copy Markdown

netlify Bot commented May 24, 2026

Deploy Preview for thriving-cassata-78ae72 canceled.

Name Link
🔨 Latest commit 3ab19cb
🔍 Latest deploy log https://app.netlify.com/projects/thriving-cassata-78ae72/deploys/6a13ddaa7771630008e51762

@shangyian shangyian changed the title Reduce aggressive pushdown fix limited-distinct identity contract May 25, 2026
@shangyian shangyian marked this pull request as ready for review May 25, 2026 05:53
@shangyian shangyian merged commit 429949b into DataJunction:main May 25, 2026
21 checks passed
shangyian added a commit that referenced this pull request May 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant