Skip to content

280 - Making Payload in Satellites and Reference Satellites optional#458

Open
polat-deniz wants to merge 6 commits into
mainfrom
280-sat-payload-optional
Open

280 - Making Payload in Satellites and Reference Satellites optional#458
polat-deniz wants to merge 6 commits into
mainfrom
280-sat-payload-optional

Conversation

@polat-deniz
Copy link
Copy Markdown
Collaborator

@polat-deniz polat-deniz commented May 18, 2026

Description

feat: optional src_payload for Satellites and Reference Satellites

Makes src_payload (as well as src_hashdiff) optional across all satellite macros (sat_v0, sat_v1, ref_sat_v0, ref_sat_v1) for all 9 supported database adapters. This introduces two new ways to use satellites alongside the existing classic approach.


Three ways to use satellites

Mode src_payload src_hashdiff Behavior
A — Classic (unchanged) ≥ 1 column set Hashdiff-based deduplication and insert
B — Single-Attribute (new) exactly 1 column not set Deduplication compares the payload column directly; no hashdiff column in the output
C — No-Payload / Tracking (new) empty / not set not set No deduplication; a record is only inserted when a new hashkey appears

Validation: if src_payload has more than one column and src_hashdiff is not set, dbt will throw a clear compiler error.


What changed

Top-level dispatchers

macros/tables/sat_v0.sql, sat_v1.sql, ref_sat_v0.sql, ref_sat_v1.sql

  • src_payload and src_hashdiff / hashdiff are no longer required parameters
  • A validation check was added for the case where multiple payload columns are used without a hashdiff

Dialect implementations

9 adapters × 4 macro types = 36 files

  • The deduplication column is now derived from the hashdiff (Mode A), the single payload column (Mode B), or not used at all (Mode C)
  • The deduplication CTE is skipped entirely in Mode C
  • The hashdiff column is left out of SELECT lists and duplicate checks when it is not provided
  • Mode B uses the payload column directly in window function comparisons

Backwards compatibility

All existing models that already provide src_payload and src_hashdiff continue to work without any changes.

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

Tested with branch 280-sat-payload-optional of datavault4dbt and branch 280-sat-payload-optional of datavault4dbt-automatic-tests, to make sure that all special cases were covered. Was tested in the CI/CD.

@polat-deniz polat-deniz added the testing To trigger the automated test workflow as internal User. label May 18, 2026
@remoteworkflow
Copy link
Copy Markdown

dbt test combined result: ❌


Details

RESULTS for Synapse:
❌ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Postgres:
❌ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for BigQuery:
❌ dbt-core-tests
❌ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Redshift:
❌ dbt-core-tests
❌ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Snowflake:
❌ dbt-core-tests
❌ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Exasol:
❌ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Fabric:
✅ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Oracle:
❌ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Databricks:
✅ dbt-core-tests
✅ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for SQL Server:
❌ dbt-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Trino:
❌ dbt-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


Link to workflow summary: https://github.com/ScalefreeCOM/datavault4dbt-ci-cd/actions/runs/26023273407

@remoteworkflow remoteworkflow Bot removed the testing To trigger the automated test workflow as internal User. label May 18, 2026
@polat-deniz polat-deniz added the testing To trigger the automated test workflow as internal User. label May 18, 2026
@polat-deniz polat-deniz requested review from tkiehn and tkirschke May 18, 2026 15:32
@remoteworkflow
Copy link
Copy Markdown

dbt test combined result: ❌


Details

RESULTS for Synapse:
✅ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Postgres:
✅ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for BigQuery:
✅ dbt-core-tests
✅ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Redshift:
✅ dbt-core-tests
✅ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Snowflake:
✅ dbt-core-tests
✅ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Exasol:
❌ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Fabric:
✅ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Oracle:
✅ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Databricks:
✅ dbt-core-tests
✅ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for SQL Server:
❌ dbt-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Trino:
✅ dbt-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


Link to workflow summary: https://github.com/ScalefreeCOM/datavault4dbt-ci-cd/actions/runs/26043144400

@remoteworkflow remoteworkflow Bot removed the testing To trigger the automated test workflow as internal User. label May 18, 2026
Copy link
Copy Markdown
Member

@tkirschke tkirschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @polat-deniz , habe dann nicht alle Adapter durch geschaut, aber paar grundsätzliche Änderungswünsche, die entsprechend auf alle Adapter angewendet werden sollten :)

Comment thread macros/tables/bigquery/ref_sat_v0.sql Outdated
WHERE {{ src_ldts }} != {{ datavault4dbt.string_to_timestamp(timestamp_format, end_of_all_times) }}
)
{%- if is_incremental() %}
WHERE {{ src_ldts }} != {{ datavault4dbt.string_to_timestamp(timestamp_format, end_of_all_times) }}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@polat-deniz das ist unnötig so - in der alten Variante existiert die WHERE != end_of_all_times nur, um die HWM nicht zu gefährden (ohne den filter wäre max(ldts) immer end_of_all_times).

In deiner Änderung würde der Filter auch greifen, wenn incremental=true und disable_hwm=true, und würde potentiell den ghost record rausfiltern (dieser wäre eigentlich eh nach initial load drin, aber so eine where würde ich vermeiden). Damit der Ghost Record nicht erneut geinserted wird, reicht die klassische deduplizierung des Satellites: error_key existiert schon im sat, hashdiff hat sich nicht geändert, kein Insert

Also mMn bitte diese Änderung bei allen Satellites wieder rausnehmen

Comment thread macros/tables/bigquery/ref_sat_v0.sql Outdated
),
{%- endif %}

{%- set last_cte = 'source_data' -%}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@polat-deniz das wird hier gesetzt, was erstmal eine gute Idee ist, aber dann müsste es auch in Zeile 97 genutzt werden, sonst ist es unnötig definiert hier, da es in 104 wieder überschrieben wurde.
Wird ähnlich für alle anderen Satellites sein.

Comment thread macros/tables/bigquery/sat_v0.sql Outdated
MAX({{ src_ldts }}) FROM {{ this }}
WHERE {{ src_ldts }} != {{ datavault4dbt.string_to_timestamp(timestamp_format, end_of_all_times) }}
)
AND {{ src_ldts }} != {{ datavault4dbt.string_to_timestamp(timestamp_format, end_of_all_times) }}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

siehe kommentar oben

Comment thread macros/tables/databricks/ref_sat_v0.sql Outdated

{%- set ref_key = datavault4dbt.escape_column_names(ref_key) -%}
{%- if has_hashdiff -%}
{%- set ns.src_hashdiff = datavault4dbt.escape_column_names(ns.src_hashdiff) -%}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent den inneren Teil der If

Comment thread macros/tables/databricks/ref_sat_v0.sql Outdated
WHERE {{ src_ldts }} != {{ datavault4dbt.string_to_timestamp(timestamp_format, end_of_all_times) }}
)
{%- if is_incremental() %}
WHERE {{ src_ldts }} != {{ datavault4dbt.string_to_timestamp(timestamp_format, end_of_all_times) }}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Comment thread macros/tables/databricks/sat_v0.sql Outdated
{%- set src_rsrc = datavault4dbt.escape_column_names(src_rsrc) -%}
{%- set parent_hashkey = datavault4dbt.escape_column_names(parent_hashkey) -%}
{%- if has_hashdiff -%}
{%- set src_hashdiff = datavault4dbt.escape_column_names(src_hashdiff) -%}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent

Comment thread macros/tables/databricks/sat_v0.sql Outdated
MAX({{ src_ldts }}) FROM {{ this }}
WHERE {{ src_ldts }} != {{ datavault4dbt.string_to_timestamp(timestamp_format, end_of_all_times) }}
)
AND {{ src_ldts }} != {{ datavault4dbt.string_to_timestamp(timestamp_format, end_of_all_times) }}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weg

@polat-deniz polat-deniz added the testing To trigger the automated test workflow as internal User. label May 22, 2026
@remoteworkflow
Copy link
Copy Markdown

dbt test combined result: ❌


Details

RESULTS for Synapse:
✅ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Postgres:
✅ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for BigQuery:
✅ dbt-core-tests
✅ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Redshift:
✅ dbt-core-tests
✅ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Snowflake:
✅ dbt-core-tests
✅ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Exasol:
❌ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Fabric:
✅ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Oracle:
❌ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Databricks:
✅ dbt-core-tests
✅ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for SQL Server:
❌ dbt-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Trino:
✅ dbt-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


Link to workflow summary: https://github.com/ScalefreeCOM/datavault4dbt-ci-cd/actions/runs/26281641264

@remoteworkflow remoteworkflow Bot removed the testing To trigger the automated test workflow as internal User. label May 22, 2026
@polat-deniz polat-deniz force-pushed the 280-sat-payload-optional branch from 84c26d2 to eec45ff Compare May 26, 2026 15:02
@polat-deniz polat-deniz added the testing To trigger the automated test workflow as internal User. label May 26, 2026
@polat-deniz
Copy link
Copy Markdown
Collaborator Author

Hey @tkirschke,

I addressed the requested changes and applied them across the affected adapters:

  • removed the additional source-side end_of_all_times filters, the filter remains only in the HWM MAX(ldts) subquery
  • restored the standard is_incremental() and not disable_hwm handling for source_data where applicable
  • adjusted last_cte so it is set based on whether a payload exists and then used consistently downstream
  • cleaned up the indentation in the has_hashdiff blocks

@remoteworkflow
Copy link
Copy Markdown

dbt test combined result: ❌


Details

RESULTS for Synapse:
✅ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Postgres:
✅ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for BigQuery:
✅ dbt-core-tests
✅ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Redshift:
✅ dbt-core-tests
✅ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Snowflake:
✅ dbt-core-tests
✅ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Exasol:
❌ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Fabric:
✅ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Oracle:
✅ dbt-core-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Databricks:
✅ dbt-core-tests
✅ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for SQL Server:
❌ dbt-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


RESULTS for Trino:
✅ dbt-tests
⚠️ dbt-fusion-tests
⚠️ dbt-macro-tests
⚠️ tech-tests


Link to workflow summary: https://github.com/ScalefreeCOM/datavault4dbt-ci-cd/actions/runs/26458084442

@remoteworkflow remoteworkflow Bot removed the testing To trigger the automated test workflow as internal User. label May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants