fix(dv2): ClickHouse customer MDM views admit all source conventions (audit #M1)#108
Merged
Merged
Conversation
…(audit #12) The five ClickHouse bv_customer_mdm__<branch> views filtered hub rows by a hard-coded record_source = '1c__<branch>', silently dropping every OLTP- and X5-promoted customer (record_source pg_ops__/x5__) from the MDM result. The PostgreSQL port fixed this in #99 (split_part); the ClickHouse half was left behind as a CH/PG split-brain (audit_28_06_26 #M1), and the postgres_oltp README already documented the splitByString admission the views did not implement. Swap the hub admission filter to splitByString('__', record_source)[2] = '<branch>' (the ClickHouse equivalent of the PG split_part fix), so a customer promoted under any source convention is integrated. View DDL only - customer_hk = md5(business_key) is identical across loaders, so no data migration. Add tests/unit/test_dv2_business_vault_ddl.py: the CH business-vault views had no parse or coverage at all. Pins that every view parses under sqlglot's ClickHouse dialect and that all five MDM views admit by branch, never by '1c__' again. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
DORA Metrics
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
The five ClickHouse
bv_customer_mdm__<branch>views filtered hub rows by ahard-coded
record_source = '1c__<branch>', silently dropping every OLTP- andX5-promoted customer (record_source
pg_ops__/x5__) from the MDM result.The PostgreSQL port fixed this in #99 (
split_part); the ClickHouse half wasleft behind — a CH/PG split-brain (audit_28_06_26 #M1). The
postgres_oltpREADME already documented the
splitByString('__', record_source)[2]admissionthat the views did not actually implement.
Change
bv_customer_mdm__{msk,spb,ekb,dxb,ala}.sql): swap the hubadmission filter to
splitByString('__', record_source)[2] = '<branch>'(theClickHouse equivalent of the PG
split_partfix). View DDL only —customer_hk = md5(business_key)is identical across loaders, so no datamigration. Per-branch PII jurisdiction (RBAC on the view) is preserved.
tests/unit/test_dv2_business_vault_ddl.py: the CH business-vaultviews had no parse or coverage at all. Pins that every view parses under
sqlglot's ClickHouse dialect and that all five MDM views admit by branch,
never by
1c__again — symmetric to the PG regression test.Verification
DDL); sqlglot parses all 7 CH business-vault views.
clickhouse local, no Docker): seeded CUST-A via1c__mskand CUST-B viapg_ops__msk; the fixed view returns both(2 rows), the old hard-coded filter returns only CUST-A (1 row). The
counterfactual confirms the fix admits the OLTP-promoted customer the bug
dropped — the ClickHouse mirror of the live-PG proof in fix(dv2): port customer MDM views to PostgreSQL with branch-agnostic hub filter (audit #12) #99.