Handle inherited tables with different schemas#3936
Handle inherited tables with different schemas#3936lukejudd-lux wants to merge 4 commits intoPeerDB-io:mainfrom
Conversation
heavycrystal
left a comment
There was a problem hiding this comment.
The code itself seems fine, but I'm confused as to what problem this solves. As you noted, Postgres doesn't allow child tables to have a different type for the same column. And since the destination table itself is still using the type of the parent for this column, if the child table is BIGINT instead of TEXT, things can break in a different way. So it's just pushing the error to a different place
|
@heavycrystal after looking a little closer its not that they are different types at source Its that the inherited table has more columns than the parent (as can happen under relkind 'r'). So the columns are out of alignment at parsing time. I've updated the test to address the real issue, the code changes still address the actual problem. |
ef4ed4e to
902c0b9
Compare
|
@heavycrystal after more testing I have added another commit to this change When replicating tables that use PostgreSQL table inheritance (rare I know sorry), child-specific columns were incorrectly detected as schema changes on the parent table. This triggered spurious Root cause: Fix: When processing a RELATION message from a child table, query IMPACTFor anybody with inherited table setups... I've tested this locally and it seems to keep everything in sync. Ultimately, this query should return the same thing on source and destination (with the same schemas if you mirror the whole thing) Sorry Sai...
|
|
@lukejudd-lux child tables having mismatched schemas from the parent table from inheritance is something we've hit before. It is a niche feature that is tricky to support reliably during CDC. As an example, it's possible for a column to be added and then dropped almost immediately during CDC. This should still create the column on the destination table and populate a few rows with the data. Your change makes it such that if |
|
@heavycrystal yeh I can see what you are saying The attributes table can never be WAL consistent. In the example you provide, the current approach would not even include this column (and any data) at the destination.
I can see tradeoffs any way I look at it. In our setup the likelihood of this happening is close to zero, DDL is always intentional in production. Previously we have set up streaming query replication on this specific parent table and a CDC mirror on the inherited tables, which seems to avoid this issue entirely. |
Summary
Fixes Postgres CDC tuple decoding failures and cross-child data corruption when syncing old-style
INHERITStables where children have additional columns beyond the parent's schema.Problem
When a child table's
RelationMessagearrived,processMessagemutatedmsg.RelationIDto the parent's ID, causing the child's RelationMessage to be stored under the parent's key inrelationMessageMapping. This had two consequences:stripe_paymentwith 20 columns includingfk_stripe_payment) had its tuple decoded using the parent's or another child's RelationMessage (e.g. 18 columns), causing columns to be decoded with wrong types.Solution
msg.RelationID— store each child's RelationMessage under its own relation ID. Use the parent's ID only for the "do we care?" check and table/schema lookups.processInsertMessage/processUpdateMessage/processDeleteMessage, use the child's actual relation ID (actualRelID) for RelationMessage lookup so column types match the WAL tuple. The parent's ID is used only for table name and destination routing.Testing
TestInheritedTableWithExtraColumns: parentpayment(4 columns), childstripe_payment(6 columns includingfk_stripe_paymentTEXT andstripe_account_idTEXT). Inserts from child with Stripe charge IDch_3T1KxCFUtwYrZPVC0M6OeTpy— verifies extra columns are decoded as text, not misinterpreted due to positional mismatch.TestMultipleInheritedChildrenNoContamination: two children (stripe_payment,paypal_payment) with different extra columns. Verifies inserts from both are decoded using their own RelationMessages without cross-child contamination.Fixes #3935