Fix schema restore failure with partitioned tables and add --data-only flag#775
Open
blakewatters wants to merge 1 commit intoxataio:mainfrom
Open
Fix schema restore failure with partitioned tables and add --data-only flag#775blakewatters wants to merge 1 commit intoxataio:mainfrom
--data-only flag#775blakewatters wants to merge 1 commit intoxataio:mainfrom
Conversation
--data-only flag
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix schema restore failure with partitioned tables and add
--data-onlyflagSummary
--init/--snapshot-tablesfailing on databases with partitioned tables (PostgreSQL 17/18)--data-onlyflag to allow data-only snapshots when schema is pre-populated on the targetBug: Schema restore treats partition errors as fatal
When restoring a schema from a source database that uses declarative partitioning, pgstream's
pg_dump-based schema restore fails with errors like:Root cause: PostgreSQL's
pg_dump -Fpoutput for partitioned tables includes both:CREATE TABLE child PARTITION OF parent ...— which attaches the child to the parentALTER TABLE parent ATTACH PARTITION child ...— which fails because the child is already attached from step 1The error message
"already a partition"is not matched by the existing ignorable error pattern, which only checks for"already exists". These errors fall through to the default case inparseErrorLine()and are classified as critical, causing the entire restore to fail.This is the same class of harmless, idempotent error as
"already exists"— the partition is correctly attached, the redundantATTACH PARTITIONjust fails because it's already done.Fix: Add
strings.Contains(line, "already a partition")to the ignorable error patterns inparseErrorLine(), classified asErrRelationAlreadyExists.Feature:
--data-onlyflag for--snapshot-tablesCurrently,
--snapshot-tablesalways triggers a full schema + data snapshot. There is no way to perform a data-only initial snapshot when the schema already exists on the target.This is needed for workflows where the schema is restored separately (e.g., via
pg_dump --schema-only | psql) to work around the partition bug above, or when the target schema is managed independently.Change: Add
--data-onlyboolean flag topgstream run. When combined with--snapshot-tables, it setssource.postgres.snapshot.modeto"data"instead of"full", skipping the schema restore phase entirely.Without
--data-only, behavior is unchanged.Files Changed
internal/postgres/pg_restore.go"already a partition"to ignorable error patternsinternal/postgres/pg_restore_test.gocmd/root_cmd.go--data-onlyflag on run commandcmd/run_cmd.go--data-onlyto set snapshot mode to"data"How We Found This
We're using pgstream to replicate an 838 GB Cloud SQL database (with ~2,100 tables including heavily partitioned tables with 200+ partitions each) to a local PostgreSQL 18 instance on a GCP VM. Both
--initand--snapshot-tablesfailed on the schema restore phase due to the partition error. The data restore and CDC replication work correctly — only the schema restore is affected.Testing
All existing tests pass. New test cases added:
TestParsePgRestoreOutputErrs/partition already attached error— verifies"already a partition"is classified as ignorableTestParseErrorLine/already a partition— verifiesparseErrorLinereturnsErrRelationAlreadyExistsfor partition errors