[cdc] Fix PostgreSQL timestamp(3) incorrectly mapped to BIGINT in PostgresRecordParser#8222
Open
q8webmaster wants to merge 4 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When syncing a PostgreSQL table that contains a
timestamp(n)column withn <= 3(millisecond precision), the Paimon CDC job enters a continuous restart loop, crashing within the first minute with:The exception fires on the very first CDC record, so the job never makes useful progress.
Root cause
There are two code paths that derive a Paimon
DataTypefor a PostgreSQL column:retrieveSchema)DatabaseMetaDatatype"timestamp", scale3TIMESTAMP(3)✅PostgresRecordParser.extractFieldType(CDC record parsing)int64/io.debezium.time.TimestampBIGINT❌PostgreSQL
timestamp(n)withn <= 3is encoded by Debezium using theio.debezium.time.Timestamplogical type (epoch-millis, int64). TheextractFieldTypeint64case only handlesMicroTimestampandMicroTime, soio.debezium.time.Timestampsilently falls through toreturn DataTypes.BIGINT().The table is created with
TIMESTAMP(3)from the JDBC path. When the first CDC record arrives,extractFieldTypereturnsBIGINT,parseSchemaChangeemits anUpdateColumnType(TIMESTAMP(3) → BIGINT)event,canConvertreturnsEXCEPTION, and the job crashes. Because this happens on every first record, the job loops continuously.Fix
Add the missing branch to the
int64case inextractFieldType:io.debezium.time.Timestamprepresents epoch-milliseconds, soTIMESTAMP(3)is the correct Paimon type — matching the JDBC path exactly. With both paths in agreement,parseSchemaChangesees no field type difference and no schema evolution fires.Prior art
PR #6239 fixed a structurally identical bug for
DECIMAL(bytes / org.apache.kafka.connect.data.Decimalfalling through toBYTES). This PR applies the same pattern toTimestamp.Changes
PostgresRecordParser.java: addTimestamp.SCHEMA_NAMEbranch toint64case inextractFieldTypePostgresRecordParserTest.java: three unit tests coveringio.debezium.time.Timestamp→TIMESTAMP(3),MicroTimestamp→TIMESTAMP(6)(regression), and plainint64→BIGINT(regression)