Skip to content

[cdc] Fix PostgreSQL timestamp(3) incorrectly mapped to BIGINT in PostgresRecordParser#8222

Open
q8webmaster wants to merge 4 commits into
apache:masterfrom
q8webmaster:fix/postgres-cdc-timestamp-type-mapping
Open

[cdc] Fix PostgreSQL timestamp(3) incorrectly mapped to BIGINT in PostgresRecordParser#8222
q8webmaster wants to merge 4 commits into
apache:masterfrom
q8webmaster:fix/postgres-cdc-timestamp-type-mapping

Conversation

@q8webmaster

@q8webmaster q8webmaster commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Problem

When syncing a PostgreSQL table that contains a timestamp(n) column with n <= 3 (millisecond precision), the Paimon CDC job enters a continuous restart loop, crashing within the first minute with:

java.lang.UnsupportedOperationException:
  Cannot convert field <col> from type TIMESTAMP(3) NOT NULL to BIGINT NOT NULL

The exception fires on the very first CDC record, so the job never makes useful progress.

Root cause

There are two code paths that derive a Paimon DataType for a PostgreSQL column:

Path Input Result
JDBC (startup / table creation via retrieveSchema) DatabaseMetaData type "timestamp", scale 3 TIMESTAMP(3)
PostgresRecordParser.extractFieldType (CDC record parsing) Debezium int64 / io.debezium.time.Timestamp BIGINT

PostgreSQL timestamp(n) with n <= 3 is encoded by Debezium using the io.debezium.time.Timestamp logical type (epoch-millis, int64). The extractFieldType int64 case only handles MicroTimestamp and MicroTime, so io.debezium.time.Timestamp silently falls through to return DataTypes.BIGINT().

The table is created with TIMESTAMP(3) from the JDBC path. When the first CDC record arrives, extractFieldType returns BIGINT, parseSchemaChange emits an UpdateColumnType(TIMESTAMP(3) → BIGINT) event, canConvert returns EXCEPTION, and the job crashes. Because this happens on every first record, the job loops continuously.

Fix

Add the missing branch to the int64 case in extractFieldType:

} else if (Timestamp.SCHEMA_NAME.equals(field.name())) {
    return DataTypes.TIMESTAMP(3);
}

io.debezium.time.Timestamp represents epoch-milliseconds, so TIMESTAMP(3) is the correct Paimon type — matching the JDBC path exactly. With both paths in agreement, parseSchemaChange sees no field type difference and no schema evolution fires.

Prior art

PR #6239 fixed a structurally identical bug for DECIMAL (bytes / org.apache.kafka.connect.data.Decimal falling through to BYTES). This PR applies the same pattern to Timestamp.

Changes

  • PostgresRecordParser.java: add Timestamp.SCHEMA_NAME branch to int64 case in extractFieldType
  • PostgresRecordParserTest.java: three unit tests covering io.debezium.time.TimestampTIMESTAMP(3), MicroTimestampTIMESTAMP(6) (regression), and plain int64BIGINT (regression)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant