Skip to content

feat(catalog): support column-level alter table#370

Open
TheR1sing3un wants to merge 3 commits into
apache:mainfrom
TheR1sing3un:pr/column-alter-table
Open

feat(catalog): support column-level alter table#370
TheR1sing3un wants to merge 3 commits into
apache:mainfrom
TheR1sing3un:pr/column-alter-table

Conversation

@TheR1sing3un

Copy link
Copy Markdown
Member

Purpose

Linked issue: close #368

Implement column-level alter table on both FileSystemCatalog and RESTCatalog, and align the SchemaChange JSON wire format with Java Paimon.

Brief change log

  • TableSchema::apply_changes: implement add / rename / drop column, update column type / nullability / comment / position, and update table comment, mirroring Java SchemaManager.generateTableSchema (field-id allocation, FIRST/AFTER/BEFORE/LAST moves, primary/partition-key guards). Operates on top-level columns; reuses ColumnAlreadyExist / ColumnNotExist. The method keeps its single-argument signature — the catalog fills in the table name on column errors.
  • Align SchemaChange JSON with Java Paimon: internally tagged by action, with fieldNames arrays, comment / newDataType / keepNullability / newNullability / newComment fields, referenceFieldName move anchors, and FIRST/AFTER/BEFORE/LAST move types.
  • Add AlterTableRequest + RESTApi::alter_table; implement RESTCatalog::alter_table.

Tests

  • New unit tests in spec::schema_change (Java-format (de)serialization), spec::schema, and catalog::filesystem (add+move, rename syncing PK refs, drop rejecting PK columns, type/nullability/comment updates, reposition, and the ColumnAlreadyExist / ColumnNotExist / TableNotExist error paths).
  • Existing paimon-datafusion alter-table tests pass.

API and Format

Changes the SchemaChange JSON wire format. This is effectively a fix: the previous format was incompatible with Java Paimon, and REST alter_table was never implemented, so no working interaction is broken. No public Rust API signature change (apply_changes keeps its arity).

Documentation

None required.

Out of scope (follow-ups)

Nested struct field paths; UpdateColumnType cast-compatibility validation; Java's dropPrimaryKey / updateColumnDefaultValue.

@TheR1sing3un

Copy link
Copy Markdown
Member Author

Pushed a fix for the integration job: the existing test_alter_table_add_column (in crates/integrations/datafusion/tests/sql_context_tests.rs) asserted that ALTER TABLE ... ADD COLUMN fails because AddColumn was unsupported. Since this PR implements it, the test now asserts the statement succeeds and the new column is appended to the schema. Verified locally: cargo test -p paimon-datafusion --test sql_context_tests test_alter_table passes.

@TheR1sing3un TheR1sing3un force-pushed the pr/column-alter-table branch from 8d214bf to 44fa702 Compare June 10, 2026 02:56
Comment thread crates/paimon/src/spec/schema.rs Outdated
Comment thread crates/paimon/src/spec/schema.rs
Comment thread crates/paimon/src/spec/schema.rs
Comment thread crates/paimon/src/spec/schema.rs
Comment thread crates/paimon/src/spec/schema.rs Outdated
@TheR1sing3un TheR1sing3un requested a review from JingsongLi June 11, 2026 03:25
Comment thread crates/paimon/src/spec/schema.rs Outdated
Comment thread crates/paimon/src/spec/schema.rs
@TheR1sing3un TheR1sing3un requested a review from JingsongLi June 11, 2026 09:39
@JingsongLi

JingsongLi commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Cross-PR note with #340: if merge-engine=aggregation lands before or together with this alter-table work, RenameColumn needs to update aggregation field-scoped options as well.

Right now the rename path rewrites primary_keys, bucket-key, and sequence.field, but it does not rename keys such as fields.<col>.aggregate-function or fields.<col>.list-agg-delimiter. After renaming an aggregation value column, those options would still point at the old column name. Since the final validation here reruns blob / partial-update / first-row checks but not aggregation validation, the stale option can be persisted and the renamed column may silently fall back to the default aggregator/delimiter at read time.

This is not visible against current main by itself, but it should be handled if this PR is intended to compose with #340.

@TheR1sing3un

Copy link
Copy Markdown
Member Author

Thanks — confirmed. The rename path here only ports case 1 of Java's applyRenameColumnsToOptions (primary_keys / bucket-key / sequence.field); it doesn't rewrite the field-scoped fields..* keys (Java case 2: aggregate-function / ignore-retract / distinct / list-agg-delimiter, and case 3: sequence-group / nested-key). Worth noting this is already latent for partial-update tables on main, not only when composing with #340. To keep this PR scoped to the alter-table feature as reviewed, I'll track the field-scoped option rewriting as a follow-up: #383.
@JingsongLi

Implement all column-level SchemaChange variants in
TableSchema::apply_changes (add/rename/drop column, update column
type/nullability/comment/position, update table comment), so
FileSystemCatalog::alter_table and RESTCatalog::alter_table can evolve
table schemas. Changes operate on top-level columns and reuse the
existing ColumnAlreadyExist/ColumnNotExist errors; apply_changes keeps
its single-argument signature (the catalog fills in the table name).

Align the SchemaChange JSON wire format with Java Paimon: internally
tagged by "action", with fieldNames arrays, comment / newDataType /
keepNullability / newNullability / newComment fields, referenceFieldName
move anchors, and FIRST/AFTER/BEFORE/LAST move types. Add
AlterTableRequest and RESTApi::alter_table, and implement
RESTCatalog::alter_table so the REST client can alter tables against a
Paimon REST server.
Address review comments on column-level alter table:
- AddColumn: reject NOT NULL types; reassign nested field IDs from the
  table-wide highest field ID (mirrors Java ReassignFieldId); make
  current_highest_field_id nested-aware
- DropColumn: reject dropping the last field
- UpdateColumnType: reject partition-key and primary-key columns
- UpdateColumnNullability: reject making a primary-key column nullable
- RenameColumn: reject partition columns; propagate renames into
  bucket-key and sequence.field options
…n alter

UpdateColumnType now rejects conversions that are not supported Paimon
casts (ported from Java DataTypeCasts implicit/explicit rules) or not
executable by the arrow read path, honoring disable-explicit-type-casting.
Type changes involving BLOB columns are rejected, and converting nullable
columns to NOT NULL is rejected by default per
alter-column-null-to-not-null.disabled.

apply_changes also re-runs the create-time final-schema validations
(BLOB fields and partial-update options) before persisting.
@TheR1sing3un TheR1sing3un force-pushed the pr/column-alter-table branch from 6973290 to 12d63cf Compare June 15, 2026 04:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support column-level alter table in FileSystemCatalog and RESTCatalog

2 participants