[coordinator] Refactor SchemaUpdate to delegate schema changes to Schema.Builder #2416

Prajwal-banakar · 2026-01-20T12:24:43Z

Purpose

Linked issue: close #2344

The purpose of this change is to refactor the SchemaUpdate class to delegate all schema membership management (columns, primary keys, and auto-increment fields) directly to Schema.Builder. Currently, SchemaUpdate manually maintains these members, which is error-prone and can lead to broken schemas during evolution.

Brief change log

Refactored SchemaUpdate: Removed local lists for columns, primary keys, and auto-increment fields.

Builder Delegation: Modified SchemaUpdate to directly maintain a Schema.Builder instance, ensuring a single source of truth for schema building and validation logic.

Enhanced Schema.Builder: Updated Schema.Builder#fromSchema to correctly adopt all members from an existing schema, including column IDs and metadata.

Added Helper API: Added Schema.Builder#getColumn(String columnName) to allow checking for existing columns within the builder's state.

Tests

Verified the fix by running TableSchemaTest and SchemaUpdateTest to ensure the delegation logic handles schema evolution correctly without IllegalStateException.

Performed a full test suite execution for the fluss-common module: mvn test -pl fluss-common.

Results: 1,491 tests run, 0 failures.

API and Format

This change adds a public helper method getColumn(String columnName) to the Schema.Builder API.

It does not affect the storage format.

Documentation

This change does not introduce a new feature; it is a refactoring of existing internal logic.

wuchong

@Prajwal-banakar thanks for the contribution. The new code looks much clear now. I only left a minor comment.

@loserwang1024 could you have another look?

wuchong · 2026-01-21T10:03:49Z

fluss-common/src/main/java/org/apache/fluss/metadata/Schema.java

+            // 1. Clear current builder state
+            this.columns.clear();
+            this.autoIncrementColumnNames.clear();
+            this.primaryKey = null;


I think it's better make sure the fromSchema is the first API to be called on the builder, rather than silently dropping previous columns. Otherwise, it's hard to debug the problem. You can validate all the members should be empty.

wuchong · 2026-01-21T10:11:30Z

fluss-server/src/main/java/org/apache/fluss/server/coordinator/SchemaUpdate.java

 /** Schema update. */
 public class SchemaUpdate {

-    /** Apply schema changes to the given table info and return the updated schema. */


Keep the original javadoc?

Prajwal-banakar · 2026-01-21T18:43:38Z

Hi @wuchong I have addressed the feedback:

Added checkState validation in Schema.Builder#fromSchema.
Restored the original Javadoc in SchemaUpdate.

I am seeing different failures in the CI (CommitRemoteLogManifestITCase in the first run and TableChangeWatcherTest in the second). I've verified these tests locally and they pass consistently. It seems the CI environment is experiencing some flakiness unrelated to this refactor.
All core metadata tests in fluss-common and fluss-server passed perfectly. Could you please trigger a manual re-run or review the logic?

loserwang1024 · 2026-01-22T04:04:10Z

fluss-common/src/main/java/org/apache/fluss/metadata/Schema.java

-            }
-            this.highestFieldId = new AtomicInteger(schema.highestFieldId);
+            // Check that the builder is empty before adopting from an existing schema
+            checkState(


It's very useful check. Maybe we can move this to fromColumns. Thus, they can share the same check. A

loserwang1024

LGTM

Prajwal-banakar · 2026-01-22T06:45:12Z

Hi @loserwang1024 and @wuchong,

I was about to move the checkState validation to fromColumns as suggested, but I noticed that the strict "empty builder" rule causes failures in existing integration tests, specifically FlussAdminITCase#testAlterTableColumn.

It seems some tests use a pattern where they call .column(...) to set up a template and then call .fromColumns(...) to load the remaining fields. If we enforce that the builder must be empty at the start of fromColumns, these existing tests will break.

I have two thoughts on how to proceed and wanted to get your preference:

Strict for IDs only: Only throw the IllegalStateException if the columns being adopted already have assigned IDs (initialization mode). This protects the schema's "source of truth" while allowing tests to append template columns.

Keep it in fromSchema: Revert to keeping the check only in fromSchema(...). This ensures that when loading a full schema object, the builder is fresh, but doesn't restrict the more flexible fromColumns API.

What do you think is the best approach for the project's long-term architecture?

loserwang1024 · 2026-01-22T08:11:15Z

@Prajwal-banakar Agree with you.

Prajwal-banakar · 2026-01-22T11:55:16Z

Hi @loserwang1024 and @wuchong ,
I have finalized the implementation of Option 1. The validation in fromColumns(...) now only enforces the empty builder rule when adopting columns with assigned IDs (initialization mode).
Verification:
Verified locally with a full test run. Both fluss-common and fluss-server resulted in BUILD SUCCESS. I also verified that FlussAdminITCase now passes perfectly with this refined logic.
Ready for final review

wuchong · 2026-01-23T07:30:44Z

Hi @Prajwal-banakar @loserwang1024 , I think we still need to keep the checkState in fromSchema(..), otherwise, autoIncrementColumnNames and primaryKey are silently overridden. I added a commit to fix. Will merge this PR when CI is passed.

wuchong · 2026-01-23T07:31:45Z

@Prajwal-banakar, besides, just a quick note: please avoid squashing commits while responding to review comments. Keeping the changes in separate commits makes it much easier to track what has been updated since the last review.

When commits are squashed, it becomes difficult to distinguish new changes from previous ones, forcing reviewers to review the entire changes again. This significantly increases review time and may slow down the overall merge process.

The committer will help to squash commits and improve the commit message before merging the PR. So don’t worry about having multiple "fix" commits in PR.

Prajwal-banakar force-pushed the schemaupdate-delegation branch from 5da4435 to f10a940 Compare January 20, 2026 12:48

Prajwal-banakar changed the title ~~[common] Refactor SchemaUpdate to delegate schema changes to Schema.Builder~~ [coordinator] Refactor SchemaUpdate to delegate schema changes to Schema.Builder Jan 20, 2026

wuchong reviewed Jan 21, 2026

View reviewed changes

Refactor SchemaUpdate to delegate schema changes to Schema.Builder

08590d8

Prajwal-banakar force-pushed the schemaupdate-delegation branch from f10a940 to 08590d8 Compare January 21, 2026 17:19

loserwang1024 reviewed Jan 22, 2026

View reviewed changes

Prajwal-banakar force-pushed the schemaupdate-delegation branch from 8d6ff7c to 1b59edd Compare January 22, 2026 05:26

loserwang1024 approved these changes Jan 22, 2026

View reviewed changes

Rerunning CI to bypass flaky test

5806e63

Prajwal-banakar force-pushed the schemaupdate-delegation branch from 1b59edd to 5806e63 Compare January 22, 2026 10:09

Enforce empty state requirement for Schema.Builder#fromSchema method

cba6acb

loserwang1024 merged commit 2185afb into apache:main Jan 23, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[coordinator] Refactor SchemaUpdate to delegate schema changes to Schema.Builder #2416

[coordinator] Refactor SchemaUpdate to delegate schema changes to Schema.Builder #2416

Prajwal-banakar commented Jan 20, 2026

Uh oh!

wuchong left a comment

Uh oh!

wuchong Jan 21, 2026

Uh oh!

wuchong Jan 21, 2026

Uh oh!

Prajwal-banakar commented Jan 21, 2026

Uh oh!

loserwang1024 Jan 22, 2026

Uh oh!

loserwang1024 left a comment

Uh oh!

Prajwal-banakar commented Jan 22, 2026

Uh oh!

loserwang1024 commented Jan 22, 2026

Uh oh!

Prajwal-banakar commented Jan 22, 2026

Uh oh!

wuchong commented Jan 23, 2026

Uh oh!

wuchong commented Jan 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[coordinator] Refactor SchemaUpdate to delegate schema changes to Schema.Builder #2416

[coordinator] Refactor SchemaUpdate to delegate schema changes to Schema.Builder #2416

Conversation

Prajwal-banakar commented Jan 20, 2026

Purpose

Brief change log

Tests

API and Format

Documentation

Uh oh!

wuchong left a comment

Choose a reason for hiding this comment

Uh oh!

wuchong Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

wuchong Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Prajwal-banakar commented Jan 21, 2026

Uh oh!

loserwang1024 Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

loserwang1024 left a comment

Choose a reason for hiding this comment

Uh oh!

Prajwal-banakar commented Jan 22, 2026

Uh oh!

loserwang1024 commented Jan 22, 2026

Uh oh!

Prajwal-banakar commented Jan 22, 2026

Uh oh!

wuchong commented Jan 23, 2026

Uh oh!

wuchong commented Jan 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants