Skip to content

[kv] Fix non-target columns not being nulled during partial update on first insert#2952

Open
binary-signal wants to merge 3 commits intoapache:mainfrom
binary-signal:main
Open

[kv] Fix non-target columns not being nulled during partial update on first insert#2952
binary-signal wants to merge 3 commits intoapache:mainfrom
binary-signal:main

Conversation

@binary-signal
Copy link
Copy Markdown
Contributor

Purpose

Linked issue: close #2843

This pull request fixes a bug where partialUpdate on a Primary Key table during a first insert (where no row exists for that key) incorrectly stored all columns. This resulted in non-target columns retaining their values from the client row instead of being properly set to null.

Brief change log

The root cause was isolated to KvTablet.processUpsert(), which bypassed the RowMerger when oldValueBytes == null (indicating a first insert) and passed the raw row directly to applyInsert(). As a result, PartialUpdater.updateRow(null, partialValue) was never invoked.

The following changes were implemented to resolve this:

  • fluss-server/.../rowmerger/RowMerger.java: Added a default mergeInsert(BinaryValue newValue) method that returns newValue unchanged. This ensures safety and backward compatibility for all existing mergers.
  • fluss-server/.../rowmerger/DefaultRowMerger.java: Overrode mergeInsert within the inner PartialUpdateRowMerger class to call partialUpdater.updateRow(null, newValue). This guarantees non-target columns are nulled out on the first insert.
  • fluss-server/.../kv/KvTablet.java: Updated processUpsert() to execute currentMerger.mergeInsert(currentValue) prior to applyInsert(). This ensures partial updates are accurately applied throughout the initial insert process.

Tests

Added two new unit tests to fluss-server/.../kv/KvTabletTest.java:

  • testPartialUpdateFirstInsertNullsNonTargetColumns: Verifies that non-target columns are set to null on a first insert, even if the client row contains values for them.
  • testPartialUpdateFirstInsertThenUpdate: Verifies the full lifecycle by ensuring that an initial insert with a partial update nulls non-target columns, and a subsequent partial update correctly retains the previously stored values.

API and Format

No. This is an internal data processing logic fix. No public APIs or storage formats are modified.

Documentation

No. This is a bug fix correcting existing behavior. No documentation updates are required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Invalid partial update for PK table

1 participant