[kv] Fix non-target columns not being nulled during partial update on first insert#2952
Open
binary-signal wants to merge 3 commits intoapache:mainfrom
Open
[kv] Fix non-target columns not being nulled during partial update on first insert#2952binary-signal wants to merge 3 commits intoapache:mainfrom
binary-signal wants to merge 3 commits intoapache:mainfrom
Conversation
fix invalid partial update for pk table gh issue apache#2843
fix formatting
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Linked issue: close #2843
This pull request fixes a bug where
partialUpdateon a Primary Key table during a first insert (where no row exists for that key) incorrectly stored all columns. This resulted in non-target columns retaining their values from the client row instead of being properly set to null.Brief change log
The root cause was isolated to
KvTablet.processUpsert(), which bypassed theRowMergerwhenoldValueBytes == null(indicating a first insert) and passed the raw row directly toapplyInsert(). As a result,PartialUpdater.updateRow(null, partialValue)was never invoked.The following changes were implemented to resolve this:
fluss-server/.../rowmerger/RowMerger.java: Added a defaultmergeInsert(BinaryValue newValue)method that returnsnewValueunchanged. This ensures safety and backward compatibility for all existing mergers.fluss-server/.../rowmerger/DefaultRowMerger.java: OverrodemergeInsertwithin the innerPartialUpdateRowMergerclass to callpartialUpdater.updateRow(null, newValue). This guarantees non-target columns are nulled out on the first insert.fluss-server/.../kv/KvTablet.java: UpdatedprocessUpsert()to executecurrentMerger.mergeInsert(currentValue)prior toapplyInsert(). This ensures partial updates are accurately applied throughout the initial insert process.Tests
Added two new unit tests to
fluss-server/.../kv/KvTabletTest.java:testPartialUpdateFirstInsertNullsNonTargetColumns: Verifies that non-target columns are set to null on a first insert, even if the client row contains values for them.testPartialUpdateFirstInsertThenUpdate: Verifies the full lifecycle by ensuring that an initial insert with a partial update nulls non-target columns, and a subsequent partial update correctly retains the previously stored values.API and Format
No. This is an internal data processing logic fix. No public APIs or storage formats are modified.
Documentation
No. This is a bug fix correcting existing behavior. No documentation updates are required.