fix: Carry over last change to next fragment to fix max_batch_size boundary bug#3764
fix: Carry over last change to next fragment to fix max_batch_size boundary bug#3764
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Comment |
✅ Deploy Preview for electric-next ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3764 +/- ##
=======================================
Coverage 87.36% 87.36%
=======================================
Files 23 23
Lines 2050 2050
Branches 543 542 -1
=======================================
Hits 1791 1791
Misses 257 257
Partials 2 2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
… one change When a transaction has exactly max_batch_size changes, the commit fragment was getting the same last_log_offset as the preceding data fragment. This caused ShapeLogCollector to drop it as a duplicate. The other undesirable side effect of allowing an empty fragment with commit is that the last change from the previous fragment **is not** marked with last?=true. This commit fixes both problems by retaining the last change from the current fragment and holding it until the next fragment. In this way, even if the next converted message is Commit, the transaction fragment created from it will have one change and its log offset will be greater than previous fragment's last_log_offset.
Add end-to-end tests verifying that transactions with exactly max_batch_size (100) changes are fully received by clients. These tests exercise the full pipeline (ReplicationClient -> ShapeLogCollector -> HTTP API -> Client) and confirm the fix for the bug where commit-only fragments at exact batch boundaries were silently dropped due to duplicate offsets.
Add ELECTRIC_EXPERIMENTAL_MAX_BATCH_SIZE env var to allow configuring the maximum number of changes buffered before flushing a transaction fragment. This is primarily useful for testing batch boundary behavior. Default remains 100 to match existing behavior.
Add a new integration test that validates transactions following a batch-boundary transaction are not incorrectly skipped by ShapeLogCollector.
25e45e0 to
8ad626f
Compare
Summary
Fix #3726.
Fixes the bug where transactions with a number of changes that is a multiple of
max_batch_size(default: 100) had their commit fragment silently dropped byShapeLogCollector.Root Cause
When a transaction has exactly N *
max_batch_sizechanges (e.g. 100, 200, 300, ...):maybe_flush/1, returning a fragment withlast_log_offset = (final_lsn, X)maybe_flushcreates a new empty fragment but did NOT resetstate.last_log_offsetlast_log_offset = (final_lsn, X)ShapeLogCollector.handle_txn_fragment/2drops it as a duplicate (offset <= last_processed_offset)A secondary issue: when a fragment is flushed at exactly
max_batch_size, the last change in that fragment is not marked withlast?=true, which is incorrect since it should only be set on the very last change of the entire transaction.Solution
When flushing a fragment at the
max_batch_sizeboundary, retain the last change from the flushed fragment and carry it over to the next fragment. This guarantees that even if the next replication message is a Commit, the resulting fragment will have at least one change with alog_offsetgreater than the previous fragment'slast_log_offset.This approach also fixes the
last?marking issue since the carried-over change will be correctly annotated when its fragment is finalized.