streaming committer for across-batch pipelining#18
Merged
Conversation
Replaces the populate-then-commit two-phase loop in `KafkaBatchCommitter` with a single streaming loop that continuously absorbs queue items into per-partition pending state and commits each partition's contiguous-done prefix. Eliminates the queue-grows-unbounded pathology when one batch stalls on a slow handler. Fast partitions now commit independently of slow ones across batch boundaries, not just within a single batch. `commit_batch_size` is still global; `commit_batch_timeout_sec` now anchors on first-task arrival rather than ticking through populate blocks. `commit_all`, `close`, and rebalance flushing are unchanged externally — flush events drive the loop the same way they always did. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replaces the populate-then-commit two-phase loop in
KafkaBatchCommitterwith a single streaming loop that continuously absorbs queue items into per-partition pending state and commits each partition's contiguous-done prefix. Eliminates the queue-grows-unbounded pathology when one batch stalls on a slow handler. Fast partitions now commit independently of slow ones across batch boundaries, not just within a single batch.commit_batch_sizeis still global;commit_batch_timeout_secnow anchors on first-task arrival rather than ticking through populate blocks.commit_all,close, and rebalance flushing are unchanged externally — flush events drive the loop the same way they always did.