fix: stabilize BFT block application and sync#769
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
📝 WalkthroughWalkthroughThe validator consensus loop now pre-applies blocks on a scratch blockchain before embedding them into signed proposals. This pre-apply pattern applies across normal proposal construction, self-propose speculative N+1 paths, and peer-propose speculative N+1 paths; pre-apply failures abort proposal building. Peer-propose additionally validates that speculative blocks have matching validator addresses. SyncNeeded handlers are clarified with explicit "triggering block sync" logging and async trigger calls. In the block receive layer, gossipsub and direct NewBlock requests add stale-block guards to skip already-applied heights. NewBlock apply gains epoch boundary bookkeeping to match gossipsub apply behavior. Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@bin/sentrix/src/main.rs`:
- Around line 2888-2902: The peer-finalization branch is incorrectly gating
processing on blk.validator != wallet.address, causing peer proposals to be
skipped and forcing followers to wait for NewBlock/sync; remove that
local-validator gate so peer proposals proceed to the normal finalization path
(allow bc.add_block to run for proposed_block), and instead move any local-only
speculative validation into the speculative build/stash code (e.g., where
create_block_voyager stamps proposer and where speculative builds are handled).
Ensure you also remove or avoid calling lp2p_clone.trigger_sync() and break out
of the finalization flow for legitimate peer proposals so the block is applied
immediately.
- Around line 2415-2418: The speculative pre-build miss handler currently sets
speculative_proposal = None and then break; which aborts the enclosing BFT
action loop and prevents the already-committed block (added via
bc.add_block(blk)) from reaching the persistence/broadcast path; change the
behavior at both occurrences so that after detecting let Some(block) = block
else { ... } you clear speculative_proposal but do not break the outer loop —
use continue (or otherwise return to the top of the enclosing loop) so
processing proceeds to the finalized-block persistence/broadcast logic; update
both places referenced (the block unwrap paths around speculative_proposal) to
only clear speculative_proposal and continue rather than break.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 462ad508-56c1-4c6a-984b-3c8061485c16
📒 Files selected for processing (2)
bin/sentrix/src/main.rscrates/sentrix-network/src/libp2p_node.rs
| let Some(block) = block else { | ||
| speculative_proposal = None; | ||
| break; | ||
| }; |
There was a problem hiding this comment.
Don't let speculative N+1 failure abort finalized-block dissemination.
At both sites, height N has already been committed via bc.add_block(blk), but this break exits the enclosing BFT action loop before the finalized block reaches the persistence/broadcast path starting at Line 2471 and Line 3048. A speculative pre-build miss should only clear speculative_proposal; otherwise an optional optimization failure turns into a local-only commit of the finalized block.
As per coding guidelines, "bin/sentrix/src/main.rs: CONSENSUS-CRITICAL — suggestions only, no destructive rewrites."
Also applies to: 3001-3004
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@bin/sentrix/src/main.rs` around lines 2415 - 2418, The speculative pre-build
miss handler currently sets speculative_proposal = None and then break; which
aborts the enclosing BFT action loop and prevents the already-committed block
(added via bc.add_block(blk)) from reaching the persistence/broadcast path;
change the behavior at both occurrences so that after detecting let Some(block)
= block else { ... } you clear speculative_proposal but do not break the outer
loop — use continue (or otherwise return to the top of the enclosing loop) so
processing proceeds to the finalized-block persistence/broadcast logic; update
both places referenced (the block unwrap paths around speculative_proposal) to
only clear speculative_proposal and continue rather than break.
| if blk.validator != wallet.address { | ||
| tracing::info!( | ||
| target: "finalize_trace", | ||
| "BFT finalize peer-propose: h={} round={} block={:.16}… \ | ||
| proposer={} is not local validator {}; waiting for \ | ||
| libp2p NewBlock/sync instead of executing peer block \ | ||
| in the BFT loop", | ||
| height, | ||
| round, | ||
| block_hash, | ||
| blk.validator, | ||
| wallet.address, | ||
| ); | ||
| lp2p_clone.trigger_sync().await; | ||
| break; |
There was a problem hiding this comment.
Remove the local-validator gate from peer finalization.
proposed_block here is the block carried by a peer's proposal, so on every non-proposer validator blk.validator is expected to be the peer proposer's address, not wallet.address. Cross-file, create_block_voyager stamps the passed proposer address into Block.validator, so this branch will fire on the normal peer-finalize path, skip bc.add_block, and make followers depend on a later NewBlock/sync to advance. If the intent was to validate speculative locally-built N+1 blocks, that check needs to live on the speculative build/stash path instead of the current finalized block.
As per coding guidelines, "bin/sentrix/src/main.rs: CONSENSUS-CRITICAL — suggestions only, no destructive rewrites."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@bin/sentrix/src/main.rs` around lines 2888 - 2902, The peer-finalization
branch is incorrectly gating processing on blk.validator != wallet.address,
causing peer proposals to be skipped and forcing followers to wait for
NewBlock/sync; remove that local-validator gate so peer proposals proceed to the
normal finalization path (allow bc.add_block to run for proposed_block), and
instead move any local-only speculative validation into the speculative
build/stash code (e.g., where create_block_voyager stamps proposer and where
speculative builds are handled). Ensure you also remove or avoid calling
lp2p_clone.trigger_sync() and break out of the finalization flow for legitimate
peer proposals so the block is applied immediately.
Summary
Risk tier
Check ONE:
sentrix-core,sentrix-trie,sentrix-staking,sentrix-bft),block_executor,apply_block_*,state_rootpathRequired by tier
🟢 Low — minimum bar
🟡 Medium — adds
#[test]in same PR🟠 High — adds
🔴 Critical — adds
Test plan
Rollback plan
Related
Summary by CodeRabbit