fix(consensus): two SIP-6 pre-deploy blockers — address case normalization + epoch transition in libp2p sync#763
Conversation
Fix #1 — epoch_state_value_bytes: normalize validator addresses before hashing (trim 0x prefix + lowercase). Without this, addresses stored with different casing across validators produce different validator_set_hash bytes → state_root fork at every post-fork block. Mirrors the normalization already in address_to_key. New test test_epoch_state_value_validator_set_case_insensitive pins the fix. Fix #2 — epoch boundary transition missing in libp2p sync paths. Extract the epoch bookkeeping block (process_unbonding → unbond releases → update_active_set → epoch rotation → check_liveness) into Blockchain::run_epoch_bookkeeping(height). Replace the two duplicate inline blocks in main.rs FinalizeBlock arms. Add the call to both libp2p catch-up paths (gossip + GetBlocks batch-sync) where it was absent entirely — nodes syncing via these paths diverged from BFT-finalized validators at every epoch boundary. cargo check --workspace -D warnings: clean. 78 sentrix-trie + 8 sentrix-core + 32 sentrix-network tests: all pass.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
📝 WalkthroughWalkthroughThis PR consolidates epoch-boundary state transition logic into a single Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/sentrix-core/src/blockchain.rs`:
- Around line 528-529: The boundary helper currently calls
self.epoch_manager.record_block(0) a second time (the duplicate call at the
boundary helper where `finished` is pushed into history), which double-counts
boundary blocks; remove the redundant self.epoch_manager.record_block(0)
invocation from the boundary helper so the helper assumes callers have already
invoked epoch_manager.record_block, and update the helper's docstring/comments
to state that contract explicitly (no other logic change required).
- Around line 540-548: process_unbonding currently removes entries from
StakeRegistry.unbonding_queue (StakeRegistry::process_unbonding) before the
caller in blockchain.rs performs payouts via self.accounts.transfer /
self.accounts.credit, and transfer failures are merely warned about, which can
permanently drop releases; fix by making the payout step atomic: either (A)
change StakeRegistry::process_unbonding to return the matured keys and amounts
without removing them and only remove entries after successful transfer/credit,
or (B) have process_unbonding keep removals but return a Result and on any
accounts.transfer/accounts.credit error reinsert the corresponding unbonding
entries or return Err to the caller so the block execution can roll back; ensure
you update the blockchain.rs caller to propagate errors instead of only logging
(remove unwrap_or_else warning-only handling) and reference the same
delegator/amount pairs when reinserting or committing removals.
In `@crates/sentrix-network/src/libp2p_node.rs`:
- Around line 1135-1139: The NewBlock request handling path must also invoke the
epoch-boundary transition; after the existing call to
chain.epoch_manager.record_block(reward) in the SentrixRequest::NewBlock apply
path, call chain.run_epoch_bookkeeping(gossip.block.index) (same as the gossip
path) so the active_set/unbonding/liveness slashing are updated for
epoch-boundary blocks; locate the SentrixRequest::NewBlock handler and add the
run_epoch_bookkeeping call immediately after
chain.epoch_manager.record_block(reward) to keep epoch state consistent with the
gossip branch.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 6fe1c37c-5c71-4eef-b54d-51fb37eaf9c5
📒 Files selected for processing (4)
bin/sentrix/src/main.rscrates/sentrix-core/src/blockchain.rscrates/sentrix-network/src/libp2p_node.rscrates/sentrix-trie/src/address.rs
| /// Call AFTER record_block + distribute_reward have already run for | ||
| /// this block. Returns without doing anything if the height is not |
There was a problem hiding this comment.
Remove the second epoch_manager.record_block from the boundary helper.
The new contract says callers reach this helper after record_block has already run, and the PR objective says the libp2p apply paths already did that. Calling self.epoch_manager.record_block(0) again on Line 561 double-counts every boundary block before finished is pushed into history, so epoch block totals drift by one at each rollover.
Suggested fix
- self.epoch_manager.record_block(0);
let finished = self.epoch_manager.current_epoch.clone();
self.epoch_manager.history.push(finished);Also applies to: 561-563
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/sentrix-core/src/blockchain.rs` around lines 528 - 529, The boundary
helper currently calls self.epoch_manager.record_block(0) a second time (the
duplicate call at the boundary helper where `finished` is pushed into history),
which double-counts boundary blocks; remove the redundant
self.epoch_manager.record_block(0) invocation from the boundary helper so the
helper assumes callers have already invoked epoch_manager.record_block, and
update the helper's docstring/comments to state that contract explicitly (no
other logic change required).
| let released = self.stake_registry.process_unbonding(height); | ||
| for (delegator, amount) in &released { | ||
| let r = if Self::is_reward_v2_height(height) { | ||
| self.accounts.transfer(PROTOCOL_TREASURY, delegator, *amount, 0) | ||
| } else { | ||
| self.accounts.credit(delegator, *amount) | ||
| }; | ||
| r.unwrap_or_else(|e| tracing::warn!("unbonding release failed: {}", e)); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== process_unbonding definitions and call sites =="
rg -n -C4 '\bfn\s+process_unbonding\b|\bprocess_unbonding\s*\(' --glob '*.rs'
echo
echo "== nearby unbonding mutation logic =="
rg -n -C4 '\bunbond|\brelease|\bretain\b|\bremove\b' --glob '*.rs' | rg 'process_unbonding|unbond'Repository: sentrix-labs/sentrix
Length of output: 9218
process_unbonding mutates state before payout; swallowed transfer failures can drop releases
StakeRegistry::process_unbonding builds matured_keys and removes matured entries from self.unbonding_queue (unbonding_queue.remove(&key)) before returning released. The caller in crates/sentrix-core/src/blockchain.rs then performs accounts.transfer(...) / accounts.credit(...) but only warns on failure (unwrap_or_else(...warn...)). If those calls fail and the block execution doesn’t roll back the removal, the unbonding entries (and their eventual payouts) are consumed with no retry path.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/sentrix-core/src/blockchain.rs` around lines 540 - 548,
process_unbonding currently removes entries from StakeRegistry.unbonding_queue
(StakeRegistry::process_unbonding) before the caller in blockchain.rs performs
payouts via self.accounts.transfer / self.accounts.credit, and transfer failures
are merely warned about, which can permanently drop releases; fix by making the
payout step atomic: either (A) change StakeRegistry::process_unbonding to return
the matured keys and amounts without removing them and only remove entries after
successful transfer/credit, or (B) have process_unbonding keep removals but
return a Result and on any accounts.transfer/accounts.credit error reinsert the
corresponding unbonding entries or return Err to the caller so the block
execution can roll back; ensure you update the blockchain.rs caller to propagate
errors instead of only logging (remove unwrap_or_else warning-only handling) and
reference the same delegator/amount pairs when reinserting or committing
removals.
| // Epoch boundary transition — rotate active set, | ||
| // release unbonding, run liveness slashing. | ||
| // Previously missing here; libp2p-synced nodes | ||
| // diverged from BFT-finalize path at boundaries. | ||
| chain.run_epoch_bookkeeping(gossip.block.index); |
There was a problem hiding this comment.
SentrixRequest::NewBlock still misses the same epoch-boundary transition.
This wires run_epoch_bookkeeping into gossip, but the inbound SentrixRequest::NewBlock apply path at Lines 1692-1714 still stops after chain.epoch_manager.record_block(reward). A peer delivering an epoch-boundary block over that request/response route will keep stale active_set / epoch state and can still fork for the same reason this PR is fixing here.
Suggested minimal follow-up
chain.epoch_manager.record_block(reward);
+ chain.run_epoch_bookkeeping(block_idx);As per coding guidelines, crates/sentrix-network/**: CONSENSUS-CRITICAL — suggestions only, no destructive rewrites.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/sentrix-network/src/libp2p_node.rs` around lines 1135 - 1139, The
NewBlock request handling path must also invoke the epoch-boundary transition;
after the existing call to chain.epoch_manager.record_block(reward) in the
SentrixRequest::NewBlock apply path, call
chain.run_epoch_bookkeeping(gossip.block.index) (same as the gossip path) so the
active_set/unbonding/liveness slashing are updated for epoch-boundary blocks;
locate the SentrixRequest::NewBlock handler and add the run_epoch_bookkeeping
call immediately after chain.epoch_manager.record_block(reward) to keep epoch
state consistent with the gossip branch.
…n_epoch_bookkeeping (#764) Callers (main.rs BFT-finalize, libp2p gossip + batch-sync) already call epoch_manager.record_block(reward) before run_epoch_bookkeeping. The extra record_block(0) inside the method incremented total_blocks_produced twice for every boundary block, inflating epoch history by 1 block per rollover. Reported by CodeRabbit on PR #763.
Summary
Two bugs found during SIP-6 pre-deploy review that would cause immediate state_root forks on testnet.
Fix 1 —
epoch_state_value_bytesaddress case normalization (crates/sentrix-trie/src/address.rs)The validator_set hash was feeding raw address strings into SHA-256 without normalizing case or stripping the
0xprefix. If any two validators stored the same address with different casing (e.g.0xABCDvs0xabcd), the 32-bytevalidator_set_hashat bytes 48..80 of the epoch state value would differ →state_rootmismatch at every post-SIP-6 block. Fix: normalize each address withtrim_start_matches("0x").to_lowercase()before hashing, matching the existing pattern inaddress_to_key. New testtest_epoch_state_value_validator_set_case_insensitivepins the fix.Fix 2 — epoch boundary transition missing in libp2p sync paths (
crates/sentrix-core/src/blockchain.rs,crates/sentrix-network/src/libp2p_node.rs,bin/sentrix/src/main.rs)The gossip and GetBlocks batch-sync paths in
libp2p_node.rsappliedrecord_block_signatures + distribute_reward + epoch_manager.record_blockbut skipped the full epoch boundary transition (process_unbonding → unbond releases → update_active_set → epoch rotation → check_liveness). Nodes catching up via these paths would have a staleactive_setand wrongepoch_stateafter every epoch boundary — with SIP-6 active, this surfaces immediately as astate_rootmismatch.Fix: extract the epoch boundary block from the two duplicate
main.rsFinalizeBlock arms intoBlockchain::run_epoch_bookkeeping(height), then call it from all three paths (bothmain.rsarms + bothlibp2p_node.rspaths).Test plan
cargo check --workspace -D warningscleansentrix-trietests pass (new case normalization test included)sentrix-coretests passsentrix-networktests passSummary by CodeRabbit