Skip to content

fix(consensus): centralize reward bookkeeping in apply path (fork-gated)#782

Merged
github-actions[bot] merged 3 commits into
mainfrom
fix/reward-distribution-apply-path-determinism
Jun 3, 2026
Merged

fix(consensus): centralize reward bookkeeping in apply path (fork-gated)#782
github-actions[bot] merged 3 commits into
mainfrom
fix/reward-distribution-apply-path-determinism

Conversation

@satyakwok
Copy link
Copy Markdown
Collaborator

@satyakwok satyakwok commented Jun 3, 2026

Draft — do not merge/deploy/activate. Consensus-changing; ships off-by-default.

The real fix for the reward-escrow determinism bug that caused the 2026-06-03 testnet 4-way state drift (root cause in audits/reward-distribution-flow-audit-2026-04-27.md lineage).

Root cause

Reward + liveness + epoch-record bookkeeping ran in 5 separate network/finalize receive paths (libp2p_node.rs gossip-apply / peer-apply / catch-up-sync + main.rs validator-finalize ×2), not the deterministic block-apply path. Uneven path coverage → a block's reward applied a per-node-variable number of times → pending_rewards/delegator_rewards (and PROTOCOL_TREASURY = their sum) drift → state_root divergence. Confirmed on the stalled testnet: only PROTOCOL_TREASURY diverged across all 4 validators; every other account was byte-identical.

Fix

Run the bundle (record_block_signatures + distribute_reward + epoch_manager.record_block) exactly once inside apply_block_pass2, keyed off the committed block's justification, before update_trie_for_block (so it lands in this block's state_root). Every node applies it identically regardless of receive path.

New apply_reward_bookkeeping_for_latest_block() mirrors the external sites exactly (proposer, justification precommit stake_weight, get_block_reward(), fee_share=0). Bundle order is benign (3 independent structures).

Fork gate — REWARD_APPLY_PATH_HEIGHT (default u64::MAX, both nets)

  • pre-fork: the 5 external sites run the bundle (bit-identical to today)
  • post-fork: apply_block_pass2 runs it once; the 5 external sites skip it
  • boundary is clean: block H-1 via external, H via apply-path — each exactly once

run_epoch_bookkeeping (epoch-boundary rotation) intentionally stays on the receive paths — out of scope for this fork.

Scope / safety

  • Off by default → zero live behavior change until an activation height is pinned (halt-all + state-root-aligned + simul-start, per the activation-playbook pattern).
  • This does not itself re-converge an already-drifted chain — that's a separate recovery (cp canonical chain.db). It prevents future drift.
  • Pairs with native-state-root commitment (feat(state): commit native module state into trie/state_root (fork-gated) #779): once reward is deterministic and native state is committed, this class fails loud instead of silent.

Files

  • fork_heights.rs — gate + tests · blockchain.rs — wrapper
  • block_executor.rs — apply-path hook + apply_reward_bookkeeping_for_latest_block + tests
  • libp2p_node.rs ×3, main.rs ×2 — gate external sites to pre-fork only

Tests

Gate default-disabled both nets + activates at pinned height; apply-path helper credits pending_rewards (each call = one distribution → caller must run once-per-block); no-op without justification. Existing SRC-20/NFT/staking/fingerprint suites unchanged.

Commands

  • cargo test --workspace ✅ (67 suites, 0 fail) · cargo test -p sentrix-nft ✅ · cargo fmt --check ✅ · cargo clippy --workspace --all-targets -D warnings

No deploy, no activation, no fork enabled.

Summary by CodeRabbit

  • Bug Fixes

    • Improved consistency of reward and epoch bookkeeping application across different code paths to prevent divergence.
  • Chores

    • Version bumped to 2.2.27.

Root cause of the 2026-06-03 testnet 4-way state drift (only
PROTOCOL_TREASURY diverged, every other account identical): reward +
liveness + epoch-record bookkeeping was done in 5 separate network/
finalize receive paths (libp2p gossip-apply / peer-apply / catch-up-sync
+ main.rs validator-finalize ×2), not in the deterministic block-apply
path. Path coverage is uneven, so a block's reward got applied a per-node
variable number of times → pending_rewards / delegator_rewards (and thus
PROTOCOL_TREASURY = their sum) drifted → state_root divergence. Block-hash
agreement masked it (state_root is excluded from the block hash).

Fix: run the bundle (record_block_signatures + distribute_reward +
epoch_manager.record_block) exactly once inside apply_block_pass2, keyed
off the committed block's justification, before update_trie_for_block so it
lands in this block's state_root. Every node applies it identically
regardless of how the block arrived.

Fork-gated by NATIVE-style gate REWARD_APPLY_PATH_HEIGHT (default u64::MAX
on both nets):
- pre-fork: the 5 external sites run it (bit-identical to today)
- post-fork: apply_block_pass2 runs it once; the 5 external sites skip it
The boundary is clean (block H-1 via external, H via apply-path — each
once). run_epoch_bookkeeping (epoch-boundary rotation) is intentionally
left on the receive paths — out of scope for this fork.

Consensus-changing → stays off until an activation height is pinned
(halt-all + state-root-aligned + simul-start). Does NOT itself re-converge
a drifted chain; that's a separate recovery (cp canonical chain.db).

Tests: fork gate default-disabled both nets + activates at pinned height;
apply-path helper credits pending_rewards (and each call = one
distribution, so caller must run once); no-op without justification.
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…gate

Extends the apply-path bookkeeping fix to the other half of the bundle:
run_epoch_bookkeeping (epoch-boundary active-set rotation, unbonding
release, liveness slashing) was also called from all 5 network/finalize
receive paths. It is NOT idempotent — advancing epoch_number, pushing
history, and slashing — so a per-node-variable application count corrupts
epoch_state (trie-committed) and double-slashes. Same multipath drift
class as the reward bundle.

Post REWARD_APPLY_PATH_HEIGHT it runs once in apply_block_pass2, right
after the reward bundle (so it sees fresh liveness counts + pre-rotation
active_set, matching the external ordering) and before update_trie_for_block.
The 5 external run_epoch_bookkeeping calls are gated to pre-fork only.

One gate covers the full per-block bookkeeping (reward + liveness + epoch
+ slashing) so it all activates atomically and coherently.
@satyakwok
Copy link
Copy Markdown
Collaborator Author

Extended (commit 61d4a2e): the same gate now also centralizes run_epoch_bookkeeping (epoch-boundary rotation + unbonding + liveness slashing) into apply_block_pass2 — it's the other half of the same multipath bundle and is non-idempotent (advances epoch_number, double-slashes), so it had the identical drift risk against trie-committed epoch_state. One gate (REWARD_APPLY_PATH_HEIGHT) now covers the full per-block bookkeeping (reward + liveness + epoch + slashing), activated atomically. Pre-fork bit-identical; all gates green.

@satyakwok satyakwok self-assigned this Jun 3, 2026
@satyakwok satyakwok marked this pull request as ready for review June 3, 2026 22:57
@github-actions github-actions Bot enabled auto-merge (squash) June 3, 2026 22:57
@github-actions github-actions Bot merged commit ce36594 into main Jun 3, 2026
20 of 21 checks passed
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 3, 2026

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 13b33998-4b7a-4938-bb5c-8acf961c925b

📥 Commits

Reviewing files that changed from the base of the PR and between 8df5075 and 402ce9d.

⛔ Files ignored due to path filters (2)
  • CHANGELOG.md is excluded by !**/CHANGELOG.md
  • Cargo.lock is excluded by !**/*.lock, !**/*.lock
📒 Files selected for processing (6)
  • Cargo.toml
  • bin/sentrix/src/main.rs
  • crates/sentrix-core/src/block_executor.rs
  • crates/sentrix-core/src/blockchain.rs
  • crates/sentrix-core/src/fork_heights.rs
  • crates/sentrix-network/src/libp2p_node.rs

📝 Walkthrough

Walkthrough

This PR introduces a fork-height gate REWARD_APPLY_PATH_HEIGHT to centralize when reward/liveness/epoch bookkeeping occurs, preventing double-application of these mutations. The gate defaults to disabled (u64::MAX) and, when activated, moves reward bookkeeping from distributed BFT/network receive paths into the centralized apply_block_pass2 path. Fork-height infrastructure is defined in fork_heights.rs, exposed via a Blockchain delegator, applied conditionally in block executor, and gated across BFT finalize blocks and three network receive paths (gossipsub, request-response, and sync).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

  • sentrix-labs/sentrix#750: Addresses the same root cause—post-add_block mutations of pending_rewards, liveness, and epoch bookkeeping being applied outside state_root and causing divergence are now gated/centralized by the reward-apply-path fork to prevent double-application.

Possibly related PRs

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/reward-distribution-apply-path-determinism

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant