fix(node): Fix validator set hash mismatch at height 18409547 #828

FletcherMan · 2025-12-15T07:25:29Z

Problem

At block height 18409547, a transaction added a new sequencer with an invalid blsKey. This caused all validator nodes to halt because the original code returned an error when blsKey decoding failed.

After deploying a hotfix (using continue instead of return nil, err), nodes successfully resumed block production. However, a critical inconsistency emerged:

Old nodes (restarted after the hotfix): When replaying block 18409547, geth had already executed the block, so DeliverBlock called getParamsAndValsAtHeight() which returned 5 sequencers (including the invalid one) because it didn't validate blsKey.
New nodes (syncing from scratch): When executing block 18409547 normally, DeliverBlock called updateSequencerSet() → sequencerSetUpdates() which returned 4 sequencers (skipping the invalid one).

This caused different next_validators_hash values:

Old nodes: 65C2A11E5A28F185EC039D2B9F7A0AAFFC6B577BC596BD46176F8B203F51D9FF
New nodes: D3CF5BD31E9E1776EA7E656E3E10B2DE3CE8AD413B205F286E816404A43D7071

New nodes fail to sync past height 18409548 due to validator hash verification failure.

Root Cause

Inconsistent blsKey validation between two code paths:

sequencerSetUpdates() - validates blsKey, skips invalid sequencers
getParamsAndValsAtHeight() - did NOT validate blsKey, included all sequencers

Solution

Add blsKeyCheckForkHeight = 18409547 constant
Add height parameter to sequencerSetUpdates() and updateSequencerSet()
For heights <= 18409547: Include sequencers even if blsKey validation fails (historical compatibility)
For heights > 18409547: Skip sequencers with invalid blsKey (correct behavior)

This ensures:

New nodes calculate the same validator set hash as historical blocks
Future blocks enforce proper blsKey validation

Block Data Reference

Height	validators_hash	next_validators_hash
18409547	D3CF...	D3CF...
18409548	D3CF...	65C2...
18409549	65C2...	D3CF...
18409550+	D3CF...	D3CF...

Testing

New node can sync past height 18409548
Existing nodes continue to operate normally
blsKey validation works correctly for heights > 18409547

Summary by CodeRabbit

Bug Fixes
- Height-aware validator handling: improved BLS key validation around a protocol fork to avoid accepting invalid keys after the fork while preserving historical entries before it.
- Added logging for BLS key decoding failures.
New Features
- Added a configurable fork-height setting controlling when stricter BLS key checks apply.
Chores
- Sequencer/validator update flow now uses block height to ensure correct validator set updates across fork transitions.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…ibility

coderabbitai · 2025-12-15T07:25:48Z

Walkthrough

Executor and sequencer flows were made height-aware: initialization and DeliverBlock now pass L2 block height into sequencer updates. BLS key decoding is gated by a configured fork height (BlsKeyCheckForkHeight): decode failures are logged and handled differently depending on the fork boundary.

Changes

Cohort / File(s)	Summary
Executor initialization & block delivery `node/core/executor.go`	Executor gained `blsKeyCheckForkHeight` field; `NewExecutor` initializes it from config. Init and `DeliverBlock` obtain current block height and call `updateSequencerSet(height)`. `getParamsAndValsAtHeight` builds `newValidators` with append, adds BLS decode checks, logs decode errors, and conditionally omits invalid entries depending on fork.
Sequencer set updates & BLS validation `node/core/sequencers.go`	Introduced `isBlsKeyCheckFork` helper and fork-aware logic. Signatures changed to `sequencerSetUpdates(height uint64)` and `updateSequencerSet(height uint64)`. Added at-fork-boundary cache bypass and height-propagated handling when decoding/skipping BLS keys.
Configuration `node/core/config.go`	Added public `MainnetBlsKeyCheckForkHeight` and `Config.BlsKeyCheckForkHeight` field; consolidated mainnet config path to set `BlsKeyCheckForkHeight` (and removed Holesky-specific constant).
Flags `node/flags/flags.go`	Removed `HoleskyFlag` and its inclusion from the exported `Flags` list.
Module metadata `go.mod`	Module metadata present; no exported API renames beyond internal method signature changes.

Sequence Diagram(s)

sequenceDiagram
    participant Executor
    participant Sequencer as SequencerSetManager
    participant Cache
    participant Logger

    Executor->>Sequencer: updateSequencerSet(height)
    Sequencer->>Cache: check cached sequencer set(height)
    alt at fork boundary (height == fork or fork+1)
        Sequencer->>Cache: bypass cache
    end
    Sequencer->>Sequencer: decode BLS keys for validators
    alt decode failure
        Sequencer->>Logger: log decode error
        alt height <= fork
            Sequencer-->>Sequencer: include historical TmKey despite decode fail
        else
            Sequencer-->>Sequencer: skip invalid entry
        end
    end
    Sequencer->>Cache: update cache with new validator set
    Sequencer-->>Executor: return updated sequencer params

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Verify all call sites pass the correct height into updateSequencerSet / sequencerSetUpdates.
Inspect getParamsAndValsAtHeight for correct slice growth, off-by-one, and correct conditional inclusion of TmKey across fork boundary.
Review cache-bypass/at-fork-boundary logic to ensure intended cache semantics and performance trade-offs.
Confirm logs for BLS decode failures are informative but do not leak sensitive data.

Possibly related PRs

Fix bug for invalid sequencer BLS key #808 — Prior changes to sequencer BLS key handling; this PR extends that work with fork-height–aware validation and signature changes.

Suggested reviewers

secmgt
chengwenxi

Poem

🐰
I hopped through chains by height and light,
Skipped rotten keys when morning's right.
At the fork I paused, then kept the old,
New rules ahead, the cache controlled.
Sequencers hum — I nibble code tonight.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title accurately summarizes the main objective: fixing a validator set hash mismatch at a specific height by implementing fork-aware BLS key validation logic.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix-blskey-validation-fork-compatibility

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

node/core/executor.go (1)

377-377: Consider type consistency for height parameters.

getParamsAndValsAtHeight uses int64 for height, while sequencerSetUpdates uses uint64. The comparison on line 405 works correctly, but consistent typing would reduce cognitive overhead. This is a pre-existing pattern, so acceptable to defer.

Also applies to: 405-405

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7df07a7 and eba200c.

📒 Files selected for processing (2)

node/core/executor.go (3 hunks)
node/core/sequencers.go (4 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

node/core/sequencers.go (1)

node/core/executor.go (1)

Executor (31-60)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: test
GitHub Check: check
GitHub Check: test
GitHub Check: check
GitHub Check: check
GitHub Check: test
GitHub Check: Analyze (rust)
GitHub Check: Analyze (go)

🔇 Additional comments (6)

node/core/sequencers.go (3)

21-24: LGTM - Fork height constant is well documented.

The constant clearly explains its purpose for maintaining historical block compatibility. The value aligns with the documented block height where the issue occurred.

63-63: Height-aware BLS validation logic is correct for historical compatibility.

The implementation correctly skips invalid BLS keys only after the fork height, preserving historical block compatibility.

One consideration: when decodeBlsPubKey fails for heights ≤ fork height, a zero-value blsPubKey is stored in valsByTmKey (line 118). If VerifySignature is later called for such a validator, it would attempt verification with a zero BLS public key. Verify this doesn't cause issues for historical signature verification paths.

Also applies to: 103-120

159-160: LGTM - Height parameter correctly propagated.

node/core/executor.go (3)

138-143: LGTM - Initialization now uses current height for sequencer set update.

This ensures height-aware BLS key validation is applied during executor startup.

335-335: LGTM - Block number correctly passed for height-aware sequencer updates.

398-410: BLS validation now consistent with sequencerSetUpdates - this fixes the root cause.

The logic correctly mirrors the validation in sequencerSetUpdates, ensuring both code paths produce the same validator set. The slice initialization change to capacity-based (make([][]byte, 0, len(addrs))) with append is correct for conditional inclusion.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

node/core/config.go (1)
161-170: Logger format issue and missing flexibility for non-mainnet configurations.

The logger.Info call on line 166 uses positional arguments without proper key-value formatting. Tendermint's logger expects key-value pairs.

When --mainnet flag is not set but --upgrade.batchTime is provided, BlsKeyCheckForkHeight remains at its zero default. This means isBlsKeyCheckFork() will return true for all heights (strict validation). Consider whether testnet/holesky deployments need a separate fork height configuration.
 	case ctx.GlobalIsSet(flags.MainnetFlag.Name):
 		c.UpgradeBatchTime = MainnetUpgradeBatchTime
 		c.BlsKeyCheckForkHeight = MainnetBlsKeyCheckForkHeight
-		logger.Info("set UpgradeBatchTime: ", c.UpgradeBatchTime, "BlsKeyCheckForkHeight: ", c.BlsKeyCheckForkHeight)
+		logger.Info("mainnet config applied", "UpgradeBatchTime", c.UpgradeBatchTime, "BlsKeyCheckForkHeight", c.BlsKeyCheckForkHeight)
 	case ctx.GlobalIsSet(flags.UpgradeBatchTime.Name):
 		c.UpgradeBatchTime = ctx.GlobalUint64(flags.UpgradeBatchTime.Name)
 		logger.Info("set UpgradeBatchTime: ", ctx.GlobalUint64(flags.UpgradeBatchTime.Name))

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f7cb93 and 47caf6e.

📒 Files selected for processing (4)

node/core/config.go (3 hunks)
node/core/executor.go (5 hunks)
node/core/sequencers.go (4 hunks)
node/flags/flags.go (0 hunks)

💤 Files with no reviewable changes (1)

node/flags/flags.go

🧰 Additional context used

🧬 Code graph analysis (2)

node/core/sequencers.go (1)

node/core/executor.go (1)

Executor (31-61)

node/core/config.go (1)

node/flags/flags.go (2)

MainnetFlag (229-232)

UpgradeBatchTime (224-228)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: test
GitHub Check: check
GitHub Check: test
GitHub Check: check
GitHub Check: test
GitHub Check: Analyze (rust)
GitHub Check: Analyze (go)

🔇 Additional comments (8)

node/core/sequencers.go (4)

21-25: LGTM!

The fork detection logic correctly handles:

blsKeyCheckForkHeight == 0: Always enforce validation (for testnets/new deployments)

height > blsKeyCheckForkHeight: Enforce validation post-fork

height <= blsKeyCheckForkHeight: Skip validation for historical compatibility

69-73: LGTM!

The fork boundary cache bypass logic correctly invalidates the cached sequencer set at both the fork height and the subsequent block. This ensures the transition from lenient to strict BLS key validation produces the correct validator hash.

160-161: LGTM!

Clean propagation of the height parameter to sequencerSetUpdates.

105-121: Add clarification comment about pre-fork invalid BLS key handling.

When decodeBlsPubKey fails at heights < fork (where isBlsKeyCheckFork returns false), a zero-value blsPubKey is stored in valsByTmKey. While this is mitigated in practice because ConvertBlsData only processes new blocks being committed (not historical replays), adding a comment explaining this behavior would improve code maintainability. Consider documenting why pre-fork validators with invalid keys are included despite failing decode.

node/core/executor.go (4)

54-57: LGTM!

The blsKeyCheckForkHeight field is correctly added to the Executor struct, aligning with the configuration.

140-147: LGTM!

Correctly fetches the current chain height before initializing the sequencer set, ensuring fork-aware BLS key validation from startup.

336-342: LGTM!

Correctly passes the delivered block's number to updateSequencerSet, ensuring fork-aware validation is applied based on the block being processed.

400-410: Core fix looks correct - ensures consistency between the two validation paths.

This change addresses the root cause by adding the same fork-aware BLS key validation to getParamsAndValsAtHeight that exists in sequencerSetUpdates. Both paths now behave identically:

Heights ≤ 18409547: Include sequencers regardless of BLS key validity

Heights > 18409547: Skip sequencers with invalid BLS keys

Minor note: The height parameter is int64 but isBlsKeyCheckFork expects uint64. While block heights should never be negative in practice, consider adding a guard or using a consistent type.

fix(node): skip blsKey check before fork height for historical compat…

eba200c

…ibility

FletcherMan requested a review from a team as a code owner December 15, 2025 07:25

FletcherMan requested review from Web3Jumb0 and removed request for a team December 15, 2025 07:25

coderabbitai bot reviewed Dec 15, 2025

View reviewed changes

fletcher.fan added 2 commits December 15, 2025 15:35

bypass cache at fork boundary for correct validator set

8f7cb93

make blsKeyCheckForkHeight configurable

47caf6e

coderabbitai bot reviewed Dec 15, 2025

View reviewed changes

Kukoomomo approved these changes Dec 15, 2025

View reviewed changes

panos-xyz approved these changes Dec 15, 2025

View reviewed changes

curryxbo approved these changes Dec 15, 2025

View reviewed changes

FletcherMan merged commit cb12de2 into main Dec 15, 2025
13 checks passed

FletcherMan deleted the fix-blskey-validation-fork-compatibility branch December 15, 2025 09:33

coderabbitai bot mentioned this pull request Dec 22, 2025

WIP:Update l2 RetryableClient #827

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(node): Fix validator set hash mismatch at height 18409547 #828

fix(node): Fix validator set hash mismatch at height 18409547 #828

Uh oh!

FletcherMan commented Dec 15, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 15, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

fix(node): Fix validator set hash mismatch at height 18409547 #828

fix(node): Fix validator set hash mismatch at height 18409547 #828

Uh oh!

Conversation

FletcherMan commented Dec 15, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root Cause

Solution

Block Data Reference

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

FletcherMan commented Dec 15, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 15, 2025 •

edited

Loading