Skip to content

feat(evm): align BLOCKHASH opcode with Aptos consensus block_id#354

Open
nekomoto911 wants to merge 7 commits into
Galxe:mainfrom
nekomoto911:feat/blockhash-block-id-alignment
Open

feat(evm): align BLOCKHASH opcode with Aptos consensus block_id#354
nekomoto911 wants to merge 7 commits into
Galxe:mainfrom
nekomoto911:feat/blockhash-block-id-alignment

Conversation

@nekomoto911

@nekomoto911 nekomoto911 commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Summary

Make the EVM BLOCKHASH opcode return Gravity's Aptos consensus block_id on every execution path — live pipeline, historical eth_call, debug_trace*, trace_*, eth_callBundle, mev_simBundle, flashbots_validateBuilderSubmission, and the pending-block simulation. JSON-RPC layer is untouched: eth_getBlockByNumber(n).hash continues to return the standard keccak(rlp(header)) so devp2p, stage sync, eth_feeHistory, and genesis init all remain bit-compatible with upstream reth.

The opcode-visible distribution is binary {block_id, 0x0}: a miss (block predates the upgrade point or is not yet committed) maps to B256::ZERO. The mechanism never falls back to keccak, so cross-upgrade contracts need a single "is zero?" branch — same shape as the existing "current_block - n > 256 → 0" rule baked into the opcode.

Approach

Rather than threading a per-site BlockIdOverrideDb<DB> wrapper through 17 RPC call sites (the surface area would have spanned eth-api/debug/trace/bundle/validation), this lifts BlockNumberToBlockIdReader to a supertrait of StateProvider and changes the blanket impl<T: StateProvider> EvmStateProvider for T so block_hash dispatches to block_id_by_number. Every EVM path that goes through StateProviderDatabase is covered automatically. The miss-to-zero contract is preserved by the existing one-line unwrap_or_default() in StateProviderDatabase::block_hash_ref.

Production providers query a new tables::BlockNumberToBlockId table. Writes happen in BlockViewStorage::update_canonical synchronously before make_canonical returns, so the racy in-memory window is closed for the Gravity pipeline. Test stubs alias block_id_by_number to block_hash to preserve existing test expectations.

In-memory race-window closure

The upstream-reth path (MakeCanonical engine event with async advance_persistence) used to leave a brief window where block N was canonical-in-memory but its block_id row had not landed in tables::BlockNumberToBlockIdgravity_blockIdByNumber(N) returned null and BLOCKHASH(N) returned 0x0 during that window. This is now closed:

  • CanonicalInMemoryState carries a block_ids: BTreeMap<BlockNumber, B256> index with insert_block_id / block_id_by_number / remove_persisted_block_ids (range-evict on persistence) / snapshot_block_ids_for (point-snapshot for overlays). Eviction handles the u64::MAX boundary via checked_add(1).
  • MakeCanonicalEvent carries block_id. The engine tree inserts the index entry before make_executed_block_canonical advances the canonical head, so any RPC observer that sees N as canonical resolves block_id_by_number(N) immediately.
  • MemoryOverlayStateProviderRef gains a block_ids snapshot field and a with_block_ids builder. block_id_by_number is in-memory-first then historical; block_hash is unchanged (still keccak header hash from sealed in-memory blocks — the two trait methods deliberately diverge).
  • BlockchainProvider::block_id_by_number is now two-stage: in-memory first, then consistent_provider().
  • All construction sites with CanonicalInMemoryState access wire the snapshot (consistent.rs::block_state_provider_ref + state_by_hash; blockchain_provider.rs::block_state_provider + latest() validator path). StateProviderBuilder::build (upstream-reth run_inner, opt-in via GRETH_DISABLE_PIPE_EXECUTION) and ress/provider (stateless witness recorder) carry race-window doc notes; both sit outside the gravity hot path.

Additions

  • tables::BlockNumberToBlockId (BlockNumber → B256)
  • BlockNumberToBlockIdReader trait + impls across all StateProvider impls
  • gravity_blockIdByNumber JSON-RPC method under a new gravity namespace (--http.api gravity / --ws.api gravity); "pending" returns null
  • CanonicalInMemoryState::block_ids in-memory index closing the MakeCanonicaladvance_persistence race window
  • MemoryOverlayStateProviderRef::with_block_ids builder for race-window-safe overlays

Test plan

  • cargo test -p reth-rpc --lib gravity — pending-short-circuit + concrete-dispatch
  • cargo test -p reth-provider --lib block_number_to_block_id_roundtrip — table roundtrip via DatabaseProvider + ProviderFactory
  • cargo test -p reth-chain-state --features test-utils --lib memory_overlay::tests — T1 (no keccak leak when snapshot empty) / T2 (in-memory snapshot overrides historical) / T3 (block_hash and block_id_by_number diverge on the same overlay)
  • cargo test -p reth-chain-state --features test-utils --lib block_idCanonicalInMemoryState insert / lookup / inclusive evict / u64::MAX boundary / snapshot / clear
  • cargo test -p reth-pipe-exec-layer-ext-v2 --test gravity_blockhash_alignment_test test_blockhash_alignment_grevm — full pipeline E2E (grevm)
  • cargo test -p reth-pipe-exec-layer-ext-v2 --test gravity_blockhash_alignment_test test_blockhash_alignment_disable_grevm — full pipeline E2E (no grevm)
  • cargo +nightly fmt --all --check
  • RUSTFLAGS=-D warnings cargo +nightly clippy --lib --tests --benches --all-features --locked

The integration test runs 20 blocks through the live PipeExecLayerApi, then asserts:

Invariant Assertion
Table populated BlockNumberToBlockId(n) == mock_block_id(n) for n ∈ [1, 20]
Provider BlockchainProvider::block_id_by_number(n) == mock_block_id(n)
RPC stays keccak eth_getBlockByNumber(n).hash != mock_block_id(n)
EVM in window eth_call(BLOCKHASH(n)) returns mock_block_id(n) for n ∈ [1, 19]
Future block eth_call(BLOCKHASH(21)) returns 0x0
Current block eth_call(BLOCKHASH(20)) returns 0x0 (opcode contract)

Migration / known limits

  • No backfill for blocks executed before this change. EVM BLOCKHASH(n) for those returns 0x0, never keccak — matches the existing "out-of-window → 0" behavior. Documented in the design.
  • Stage-sync re-execution: operators wiping the DB and re-running stage-sync over previously-executed Gravity history must backfill BlockNumberToBlockId first; otherwise state roots will diverge from the live run. The live Gravity pipeline does not use this path.
  • RpcBlockchainStateProvider (RPC-proxy mode) returns Ok(None) rather than aliasing to block_hash — preserves the {block_id, 0} invariant. Real block_id reads in proxy mode require a node that queries the table directly.
  • update_canonical opens a sync RW tx on the async hot path (one ~40 byte commit per canonical block). Future optimization batches with engine-tree persistence or uses spawn_blocking.
  • StateProviderBuilder::build and ress/provider do not thread the in-memory block_ids snapshot. Both are off the gravity hot path (the former requires GRETH_DISABLE_PIPE_EXECUTION=1; the latter is a stateless witness recorder); explicit race-window doc notes are in place. Closing these requires plumbing a CanonicalInMemoryState handle and is deferred.

Labels

A-rpc A-evm

Make the EVM BLOCKHASH opcode return Gravity's Aptos consensus block_id
instead of keccak(rlp(header)) on every execution path — live pipeline
execution, historical eth_call, debug_trace*, trace_*, eth_callBundle,
mev_simBundle, flashbots_validateBuilderSubmission, and the pending block
simulation. JSON-RPC layer is untouched: eth_getBlockByNumber(n).hash
keeps returning the standard keccak header hash so devp2p, stage sync,
fee_history, and genesis init all remain bit-compatible with upstream
reth.

The opcode-visible distribution is binary {block_id, 0x0}: a miss
(block predates the upgrade point or is not yet committed) maps to
B256::ZERO. The wrapper never falls back to keccak, so cross-upgrade
contracts only need a single "is zero?" branch — the same shape as the
existing "current_block - n > 256 -> 0" rule baked into the opcode.

Mechanism: rather than threading a BlockIdOverrideDb wrapper through ~17
RPC call sites, this lifts BlockNumberToBlockIdReader to a supertrait of
StateProvider and changes the blanket impl<T: StateProvider>
EvmStateProvider for T so block_hash dispatches to block_id_by_number.
Every EVM execution path that goes through StateProviderDatabase is
covered automatically. Production providers query a new
tables::BlockNumberToBlockId table; the write path lives in
BlockViewStorage::update_canonical, so persisting runs synchronously
before make_canonical returns and the racy in-memory window is closed
for the Gravity pipeline. Test stubs alias block_id_by_number to
block_hash to preserve existing test expectations.

The BlockIdOverrideDb wrapper is retained as an escape hatch with three
unit tests that pin the never-keccak invariant directly.

Additions
- tables::BlockNumberToBlockId (BlockNumber -> B256)
- BlockNumberToBlockIdReader trait with provider impls across
  DatabaseProvider, ProviderFactory, ConsistentProvider,
  BlockchainProvider, MemoryOverlayStateProviderRef,
  LatestStateProviderRef, HistoricalStateProviderRef,
  StateProviderTraitObjWrapper, CachedStateProvider,
  InstrumentedStateProvider, NoopProvider, MockEthProvider,
  StateProviderTest, RpcBlockchainStateProvider
- gravity_blockIdByNumber JSON-RPC method under a new `gravity`
  namespace (selectable via --http.api gravity / --ws.api gravity);
  "pending" returns null since pending blocks have no consensus block_id
- BlockIdOverrideDb wrapper in reth-revm

Tests
- Unit: BlockNumberToBlockId roundtrip; BlockIdOverrideDb hit / miss /
  DatabaseRef-mirrors-Database (with a PanickingInnerDb that aborts if
  the wrapper ever falls back to inner.block_hash);
  gravity_blockIdByNumber pending-short-circuit and concrete-dispatch
- Integration (crates/pipe-exec-layer-ext-v2/execute/tests/
  gravity_blockhash_alignment_test.rs): runs the full Gravity pipeline
  through 20 blocks, then exercises both grevm and disable-grevm and
  asserts the BlockNumberToBlockId table is populated, the
  BlockchainProvider exposes block_id, eth_getBlockByNumber.hash stays
  keccak, eth_call(BLOCKHASH(n)) returns block_id for in-window n,
  BLOCKHASH(future) and BLOCKHASH(current) return 0x0

Deferred follow-ups
- Sealed-but-not-persisted in-memory window is closed for the Gravity
  pipeline (update_canonical persists before make_canonical returns)
  but remains open for upstream-reth's async-persistence path; the fix
  requires threading block_id through CanonicalInMemoryState.
- update_canonical opens its own RW transaction synchronously from the
  async hot path; future optimization batches with engine-tree
  persistence or wraps in spawn_blocking.
- Operators running stage-sync on previously-executed Gravity history
  must backfill BlockNumberToBlockId; the live pipeline does not use
  this path.

A-rpc A-evm
The test-only `MockStateProvider` in `crates/chain-state/src/in_memory.rs`
also needs `BlockNumberToBlockIdReader` now that `StateProvider` has it as
a supertrait. Returns `Ok(None)` like the other test stubs.

Caught by CI:
https://github.com/Galxe/gravity-reth/actions/runs/27395623691/job/80962206618
…race window

Closes the race window where `MakeCanonical` has marked block N canonical
in-memory but `advance_persistence` has not yet written
`tables::BlockNumberToBlockId`. In that window `gravity_blockIdByNumber(N)`
returned `null` and EVM `BLOCKHASH(N)` returned `0x0`.

- `CanonicalInMemoryState` gains a `block_ids: RwLock<BTreeMap>` index with
  `insert_block_id` / `block_id_by_number` / `remove_persisted_block_ids` /
  `snapshot_block_ids_for`. Eviction uses `checked_add(1)` to handle the
  `u64::MAX` boundary.
- `MakeCanonicalEvent` carries `block_id`; the engine tree inserts the
  index entry **before** `make_executed_block_canonical` updates the
  canonical head, so any RPC observer that sees N as canonical can
  immediately resolve `block_id_by_number(N)`.
- `MemoryOverlayStateProviderRef` gains a `block_ids` snapshot field and a
  `with_block_ids` builder; `BlockNumberToBlockIdReader::block_id_by_number`
  consults it before falling back to historical. `BlockHashReader::block_hash`
  is unchanged (still returns keccak header hash from sealed in-memory blocks).
- `BlockchainProvider::block_id_by_number` becomes two-stage: in-memory first,
  then `consistent_provider()`.
- Construction sites with `CanonicalInMemoryState` access wire the snapshot
  in: `consistent.rs::block_state_provider_ref` / `state_by_hash`,
  `blockchain_provider.rs::block_state_provider` / `latest()` validator path.
- `StateProviderBuilder::build` (upstream-reth `run_inner`, opt-in via
  `GRETH_DISABLE_PIPE_EXECUTION`) and `ress/provider` (stateless witness
  recorder) carry race-window doc notes; both are outside the gravity hot
  path.

Tests: 38 chain-state lib tests (T1/T2/T3 overlay semantics + 6
`CanonicalInMemoryState` tests including `u64::MAX` eviction boundary) +
existing 62 provider + 8 revm-blockid_override tests pass.
CI clippy runs with `-D warnings`, so `clippy::doc_markdown` is a hard
error. The doc comment added in the previous stage-7 commit missed
backticks around the type name.
The supertrait approach (`BlockNumberToBlockIdReader` lifted onto
`StateProvider`, `EvmStateProvider::block_hash` blanket dispatch) covers
every EVM entry point that goes through `StateProviderDatabase`. The
`BlockIdOverrideDb` wrapper from the original design draft was retained
as an "escape hatch" / invariant fixture, but on review it sits outside
the live call chain and its three unit tests only verify the wrapper's
own standalone behavior — they do not exercise any live code path.

Contract preservation:
- "hit → block_id" is covered by `gravity_blockhash_alignment_test::(4)`
  (`eth_call(BLOCKHASH(n)) == mock_block_id(n)` for `n ∈ [1, 19]`)
- "miss → 0x0, never keccak" is preserved by the existing one-line
  `unwrap_or_default()` in `StateProviderDatabase::block_hash_ref` and
  pinned by `gravity_blockhash_alignment_test::(5)`
  (`BLOCKHASH(future) == B256::ZERO`)

No production code referenced `BlockIdOverrideDb` outside its own file
and the `pub use` in `crates/revm/src/lib.rs`. YAGNI — if a future EVM
entry ever bypasses `StateProviderDatabase`, the simpler fix is to route
it through the supertrait too rather than reintroduce a parallel wrapper.
…doc comments

`DatabaseEnv` in this fork is hard-aliased to RocksDB
(`crates/storage/db/src/lib.rs`), so the recently-added doc comments
referring to "MDBX" were both inaccurate and unnecessarily backend-coupled.
Replace with "`tables::BlockNumberToBlockId`" / "the persisted table"
across the chain-state and provider crates so the abstraction line stays
at the trait/table level rather than naming a specific backend.

No behavior change; doc-only.
Two follow-up fixes flagged during the design.md sweep:

1. `crates/storage/storage-api/src/state.rs` — the supertrait doc
   comment said "Test stubs (MockEthProvider, MockStateProvider,
   NoopProvider) alias to BlockHashReader::block_hash". Live code shows
   those three all return `Ok(None)`; only `StateProviderTest` (in
   `reth-revm`) aliases to `block_hash`, because legacy tests there
   insert via `insert_block_hash` and assert the value flows back
   through EVM. Also drops the "MDBX table" backend label.

2. `crates/gravity-storage/src/block_view_storage/mod.rs` — three
   remaining "MDBX" mentions describing a backend-agnostic concern
   (sync RW tx blocking a tokio worker, single-writer lock contention
   with the persistence task). Rephrased to "the persistence backend"
   so the abstraction line stays at the trait layer.

No behavior change; doc-only.
@nekomoto911 nekomoto911 force-pushed the feat/blockhash-block-id-alignment branch from d27ea33 to 6580e7c Compare June 17, 2026 05:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant