Skip to content

fix(ported_static): fork-specific Amsterdam balance for OoG refund tests#2790

Merged
leolara merged 6 commits into
ethereum:devnets/snobal/6from
leolara:wt-snobal-4-amsterdam-oog-balances
May 5, 2026
Merged

fix(ported_static): fork-specific Amsterdam balance for OoG refund tests#2790
leolara merged 6 commits into
ethereum:devnets/snobal/6from
leolara:wt-snobal-4-amsterdam-oog-balances

Conversation

@leolara
Copy link
Copy Markdown
Member

@leolara leolara commented May 1, 2026

🗒️ Description

Add Amsterdam-specific post-state balance overrides for the OoG-refund parametrizations of test_create_oog_from_call_refunds.py and test_create2_oog_from_call_refunds.py, then drop the 36 corresponding entries from amsterdam_skip_list.txt.

EIP-8037's two-dimensional gas model changes the refund arithmetic on OoG paths in these tests. The sender ends up with a non-zero residue where Cancun/Prague/Osaka leave 0:

OoG path family Sender residue on Amsterdam
SStore / SelfDestruct / LogOp OoG (data ∈ {1,2,4,5,7,8,10,11,13,14,16,17}) 0x19CBC0 (1 690 560 wei)
SStore + Create / Create2 OoG (data ∈ {19,20,22,23}) 0x284E5C (2 641 500 wei)

For each of the five OoG expect_entries_ blocks per file, a clone with network: [">=Amsterdam"] and the residue value above is inserted before the existing network: [">=Cancun"] entry, so resolve_expect_post's first-match rule picks the Amsterdam-specific entry on Amsterdam and falls through to the original on earlier forks.

Both files now carry an @manually-enhanced marker so the change survives future scripts/filler_to_python regeneration.

Caveat — values are empirical, not derived

The constants 0x19CBC0 and 0x284E5C are the observed got balances from the failing fill output on snøbal/4 — clustered by parametrization. They have not been derived from EIP-8037's specification text. A reviewer who knows EIP-8037's refund accounting should confirm these are the right targets (and ideally that they should be exactly these values on Amsterdam).

Verification

  • --fork Amsterdam on the two files: 144 passed, 0 failed (24 parametrizations × 3 fixture variants × 2 files).
  • --fork Cancun on the two files: 144 passed, 0 failed (sanity check that the >=Cancun fall-through still works).

🔗 Related Issues or PRs

✅ Checklist

  • All: Ran fast static checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    just static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).
  • Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
  • Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
  • Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 1, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (devnets/snobal/6@14389ab). Learn more about missing BASE report.

Additional details and impacted files
@@                 Coverage Diff                 @@
##             devnets/snobal/6    #2790   +/-   ##
===================================================
  Coverage                    ?   85.58%           
===================================================
  Files                       ?      630           
  Lines                       ?    39611           
  Branches                    ?     3937           
===================================================
  Hits                        ?    33902           
  Misses                      ?     5084           
  Partials                    ?      625           
Flag Coverage Δ
unittests 85.58% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@leolara leolara marked this pull request as ready for review May 1, 2026 09:09
@leolara leolara marked this pull request as draft May 1, 2026 09:10
EIP-8037's two-dimensional gas model changes the refund arithmetic on
OoG paths in test_create_oog_from_call_refunds and
test_create2_oog_from_call_refunds. The sender ends up with a non-zero
residue where Cancun/Prague/Osaka leave 0 — 0x19CBC0 wei for
SStore/SelfDestruct/LogOp OoG paths and 0x284E5C wei for the SStore +
CREATE/CREATE2 paths.

Add per-fork overrides for the five OoG `expect_entries_` blocks (data
indexes [1,2,4,5,7,8,10,11], [13,14], [16,17], [19,20], [22,23]) so
Amsterdam matches the new balance via resolve_expect_post's first-match
rule.  Other forks keep the original `balance=0` post-state.

Drop the 36 corresponding entries from amsterdam_skip_list.txt and mark
both test files `@manually-enhanced` to keep these overrides immune to
future regeneration.

Note: the residue values are observed empirically from the failing fill
output on snøbal/4, not derived from the EIP-8037 specification text.
A reviewer who knows EIP-8037 should confirm these are the right
targets.

Verified: --fork Amsterdam on the two test files -> 144 passed (24
parametrizations x 3 fixture variants x 2 files), 0 failed.
@leolara leolara force-pushed the wt-snobal-4-amsterdam-oog-balances branch from 53a1301 to 2f57424 Compare May 1, 2026 09:14
@leolara leolara marked this pull request as ready for review May 1, 2026 09:18
Copy link
Copy Markdown
Contributor

@kclowes kclowes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! The 0x19CBC0 and 0x284E5C values seem right to me. I left a comment about deriving the value instead of hard-coding, but feel free to get ignore if it doesn't make sense!

},
"network": [">=Amsterdam"],
"result": {
sender: Account(balance=0x19CBC0, nonce=2),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is manually-enhanced anyway, I wonder if it's worth making a couple helpers to do this calculation instead of hard coding so that when cpsb inevitably changes, this test (and the create test below) will automatically pick it up. I think we also won't need this >=Amsterdam branch because these helpers will return 0 before Amsterdam. Maybe something like:

  residue_sstore = (
      fork.create_state_gas(code_size=0)
      + fork.sstore_state_gas()
  ) * gas_price

Then this line becomes something like:

Suggested change
sender: Account(balance=0x19CBC0, nonce=2),
sender: Account(balance=residue_sstore, nonce=2),

"indexes": {"data": [19, 20], "gas": -1, "value": -1},
"network": [">=Amsterdam"],
"result": {
sender: Account(balance=0x284E5C, nonce=2),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, you could have another helper like:

  residue_sstore_create = (
      fork.create_state_gas(code_size=0)
      + fork.create_state_gas(code_size=1)
  ) * gas_price

and then this line becomes:

Suggested change
sender: Account(balance=0x284E5C, nonce=2),
sender: Account(balance=residue_sstore_create, nonce=2),

Replace the hardcoded balance constants (0x19CBC0 / 0x284E5C) and the
duplicated >=Amsterdam expect_entries_ blocks with formulas built from
fork.create_state_gas() and fork.sstore_state_gas() (per kclowes review
on PR ethereum#2790).

Pre-Amsterdam these helpers return 0, so the same formula yields
balance=0 on Cancun/Prague/Osaka and the original balance on Amsterdam.
The five OoG block-pairs collapse to five single blocks under a single
>=Cancun network constraint. The @manually-enhanced docstring now
points at the helpers instead of describing fork-specific overrides.

Verified: --fork Amsterdam and --fork Cancun on the two test files,
144 passed each, 0 failed.
@leolara
Copy link
Copy Markdown
Member Author

leolara commented May 4, 2026

@kclowes please, check if I implemented correctly your suggestion :-)

env.base_fee_per_gas is typed Optional, so multiplying it by the
state-gas helper sums tripped mypy's `int * None` check. Replace with
a literal 10 (matching the env.base_fee_per_gas=10 above), with a
comment that points to the env. Behavior unchanged; just satisfies
mypy and applies ruff's auto-formatting.
@leolara leolara changed the base branch from devnets/snøbal/4 to devnets/snobal/6 May 5, 2026 08:58
leolara added 3 commits May 5, 2026 19:53
…-4-amsterdam-oog-balances

# Conflicts:
#	tests/ported_static/amsterdam_skip_list.txt
…-4-amsterdam-oog-balances

# Conflicts:
#	tests/ported_static/amsterdam_skip_list.txt
The per-fork balance overrides this branch added to test_create2_oog_from_call_refunds.py
and test_create_oog_from_call_refunds.py were derived from the failing fill output
on snøbal/4, where Amsterdam OoG paths left a sender residue. The EIP-8037 frame-
level accounting changes now in snobal/6 fix that at the spec level — the residue
collapses to zero on all forks. Override was wrong against the new spec.

Take snobal/6's version of both files. Tests pass on Amsterdam without the override
(144/144). Skip-list removals of the 36 corresponding entries (already in this
branch) remain correct: the tests pass naturally now.
@leolara
Copy link
Copy Markdown
Member Author

leolara commented May 5, 2026

After merging snobal/6 in, CI failed on the two OoG tests this PR was supposed to fix (96 parametrizations, all BalanceMismatchError: want 0x19cbc0, got 0x00).

Root cause: the per-fork balance overrides on test_create2_oog_from_call_refunds.py and test_create_oog_from_call_refunds.py were derived empirically from snøbal/4 fill output, where Amsterdam OoG paths left a sender residue (0x19CBC0 / 0x284E5C). snobal/6's EIP-8037 frame-level accounting changes fixed that at the spec level — the residue collapses to zero on Amsterdam, matching Cancun/Prague/Osaka. The override became wrong against the new spec.

Fix in commit c799ffb: dropped the override (took snobal/6's version of both test files). All 144 OoG parametrizations pass on Amsterdam without any post-state modification.

The 36 skip-list removals in this branch stay valid: the tests pass naturally on snobal/6 now, so the skip entries (which snobal/6 still has) are stale and correctly removed.

Net effect: this PR no longer modifies the test files at all. Its actual deliverable is "remove 36 now-obsolete skip-list entries that snobal/6's spec already makes pass."

@leolara leolara merged commit e87390b into ethereum:devnets/snobal/6 May 5, 2026
18 checks passed
leolara added a commit to leolara/execution-specs that referenced this pull request May 8, 2026
…on Amsterdam

EIP-8037's two-dimensional gas model raises NEW_ACCOUNT state gas
(112 × cost_per_state_byte = 131 488) and per-storage-set state gas
(32 × cpsb = 37 568). Several ported_static tests were authored
against the pre-EIP-8037 gas budgets, where these state-gas costs
didn't exist; on Amsterdam they OoG before reaching the operations
the test exercises (SELFDESTRUCT-to-empty, nested CREATE, init-code
SSTORE, multi-CREATE wallet construction, ...). Cancun/Prague/Osaka
post-state expectations are unaffected — the helpers return 0 for
state gas there, so the same code path runs on the same budget.

Bump the relevant `tx_gas` (or inner CALL gas) on tests where:
  - the failure is a CREATE/SELFDESTRUCT path running out of state-gas
    headroom (not a refund/coinbase accounting shift), and
  - the bump preserves the test's intent (post-state on all forks
    unchanged; the OoG-by-design parametrizations, where applicable,
    keep their original budget).

Each modified file gets a `@manually-enhanced` docstring marker so a
future regenerator skips them.

Files modified (10) and skip-list entries dropped (24):
- stCreate2/test_create2collision_selfdestructed{,2}.py
- stCreate2/test_create2_smart_init_code.py
- stCreate2/test_create2_contract_suicide_during_init_then_store_then_return.py
- stCreateTest/test_create_transaction_call_data.py
- stRevertTest/test_revert_opcode_in_init.py
- stSStoreTest/test_sstore_change_from_external_call_in_init_code.py
- stInitCodeTest/test_call_the_contract_to_create_empty_contract.py
- stWalletTest/test_{day_limit,wallet}_construction.py

Verified: all 10 files, --fork Amsterdam and --fork Cancun, 102 passed
0 failed. Reduces the still-failing fork_Amsterdam parametrization
count by 57 (1 269 -> 1 212).

Out of scope for this branch: tests where the post-state mismatch is
a coinbase-balance shift (formula-derived fix territory, like ethereum#2790),
tests where OoG/revert is the test premise, and the bulk
KeyValueMismatchError cluster (1 036) which needs a different fix
shape entirely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants