Skip to content

feat: add native symbolic testing#14796

Open
mattsse wants to merge 45 commits into
masterfrom
mattsse/native-symbolic-testing
Open

feat: add native symbolic testing#14796
mattsse wants to merge 45 commits into
masterfrom
mattsse/native-symbolic-testing

Conversation

@mattsse
Copy link
Copy Markdown
Member

@mattsse mattsse commented May 17, 2026

What changed

Adds Foundry-owned native symbolic testing behind forge test --symbolic.

This introduces a new foundry-evm-symbolic crate with a standalone symbolic EVM, Z3-backed solving, symbolic ABI input generation, byte-precise calldata/memory/returndata, symbolic storage and keccak support, multi-frame calls, creation, cheatcode handling, invariant sequence execution, and concrete replay before counterexamples are reported.

The Forge runner now discovers check*/prove* tests for stateless symbolic execution and invariant*/statefulFuzz* tests for bounded symbolic sequence execution when symbolic mode is enabled. Symbolic configuration is available through [profile.*.symbolic], CLI/env overrides, and native forge-config: inline settings, with compatibility translation for @custom:halmos annotations.

Why

This gives Forge a native path for symbolic testing without requiring a separate external runner, while still preserving normal Forge setup, compilation, test selection, result formatting, and replay semantics.

Impact

Users can write ordinary Forge-style Solidity tests and run them symbolically with forge test --symbolic. Dynamic ABI bounds, solver limits, invariant depth, symbolic call targeting, SMT dumping, and storage layout mode are configurable through Foundry config. Unsupported symbolic constructs are reported as incomplete/stuck rather than silently treated as proven.

grandizzy and others added 20 commits May 19, 2026 12:54
* test(symbolic): add fuzzer/standard-lib parity tests

Adds 23 CI-bounded symbolic-execution parity tests against canonical
benchmark corpora (Echidna, Medusa, ItyFuzz, devdacian, crytic/properties,
Halmos examples, OpenZeppelin patterns) plus engine-capability checks for
new opcodes.

All tests gated on z3_available(); total wall ~43s in parallel.

Amp-Thread-ID: https://ampcode.com/threads/T-019e3ae3-f6f9-76f0-86a0-845a6f893516
Co-authored-by: Amp <amp@ampcode.com>

* test(symbolic): convert parity tests to snapbox snapshots

Replaces `stdout_lossy() + contains()` assertions with
`stdout_eq(str![[...]])` snapshots for all 23 symbolic parity tests.

- Adds two helpers in the test module:
  - `assert_symbolic` redacts `[METRICS]` (paths/queries counts) and
    `[SENDER]` (symbolic invariant sender addresses).
  - `assert_symbolic_witness` adds `[CALLDATA]` and `[ARGS]` (with one
    level of nested scientific-notation brackets) for tests whose
    counterexample witnesses Z3 chooses freely.

- Uses `sh_eprintln` in `skip_unless_z3!` to satisfy the disallowed-macro
  clippy lint.

Amp-Thread-ID: https://ampcode.com/threads/T-019e3ae3-f6f9-76f0-86a0-845a6f893516
Co-authored-by: Amp <amp@ampcode.com>

* test(symbolic): trim parity snapshots with ellipsis wildcards

Replaces the repeated compile-header and pre-"Failing tests:" duplicated
counterexample line in each snapshot with a single `...` multi-line
wildcard from snapbox. Passing tests keep just `Ran 1 test for ...` and
the `[PASS] ...` line; failing tests keep from `Failing tests:` to the
`forge test --rerun` tip.

Amp-Thread-ID: https://ampcode.com/threads/T-019e3ae3-f6f9-76f0-86a0-845a6f893516
Co-authored-by: Amp <amp@ampcode.com>

* test(symbolic): split parity corpora

* test(symbolic): align kevm snapshot with trim convention

The split-out kevm test ended the snapshot at `Suite result: ok. ...;
finished in [..]`, but the global `[ELAPSED]` redaction rewrites
`finished in <n>s` to literal `[ELAPSED]`, so the assertion never
matched. Replace the trailing line with the same `...` multi-line
wildcard the other parity snapshots use.

Amp-Thread-ID: https://ampcode.com/threads/T-019e3ae3-f6f9-76f0-86a0-845a6f893516
Co-authored-by: Amp <amp@ampcode.com>

* test(symbolic): consolidate parity tests under symbolic_parity/ mod

- Move the 12 `symbolic_parity_*.rs` files into a `symbolic_parity/`
  subdirectory with a single `mod symbolic_parity` registration. Mirrors
  the existing `invariant/` layout.

- Re-export `assert_symbolic` and `assert_symbolic_witness` from
  `symbolic_parity::mod` so corpus submodules can `use super::*` instead
  of reaching into `crate::test_cmd::symbolic_helpers`.

- Add the missing snapshot assertions for hevm, manticore, scribble, and
  swc. These were checking only the failure exit code; now they assert
  the trimmed snapbox snapshot the rest of the corpus uses.

Amp-Thread-ID: https://ampcode.com/threads/T-019e3ae3-f6f9-76f0-86a0-845a6f893516
Co-authored-by: Amp <amp@ampcode.com>

* fix(symbolic): make manticore parity test actually exercise the engine

The previous snapshot showed:

    [FAIL: failed to set up invariant testing environment: No contracts to fuzz.]

That's a wiring bug, not a real `Manticore`-style multi-tx exploration:
the test contract held both `arm()` and `invariant_neverArmed()` in the
same contract, didn't inherit `forge-std/Test`, and had no `setUp` /
`targetContract` call, so the invariant runner had no fuzzable target.

Split the target out into a separate `ArmTarget` contract and register
it via `targetContract`. The engine now actually shrinks the symbolic
sequence to a single `arm(0xfeed)` call.

Amp-Thread-ID: https://ampcode.com/threads/T-019e3ae3-f6f9-76f0-86a0-845a6f893516
Co-authored-by: Amp <amp@ampcode.com>

* test(symbolic): align parity tests with upstream challenges

- devdacian rarely-false: port to canonical (n + 1234) % 2**80 == 0
- halmos minivat: rename trivial test to linear_smoke_parity and add
  canonical Fundamental Equation of DAI invariant as ignored regression
  (engine gap: nonlinear bv-mul symbolic*symbolic)
- erc20 approve race: tighten comment to reflect hardcoded sequence

Co-authored-by: grandizzy <38490174+grandizzy@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019e3ae3-f6f9-76f0-86a0-845a6f893516
Co-authored-by: Amp <amp@ampcode.com>

* Update crates/forge/tests/cli/test_cmd/symbolic_parity/halmos.rs

Co-authored-by: figtracer <me@figtracer.com>

---------

Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Derek <256792747+decofe@users.noreply.github.com>
Co-authored-by: figtracer <me@figtracer.com>
* feat(symbolic): support SMT solver selection

* fix: address grandizzy's comments

* fix: improve symbolic solver portfolio performance

* fix(symbolic): validate portfolio solver choices

Amp-Thread-ID: https://ampcode.com/threads/T-019e4487-e080-741d-b8ab-62d5aa43573a
Co-authored-by: Amp <amp@ampcode.com>

* fix(symbolic): avoid impossible mapping storage aliases

Amp-Thread-ID: https://ampcode.com/threads/T-019e4487-e080-741d-b8ab-62d5aa43573a
Co-authored-by: Amp <amp@ampcode.com>

* chore(symbolic): remove local solver version aliases

Amp-Thread-ID: https://ampcode.com/threads/T-019e4487-e080-741d-b8ab-62d5aa43573a
Co-authored-by: Amp <amp@ampcode.com>

* chore(symbolic): drop versioned solver aliases

Amp-Thread-ID: https://ampcode.com/threads/T-019e4487-e080-741d-b8ab-62d5aa43573a
Co-authored-by: Amp <amp@ampcode.com>

* test(symbolic): snapshot erc20 storage regression

Amp-Thread-ID: https://ampcode.com/threads/T-019e4487-e080-741d-b8ab-62d5aa43573a
Co-authored-by: Amp <amp@ampcode.com>

---------

Co-authored-by: Amp <amp@ampcode.com>
* chore(symbolic): share builtin solver registry

Amp-Thread-ID: https://ampcode.com/threads/T-019e4487-e080-741d-b8ab-62d5aa43573a
Co-authored-by: Amp <amp@ampcode.com>

* feat(symbolic): report portfolio solver outcomes

Amp-Thread-ID: https://ampcode.com/threads/T-019e4487-e080-741d-b8ab-62d5aa43573a
Co-authored-by: Amp <amp@ampcode.com>

* feat(symbolic): warn on degraded solver portfolios

Amp-Thread-ID: https://ampcode.com/threads/T-019e4487-e080-741d-b8ab-62d5aa43573a
Co-authored-by: Amp <amp@ampcode.com>

* chore(symbolic): fix portfolio diagnostics clippy

Amp-Thread-ID: https://ampcode.com/threads/T-019e4487-e080-741d-b8ab-62d5aa43573a
Co-authored-by: Amp <amp@ampcode.com>

* docs(symbolic): document solver portfolio helpers

---------

Co-authored-by: Amp <amp@ampcode.com>
* feat(symbolic): summarize portfolio diagnostics

Amp-Thread-ID: https://ampcode.com/threads/T-019e4487-e080-741d-b8ab-62d5aa43573a
Co-authored-by: Amp <amp@ampcode.com>

* Address symbolic portfolio diagnostics review

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

* Fix transaction request clippy warning

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

* Revert "Fix transaction request clippy warning"

This reverts commit 884799d.

* Allow transaction request enum size lint

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

* Disambiguate portfolio diagnostics re-export

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

* Use enum for solver outcomes

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

---------

Co-authored-by: Amp <amp@ampcode.com>
Defer symbolic diagnostics during progress output


Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694

Co-authored-by: Amp <amp@ampcode.com>
* feat(symbolic): adapt portfolio scheduling

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

* fix(symbolic): tighten adaptive portfolio scoring

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

* refactor(symbolic): name portfolio scheduler signals

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

* fix(symbolic): clarify query progress limit

Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694

---------

Co-authored-by: Amp <amp@ampcode.com>
* feat(symbolic): configure exploration order

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

* fix(symbolic): default exploration order when omitted

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

* fix(symbolic): apply exploration order to local batches

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

---------

Co-authored-by: Amp <amp@ampcode.com>
)

Promotes scattered magic numbers and named constants spread across
executor.rs, solver.rs, evm.rs, memory.rs, state.rs and precompiles.rs
into a single `crates/evm/symbolic/src/consts.rs` module.  All previous
call-sites pick the constants up through the existing use `super::*` /
`pub(crate) use consts::*` re-export chain, so no public API changes.
* refactor(symbolic): consolidate constants into a `consts` module

Promotes scattered magic numbers and named constants spread across
executor.rs, solver.rs, evm.rs, memory.rs, state.rs and precompiles.rs
into a single `crates/evm/symbolic/src/consts.rs` module.  All previous
call-sites pick the constants up through the existing use `super::*` /
`pub(crate) use consts::*` re-export chain, so no public API changes.

* refactor(symbolic): split symbolic test file by feature

Splits the monolithic symbolic.rs (~7800 lines, ~150 tests) into focused
files grouped by the EVM feature they exercise:

- symbolic.rs: basic flags, core behavior, arrays, invariants
- symbolic_opcodes.rs: byte/signextend, shift, exp
- symbolic_memory.rs: mload/mstore/sha3/log/returndatacopy/mcopy
- symbolic_calls.rs: calldataload/copy,
  external/static/delegate/callcode
- symbolic_creates.rs: create, create2, nonce, vm.expectCreate
- symbolic_storage.rs: mapping, packed, ERC-20 storage, svm helpers
- symbolic_precompiles.rs: hash, identity, advanced precompile coverage
- symbolic_cheatcodes.rs: all vm.* cheatcode tests

No test logic changes; all symbolic CLI tests are preserved.

---------

Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Mablr <59505383+mablr@users.noreply.github.com>
* refactor(symbolic): split executor into modules

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

* ci: skip tempo network checks without rpc secrets

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

* Revert "ci: skip tempo network checks without rpc secrets"

This reverts commit af8e942.

---------

Co-authored-by: Amp <amp@ampcode.com>
…um` (#14898)

Introduces `SolverConfigError` with `EmptyCommand` and
`UnterminatedQuote(char)`
variants, replacing all `Result<T, String>` in the solver
command-building path.
Also tightens `split_quoted_args` to return the offending `char`
directly instead
of a pre-formatted string, letting callers own the message context.
…olic-testing

# Conflicts:
#	crates/forge/src/result.rs
#	crates/forge/src/runner.rs
decofe and others added 4 commits May 25, 2026 13:45
* test: snapshot symbolic CLI output checks

* fix: satisfy clippy on symbolic test branch

* Update crates/forge/tests/cli/test_cmd/symbolic_helpers.rs

Co-authored-by: Mablr <59505383+mablr@users.noreply.github.com>

---------

Co-authored-by: grandizzy <38490174+grandizzy@users.noreply.github.com>
Co-authored-by: Mablr <59505383+mablr@users.noreply.github.com>
* feat(symbolic): add tracing to executor and solver

Spans added:
- `symbolic_path` (`completed_paths`, `worklist_size`): wraps each
  worklist iteration in `run.rs`
- `symbolic_step` (`pc`, `op`): wraps each `step()` call, nested under
  `symbolic_path`
- `jumpi_branch` (`pc`, `dest`): wraps the symbolic `JUMPI` fork block
  in `opcodes.rs`
- `solver_query` (`query_id`, `constraint_count`, `kind`): wraps
  `is_sat`/`model` calls in `solver.rs`

Events added:
- trace per path pop and per solver query entry
- debug at path/depth limit exhaustion and final safe/revert-all outcome
- debug before counterexample model materialization
- debug (with `constraint_count`) on invalid solver model

* fix: add missing tracing spans to `execute_sequence_call`
* refactor(common): extract wallet helpers

Move `derive_key_path`, `derive_private_key<W>`,
`derive_private_key_with_language`, and `private_key_from_u256` from
`foundry-evm-symbolic` into `foundry-common` so that both
`foundry-cheatcodes` and `foundry-evm-symbolic` share a single impl.

* fix: enable `mnemonic-all-languages`
figtracer and others added 6 commits May 25, 2026 22:10
Co-authored-by: Amp <amp@ampcode.com>
* feat(symbolic): normalize hard arithmetic queries

Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694

* fix(symbolic): address hard arithmetic review

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

* style(symbolic): fix nightly rustfmt import order

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

* test(symbolic): stabilize fake portfolio solvers

Amp-Thread-ID: https://ampcode.com/threads/T-019e3bf2-923d-727e-82f5-40ca3e956694
Co-authored-by: Amp <amp@ampcode.com>

---------

Co-authored-by: Amp <amp@ampcode.com>
feat(symbolic): expose solver telemetry


Amp-Thread-ID: https://ampcode.com/threads/T-019e645e-97a2-757f-adc7-2c765700d18d

Co-authored-by: Amp <amp@ampcode.com>
grandizzy and others added 11 commits May 27, 2026 12:21
* test(symbolic): regress default-layout unwritten mapping SLOAD

Pins that a symbolic-key read of an unwritten mapping yields a fresh
symbolic value under the default storage layout, complementing the
existing generic-layout coverage.

* test(symbolic): regress transient storage clear between invariant steps

Pins that EIP-1153 transient storage is cleared at the boundary of every
top-level symbolic invariant sequence step, so a prior step's TSTORE is
not observable from the next step.

* test(symbolic): regress revert-branch exploration under fail_on_revert

Pins that with fail_on_revert=false, the symbolic invariant runner keeps
exploring non-reverting symbolic branches of a target whose other branches
revert, instead of silently under-approximating and missing bugs.

* test(symbolic): regress GAS/gasleft bound and pin Unsupported gap

Pins that the GAS opcode is bounded by the transaction gas limit rather
than silently returning U256::MAX, and records an ignored regression that
gasleft-dependent branches should fail closed as Unsupported instead of
emitting phantom non-replaying counterexamples.

* test(symbolic): regress dynamic memory read epoch ordering

Pins that a symbolic-offset MLOAD respects write-epoch ordering: when a
later concrete MSTORE has written to an offset, a subsequent dynamic
read that aliases that offset must see the later concrete value, not
the stale earlier symbolic write.

* test(symbolic): regress overlay code lookup across invariant steps

Pins that top-level invariant sequence calls resolve account code through
the symbolic world overlay so prior-step vm.etch effects are visible at
the next step, instead of bypassing to the backend and silently missing
the mutation.

* test(symbolic): regress CREATE with symbolic runtime size fails closed

Pins that a constructor returning a symbolic-length runtime image is
reported as unsupported symbolic execution rather than silently installed
as max-length padded bytecode, which would corrupt EXTCODESIZE and later
selector dispatch.

* test(symbolic): regress vm.prank delegatecall overload fails closed

Pins that the vm.prank(address,bool) and vm.prank(address,address,bool)
overloads with delegateCall=true are reported as unsupported symbolic
execution instead of silently behaving like the plain address-only
overloads.

* test(symbolic): pin Keccak-heuristic surfacing gap

Adds an ignored regression that pins the plan goal: results that depend on
heuristic Keccak modeling must surface explicit Keccak/SHA3 vocabulary
(Incomplete or warning), instead of either passing silently or surfacing a
generic solver-error Incomplete with no Keccak attribution.

* style(symbolic): rustfmt
feat(symbolic): allow gas as call operand


Amp-Thread-ID: https://ampcode.com/threads/T-019e645e-97a2-757f-adc7-2c765700d18d

Co-authored-by: Amp <amp@ampcode.com>
feat(symbolic): support symbolic vm.deal values


Amp-Thread-ID: https://ampcode.com/threads/T-019e645e-97a2-757f-adc7-2c765700d18d

Co-authored-by: Amp <amp@ampcode.com>
…olic-testing-conflict-fix

Amp-Thread-ID: https://ampcode.com/threads/T-019e645e-97a2-757f-adc7-2c765700d18d
Co-authored-by: Amp <amp@ampcode.com>

# Conflicts:
#	crates/forge/src/cmd/test/mod.rs
#	crates/forge/src/result.rs
Copy link
Copy Markdown
Collaborator

@grandizzy grandizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-preview soundness audit found three concrete false-PASS paths that I think should block any public preview/MVP framing, plus one undocumented cryptographic assumption. Engineering scaffolding is good (unsupported-opcode → Incomplete, solver unknownTimeout, model validation, concrete replay, world overlay, symbolic memory bounds all reviewed cleanly) — the issues below are localized and fixable.

Blockers:

  1. ADDMOD / MULMOD use wrapping intermediate arithmetic instead of widened (see executor/opcodes.rs).
  2. Cancun precompile 0x0a (KZG point-evaluation) is not in the precompile registry and falls through to empty-account success (runtime/precompiles.rs + executor/calls.rs).
  3. Observable cheatcodes expectSafeMemory*, lastCallGas, snapshotGasLastCall* are silently no-op / fabricated (executor/cheatcodes.rs) — can convert real concrete failures into symbolic Safe.

Documentation gap:
4. BoolExpr::eq rewrites keccak(a) == keccak(b) to byte-equality (runtime/expressions.rs), encoding collision resistance as a hard fact — fine as an assumption but the README currently presents PASS as fail-closed. Either document or remove the converse direction.

Would also like to see a small false_pass_prevention CLI suite under crates/forge/tests/cli/test_cmd/ covering each of the above so we don't regress. After these land I'm comfortable shipping --symbolic as an opt-in preview.

Comment thread crates/evm/symbolic/src/executor/opcodes.rs
Comment thread crates/evm/symbolic/src/executor/opcodes.rs
Comment thread crates/evm/symbolic/src/runtime/precompiles.rs
Comment thread crates/evm/symbolic/src/executor/cheatcodes.rs Outdated
Comment thread crates/evm/symbolic/src/runtime/expressions.rs
@grandizzy grandizzy self-requested a review May 28, 2026 13:11
@grandizzy grandizzy marked this pull request as ready for review May 28, 2026 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants