feat!: add integration_bench tool for end-to-end scenario latency, block, and tx-byte measurements#487
Merged
Merged
Conversation
…tx-byte measurements
Arjentix
requested changes
May 19, 2026
Collaborator
Arjentix
left a comment
There was a problem hiding this comment.
Hey, thanks for the PR.
Left some comments. My main concerns are:
- Usage of bedrock binary (I suggest to still use with docker)
- Not using binaries of our sequencer, indexer and wallet and using their library code instead. It makes all of this
integration_benchnote2e_bench. IMO it's alright, just don't call ite2e_bench. - Depending on
integration_testsand mimicingTestContextwith duplicated code, while we could possibly expandTestContext, move it to a separate crate and reuse here.
I understand that these are just benches and not the main code. But this is still a part of our project which we have to support in future (even after moving to criterion). So I'm suggesting to refactor some places to make them easier to work with.
…io::timeout BREAKING CHANGE: bench JSON renames per-step / per-scenario timing fields from *_ms (float milliseconds) to *_s (float seconds). Renames: submit_ms → submit_s, inclusion_ms → inclusion_s, wallet_sync_ms → wallet_sync_s, total_ms → total_s, setup_ms → setup_s, bedrock_finality_ms → bedrock_finality_s, total_wall_seconds → total_wall_s. measure_bedrock_finality timeout floor also shifts slightly: on timeout the field is now ~60.000s rather than "first poll tick past 60s".
20a7b4a to
563a9ce
Compare
…, share one node per run
BREAKING CHANGE:
- crate renamed e2e_bench → integration_bench. Run via `cargo run -p integration_bench`.
- env vars removed: LEZ_BEDROCK_BIN, LEZ_BEDROCK_CONFIG_DIR, LEZ_BEDROCK_PORT. Replaced by a docker prerequisite (docker-compose Bedrock via test_fixtures::TestContext).
- output filenames: target/e2e_bench_{dev,prove}.json → target/integration_bench_{dev,prove}.json.
- JSON schema: per-scenario `setup_s` field removed; replaced by run-level `shared_setup_s` (one TestContext is shared across all scenarios in a run).
- internal: bedrock_handle.rs and bench_context.rs deleted; placeholder-string config (PLACEHOLDER_CHAIN_START_TIME) gone.
Arjentix
approved these changes
May 20, 2026
Collaborator
Arjentix
left a comment
There was a problem hiding this comment.
Thanks for addressing my comments. Looks much better to me now.
…_fixtures docstring
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🎯 Purpose
Add a benchmark tool for the full end-to-end LEZ stack: scenarios driven through the wallet against a real Bedrock node, in-process sequencer, and indexer, sharing a single docker-compose session across all scenarios in a run. Records per-step wall time (as
Durationinternally, f64 seconds on the wire), Bedrock L1-finality latency, borsh-serialized block sizes, and per-tx-variant wire sizes. Complementscycle_bench(program-level cycles) andcrypto_primitives_bench(wallet-side primitives).⚙️ Approach
New standalone package at
tools/integration_bench. Bench usestest_fixtures::TestContext(extracted fromintegration_testsin this PR) to bring up the same docker-compose stack the integration tests use, one shared bedrock + sequencer + indexer session for the whole run. Five scenarios:token_onboarding: sequential public token Send + one shielded recipient setup.amm_swap_flow: pool create, swap, add liquidity, remove liquidity (all public).multi_recipient_fanout: one sender to N recipients, sequential.private_chained_flow: shielded, deshielded, private to private (PPE-bearing).parallel_fanout: N senders submit concurrently into one block.Two modes:
RISC0_DEV_MODE=1for fast latency-only runs, or unset for real STARK proving. JSON dumps default totarget/integration_bench_{dev,prove}.json; override with--json-out. Per-step timing uses aScenarioOutput::stepclosure helper instead of hand-rolledInstant::now()pairs, and end-to-end submit timeouts usetokio::time::timeout.tools/integration_benchpackage with 5 scenarios, harness, and shared TestContext setup.test_fixturescrate fromintegration_testsso the bench depends on the shared fixtures, not the test crate.docs/benchmarks/integration_bench.mdwith canonical dev-mode and real-proving result tables from the docker-compose sweep.integration_benchrow todocs/benchmarks/README.md.tools/integration_benchandtest_fixturesin workspacemembers.🧪 How to Test
Docker must be running.
See
docs/benchmarks/integration_bench.mdfor the canonical result tables (M2 Pro dev box, CPU only) including the per-PPE-step submit mean (~127 s) andppe_tx_bytes(225,728, matchingcycle_benchS_agg).🔗 Dependencies
None.
🔜 Future Work
📋 PR Completion Checklist
docs/benchmarks/integration_bench.md)