Skip to content

feat!: add integration_bench tool for end-to-end scenario latency, block, and tx-byte measurements#487

Merged
moudyellaz merged 11 commits into
mainfrom
moudy/e2e-bench-tool
May 20, 2026
Merged

feat!: add integration_bench tool for end-to-end scenario latency, block, and tx-byte measurements#487
moudyellaz merged 11 commits into
mainfrom
moudy/e2e-bench-tool

Conversation

@moudyellaz
Copy link
Copy Markdown
Collaborator

@moudyellaz moudyellaz commented May 19, 2026

🎯 Purpose

Add a benchmark tool for the full end-to-end LEZ stack: scenarios driven through the wallet against a real Bedrock node, in-process sequencer, and indexer, sharing a single docker-compose session across all scenarios in a run. Records per-step wall time (as Duration internally, f64 seconds on the wire), Bedrock L1-finality latency, borsh-serialized block sizes, and per-tx-variant wire sizes. Complements cycle_bench (program-level cycles) and crypto_primitives_bench (wallet-side primitives).

⚙️ Approach

New standalone package at tools/integration_bench. Bench uses test_fixtures::TestContext (extracted from integration_tests in this PR) to bring up the same docker-compose stack the integration tests use, one shared bedrock + sequencer + indexer session for the whole run. Five scenarios:

  • token_onboarding: sequential public token Send + one shielded recipient setup.
  • amm_swap_flow: pool create, swap, add liquidity, remove liquidity (all public).
  • multi_recipient_fanout: one sender to N recipients, sequential.
  • private_chained_flow: shielded, deshielded, private to private (PPE-bearing).
  • parallel_fanout: N senders submit concurrently into one block.

Two modes: RISC0_DEV_MODE=1 for fast latency-only runs, or unset for real STARK proving. JSON dumps default to target/integration_bench_{dev,prove}.json; override with --json-out. Per-step timing uses a ScenarioOutput::step closure helper instead of hand-rolled Instant::now() pairs, and end-to-end submit timeouts use tokio::time::timeout.

  • Add tools/integration_bench package with 5 scenarios, harness, and shared TestContext setup.
  • Extract test_fixtures crate from integration_tests so the bench depends on the shared fixtures, not the test crate.
  • Add docs/benchmarks/integration_bench.md with canonical dev-mode and real-proving result tables from the docker-compose sweep.
  • Add integration_bench row to docs/benchmarks/README.md.
  • Register tools/integration_bench and test_fixtures in workspace members.

🧪 How to Test

Docker must be running.

# Dev mode, all 5 scenarios (~17 min on M2 Pro)
RISC0_DEV_MODE=1 cargo run --release -p integration_bench -- --scenario all

# Real proving for private (~13 min)
cargo run --release -p integration_bench -- --scenario private

# Real proving for amm (~9 min); override json path since default collides with private
cargo run --release -p integration_bench -- --scenario amm --json-out /tmp/integration_bench_prove_amm.json

See docs/benchmarks/integration_bench.md for the canonical result tables (M2 Pro dev box, CPU only) including the per-PPE-step submit mean (~127 s) and ppe_tx_bytes (225,728, matching cycle_bench S_agg).

🔗 Dependencies

None.

🔜 Future Work

  • Criterion-based regression detection (tracked separately).

📋 PR Completion Checklist

  • Complete PR description
  • Implement the core functionality
  • Add/update tests (bench tool with no automated tests; verified by running the binary across all scenarios; canonical numbers in docs/benchmarks/integration_bench.md)
  • Add/update documentation and inline comments

@moudyellaz moudyellaz marked this pull request as ready for review May 19, 2026 09:14
@moudyellaz moudyellaz requested review from Arjentix, Pravdyvy and schouhy and removed request for Pravdyvy May 19, 2026 09:14
Copy link
Copy Markdown
Collaborator

@Pravdyvy Pravdyvy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Copy Markdown
Collaborator

@Arjentix Arjentix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for the PR.

Left some comments. My main concerns are:

  1. Usage of bedrock binary (I suggest to still use with docker)
  2. Not using binaries of our sequencer, indexer and wallet and using their library code instead. It makes all of this integration_bench not e2e_bench. IMO it's alright, just don't call it e2e_bench.
  3. Depending on integration_tests and mimicing TestContext with duplicated code, while we could possibly expand TestContext, move it to a separate crate and reuse here.

I understand that these are just benches and not the main code. But this is still a part of our project which we have to support in future (even after moving to criterion). So I'm suggesting to refactor some places to make them easier to work with.

Comment thread docs/benchmarks/e2e_bench.md Outdated
Comment thread docs/benchmarks/e2e_bench.md Outdated
Comment thread docs/benchmarks/e2e_bench.md Outdated
Comment thread docs/benchmarks/e2e_bench.md Outdated
Comment thread Cargo.toml Outdated
Comment thread tools/e2e_bench/src/harness.rs Outdated
Comment thread tools/e2e_bench/src/harness.rs Outdated
Comment thread tools/integration_bench/src/harness.rs
Comment thread tools/e2e_bench/src/scenarios/token.rs Outdated
Comment thread tools/integration_bench/src/scenarios/token.rs Outdated
…io::timeout

BREAKING CHANGE: bench JSON renames per-step / per-scenario timing fields from *_ms (float milliseconds) to *_s (float seconds). Renames: submit_ms → submit_s, inclusion_ms → inclusion_s, wallet_sync_ms → wallet_sync_s, total_ms → total_s, setup_ms → setup_s, bedrock_finality_ms → bedrock_finality_s, total_wall_seconds → total_wall_s. measure_bedrock_finality timeout floor also shifts slightly: on timeout the field is now ~60.000s rather than "first poll tick past 60s".
@moudyellaz moudyellaz force-pushed the moudy/e2e-bench-tool branch from 20a7b4a to 563a9ce Compare May 20, 2026 08:08
…, share one node per run

BREAKING CHANGE:
- crate renamed e2e_bench → integration_bench. Run via `cargo run -p integration_bench`.
- env vars removed: LEZ_BEDROCK_BIN, LEZ_BEDROCK_CONFIG_DIR, LEZ_BEDROCK_PORT. Replaced by a docker prerequisite (docker-compose Bedrock via test_fixtures::TestContext).
- output filenames: target/e2e_bench_{dev,prove}.json → target/integration_bench_{dev,prove}.json.
- JSON schema: per-scenario `setup_s` field removed; replaced by run-level `shared_setup_s` (one TestContext is shared across all scenarios in a run).
- internal: bedrock_handle.rs and bench_context.rs deleted; placeholder-string config (PLACEHOLDER_CHAIN_START_TIME) gone.
@moudyellaz moudyellaz requested a review from Arjentix May 20, 2026 11:59
@moudyellaz moudyellaz changed the title feat: add e2e_bench tool for end-to-end scenario latency, block, and tx-byte measurements feat!: add integration_bench tool for end-to-end scenario latency, block, and tx-byte measurements May 20, 2026
Copy link
Copy Markdown
Collaborator

@Arjentix Arjentix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my comments. Looks much better to me now.

Comment thread test_fixtures/src/lib.rs Outdated
@moudyellaz moudyellaz merged commit bfdc087 into main May 20, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants