chore: add workspace build profile gradient and wire prod deploys to profiling#104
chore: add workspace build profile gradient and wire prod deploys to profiling#104
Conversation
Clippy 1.95 reports `storage.sort_by(|(a, _), (b, _)| a.cmp(b))` as a `clippy::unnecessary_sort_by` (prefer `sort_by_key`) in the `apply_curie_hard_fork` test. The fix is the clippy-suggested rewrite. No behavioural change; only the test's sort comparator is reshaped.
morph-reth's root Cargo.toml previously defined no [profile.*] overrides,
so release builds fell back to Cargo defaults (lto=off) — strictly worse
than upstream reth's own release. Define an explicit profile gradient:
- profile.dev: line-tables-only debug + unpacked split-debuginfo.
- profile.hivetests: opt-level=3 + lto=thin, inherits test (debug_assert on).
- profile.release: opt-level=3, lto=thin, strip=symbols, codegen-units=16.
- profile.profiling: release + line-table debug + strip=none + incremental
— production binaries that still flamegraph cleanly.
- profile.bench: inherits profiling.
- profile.maxperf: release + lto=fat + codegen-units=1 for peak throughput
(expect a noticeably longer link stage).
Local `make build` still defaults to the `release` profile, which now
benefits from thin LTO without any user action.
Production-deploy paths (the Dockerfile that produces the container image and MakefileEc2.mk that uploads binaries to S3 for EC2 nodes) both flip their default Cargo profile from `release` to `maxperf`. That enables fat LTO + a single codegen unit so the shipped binary is 2-5% faster at runtime than the default release profile. Tradeoff: symbols are stripped, so flame graphs and symbolicated backtraces require a one-off `--build-arg BUILD_PROFILE=profiling` (or `PROFILE=profiling make ...`) rebuild. Image/link time also roughly doubles. Both are acceptable for an execution client where block- execution throughput is the hot metric and prod incidents are rare. Local `make build` remains on `release` so everyday iteration is unchanged.
Previously the release workflow hardcoded `target/<target>/release/` when packaging tarballs, which coupled it to the Makefile's `release` default and silently broke if either was overridden. - Add a `profile` workflow_dispatch input (default `profiling`, with `maxperf` and `release` as alternatives). Push-tag triggers keep defaulting to `maxperf` for peak throughput on public releases. - Resolve the profile in a dedicated step and thread it through both `make build-<target>` (via PROFILE=) and the subsequent `cp` path (using the computed `profile_dir`, which handles Cargo's dev->debug naming quirk).
📝 WalkthroughWalkthroughBuild and CI configuration updated to support selectable Cargo profiles and per-architecture RUSTFLAGS: new Cargo profiles added; GitHub Actions, Dockerfile, Makefile, and EC2 makefile now default to the Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Mirror upstream reth release defaults: x86_64 binaries target the x86-64-v3 baseline (Haswell+ / Excavator+, ~2013 onward) plus the pclmulqdq extension (carry-less multiply, used by keccak/GHASH but not auto-implied by v3). aarch64 is left untouched. - Makefile cross-build: target-specific RUSTFLAGS_ARCH so build-x86_64-* picks up the flags while build-aarch64-* stays clean. - Dockerfile: TARGETPLATFORM-conditional so multi-arch buildx builds get v3 only on linux/amd64. - MakefileEc2.mk: uname -m guard so EC2 native builds opt in on x86_64 and skip on Graviton ARM hosts. Pre-2013 Intel and pre-2015 AMD CPUs will SIGILL on these binaries; pass an empty RUSTFLAGS override for those hosts.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
Makefile (1)
43-50: Declare the target-specific names as .PHONY.These aren't recipe-bearing targets, but Make still treats the names as file targets, so a stray same-named artifact would shadow rebuilds. Marking them phony also silences the
checkmakewarning.♻️ Proposed fix
+.PHONY: build-x86_64-unknown-linux-gnu build-aarch64-unknown-linux-gnu + # x86_64 release artifacts target the x86-64-v3 baseline (Haswell+ / # Excavator+), matching upstream reth. `+pclmulqdq` enables carry-less # multiply (used by keccak/GHASH) which v3 does not auto-imply. Pre-2013 # Intel and pre-2015 AMD CPUs will SIGILL on these binaries. build-x86_64-unknown-linux-gnu: RUSTFLAGS_ARCH = -C target-cpu=x86-64-v3 -C target-feature=+pclmulqdq # aarch64 needs larger jemalloc page size (64KB pages on some ARM systems) build-aarch64-unknown-linux-gnu: export JEMALLOC_SYS_WITH_LG_PAGE=16🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 43 - 50, Make treats the target names build-x86_64-unknown-linux-gnu and build-aarch64-unknown-linux-gnu as file targets which can be shadowed by same-named artifacts and triggers checkmake warnings; mark these targets as phony by adding them to a .PHONY declaration (e.g., include build-x86_64-unknown-linux-gnu and build-aarch64-unknown-linux-gnu in the .PHONY list) so make won't treat them as files and the warning is silenced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@Makefile`:
- Around line 43-50: Make treats the target names build-x86_64-unknown-linux-gnu
and build-aarch64-unknown-linux-gnu as file targets which can be shadowed by
same-named artifacts and triggers checkmake warnings; mark these targets as
phony by adding them to a .PHONY declaration (e.g., include
build-x86_64-unknown-linux-gnu and build-aarch64-unknown-linux-gnu in the .PHONY
list) so make won't treat them as files and the warning is silenced.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6c3f384e-6456-4f66-9722-e75c7e5d574e
📒 Files selected for processing (6)
.github/workflows/release.ymlCargo.tomlDockerfileMakefileMakefileEc2.mkcrates/evm/src/block/curie.rs
Bench on M4 Pro across OLD/NEW(release)/MAXPERF/PROFILING binaries revealed maxperf hits a hard trade-off on ERC20 workloads: metric new(release) maxperf profiling eth median TPS 136,567 149,159 139,734 erc20 median TPS 102,667 94,517 (-10%)103,496 erc20 import p99 12.9ms 26.8ms 10.2ms erc20 import p99.9 23.8ms 90.1ms 21.3ms erc20 import max 59.6ms 456.7ms 39.4ms erc20 >50ms slow 0.017% 0.365% 0.000% Maxperf's +9% eth gain is not worth -10% ERC20 median + 18x slow-block rate vs profiling. Profiling (thin LTO + cgu=16) captures ~80% of the hot-path optimization with zero ERC20 long tail regression, AND ships line-table debug info — exactly what prod needs for incident triage. Switch all three production paths (Dockerfile / MakefileEc2.mk / tag-push release workflow) to default `profiling`. Maxperf remains selectable via override for eth-heavy reference builds.
Update 2026-04-24: switching prod default from
|
| metric | NEW (release) | MAXPERF | PROFILING |
|---|---|---|---|
| eth-transfer median TPS | 136,567 | 149,159 | 139,734 (+2.3%) |
| erc20-transfer median TPS | 102,667 | 94,517 (-10%) | 103,496 |
| erc20 import_ms p99 | 12.9 | 26.8 | 10.2 ⭐ |
| erc20 import_ms p99.9 | 23.8 | 90.1 | 21.3 ⭐ |
| erc20 import_ms max | 59.6 | 456.7 | 39.4 ⭐ |
| erc20 >50ms slow blocks | 0.017% | 0.365% ⚠ | 0.000% ⭐ |
maxperf (fat LTO + cgu=1) wins eth-transfer by +9% but regresses ERC20 median TPS by 10% and explodes the import_ms long tail (max 456ms, 0.36% slow blocks). Root cause: fat LTO inlining + single-cgu register pressure penalize cold paths in the reth engine-tree single-thread; high producer block-rate (41 blocks/s erc20 vs profiling's 27) amplifies stall-window probability.
profiling (thin LTO + cgu=16, line-table debug) captures most of maxperf's hot-path gain, has zero ERC20 long tail regression, AND keeps line-table debug info — exactly what we want for prod incident triage.
maxperf is preserved as an override option for any future eth-only reference builds.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
MakefileEc2.mk (1)
30-30:RUSTFLAGS=…clobbers any pre-existingRUSTFLAGSfrom the environment.If an operator or outer CI job exports
RUSTFLAGS(e.g., linker overrides, split-debuginfo tweaks), this assignment silently drops it. Append instead so architecture flags compose with the caller's environment.♻️ Preserve inherited RUSTFLAGS
- CARGO_NET_GIT_FETCH_WITH_CLI=true RUSTFLAGS="$(RUSTFLAGS_ARCH)" cargo build --bin $(BINARY) --profile "$(PROFILE)" --target-dir "$(CARGO_TARGET_DIR)" + CARGO_NET_GIT_FETCH_WITH_CLI=true RUSTFLAGS="$(RUSTFLAGS_ARCH) $$RUSTFLAGS" cargo build --bin $(BINARY) --profile "$(PROFILE)" --target-dir "$(CARGO_TARGET_DIR)"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@MakefileEc2.mk` at line 30, The current build command overwrites any environment RUSTFLAGS by setting RUSTFLAGS="$(RUSTFLAGS_ARCH)"; change it to merge the caller's RUSTFLAGS with the arch-specific flags instead—export or set RUSTFLAGS to combine the existing RUSTFLAGS and RUSTFLAGS_ARCH when invoking cargo (the line that runs cargo build --bin $(BINARY) --profile "$(PROFILE)" --target-dir "$(CARGO_TARGET_DIR)" should pass RUSTFLAGS composed from both RUSTFLAGS and RUSTFLAGS_ARCH so callers' flags are preserved).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@MakefileEc2.mk`:
- Line 30: The current build command overwrites any environment RUSTFLAGS by
setting RUSTFLAGS="$(RUSTFLAGS_ARCH)"; change it to merge the caller's RUSTFLAGS
with the arch-specific flags instead—export or set RUSTFLAGS to combine the
existing RUSTFLAGS and RUSTFLAGS_ARCH when invoking cargo (the line that runs
cargo build --bin $(BINARY) --profile "$(PROFILE)" --target-dir
"$(CARGO_TARGET_DIR)" should pass RUSTFLAGS composed from both RUSTFLAGS and
RUSTFLAGS_ARCH so callers' flags are preserved).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fc873d50-2bfe-4461-b252-d53fb0bc1db3
📒 Files selected for processing (3)
.github/workflows/release.ymlDockerfileMakefileEc2.mk
🚧 Files skipped from review as they are similar to previous changes (1)
- Dockerfile
Summary
morph-reth's root
Cargo.tomlpreviously defined no[profile.*]overrides, so release builds fell back to Cargo defaults (lto = off) — strictly worse than upstream reth's ownrelease. This PR defines an explicit profile gradient and wires every production-deploy path tomaxperffor peak runtime throughput.4 commits
chore(evm): silence clippy unnecessary_sort_by in curie.rs test— pre-existing clippy 1.95 error onmainthat blocks CI's lint job. Trivial fix, kept first so the rest of the PR can pass lint.chore(cargo): add workspace build profile gradient— definesdev/hivetests/release/profiling/bench/maxperf. Localmake buildstill usesrelease, which now transparently benefits from thin LTO.chore(deploy): default Docker and EC2 builds to maxperf—DockerfileBUILD_PROFILE=maxperf,MakefileEc2.mkPROFILE ?= maxperf. Fat LTO + single codegen unit → 2–5% runtime gain for the L2 execution hot loop.ci(release): parameterize cargo profile; default tag-push to maxperf— release workflow no longer hardcodestarget/.../release/. Push-tag →maxperf.workflow_dispatch→profileinput, defaultprofilingso ad-hoc release builds keep symbols.Profile matrix after this PR
make build(local dev)release(unchanged)ghcr.io/.../morph-reth)maxperfprofilingon-demand if an incident needs flame graphsMakefileEc2.mk)maxperfmaxperfprofiling(user-selectable)Performance impact vs. current
mainProd binaries move from Cargo-default
release(LTO off) → newmaxperf(fat LTO + codegen-units=1): expect 3–8% faster block execution on production nodes, with ~2–3× longer link time absorbed one-off at image build.What's NOT changed
[workspace.lints]— untouched.MakefiledefaultPROFILE ?= release— untouched. Dailymake build/make test/make lintbehave identically to today.Test plan
cargo metadataparses with new profile sectioncargo fmt --all -- --checkpassescargo clippy --all --all-targets -- -D warningspasses (curie fix lands first)lint/test/buildworkflows green on the PRworkflow_dispatchwith defaultprofile=profilingproduces a valid tarballBUILD_PROFILE=maxperfcompletes in an acceptable windowSummary by CodeRabbit
Chores
Tests