Skip to content

chore: add workspace build profile gradient and wire prod deploys to profiling#104

Open
panos-xyz wants to merge 6 commits intomainfrom
chore/add-build-profiles
Open

chore: add workspace build profile gradient and wire prod deploys to profiling#104
panos-xyz wants to merge 6 commits intomainfrom
chore/add-build-profiles

Conversation

@panos-xyz
Copy link
Copy Markdown
Contributor

@panos-xyz panos-xyz commented Apr 24, 2026

Summary

morph-reth's root Cargo.toml previously defined no [profile.*] overrides, so release builds fell back to Cargo defaults (lto = off) — strictly worse than upstream reth's own release. This PR defines an explicit profile gradient and wires every production-deploy path to maxperf for peak runtime throughput.

4 commits

  1. chore(evm): silence clippy unnecessary_sort_by in curie.rs test — pre-existing clippy 1.95 error on main that blocks CI's lint job. Trivial fix, kept first so the rest of the PR can pass lint.
  2. chore(cargo): add workspace build profile gradient — defines dev / hivetests / release / profiling / bench / maxperf. Local make build still uses release, which now transparently benefits from thin LTO.
  3. chore(deploy): default Docker and EC2 builds to maxperfDockerfile BUILD_PROFILE=maxperf, MakefileEc2.mk PROFILE ?= maxperf. Fat LTO + single codegen unit → 2–5% runtime gain for the L2 execution hot loop.
  4. ci(release): parameterize cargo profile; default tag-push to maxperf — release workflow no longer hardcodes target/.../release/. Push-tag → maxperf. workflow_dispatchprofile input, default profiling so ad-hoc release builds keep symbols.

Profile matrix after this PR

Artifact Profile Why
make build (local dev) release (unchanged) Fast link, good iteration loop; now includes thin LTO for free
Docker images (ghcr.io/.../morph-reth) maxperf Peak throughput in prod; rebuild with profiling on-demand if an incident needs flame graphs
S3-deployed EC2 binaries (MakefileEc2.mk) maxperf Same as Docker — all prod artifacts share one profile
GitHub Release tarballs (tag push) maxperf Peak throughput for public consumers
GitHub Release tarballs (manual dispatch) profiling (user-selectable) Flame-graphable debug artifacts

Performance impact vs. current main

Prod binaries move from Cargo-default release (LTO off)new maxperf (fat LTO + codegen-units=1): expect 3–8% faster block execution on production nodes, with ~2–3× longer link time absorbed one-off at image build.

What's NOT changed

  • [workspace.lints] — untouched.
  • Makefile default PROFILE ?= release — untouched. Daily make build / make test / make lint behave identically to today.
  • Everyday iteration loop — no impact.

Test plan

  • cargo metadata parses with new profile section
  • cargo fmt --all -- --check passes
  • cargo clippy --all --all-targets -- -D warnings passes (curie fix lands first)
  • CI lint / test / build workflows green on the PR
  • Dry-run release via workflow_dispatch with default profile=profiling produces a valid tarball
  • Spot-check a Docker image build with BUILD_PROFILE=maxperf completes in an acceptable window

Summary by CodeRabbit

  • Chores

    • Added multiple Cargo build profiles (dev, hivetests, profiling, bench, maxperf, revised release) for targeted optimization and analysis.
    • Build tooling now resolves and honors a selected build profile during CI, packaging, and local builds; production default switched to the profiling profile.
    • Introduced conditional x86-64 CPU-specific compiler flags and corresponding build argument handling across Docker and make-based cross-builds.
  • Tests

    • Fixed a test sorting step to ensure deterministic ordering in assertions.

Clippy 1.95 reports `storage.sort_by(|(a, _), (b, _)| a.cmp(b))` as
a `clippy::unnecessary_sort_by` (prefer `sort_by_key`) in the
`apply_curie_hard_fork` test. The fix is the clippy-suggested rewrite.

No behavioural change; only the test's sort comparator is reshaped.
morph-reth's root Cargo.toml previously defined no [profile.*] overrides,
so release builds fell back to Cargo defaults (lto=off) — strictly worse
than upstream reth's own release. Define an explicit profile gradient:

- profile.dev:       line-tables-only debug + unpacked split-debuginfo.
- profile.hivetests: opt-level=3 + lto=thin, inherits test (debug_assert on).
- profile.release:   opt-level=3, lto=thin, strip=symbols, codegen-units=16.
- profile.profiling: release + line-table debug + strip=none + incremental
                     — production binaries that still flamegraph cleanly.
- profile.bench:     inherits profiling.
- profile.maxperf:   release + lto=fat + codegen-units=1 for peak throughput
                     (expect a noticeably longer link stage).

Local `make build` still defaults to the `release` profile, which now
benefits from thin LTO without any user action.
Production-deploy paths (the Dockerfile that produces the container image
and MakefileEc2.mk that uploads binaries to S3 for EC2 nodes) both flip
their default Cargo profile from `release` to `maxperf`. That enables
fat LTO + a single codegen unit so the shipped binary is 2-5% faster at
runtime than the default release profile.

Tradeoff: symbols are stripped, so flame graphs and symbolicated
backtraces require a one-off `--build-arg BUILD_PROFILE=profiling` (or
`PROFILE=profiling make ...`) rebuild. Image/link time also roughly
doubles. Both are acceptable for an execution client where block-
execution throughput is the hot metric and prod incidents are rare.

Local `make build` remains on `release` so everyday iteration is
unchanged.
Previously the release workflow hardcoded `target/<target>/release/`
when packaging tarballs, which coupled it to the Makefile's `release`
default and silently broke if either was overridden.

- Add a `profile` workflow_dispatch input (default `profiling`, with
  `maxperf` and `release` as alternatives). Push-tag triggers keep
  defaulting to `maxperf` for peak throughput on public releases.
- Resolve the profile in a dedicated step and thread it through both
  `make build-<target>` (via PROFILE=) and the subsequent `cp` path
  (using the computed `profile_dir`, which handles Cargo's dev->debug
  naming quirk).
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 24, 2026

📝 Walkthrough

Walkthrough

Build and CI configuration updated to support selectable Cargo profiles and per-architecture RUSTFLAGS: new Cargo profiles added; GitHub Actions, Dockerfile, Makefile, and EC2 makefile now default to the profiling profile and pass resolved profile and architecture-specific RUSTFLAGS into build steps. One test sort refactor.

Changes

Cohort / File(s) Summary
Release Workflow
.github/workflows/release.yml
Adds a profile input and a "Resolve build profile" step to compute the effective Cargo profile and target subdirectory; non-workflow_dispatch tag pushes force profiling, workflow_dispatch uses user input; passes resolved PROFILE to make and packages binary from target/<target>/<profile_dir>/.
Cargo Profile Definitions
Cargo.toml
Adds custom Cargo profiles: dev, hivetests, release (redefined), profiling, bench, and maxperf with specific opt/debug/LTO/strip/codegen settings.
Docker Build
Dockerfile
Defaults BUILD_PROFILE to profiling; adds TARGETPLATFORM and RUSTFLAGS build args; conditionally injects -C target-cpu=x86-64-v3 and -C target-feature=+pclmulqdq for linux/amd64 when RUSTFLAGS not provided; applies conditional flags to both dependency cache and build stages.
Makefile (cross-builds)
Makefile
Introduces per-target RUSTFLAGS_ARCH; adds build-x86_64-unknown-linux-gnu target-specific RUSTFLAGS_ARCH for x86-64-v3 + pclmulqdq; generic build-% appends $(RUSTFLAGS_ARCH) to linker flags; relocates aarch64 jemalloc rule.
EC2 Makefile
MakefileEc2.mk
Changes default PROFILE to profiling; detects host arch (uname -m) and conditionally sets RUSTFLAGS_ARCH for x86_64; cargo_build_and_upload now runs cargo build with RUSTFLAGS="$(RUSTFLAGS_ARCH)" before packaging and upload.
Test tweak
crates/evm/src/block/curie.rs
Replaces a sort_by comparator with sort_by_key using the dereferenced U256 storage slot for deterministic ordering in the test.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • anylots
  • chengwenxi
  • curryxbo

Poem

🐰 I hopped through profiles, profiling in my paw,
Flags for x86-64-v3 I cheerfully gnaw,
Workflows resolve which build to compile,
Makefiles and Docker now race with a smile,
A tiny test sort fixed — hop, ship, deploy, hurrah! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The PR title mentions adding workspace build profiles and wiring production deploys, which aligns with the main changes (Cargo.toml profile definitions and workflow/Makefile updates), but contains a discrepancy: the title says 'profiling' while the PR objectives describe the actual intent as 'maxperf' for prod with follow-up changes to 'profiling'. Clarify whether the title should reference 'maxperf' or 'profiling' for production, or rephrase to be profile-agnostic since the main intent is establishing the profile gradient system.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/add-build-profiles

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Mirror upstream reth release defaults: x86_64 binaries target the
x86-64-v3 baseline (Haswell+ / Excavator+, ~2013 onward) plus the
pclmulqdq extension (carry-less multiply, used by keccak/GHASH but not
auto-implied by v3). aarch64 is left untouched.

- Makefile cross-build: target-specific RUSTFLAGS_ARCH so build-x86_64-*
  picks up the flags while build-aarch64-* stays clean.
- Dockerfile: TARGETPLATFORM-conditional so multi-arch buildx builds get
  v3 only on linux/amd64.
- MakefileEc2.mk: uname -m guard so EC2 native builds opt in on x86_64
  and skip on Graviton ARM hosts.

Pre-2013 Intel and pre-2015 AMD CPUs will SIGILL on these binaries; pass
an empty RUSTFLAGS override for those hosts.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
Makefile (1)

43-50: Declare the target-specific names as .PHONY.

These aren't recipe-bearing targets, but Make still treats the names as file targets, so a stray same-named artifact would shadow rebuilds. Marking them phony also silences the checkmake warning.

♻️ Proposed fix
+.PHONY: build-x86_64-unknown-linux-gnu build-aarch64-unknown-linux-gnu
+
 # x86_64 release artifacts target the x86-64-v3 baseline (Haswell+ /
 # Excavator+), matching upstream reth. `+pclmulqdq` enables carry-less
 # multiply (used by keccak/GHASH) which v3 does not auto-imply. Pre-2013
 # Intel and pre-2015 AMD CPUs will SIGILL on these binaries.
 build-x86_64-unknown-linux-gnu: RUSTFLAGS_ARCH = -C target-cpu=x86-64-v3 -C target-feature=+pclmulqdq
 
 # aarch64 needs larger jemalloc page size (64KB pages on some ARM systems)
 build-aarch64-unknown-linux-gnu: export JEMALLOC_SYS_WITH_LG_PAGE=16
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 43 - 50, Make treats the target names
build-x86_64-unknown-linux-gnu and build-aarch64-unknown-linux-gnu as file
targets which can be shadowed by same-named artifacts and triggers checkmake
warnings; mark these targets as phony by adding them to a .PHONY declaration
(e.g., include build-x86_64-unknown-linux-gnu and
build-aarch64-unknown-linux-gnu in the .PHONY list) so make won't treat them as
files and the warning is silenced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@Makefile`:
- Around line 43-50: Make treats the target names build-x86_64-unknown-linux-gnu
and build-aarch64-unknown-linux-gnu as file targets which can be shadowed by
same-named artifacts and triggers checkmake warnings; mark these targets as
phony by adding them to a .PHONY declaration (e.g., include
build-x86_64-unknown-linux-gnu and build-aarch64-unknown-linux-gnu in the .PHONY
list) so make won't treat them as files and the warning is silenced.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6c3f384e-6456-4f66-9722-e75c7e5d574e

📥 Commits

Reviewing files that changed from the base of the PR and between faec4c8 and 72e2a4f.

📒 Files selected for processing (6)
  • .github/workflows/release.yml
  • Cargo.toml
  • Dockerfile
  • Makefile
  • MakefileEc2.mk
  • crates/evm/src/block/curie.rs

Bench on M4 Pro across OLD/NEW(release)/MAXPERF/PROFILING binaries
revealed maxperf hits a hard trade-off on ERC20 workloads:

  metric              new(release)  maxperf       profiling
  eth median TPS      136,567       149,159       139,734
  erc20 median TPS    102,667        94,517 (-10%)103,496
  erc20 import p99    12.9ms         26.8ms       10.2ms
  erc20 import p99.9  23.8ms         90.1ms       21.3ms
  erc20 import max    59.6ms        456.7ms       39.4ms
  erc20 >50ms slow    0.017%         0.365%       0.000%

Maxperf's +9% eth gain is not worth -10% ERC20 median + 18x slow-block
rate vs profiling. Profiling (thin LTO + cgu=16) captures ~80% of the
hot-path optimization with zero ERC20 long tail regression, AND ships
line-table debug info — exactly what prod needs for incident triage.

Switch all three production paths (Dockerfile / MakefileEc2.mk / tag-push
release workflow) to default `profiling`. Maxperf remains selectable via
override for eth-heavy reference builds.
@panos-xyz panos-xyz changed the title chore: add workspace build profile gradient and wire prod deploys to maxperf chore: add workspace build profile gradient and wire prod deploys to profiling Apr 24, 2026
@panos-xyz
Copy link
Copy Markdown
Contributor Author

Update 2026-04-24: switching prod default from maxperfprofiling

Pushed 6f7b550 which flips the default for Dockerfile, MakefileEc2.mk, and the tag-push release workflow.

Why: Ran a 4-binary openloop bench on M4 Pro (OLD b22219c / NEW feat HEAD release / MAXPERF HEAD + this PR / PROFILING HEAD + this PR), 2 workloads × 2 runs × 120s @ 200k target TPS:

metric NEW (release) MAXPERF PROFILING
eth-transfer median TPS 136,567 149,159 139,734 (+2.3%)
erc20-transfer median TPS 102,667 94,517 (-10%) 103,496
erc20 import_ms p99 12.9 26.8 10.2
erc20 import_ms p99.9 23.8 90.1 21.3
erc20 import_ms max 59.6 456.7 39.4
erc20 >50ms slow blocks 0.017% 0.365% 0.000%

maxperf (fat LTO + cgu=1) wins eth-transfer by +9% but regresses ERC20 median TPS by 10% and explodes the import_ms long tail (max 456ms, 0.36% slow blocks). Root cause: fat LTO inlining + single-cgu register pressure penalize cold paths in the reth engine-tree single-thread; high producer block-rate (41 blocks/s erc20 vs profiling's 27) amplifies stall-window probability.

profiling (thin LTO + cgu=16, line-table debug) captures most of maxperf's hot-path gain, has zero ERC20 long tail regression, AND keeps line-table debug info — exactly what we want for prod incident triage.

maxperf is preserved as an override option for any future eth-only reference builds.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
MakefileEc2.mk (1)

30-30: RUSTFLAGS=… clobbers any pre-existing RUSTFLAGS from the environment.

If an operator or outer CI job exports RUSTFLAGS (e.g., linker overrides, split-debuginfo tweaks), this assignment silently drops it. Append instead so architecture flags compose with the caller's environment.

♻️ Preserve inherited RUSTFLAGS
-	CARGO_NET_GIT_FETCH_WITH_CLI=true RUSTFLAGS="$(RUSTFLAGS_ARCH)" cargo build --bin $(BINARY) --profile "$(PROFILE)" --target-dir "$(CARGO_TARGET_DIR)"
+	CARGO_NET_GIT_FETCH_WITH_CLI=true RUSTFLAGS="$(RUSTFLAGS_ARCH) $$RUSTFLAGS" cargo build --bin $(BINARY) --profile "$(PROFILE)" --target-dir "$(CARGO_TARGET_DIR)"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@MakefileEc2.mk` at line 30, The current build command overwrites any
environment RUSTFLAGS by setting RUSTFLAGS="$(RUSTFLAGS_ARCH)"; change it to
merge the caller's RUSTFLAGS with the arch-specific flags instead—export or set
RUSTFLAGS to combine the existing RUSTFLAGS and RUSTFLAGS_ARCH when invoking
cargo (the line that runs cargo build --bin $(BINARY) --profile "$(PROFILE)"
--target-dir "$(CARGO_TARGET_DIR)" should pass RUSTFLAGS composed from both
RUSTFLAGS and RUSTFLAGS_ARCH so callers' flags are preserved).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@MakefileEc2.mk`:
- Line 30: The current build command overwrites any environment RUSTFLAGS by
setting RUSTFLAGS="$(RUSTFLAGS_ARCH)"; change it to merge the caller's RUSTFLAGS
with the arch-specific flags instead—export or set RUSTFLAGS to combine the
existing RUSTFLAGS and RUSTFLAGS_ARCH when invoking cargo (the line that runs
cargo build --bin $(BINARY) --profile "$(PROFILE)" --target-dir
"$(CARGO_TARGET_DIR)" should pass RUSTFLAGS composed from both RUSTFLAGS and
RUSTFLAGS_ARCH so callers' flags are preserved).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fc873d50-2bfe-4461-b252-d53fb0bc1db3

📥 Commits

Reviewing files that changed from the base of the PR and between 72e2a4f and 6f7b550.

📒 Files selected for processing (3)
  • .github/workflows/release.yml
  • Dockerfile
  • MakefileEc2.mk
🚧 Files skipped from review as they are similar to previous changes (1)
  • Dockerfile

Comment thread MakefileEc2.mk
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant