Skip to content

test(surfpool): native cargo runner via build.rs + per-IDL macro#285

Draft
shahan-khatchadourian-anchorage wants to merge 15 commits into
mainfrom
shahankhatch/surfpool-rust-runner
Draft

test(surfpool): native cargo runner via build.rs + per-IDL macro#285
shahan-khatchadourian-anchorage wants to merge 15 commits into
mainfrom
shahankhatch/surfpool-rust-runner

Conversation

@shahan-khatchadourian-anchorage
Copy link
Copy Markdown
Contributor

Summary

Stacked on #202. Replaces the bash runner (scripts/surfpool_fuzz_all_idls.sh) with a native Rust setup so cargo test itself enumerates the per-IDL roundtrip tests.

  • chain_parsers/visualsign-solana/build.rs runs cargo metadata at build time, locates solana_parser's IDL directory (the one matching the rev in Cargo.lock), and emits it as cargo:rustc-env=SOLANA_IDL_DIR. Tests pick it up via env!() — no runtime cargo invocation.
  • A macro_rules! macro in tests/surfpool_fuzz.rs generates one #[tokio::test] per IDL (surfpool_idl_<name>). Cargo's harness handles enumeration, parallelism, pass/fail counts, and per-test failure diagnostics — the things the bash script was reimplementing.
  • The surfpool test path now distinguishes the three failure modes (decode rejected, no instructions, missing discriminator) with messages that name the IDL file. Previously a malformed IDL produced a misleading "no discriminator" panic.
  • collision.json and cyclic.json are excluded: they're solana_parser's negative test fixtures (duplicate type names / cyclic type refs) intentionally rejected by decode_idl_data.

CI change

The workflow drops the bash invocation and runs:

cargo test -p visualsign-solana --test surfpool_fuzz -- --ignored --test-threads=1

--test-threads=1 serialises surfpool spawns so each test owns its mainnet fork.

Documentation

Adds a Testing Patterns bullet in CLAUDE.md covering the macro-based pattern and how to add a new IDL.

Test plan

Why this PR

Addresses the third-party reviewer comment from #202 ("you might be able to do all this in Rust itself"). The narrow interpretation was already done there via the idl-meta Rust tool; this PR closes the loop by removing the bash orchestrator entirely.

🤖 Generated with Claude Code

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Use `surfpool start` subcommand with correct flags (--port, --ws-port,
  --rpc-url, --no-tui, --ci) instead of obsolete top-level flags
- Remove ledger_path/reset_ledger config (not surfpool concepts)
- Use multi_thread tokio runtime in tests (required by solana-rpc-client)
- All 15 IDLs pass surfpool integration tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wires the existing `HELIUS_API_KEY` repo secret into a label-gated CI job
that runs `scripts/surfpool_fuzz_all_idls.sh` against all 13 embedded IDLs.
The Solana parser's `SurfpoolConfig::default()` already prefers
`HELIUS_API_KEY` over the public mainnet endpoint, so passing the secret
through the job env is sufficient -- no code changes required.

Modelled on `fuzz-solana.yml` and `proptest-solana.yml`: label-gated to
control RPC quota usage, caches `~/.cargo/bin/surfpool` to skip the
~minute-long install on subsequent runs, and tags the PR with
`surfpool-failure` on a red run.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses code review on the prior commit:

1. Pin surfpool to tag v1.1.1 (matches the version verified locally
   against the Helius mainnet fork) and key the cached binary on the
   pinned version. The previous cache used `src/Cargo.lock` hash, which
   is decoupled from surfpool's upstream and would silently pin CI to
   the first-cached binary indefinitely. Splitting the binary cache from
   the workspace cache also lets each invalidate independently.

2. Skip the job on fork PRs (head repo != base repo). Secrets are not
   passed to fork PRs, so HELIUS_API_KEY would be empty and the labeling
   step would lack `gh pr edit` permissions; both would silently fall
   back to flaky public mainnet and a no-op label step.

3. Use `surfpool --version` instead of `command -v surfpool` to gate
   reinstall. A corrupted cached binary (interrupted previous install,
   ABI mismatch after toolchain bump) is now detected and rebuilt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… install

Three fixes addressing outstanding review comments and the failing CI:

1. CI install command -- the `surfpool` package was renamed to
   `surfpool-cli` upstream, so `cargo install surfpool --tag v1.1.1`
   fails with "could not find `surfpool` ... with version `*`". Drop the
   package name argument so cargo uses the workspace's `default-members`
   (`crates/cli`), which produces the `surfpool` binary regardless of
   future package renames within the workspace.

2. `wait_ready` no longer blocks a Tokio worker. The synchronous
   `RpcClient::get_version()` is now dispatched via `spawn_blocking` and
   the inter-attempt delay uses `tokio::time::sleep` instead of
   `thread::sleep`. Without this, a stalled RPC could starve other tasks
   on the same multi-thread runtime.

3. `surfpool_fuzz_all_idls.sh` now prints the captured cargo output on
   failure (both the no-summary and `N failed` paths). Previously a red
   IDL row printed only `FAIL (...)` and the diagnostics were silently
   dropped, making CI failures hard to debug.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tags are mutable -- a force-push could silently change what we install,
and `--locked` only protects transitive deps (it uses the source's
Cargo.lock; it does not lock the source commit). Pin to the SHA
`v1.1.1` resolved to (d58df4c) so the install target is bit-for-bit
reproducible. Cache key now keys on the rev.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The toolchain action exports `RUSTFLAGS: -D warnings` by default, which
made the `cargo install` step fail on a benign `unused_parens` lint in
upstream `surfpool-gql` v1.1.1. Deny-warnings is the right policy for
our own code but wrong for installing a third-party binary at a pinned
revision -- whether it compiles cleanly under future rustc is not
something we want CI to gate on.

Set `RUSTFLAGS=--cap-lints=warn` for just the install step so upstream
warnings stay warnings without affecting any subsequent step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces the ~10-minute `cargo install` compile with a ~5-second curl
of the upstream release asset. Pins the tarball's SHA256 in the workflow
so a force-pushed release asset cannot silently change what we install.

The binary is dropped into `~/.local/bin/` (added to `GITHUB_PATH` for
subsequent steps). Cache is keyed on (version, sha256) so a checksum
bump invalidates automatically.

Drops `RUSTFLAGS=--cap-lints=warn` from the install step; that was only
needed to work around upstream lint regressions during compile.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The workflow toggles a `surfpool-failure` label on PRs. With the prior
`if: contains(labels, 'surfpool')` guard, that toggle re-fires the
workflow on the same PR (since the `surfpool` label is still present
when the `surfpool-failure` label is added). Any other `labeled` event
on a PR that happens to carry the `surfpool` label would also re-trigger
unnecessarily.

Match the pattern used in `.github/workflows/stagex.yml`: on `labeled`
events only run when the label being added is `surfpool`; on other
events (`opened`, `synchronize`, `reopened`) require the label to
already be present.

Addresses copilot review comment on #253 (which proposed the same
guard); folded here so #253 can be closed with the workflow's review
comments resolved on this PR.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces the bash runner (`scripts/surfpool_fuzz_all_idls.sh`) with a
fully native Rust setup:

- `build.rs` runs `cargo metadata` once, locates `solana_parser`'s IDL
  directory (matching the rev in `Cargo.lock`), and emits it as
  `cargo:rustc-env=SOLANA_IDL_DIR`. Tests pick it up via `env!()` with
  no runtime cargo invocation.

- A `macro_rules!` macro generates one `#[tokio::test]` per IDL
  (`surfpool_idl_<name>`). Cargo's harness handles enumeration,
  parallelism, pass/fail counts, and per-test failure output -- the
  things the bash script was reimplementing by hand.

- `collision.json` and `cyclic.json` are excluded: they're solana_parser's
  own negative test fixtures (duplicate type names / cyclic type refs)
  and are intentionally rejected by `decode_idl_data`.

- The surfpool test now distinguishes the three failure modes (decode
  rejected, no instructions, missing discriminator) with messages that
  name the IDL file -- previously a malformed IDL produced a misleading
  "no discriminator" panic.

CI now invokes `cargo test ... --test surfpool_fuzz -- --ignored
--test-threads=1` directly. The 14 tests run serially in ~9s end-to-end
on the local Helius pin.

Documents the pattern in `CLAUDE.md`'s Testing Patterns section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…e IDL lookup

`solana_parser` already exposes each IDL as a `pub const &str` via
`solana::embedded_idls::*` (compiled in via `include_str!`). Use those
consts directly in the per-IDL macro instead of resolving a filesystem
path at build time.

Net deletions:
- `build.rs::emit_solana_idl_dir()` (the `cargo metadata` invocation)
- `[build-dependencies] serde_json` from `visualsign-solana/Cargo.toml`
- `env!("SOLANA_IDL_DIR")` and the `std::fs::read_to_string` per test

Resolves the duplicated locate-IDL-dir logic the code review flagged
between `tools/idl-meta::locate_idl_dir` and `build.rs::emit_solana_idl_dir`.

14 tests pass locally in ~9s (`HELIUS_API_KEY=<key> cargo test ... --
--ignored --test-threads=1`).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant