End-to-end test harness for the Routstr stack.
make syncThis clones (or fast-forwards) routstr-core, routstrd, and routstr-cli into vendor/
and writes vendor/COMMITS.txt with the pinned commit hashes.
cp .env.example .envSet in .env (all gitignored):
E2E_CASHU_TOKEN— a funded cashu token, for payment tests.CASHU_MINTS— the mint the token is from (e.g.https://mint.chorus.community). Both nodes must trust it.OPENROUTER_API_KEY(+ optionalOPENROUTER_REFERER) — to point the nodes at a real upstream. The node auto-seeds anopenrouterprovider at startup when this is present.NODE_A_ADMIN_PASSWORD/NODE_B_ADMIN_PASSWORD— admin password per node (defaulttest-admin-pw), needed forroutstr-cliconfig.
make upThe
webuicompose service currently fails to build (corepack/pnpm on the node20 base image), which abortsmake up. Bring up only the core services until that's fixed, and build the UI on the host (node 22 / pnpm 10) instead:docker compose up -d --build relay mock-openai node-a node-b routstrd cli-runner # UI: make serve (host build, single origin :8000 — see "Deploying the Web UI")
make testOr drive a single scenario through the orchestrator and persist results to runs.db:
python -m runner.orchestrate --scenario smoke --token <cashu-token>
# or
make orchestrate SCENARIO=smoke TOKEN=cashuA...Useful env flags:
SKIP_SYNC=1— skipscripts/sync.shKEEP_UP=1— leave compose services running after the run for debugging
The orchestrator writes one row to runs and one row per test to test_results in runs.db (SQLite via SQLModel). Logs land in logs/<run-timestamp>/.
make logsmake downThe orchestrator can point the test suite at externally-deployed routstr
nodes instead of building node-a / node-b from vendor/routstr-core/:
python -m runner.orchestrate \
--scenario smoke \
--target-profile remote \
--remote-node-urls https://node1.example,https://node2.exampleIn remote mode:
docker compose upis skipped — your deployment isn't touched.TARGET_PROFILE=remote,REMOTE_NODE_URLS=..., andROUTSTRD_BOOTSTRAP_PROVIDERS=...are exported into pytest's env. The routstrd seed-providers step picks the latter up so the daemon routes through the remote nodes.- The
tests/conftest.pyskip-rule auto-skips any test tagged@pytest.mark.destructive, and skips@pytest.mark.admin_requiredtests unless at least oneREMOTE_NODE_ADMIN_TOKEN_<i>env var is set. - The resulting
runsrow carriestarget_profile=remoteandremote_node_urls_json. Admin tokens are never persisted.
Pass per-node admin tokens via env (preferred) or --remote-admin-tokens
(local dev only — argv is visible in ps):
REMOTE_NODE_ADMIN_TOKEN_0=secret1 REMOTE_NODE_ADMIN_TOKEN_1=secret2 \
python -m runner.orchestrate --scenario smoke \
--target-profile remote \
--remote-node-urls https://node1.example,https://node2.exampleThe Web UI Run modal exposes the same fields: a target_profile dropdown,
a node-URLs textarea, and a masked admin-token field per node. The Runs
table shows the profile badge per row and a filter in the header.
By default the routstr nodes talk to the in-compose mock-openai container
(upstream_profile=mock). You can instead point them at a real upstream LLM
provider — OpenAI, Anthropic, OpenRouter, Groq, Together, Fireworks — by
selecting a profile from providers/:
UPSTREAM_PROFILE=openai OPENAI_API_KEY=sk-... \
python -m runner.orchestrate --scenario openai_chat_completionsWhat happens:
- The orchestrator loads
providers/openai.yaml, validates that everyrequired_env(hereOPENAI_API_KEY) is set — bailing with a clear error before any stack bring-up if not. - It checks the scenario's
estimated_upstream_cost_usdagainstUPSTREAM_MAX_USD(default$1.00) and refuses to start if over budget. - It exports
UPSTREAM_BASE_URL,UPSTREAM_API_KEY, andUPSTREAM_MODELS_PATHsonode-a/node-broute to the real provider, then runs the scenario'sreal_upstream-tagged tests (default-skipped undermock). - The run row records
upstream_profile=openai,upstream_estimated_cost_usd, and a best-effortupstream_actual_cost_usdpriced from the provider's model catalog. Provider API keys are never persisted.
Cost controls:
# Block a scenario whose estimated cost exceeds the ceiling:
UPSTREAM_PROFILE=openai OPENAI_API_KEY=sk-... UPSTREAM_MAX_USD=0.001 \
python -m runner.orchestrate --scenario openai_chat_completions
# → exits non-zero: "estimated upstream cost $0.0100 exceeds UPSTREAM_MAX_USD $0.0010"The profile/target matrix:
| target | upstream | runs |
|---|---|---|
| local | mock | everything (current default) |
| local | real provider | everything; real_upstream tests charge |
| remote | mock | ROU-151 read-only flow (invalid for real_upstream tests) |
| remote | real provider | safe_for_remote ∪ real_upstream, cost-gated |
In the Web UI Run modal, an "Upstream provider" dropdown (populated from
GET /api/providers) exposes masked, write-only key fields for the selected
provider and a cost preview (red when over UPSTREAM_MAX_USD). The Runs table
gains an "Upstream" column; the Run detail shows the resolved profile and the
estimated / actual USD spend. See providers/README.md
to add a provider.
Beyond the smoke / real-upstream scenarios, the harness ships integration
scenarios under tests/integration/ (driven via the orchestrator, results in
runs.db + the Runs UI). The stack must be up (make up) and KEEP_UP=1 is
recommended so the orchestrator runs pytest against the already-running stack.
routstrd discovers nodes from the Nostr relay and routes each request to the
cheapest node that serves the model (lowest provider_fee, exposed at
GET /models/<id>/providers, sorted by sats_pricing.max_cost). The scenario
sets per-node fees with routstr-cli and asserts the ranking follows:
SKIP_SYNC=1 KEEP_UP=1 python -m runner.orchestrate --scenario routstrd_cheapest --token placeholderFee update via CLI (what the test does):
docker exec routstr-testing-cli-runner-1 bun /app/dist/index.js \
--node http://node-a:8000 providers update 1 -t <admin-token> --fee 0.3Relay isolation: the bundled
@routstr/sdkhardcodes public discovery relays with no env override, so an unpatched daemon discovers the global routstr network.vendor-dockerfiles/routstrd.Dockerfilerewrites those to the localws://relay:8080so onlynode-a/node-bare discovered.
Real /v1/chat/completions across many models through a node, paid from a
funded ecash balance (ecash → node → openrouter → completion). Provide a funded
node api-key (or cashu token) via NODE_A_API_KEY:
NODE_A_API_KEY=sk-... SKIP_SYNC=1 KEEP_UP=1 \
python -m runner.orchestrate --scenario real_inference --token placeholderSingle-use ecash: send X-Cashu: <token> (no auth); the node redeems, charges
exact cost, and returns change in the X-Cashu response header. Provide one
funded token per model (plus one for the change test) via X_CASHU_TOKENS
(comma-separated):
X_CASHU_TOKENS=cashuB...,cashuB...,... SKIP_SYNC=1 KEEP_UP=1 \
python -m runner.orchestrate --scenario xcashu --token placeholderNode billing is sub-sat (millisats), so the Runs table renders precise spend:
paid tests append their spend to $SPEND_REPORT_PATH (set automatically by the
orchestrator); the run records token_consumed_msats and the UI shows e.g.
349 msat or 4 sats instead of a rounded 0 sats. See
docs/PLAN-full-node-coverage.md for the
full status, findings, and known vendor bugs (incl. a node refund/X-Cashu
change-retention fund leak — avoid repeated real-money runs until fixed).
These scenarios resolve their targets through tests/integration/targets.py, so
they run against deployed nodes too — add --target-profile remote and node
URLs. URLs, the routstrd endpoint, admin tokens, and node identity all come from
env; tests skip (not fail) when a piece is missing.
# real_inference / xcashu — just point at a deployed node and fund it:
NODE_A_API_KEY=sk-... python -m runner.orchestrate --scenario real_inference \
--target-profile remote --remote-node-urls https://node1.example
X_CASHU_TOKENS=cashuB...,... python -m runner.orchestrate --scenario xcashu \
--target-profile remote --remote-node-urls https://node1.example
# routstrd_cheapest — needs 2 nodes you can admin + a reachable routstrd + cli-runner:
ROUTSTRD_URL=https://routstrd.example \
REMOTE_NODE_ADMIN_TOKEN_0=tok0 REMOTE_NODE_ADMIN_TOKEN_1=tok1 \
python -m runner.orchestrate --scenario routstrd_cheapest \
--target-profile remote --remote-node-urls https://node1.example,https://node2.exampleEnv knobs (else local defaults :8001/:8002, :8091, docker node-a/node-b,
password login): REMOTE_NODE_URLS, REMOTE_NODE_ADMIN_TOKEN_<i>,
ROUTSTRD_URL, CLI_CONTAINER. The cheapest test identifies the serving node by
URL host (works with any remote URL), not a hardcoded node-a.
The Run modal (cashu token + provider keys, target/upstream profile) and the Scenarios/Runs views can be driven entirely from a browser — no env-var token required. To stand up a browsable instance on one origin:
make serve # builds webui/dist, serves UI + /api on 0.0.0.0:8000
# open http://localhost:8000 → Scenarios → "Run scenario" → paste cashu tokenmake serve runs webui-build then launches the FastAPI server with
WEBUI_DIST_DIR pointed at the build output, so the same process serves both
the React UI (/, with SPA deep-link fallback) and the /api/* backend. One
port means one tunnel / one reverse-proxy host exposes the whole harness:
# Public URL for a quick shared test (any tunnel works):
ngrok http 8000 # → https://<id>.ngrok-free.app (UI + API, same origin)
# or put Caddy/nginx in front of :8000 on a host you control.Because UI and API share an origin, the browser uses same-origin fetch and
VITE_API_BASE_URL can stay empty. The cashu token and provider API keys are
write-only — posted in the run body, forwarded to the orchestrator via env,
and never persisted or echoed (tests/test_server_token_hygiene.py).
Run execution still needs the local docker compose stack (
make up) fortarget_profile=local, or reachableremotenode URLs entered in the modal. A persistent hosted deployment (containerized server with docker access) is tracked separately — see the ROU-125 follow-up.
| Service | Description |
|---|---|
relay |
Nostr relay (nostr-rs-relay) |
mock-openai |
WireMock-based OpenAI API mock |
node-a |
routstr-core node A |
node-b |
routstr-core node B |
routstrd |
routstrd daemon; discovery relay-isolated to local relay (override Dockerfile adds sqlite + cocod wallet) |
cli-runner |
routstr-cli test runner container |
webui |
Vite + React UI (Docker build currently broken — build on host) |
vendor/ # auto-populated by make sync (gitignored)
routstr-core/
routstrd/
routstr-cli/
COMMITS.txt # pinned commit SHAs
scripts/
sync.sh # vendor sync script
runner/ # scenario-driven orchestrator
orchestrate.py # CLI entrypoint
models.py # SQLModel schema (scenarios, runs, test_results)
scenario.py # YAML loader
providers.py # upstream provider profile registry (ROU-153)
cost.py # best-effort upstream cost pricing
junit.py # junit XML parser
compose.py # docker compose wrappers
providers/ # upstream provider profiles + curated model catalogs
<id>.yaml # one per provider (openai, anthropic, ...)
models/<id>.json
scenarios/ # YAML scenario library (smoke, routstrd_cheapest, real_inference, xcashu, ...)
tests/ # pytest suite driven by the orchestrator
cli/ # routstr-cli tests (via docker exec)
integration/ # routing + paid scenarios (cheapest, real_inference, xcashu) + spend helper
vendor-dockerfiles/ # local Dockerfile overrides (routstr-cli, routstrd: sqlite + cocod + relay isolation)
docs/ # status / plan / findings (PLAN-full-node-coverage.md)
webui/ # Vite + React UI (build on host: pnpm install && pnpm build)
compose.yml
Makefile
pyproject.toml # runner dependencies (sqlmodel, pyyaml, pytest, ...)
.env.example