Skip to content

Phase 3–5 engine: standalone mode, scenarios, security, multi-tenancy, ops tooling, large-payload testing#95

Merged
cbaugus merged 18 commits intomainfrom
dev
Mar 9, 2026
Merged

Phase 3–5 engine: standalone mode, scenarios, security, multi-tenancy, ops tooling, large-payload testing#95
cbaugus merged 18 commits intomainfrom
dev

Conversation

@cbaugus
Copy link
Owner

@cbaugus cbaugus commented Mar 3, 2026

Summary

Comprehensive feature set across multiple phases, all confirmed passing CI.

Security & Auth

Node Operations

Load Testing Capabilities

  • feat: body_size field for synthetic large-payload load testing #96 bodySize field — generate synthetic random payloads (B/KB/MB) per request for upload/parser stress testing; mutually exclusive with body
  • Multi-host steps — full URL in path overrides baseUrl per step (e.g. auth service → API service)
  • Variable extraction passes JWT tokens between steps

Infrastructure

Docs & Examples

  • DOCKER_HUB_OVERVIEW.md — bodySize section, security env vars, auto-registration vars
  • examples/configs/large-payload-test.yaml — new ready-to-use template
  • examples/configs/README.md — updated template list and customization guide
  • ACCEPTABLE_USE.md — scoped acceptable-use policy

Test Fixes

  • Converted all httpbin.org-dependent tests to wiremock
  • Fixed InconsistentCardinality panics after adding tenant label to all metrics
  • Fixed flaky test_realistic_user_pool, test_mixed_methods_scenario, test_case_insensitive_methods

Test plan

  • CI lint (rustfmt + clippy) passes
  • CI test suite passes (245+ tests)
  • Docker image builds for both Dockerfiles
  • POST /config with bodySize: "512KB" confirmed sending correct Content-Length to target
  • GET /ready returns 200 without auth when HEALTH_AUTH_ENABLED=true
  • GET /health returns 401 without token when HEALTH_AUTH_ENABLED=true

🤖 Generated with Claude Code

cbaugus and others added 2 commits March 3, 2026 13:54
- POST /config and POST /stop now check Authorization: Bearer <token>
  when API_AUTH_TOKEN env var is set; returns 401 if missing/invalid.
  Fully backwards-compatible: when unset, endpoints remain open.
- POST /stop sends stop signal to all workers, aborts handles, and
  transitions node_state to "idle". Returns JSON summary with last
  known RPS and worker count.
- Updated help text to document API_AUTH_TOKEN and POST /stop.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- YamlMetadata gains an optional `tenant` field (metadata.tenant in YAML).
  Backwards-compatible: omit the field and behaviour is unchanged.
- Five Prometheus metrics now carry a `tenant` label alongside `region`:
  REQUEST_TOTAL, REQUEST_STATUS_CODES, CONCURRENT_REQUESTS,
  REQUEST_DURATION_SECONDS, REQUEST_ERRORS_BY_CATEGORY,
  SCENARIO_REQUESTS_TOTAL.  Empty string when no tenant is set.
- WorkerConfig and ScenarioWorkerConfig gain a `tenant: String` field
  threaded through all worker-spawning sites (config-watcher, startup,
  standby, scenario paths) and all metric recording call sites.
- TestState tracks the active tenant; GET /health exposes it as
  `"tenant": null | "acme"` so the web layer can see who owns the node.
- POST /stop accepts an optional JSON body `{"tenant": "acme"}`.
  When supplied, the endpoint returns 409 Conflict if the active test
  belongs to a different tenant, preventing one client from stopping
  another client's test.  Omit the body to stop unconditionally.
- Metrics updater resets RPS delta tracking when the active tenant
  changes, preventing phantom RPS spikes at test boundaries.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cbaugus cbaugus changed the title Security: API_AUTH_TOKEN auth + POST /stop endpoint Security + multi-tenant: API_AUTH_TOKEN auth, POST /stop, and tenant metrics Mar 3, 2026
cbaugus and others added 7 commits March 3, 2026 14:38
…olations

- Move `new_tenant` extraction before worker spawning block so all
  three ScenarioWorkerConfig/WorkerConfig sites can reference it
- Collapse rustfmt two-line splits: current_tenant, active, curr_requests,
  and tenant:.clone().unwrap_or_default() in struct literal

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Split REQUEST_TOTAL chained calls in worker.rs to match rustfmt style
- Reformat body_bytes/stop_tenant block and curr_requests chain in main.rs
  to match rustfmt's expected indentation
- Replace test_realistic_user_pool httpbin.org dependency with wiremock
  mock server to eliminate CI flakiness from live endpoint unavailability

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All 4 metric helpers used the old label arity after tenant was added.
Workers write with &["local", ""] but reads used &["local"] causing
InconsistentCardinality panics in prometheus.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- test_mixed_methods_scenario: mock all 4 methods locally; fix the
  buggy assertion that panicked on out-of-bounds index when fewer
  than 4 steps were returned (change OR to assert_eq!(len, 4))
- test_case_insensitive_methods: mock GET /get and POST /post locally
  so the case-folding logic in executor.rs is tested without a live
  network dependency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… policy

Issue #92 — Authenticate GET /health:
- Add HEALTH_AUTH_ENABLED env var (default: false, backwards-compatible)
- When true, GET /health requires the same Bearer token as POST /config
- Add GET /ready — always unauthenticated, returns {"ready":true}
  safe for Nomad/K8s health probes regardless of auth settings
- Update Nomad example to use HTTP /ready check instead of TCP check

Issue #89 — Node auto-registration:
- New src/registry.rs: RegistrationConfig::from_env(), register_once(),
  spawn_registration_task()
- All three vars (NODE_REGISTRY_URL, AUTO_REGISTER_PSK, NODE_BASE_URL)
  must be set; missing any one logs a warning and skips silently
- Heartbeat re-registers at NODE_REGISTRY_INTERVAL (default 30s)
- Document new env vars in help text and Nomad HCL template

Housekeeping:
- Add ACCEPTABLE_USE.md with explicit prohibition on unauthorized testing
- Gitignore nomad/loadtest.nomad.hcl (personal deployment file)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Issue #42 — migrate subcommand:
- `rust-loadtest migrate [--output config.yaml]` reads current env vars
  and writes an equivalent YAML config file
- Handles Concurrent, Rps, and RampRps load models
- Includes baseUrl, workers, duration, timeout, skipTlsVerify, method,
  JSON body, region, and tenant when present
- Prints curl command to POST the generated config to a node
- Exits before tracing/metrics init so it works without TARGET_URL set

Issue #25 — weekly Docker build workflow:
- .github/workflows/weekly-docker-build.yaml
- Runs every Monday 00:00 UTC via cron; also triggerable via workflow_dispatch
- Builds both standard (Dockerfile) and Chainguard (Dockerfile.chainguard)
  images with latest, YYYY-MM-DD, and weekly-YYYY-WW tags
- Layer cache via Docker Hub buildcache tags to keep builds fast
- Generates CycloneDX SBOMs with Syft for both images
- Trivy vulnerability scan (CRITICAL+HIGH) uploaded to GitHub Security tab
- SBOMs uploaded as 90-day workflow artifacts
- Uses existing DOCKERHUB_USERNAME / DOCKERHUB_TOKEN secrets

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Wrap long warn! string in registry.rs (NODE_BASE_URL missing message)
- Collapse two-line node_name assignment to one line
- Collapse two-line env_or closure to one line
- Split long format! in load_section to multi-line form

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@cbaugus cbaugus changed the title Security + multi-tenant: API_AUTH_TOKEN auth, POST /stop, and tenant metrics Phase 3–4 engine: standalone mode, scenarios, security, multi-tenancy, ops tooling Mar 9, 2026
cbaugus and others added 9 commits March 8, 2026 20:28
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…#96)

Add a `bodySize` field to scenario step requests that auto-generates a
random alphanumeric body of the specified size on every request, enabling
stress tests of endpoint request-body handling without inlining large
strings in YAML.

YAML usage:
  steps:
    - name: "Hammer endpoint"
      request:
        method: "POST"
        path: "/api/upload"
        bodySize: "1MB"   # or "512KB", "128B"

- src/utils.rs: add parse_body_size() supporting B, KB, MB units with tests
- src/scenario.rs: add body_size: Option<usize> to RequestConfig
- src/yaml_config.rs: parse bodySize YAML field; validate mutual exclusion
  with body at config load time
- src/executor.rs: generate rand::Alphanumeric body when body_size is set
- tests/scenario_integration_tests.rs: integration test verifies 512-byte
  body reaches the server via wiremock Content-Length check
- All other test files: add body_size: None to existing RequestConfig literals

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…for rustfmt

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ion, reformat parse chain

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…e.rs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
strip_suffix('B') matched "1GB" leaving "1G" as the numeric part,
producing the wrong error message. Now checks that the remaining
string is purely numeric before accepting the 'B' suffix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…onfigs

- DOCKER_HUB_OVERVIEW.md: add "Large Payload / Upload Testing" section
  under Advanced Features with standalone and JWT-auth examples
- examples/configs/large-payload-test.yaml: new ready-to-use template
  (raw 1MB upload + JWT-auth 512KB upload, two weighted scenarios)
- examples/configs/README.md: add template entry #8, selection table row,
  and customization example #6 for bodySize

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cbaugus cbaugus merged commit faa607b into main Mar 9, 2026
7 checks passed
@cbaugus cbaugus changed the title Phase 3–4 engine: standalone mode, scenarios, security, multi-tenancy, ops tooling Phase 3–5 engine: standalone mode, scenarios, security, multi-tenancy, ops tooling, large-payload testing Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant