Distributed Stress Testing Tool

A flexible HTTP stress testing tool that supports both standalone and distributed execution modes.

Architecture Overview

The tool has been refactored to support a Controller-Node architecture for distributed load testing:

Components

Controller:
- Loads the test configuration file
- Waits for worker nodes to connect
- Distributes test specifications to all connected nodes
- Coordinates test execution start
- Aggregates and displays results from all nodes
Node (Worker):
- Connects to the controller
- Receives test specifications
- Executes tests with node-specific overrides
- Reports progress and results back to the controller
Standalone Mode:
- Traditional single-instance execution
- No network communication required

Usage

1. Standalone Mode (Single Machine)

Run tests directly without distribution:

./stresstool run -f config.yaml

Options:

-f, --file: Path to YAML configuration file (required)
--verbose: Print detailed logs
--dry-run: Validate config without executing tests
--parallel: Run all tests in parallel

2. Distributed Mode (Controller + Nodes)

Step 1: Start the Controller

./stresstool controller -f config.yaml --listen :8090

The controller will:

Load the configuration
Listen for node connections
Wait for your command to start tests

Options:

-f, --file: Path to YAML configuration file (required)
--listen: Address to listen on (default: :8090)
--web: Enable the web UI for triggering tests
--web-port: Port for the web UI server (default: 8091, requires --web)
--parallel: Run tests in parallel on each node
--verbose: Print detailed logs

Step 2: Start Worker Nodes

On each worker machine, run:

# Node A
./stresstool node --node-name node-a --controller controller-host:8090

# Node B
./stresstool node --node-name node-b --controller controller-host:8090

# Node C
./stresstool node --node-name node-c --controller controller-host:8090

Each node will:

Connect to the controller
Identify itself by name
Wait for test specifications

Options:

--node-name: Unique name for this node (required)
--controller: Controller address to connect to (required)
--verbose: Print detailed logs

Step 3: Start Tests from Controller

Once all nodes are connected, type start in the controller terminal:

Type 'start' when ready to begin tests, or 'nodes' to see connected nodes:
> nodes
Connected nodes (3):
  - node-a
  - node-b
  - node-c

> start
Starting tests on 3 node(s)...

The controller will:

Send test specifications to all nodes
Signal nodes to start simultaneously
Display real-time progress from all nodes
Aggregate and display final results

3. Controller Web Mode

The controller includes an optional web UI for managing tests from a browser. Enable it with the --web flag:

# Start controller with web UI on default port (8091)
./stresstool controller -f config.yaml --listen :8090 --web

# Or specify a custom web UI port
./stresstool controller -f config.yaml --listen :8090 --web --web-port 9000

Options:

--web: Enable the web UI (disabled by default)
--web-port: Port for the web UI server (default: 8091, requires --web)

Once running, open http://localhost:8091 in your browser. The UI auto-refreshes every 2 seconds and shows all connected nodes. Click Start Tests to trigger execution, or continue using the terminal commands (start / nodes) as usual.

Authentication: the web UI and /api/* endpoints require a bearer token. On startup the controller uses STRESSTOOL_WEB_TOKEN from the environment or generates a random token and prints it to the console. Sign in via /login (or send Authorization: Bearer <token> when scripting against the API).

Configuration

Authentication

Define authentication at the top level of your config using the auth key. Only one auth type is allowed per config file. All tests use it automatically — set auth: false on individual tests to opt out.

Auth field values support {{ }} placeholders just like headers and body.

Supported auth types:

# Basic Auth — sends Authorization: Basic <base64(user:pass)>
auth:
  basic_auth:
    username: admin
    password: "{{ get_password() }}"

# Bearer Token — sends Authorization: Bearer <token>
auth:
  bearer:
    token: "{{ get_token() }}"

# API Key — sends a custom header
auth:
  api_key:
    header: X-API-Key
    key: "my-secret-key"

# OAuth2 Client Credentials — fetches token from token_url, sends Authorization: Bearer <token>
auth:
  oauth2_client_credentials:
    token_url: https://auth.example.com/token
    client_id: my-client
    client_secret: "{{ get_secret() }}"
    scopes: ["read", "write"]

OAuth2 tokens are cached and automatically refreshed when they expire.

Warmup (Ramp-Up) Phase

Add an optional warmup_seconds field to any test to linearly ramp the request rate from zero up to the configured requests_per_second before the main run begins. This lets the target system (and any autoscalers, caches, or connection pools) settle before full load hits.

tests:
  - name: api_test
    path: "https://api.example.com/endpoint"
    method: "GET"
    requests_per_second: 100
    threads: 10
    warmup_seconds: 15   # linear ramp from 0 → 100 RPS over 15s
    run_seconds: 60      # 60s of steady load at 100 RPS afterwards

Total test duration with a warmup is warmup_seconds + run_seconds — in the example above, 75 seconds total. Metrics cover the entire window, so RPS shown during warmup will appear lower than the target until the ramp completes. When warmup_seconds is omitted or set to 0, the runner behaves exactly as before: full RPS from t=0.

Node overrides may also tune warmup per node (warmup_seconds: -1 disables warmup for a specific node, 0 inherits from the test default).

Node-Specific Overrides

You can configure different load parameters for specific nodes in your YAML config:

tests:
  - name: api_load_test
    path: "https://api.example.com/endpoint"
    method: "POST"
    requests_per_second: 100  # Default for all nodes
    threads: 10               # Default for all nodes
    run_seconds: 60
    
    # Node-specific overrides
    nodes:
      node-a:
        requests_per_second: 50
        threads: 5
      node-b:
        requests_per_second: 150
        threads: 15
      node-c:
        requests_per_second: 100
        # Uses default threads: 10

Example Configuration

See example-config.yaml for a complete example:

funcs:
  - name: get_password
    cmd: ["echo", "test-password-12345"]

auth:
  basic_auth:
    username: "testuser"
    password: "{{ get_password() }}"

tests:
  - name: api_login_test
    path: "https://httpbin.org/post"
    method: "POST"
    requests_per_second: 5
    threads: 2
    run_seconds: 10

    # Auth headers (Authorization: Basic ...) are added automatically.
    headers:
      X-Request-Id: "{{ uuid() }}"
      X-Timestamp: "{{ now() }}"
      Content-Type: "application/json"

    body: >
      {
        "username": "testuser",
        "timestamp": "{{ now() }}"
      }

    assert:
      status_code: 200
      body_contains: "json"
      max_latency_ms: 2000

    nodes:
      node-a:
        requests_per_second: 3
        threads: 1
      node-b:
        requests_per_second: 8

  - name: api_get_public
    path: "https://httpbin.org/get"
    method: "GET"
    auth: false  # Disable auth for this test
    requests_per_second: 10
    threads: 3
    run_seconds: 5

    assert:
      status_code: 200
      max_latency_ms: 1000

Custom Functions

Custom functions defined under funcs: are executed via os/exec and time out after 3 seconds by default. Override the timeout with the STRESSTOOL_FUNC_TIMEOUT environment variable, which accepts any Go duration (e.g. 500ms, 10s).

STRESSTOOL_FUNC_TIMEOUT=10s stresstool standalone --config config.yaml

Communication Protocol

The controller and nodes communicate over gRPC using a bidirectional Session stream (StressTestService in proto/api/v1/payload.proto). Messages are protobuf oneof variants on NodeMessage (node → controller) and ControllerMessage (controller → node).

Message flow (high level)

Hello: Node opens the stream and sends hello (name + version).
TestSpec: After you start a run, the controller sends each node its config slice.
Ready: Node validates config and responds with ready (or Error if validation fails).
StartTests: Controller tells all ready nodes to begin execution.
Progress / TestResult: Node streams progress and final metrics.
StopTests (optional): Controller can request early cancellation (e.g. web Stop or API).
Complete: Controller closes the logical run; nodes may disconnect.

By default, traffic is plaintext gRPC (--insecure defaults to true on controller and node). For TLS or mTLS, set --insecure=false and pass --tls-cert, --tls-key, and when needed --tls-ca (see internal/cli/tls.go).

Protobuf / Buf

Payload schemas live in proto/api/v1/payload.proto. Regenerating Go code from them requires the Buf CLI.

Install Buf (macOS / Homebrew):

brew install bufbuild/buf/buf

Module and codegen config: buf.yaml, buf.gen.yaml. From the repository root:

buf generate

This runs the remote protocolbuffers/go and grpc/go plugins configured in buf.gen.yaml and writes generated files under internal/protocol/payloadpb/.

Benefits of Distributed Mode

Higher Load: Distribute load across multiple machines to test at scale
Geographic Distribution: Run nodes in different regions to test from multiple locations
Node Specialization: Configure different load profiles per node
Centralized Control: Single point to coordinate and monitor all tests
Aggregated Results: View combined results from all nodes in one place

Example Distributed Test Scenario

# Terminal 1: Start Controller
./stresstool controller -f load-test.yaml --listen :8090

# Terminal 2: Node in US-East
./stresstool node --node-name us-east --controller localhost:8090

# Terminal 3: Node in US-West
./stresstool node --node-name us-west --controller localhost:8090

# Terminal 4: Node in EU
./stresstool node --node-name eu-central --controller localhost:8090

# Back to Terminal 1: Check nodes and start
> nodes
Connected nodes (3):
  - us-east
  - us-west
  - eu-central

> start

The controller will coordinate all three nodes to execute tests simultaneously, with each node applying its specific configuration overrides.

Monitoring

Real-Time Progress

During test execution, the controller displays live updates from all nodes:

→ us-east / api_login_test: 5s elapsed - 25 requests, 5.0 RPS, 0 failures
→ us-west / api_login_test: 5s elapsed - 40 requests, 8.0 RPS, 1 failures
→ eu-central / api_login_test: 5s elapsed - 15 requests, 3.0 RPS, 0 failures

Final Results

After all tests complete, view aggregated metrics per node:

=== Node: us-east ===
Test: api_login_test
  Requests: 50 total, 50 success, 0 failures
  Latency: Min: 45ms, Avg: 123ms, Max: 456ms, P95: 234ms, P99: 345ms
  Result: ✓ PASSED

=== Node: us-west ===
...

Build from Source

Toolchain: Go 1.26.1+ (see go.mod). To regenerate protobuf/gRPC Go stubs after editing proto/, install Buf (e.g. brew install bufbuild/buf/buf) and run buf generate.

git clone <repository-url>
cd stresstool
go build -o stresstool ./cmd/stresstool

Install Globally

go install ./cmd/stresstool

Dependency Maintenance and Risk Checks

Use the helper script below to check for outdated modules and known vulnerabilities:

./scripts-deps-check.sh

To attempt dependency upgrades first, then run the same risk checks:

./scripts-deps-check.sh --upgrade

The script runs:

go list -m -u all to report outdated dependencies
govulncheck ./... to find known vulnerabilities in reachable code paths

Cloud Deployment with Terraform

Terraform configurations are provided for deploying stresstool's controller-node architecture on AWS, GCP, and Azure. Each configuration creates a dedicated network, firewall/security rules, a controller VM, and a configurable number of worker node VMs.

Prerequisites

Terraform >= 1.5
Cloud provider CLI authenticated (aws configure, gcloud auth login, or az login)

Quick Start

cd terraform/aws    # or terraform/gcp / terraform/azure
terraform init
terraform plan
terraform apply

Provider-Specific Variables

AWS

Variable	Description	Default
`aws_region`	AWS region	`us-east-1`
`instance_type`	EC2 instance type	`t3.medium`
`node_count`	Number of worker nodes	`3`
`key_name`	EC2 key pair name (required)	—
`allowed_ssh_cidr`	CIDR allowed to SSH	`0.0.0.0/0`
`controller_port`	Controller listen port	`8090`

terraform apply -var="key_name=my-keypair" -var="node_count=5"

GCP

Variable	Description	Default
`project_id`	GCP project ID (required)	—
`region`	GCP region	`us-central1`
`zone`	GCP zone	`us-central1-a`
`machine_type`	GCE machine type	`e2-medium`
`node_count`	Number of worker nodes	`3`
`allowed_ssh_cidr`	CIDR allowed to SSH	`0.0.0.0/0`
`controller_port`	Controller listen port	`8090`

terraform apply -var="project_id=my-gcp-project"

Azure

Variable	Description	Default
`location`	Azure region	`East US`
`vm_size`	VM size	`Standard_B2s`
`node_count`	Number of worker nodes	`3`
`admin_username`	VM admin user	`azureuser`
`ssh_public_key_path`	Path to SSH public key	`~/.ssh/id_rsa.pub`
`allowed_ssh_cidr`	CIDR allowed to SSH	`0.0.0.0/0`
`controller_port`	Controller listen port	`8090`

terraform apply -var="location=West US 2"

Opening the Controller Port to the Web

By default, the controller port (8090) is only accessible from within the private network (VPC/VNet). To expose it externally — for example, to connect worker nodes from other networks or to access the controller from the internet:

AWS

Add a public ingress rule to the controller security group:

# In terraform/aws/main.tf, add to aws_security_group.controller:
ingress {
  description = "Controller port from the web"
  from_port   = 8090
  to_port     = 8090
  protocol    = "tcp"
  cidr_blocks = ["0.0.0.0/0"]   # or restrict to your IP
}

GCP

Update the firewall source range:

# In terraform/gcp/main.tf, change google_compute_firewall.allow_controller:
source_ranges = ["0.0.0.0/0"]   # instead of source_tags
# Remove: source_tags = ["stresstool-node"]

Azure

Add an NSG rule for external access:

# In terraform/azure/main.tf, add to azurerm_network_security_group.controller:
security_rule {
  name                       = "ControllerWeb"
  priority                   = 1003
  direction                  = "Inbound"
  access                     = "Allow"
  protocol                   = "Tcp"
  source_port_range          = "*"
  destination_port_range     = "8090"
  source_address_prefix      = "0.0.0.0/0"   # or restrict to your IP
  destination_address_prefix = "*"
}

Security note: Exposing the controller port to 0.0.0.0/0 makes it accessible from the entire internet. In production, restrict allowed_ssh_cidr and the controller port to known IP ranges.

Outputs

After terraform apply, all providers output:

Controller public/private IPs
Node public/private IPs
SSH command to connect to the controller

Cleanup

terraform destroy

Architecture Benefits

Separation of Concerns: Controller handles coordination, nodes handle execution
Scalability: Add more nodes to increase load capacity
Flexibility: Mix standalone and distributed modes as needed
Observability: Centralized progress monitoring and result aggregation

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
.github/workflows		.github/workflows
cmd/stresstool		cmd/stresstool
docker		docker
docs		docs
internal		internal
proto/api/v1		proto/api/v1
terraform		terraform
.dockerignore		.dockerignore
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
.tool-versions		.tool-versions
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
buf.gen.yaml		buf.gen.yaml
buf.yaml		buf.yaml
demo.sh		demo.sh
docker-compose.yml		docker-compose.yml
example-config.yaml		example-config.yaml
go.mod		go.mod
go.sum		go.sum
scripts-build.sh		scripts-build.sh
scripts-deps-check.sh		scripts-deps-check.sh

Folders and files

Latest commit

History

Repository files navigation

Distributed Stress Testing Tool

Architecture Overview

Components

Usage

1. Standalone Mode (Single Machine)

2. Distributed Mode (Controller + Nodes)

Step 1: Start the Controller

Step 2: Start Worker Nodes

Step 3: Start Tests from Controller

3. Controller Web Mode

Configuration

Authentication

Warmup (Ramp-Up) Phase

Node-Specific Overrides

Example Configuration

Custom Functions

Communication Protocol

Message flow (high level)

Protobuf / Buf

Benefits of Distributed Mode

Example Distributed Test Scenario

Monitoring

Real-Time Progress

Final Results

Build from Source

Install Globally

Dependency Maintenance and Risk Checks

Cloud Deployment with Terraform

Prerequisites

Quick Start

Provider-Specific Variables

AWS

GCP

Azure

Opening the Controller Port to the Web

AWS

GCP

Azure

Outputs

Cleanup

Architecture Benefits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages