SW4RM is an open coordination protocol for agent swarms. It provides guaranteed message delivery, persistent scheduling, multi-agent negotiation, crash-safe handoffs, and rich observability -- the runtime layer that sits between your agents and lets them work together reliably.
This repository contains five SDKs (Python, Rust, JavaScript, Elixir, Common Lisp), three reference services (Registry, Router, Scheduler), an A2A protocol gateway, and a Docker Compose stack that brings everything up in one command.
git clone https://github.com/rahulrajaram/sw4rm.git && cd sw4rm
# Start the full stack (Registry, Router, Scheduler, A2A Gateway)
docker compose up --build -d
# Verify the A2A gateway is running
curl http://localhost:8080/.well-known/agent.json
# Run the quickstart demo (registers agents, sends messages, heartbeats)
./quickstart.sh --localOr use the JSON-RPC interface:
curl -X POST http://localhost:8080/ \
-H 'Content-Type: application/json' \
-d '{"jsonrpc":"2.0","method":"GetAgentCard","params":{},"id":1}'| SDK | Directory | Tests | README |
|---|---|---|---|
| Python | sdks/py_sdk |
770 | README |
| Rust | sdks/rust_sdk |
329 | README |
| JavaScript/TypeScript | sdks/js_sdk |
410 | README |
| Elixir | sdks/ex_sdk |
331 | README |
| Common Lisp | sdks/cl_sdk |
87+333 | README |
All SDKs implement the same protocol (17 proto3 service definitions) and follow the same layered architecture adapted to language idioms.
The a2a_gateway/ module exposes any SW4RM agent swarm via the A2A (Agent-to-Agent) protocol. External A2A clients see standard Agent Cards and JSON-RPC 2.0; internally, SW4RM handles scheduling, negotiation, and crash recovery.
| Endpoint | Description |
|---|---|
GET /.well-known/agent.json |
A2A Agent Card for the gateway |
POST / (JSON-RPC) SendMessage |
Route a message to a SW4RM agent |
POST / (JSON-RPC) GetTask |
Query task state |
POST / (JSON-RPC) CancelTask |
Cancel via Scheduler preemption |
POST / (JSON-RPC) GetAgentCard |
Get agent card by ID |
See a2a_gateway/README.md for details.
- Guaranteed Delivery: Router with persistent message queues, ACK lifecycle (received/read/fulfilled/rejected/failed/timed_out), and automatic reconciliation
- Scheduling and Preemption: Priority-based task scheduling with cooperative preemption, urgent lane semantics, and activity buffer persistence
- Multi-Agent Negotiation: Proposal/vote/decision protocol with configurable quorum policies, confidence-weighted vote aggregation, and timeout profiles
- Crash-Safe Handoffs: Structured agent-to-agent work transfer with context serialization, capability matching, and full audit trail
- Worktree Isolation: Policy-driven worktree binding with persistent state across restarts
- A2A Interoperability: Gateway translates between A2A protocol and SW4RM, with
.well-known/agent.jsondiscovery - Five SDKs: Python, Rust, JavaScript/TypeScript, Elixir, Common Lisp -- all wire-compatible
pip install sw4rm-sdk
# or from source:
pip install -e ".[dev]"[dependencies]
sw4rm-sdk = "0.6.0"
tokio = { version = "1.0", features = ["full"] }npm install @sw4rm/js-sdk# mix.exs
defp deps do
[{:sw4rm, "~> 0.6.0"}]
end(push (truename "sdks/cl_sdk/") asdf:*central-registry*)
(ql:quickload :sw4rm-sdk)import grpc
from sw4rm.clients.registry import RegistryClient
from sw4rm.clients.router import RouterClient
# Connect to services
registry_ch = grpc.insecure_channel("localhost:50052")
router_ch = grpc.insecure_channel("localhost:50051")
registry = RegistryClient(registry_ch)
router = RouterClient(router_ch)
# Register agent
registry.register({
"agent_id": "my-agent",
"name": "My Agent",
"capabilities": ["processing"],
})
# Send message
router.send_message({
"producer_id": "my-agent",
"consumer_id": "target-agent",
"message_type": 2, # DATA
"payload": b"hello",
"content_type": "text/plain",
"correlation_id": "req-001",
})
# Heartbeat and deregister
registry.heartbeat(agent_id="my-agent", state=1)
registry.deregister(agent_id="my-agent")use sw4rm_sdk::*;
use async_trait::async_trait;
struct EchoAgent {
config: AgentConfig,
preemption: PreemptionManager,
}
#[async_trait]
impl Agent for EchoAgent {
async fn on_message(&mut self, envelope: EnvelopeData) -> Result<()> {
if let Ok(text) = envelope.string_payload() {
println!("Echo: {}", text);
}
Ok(())
}
fn config(&self) -> &AgentConfig { &self.config }
fn preemption_manager(&self) -> &PreemptionManager { &self.preemption }
}
#[tokio::main]
async fn main() -> Result<()> {
let config = AgentConfig::new("echo-1".into(), "Echo Agent".into());
let agent = EchoAgent { config: config.clone(), preemption: PreemptionManager::new() };
AgentRuntime::new(config).run(agent).await
}import { RegistryClient, RouterClient, buildEnvelope, MessageType, CommunicationClass } from '@sw4rm/js-sdk';
const registry = new RegistryClient('localhost:50052');
const router = new RouterClient({ address: 'localhost:50051' });
await registry.registerAgent({
agent_id: 'echo-1',
name: 'EchoAgent',
capabilities: ['echo'],
communication_class: CommunicationClass.STANDARD,
});
const stream = router.streamIncoming('echo-1');
for await (const item of stream) {
const reply = buildEnvelope({
producer_id: 'echo-1',
message_type: MessageType.DATA,
payload: item.msg.payload,
content_type: 'application/json',
});
await router.sendMessage(reply);
}# Register and send a message
{:ok, channel} = GRPC.Stub.connect("localhost:50052")
Sw4rm.Transport.Client.register(channel, %{agent_id: "my-agent", capabilities: ["echo"]})See sdks/ex_sdk/examples/reference_demo.exs for a full demo exercising all 12 SDK features.
Three reference service implementations run the SW4RM protocol:
| Service | Port | Metrics | Description |
|---|---|---|---|
| Registry | 50052 | 9100 | Agent discovery, heartbeat, deregistration |
| Router | 50051 | 9101 | Message routing, delivery queues, streaming |
| Scheduler | 50053 | 9102 | Task scheduling, preemption, activity buffer |
Start locally:
cd sdks/py_sdk/reference-services
bash start_services.sh --localOr via Docker:
docker compose up --build -d- Code-Agent Tutorial: 3-agent code review swarm (writer + reviewers + deployer) using negotiation and handoff --
code_agent_tutorial.py|walkthrough
echo_agent.py-- Minimal echo agentvoting_example.py-- Multi-reviewer votingnegotiation_debate_example.py-- Negotiation room debatehandoff_example.py-- Agent-to-agent handoffhitl_escalation_example.py-- Human-in-the-loopworkflow_orchestration_example.py-- Multi-step workflowtool_streaming_example.py-- Tool call streamingthree_id_demo.py-- Three-ID correlation
echo_agent.rs-- Minimal echo agentadvanced_agent.rs-- Full-featured agenthandoff.rs-- Agent handoffworkflow.rs-- Workflow orchestrationnegotiation_room.rs-- Negotiation roomactivity_demo.rs-- Activity buffer
echoAgent.ts-- Minimal echo agentadvancedAgent.ts-- Full-featured agenthandoffExample.ts-- Agent handoffworkflowExample.ts-- Workflow orchestrationnegotiationRoomExample.ts-- Negotiation roomhitlEscalation.ts-- Human-in-the-loop
reference_demo.exs-- Full SDK feature demobasic_agent.exs-- Minimal agentnegotiation_flow.exs-- Negotiation flowhandoff.exs-- Agent handofftool_execution.exs-- Tool execution
echo-agent.lisp-- Minimal echo agentnegotiation-voting.lisp-- Negotiation votingsecret-management.lisp-- Secret managementtool-streaming.lisp-- Tool streaming
All five SDKs follow the same layered architecture:
+---------------------------+
| Integration Layer | ACK lifecycle, message processing, workflows
+---------------------------+
| Client Layer | Registry, Router, Scheduler, HITL, Negotiation,
| | Handoff, Tool, Worktree, Connector, Reasoning
+---------------------------+
| Protocol Layer | Proto3 wire format (17 service definitions)
+---------------------------+
| Runtime Layer | Activity buffer, worktree state, state machine
+---------------------------+
Each SDK adapts this to language idioms: Python uses classes and context managers, Rust uses async/await traits, JavaScript uses Promises and async iterators, Elixir uses GenServers and supervisors, and Common Lisp uses the condition/restart system.
| Workflow | What it does |
|---|---|
| Python CI | Python 3.12, pytest, smoke tests |
| Rust CI | cargo test --all --locked with protoc |
| JS CI | Node 20, npm ci && npm run build && npm test |
| Elixir CI | Elixir 1.16 / OTP 26, mix test + reference demo |
| Common Lisp CI | SBCL + Quicklisp, FiveAM test suite |
| Proto Check | Protocol file validation |
| Version Guard | Cross-SDK version consistency |
| Secrets Scan | Trufflehog credential scanning |
# All SDKs
make test
# Individual
make test-python # pytest -q sdks/py_sdk/tests
make test-rust # cd sdks/rust_sdk && cargo test --all --locked
make test-js # cd sdks/js_sdk && npm ci && npm run build && npm test
make test-lisp # cd sdks/cl_sdk && sbcl --load test/suite.lisp
# Elixir (Docker, no local Elixir required)
docker run --rm -v $(pwd):/app -w /app/sdks/ex_sdk elixir:1.16 \
bash -c "mix local.hex --force && mix local.rebar --force && mix deps.get && mix test"pip install -e ".[dev]"
make protos# Local mode (starts services, runs checks, cleans up)
./scripts/smoke_test.sh
# Docker mode (includes A2A gateway checks)
./scripts/smoke_test.sh --dockerPublishing is tag-driven per language via GitHub Actions:
# Python → PyPI
git tag py-v0.6.0 && git push origin py-v0.6.0
# JavaScript → npm
git tag npm-v0.6.0 && git push origin npm-v0.6.0
# Rust → crates.io
git tag rs-v0.6.0 && git push origin rs-v0.6.0Requires environment secrets (PYPI_API_TOKEN, NPM_TOKEN, CRATES_IO_TOKEN) in the production GitHub Actions Environment.
See CONTRIBUTING.md for versioning policy, commit hooks, and PR guidelines.