A voice-driven Claude Code orchestration engine. Put on headphones, walk around, and run multi-project agent workflows by voice — local-first, with optional cloud STT/TTS.
walkclaude is the open-source distribution of a voice-orchestration stack built around three Python modules:
voice_gateway/— local aiortc WebRTC gateway. Browser microphone in, browser/Cartesia TTS out. STT via local Whisper (faster-whisper) by default, Deepgram or OpenAI Realtime as cloud upgrades. Local LLM via Ollama for the zero-API "walk-mode" path.hermes_comms/— multi-channel communication bus with intent classification, decision packets, approval flows, evidence log, file custody, redaction, and a streaming HTTP server. Brokers between voice intents and downstream tool calls.hermes_parallel/— multi-project parallel agent runner. Tracks runs across projects, maintains a per-project registry, and exposes a CLI for inspecting in-flight work.
Plus minimal stub modules (crm_map/, hermes_inbox/) that satisfy the import surface so the code boots out-of-the-box without the private CRM/inbox components it was extracted from.
A working bench for the "walk around with headphones and orchestrate Claude Code" workflow:
- Open the gateway in a browser tab on your laptop.
- Pair your phone (Tailscale or your local network).
- Talk. STT transcribes locally via Whisper or via Deepgram/OpenAI Realtime if you supply keys. The local LLM (Ollama) routes intents. Approved actions hit the comms bus. The parallel runner tracks per-project state. The agent gives back a TTS reply.
- Walk.
You get the working substrate. You don't get a polished consumer app — that's a different project.
What's in:
- The full voice-gateway code (~5,000 lines of Python): WebRTC signalling, audio buffering, latency ledger, local + cloud STT/LLM/TTS adapters, browser client HTML, OpenAI Realtime adapter, task-loopback test harness.
- The full hermes_comms code (~3,000 lines): adapters, approvals, context store, decision packets, evidence log, file custody, intent classifier, outbox, policy, redaction, server.
- The full hermes_parallel code (~1,300 lines): registry, runner, policy, models, CLI.
- Public-worthy docs from the parent harness: voice-gateway runbooks, voice-operations runbook, secure-remote-voice-tunnel design, local-zero-api walk-mode design, voice-first multi-project deepresearch.
What's NOT in:
- The private commercial
crm_mapandhermes_inboxmodules. Stubbed in this repo with no-op implementations that satisfy the imports — replace with your own implementations if you need real CRM approval flows or external-channel inbox bridges (WhatsApp / Telegram / SMS / email). - Two demo simulation files (
week_simulation.py,real_day_simulation.py) that depended deeply on the private modules and a parent-harness project layout. Removed cleanly. - CI / test infrastructure. The original tests required a real audio pipeline + private modules to run end-to-end. Tests are not included; PRs welcome.
- Polished UX. Browser client is a single-page HTML demo; production UX is your problem.
This is a v0.1.0 foundational drop — the working code, the working architecture, the docs. Expect rough edges. The system has been running internally since 2026-04 against a real workload; the open-source extraction was made on 2026-05-11.
Requires Python 3.12+, ffmpeg (for some audio paths), and at least one of: a local Whisper install for STT, a Deepgram API key, or an OpenAI API key with Realtime access.
git clone https://github.com/waitdeadai/walkclaude
cd walkclaude
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements-voice.txt
pip install -r requirements-comms.txt
cp env.example .env
# Edit .env to add your keys (or leave them blank for local-only walk-mode).If you have Ollama and a local Whisper model installed:
python -m voice_gateway.gateway --zero-api --port 8080Open http://localhost:8080 in a browser, allow microphone, and speak. The local LLM responds; the local TTS speaks back.
python -m voice_gateway.gateway --port 8080Requires DEEPGRAM_API_KEY, CARTESIA_API_KEY, and OPENAI_API_KEY in your environment.
python -m hermes_comms.server --store .walkclaude/hermes-comms --port 8081The HTTP API exposes intent classification, approval flows, decision packets, file custody, and evidence retrieval.
python -m hermes_parallel.cli list
python -m hermes_parallel.cli start <project-slug> --objective "<...>" --lane-count 2
python -m hermes_parallel.cli status <run-id>State stored under .walkclaude/hermes-parallel/.
See docs/local-zero-api-walk-mode.md for the local-only headless setup walkthrough, and docs/voice-first-multi-project-agent-orchestration-deepresearch.md for the design rationale.
For remote access (talking to your home machine from outside your LAN), see docs/secure-remote-voice-tunnel.md. The current recommendation is Tailscale; LiveKit/SIP cloud-bridge integration is documented separately.
┌──────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Browser │ WebRTC │ │ HTTP │ │
│ (mic + TTS) ├──────────►│ voice_gateway├─────────►│ hermes_comms │
│ │◄──────────┤ (aiortc) │◄─────────┤ (intents + │
└──────────────┘ │ │ │ approvals) │
│ STT: local │ └────────┬────────┘
│ Whisper or │ │
│ Deepgram │ ▼
│ │ ┌─────────────────┐
│ LLM: local │ │ hermes_parallel │
│ Ollama or │ │ (multi-project │
│ OpenAI │ │ run tracker) │
│ │ └─────────────────┘
│ TTS: local │
│ Kokoro or │
│ Cartesia │
└──────────────┘
walkclaude is the voice-orchestration sibling to the LLM Dark Patterns Hooks suite (10 single-purpose Stop hooks for Claude Code) and the minmaxing governance harness. The three projects compose: minmaxing for spec/verify discipline, dark-patterns hooks for closeout-language enforcement, walkclaude for voice-driven multi-project orchestration.
PRs welcome on:
- Real CI / test infrastructure (the existing internal tests need too much setup to run portably).
- Replacing the
crm_map/hermes_inboxstubs with adapter interfaces other CRMs/inbox systems can implement. - Browser client polish.
- New STT / TTS / LLM provider adapters.
- Docs corrections.
The bar for new features: the change should make walkclaude more useful for somebody walking around with headphones running Claude Code. Anything that doesn't pass that bar belongs in a fork.
Apache-2.0.