All notable changes to the mastersof-ai harness.
- 49 new integration tests for IPC round-trip, worker lifecycle, frame sanitization, mutex contention, and env isolation
- Mock worker (
src/test-fixtures/mock-worker.ts) speaks IPC protocol without SDK dependency - Extracted
filterExecArgv()andbuildWorkerEnv()from worker-manager.ts so tests call production code instead of reimplementing logic inline - Added
workerPathconstructor param for test injection - Rewrote 5 tautological tests in session-worker; added 2 integration tests (SIGKILL+mutex, kill-on-resubscribe)
- 332 total tests, 0 failures, 5/5 flakiness runs clean
Seven waves of defense-in-depth security hardening, culminating in 282 passing tests and zero known critical vulnerabilities.
Wave 1: Tier 0 Security — Default bind to 127.0.0.1, shell env allowlist (src/env-safety.ts), SSRF URL validation (src/url-safety.ts), sandbox read-only mounts, API key hygiene.
Wave 2: Credential Architecture + Egress — CredentialStore with per-tool scoping (src/credentials.ts), egress domain filtering (src/egress-proxy.ts), headless run subcommand with runs.jsonl logging, per-user tool deny lists, canUseTool operation allowlists.
Wave 4: Defense in Depth — Content boundary tagging for prompt injection defense (src/content-safety.ts), SSRF hardening (IPv4-mapped IPv6, hex-short, decimal/octal/hex IP encoding, protocol restriction, redirect chaining), A2A server authentication, WS query param token deprecation, per-user logging.
Wave 5: Process Isolation + Partner Onboarding — Fork-per-session workers (src/session-worker.ts), IPC protocol (src/ipc-protocol.ts), worker lifecycle management (src/worker-manager.ts), per-user query serialization (src/query-mutex.ts), partner token generation. 15 review fixes: settled-flag race, IPC channel closed recursion, mutex FIFO ordering, fork bomb cap, IPC frame allowlist, execArgv sanitization, hashed rate limiter keys.
Wave 6: Process Isolation Hardening — Shared SDK stream processor (src/sdk-stream.ts), WebSocket message validation (src/ws-protocol.ts), mutex timeout, timing-safe token comparison, worker ready timeout, pending approval cleanup, configurable maxWorkers, worker pool health reporting.
Wave 7: Review Hardening + Type Safety + Observability — Exhaustive switch enforcement, Zod↔TypeScript type assertions, ALLOWED_FRAME_TYPES module extraction, safeSend wrapper, worker config minimization, WS schema tightening (content max, lastMessageId max), bounded health arrays, WorkerManager.getStats(), safeCompare JSDoc. CLI DX: credentials check, access create, access rotate, status, preflight, token rotation.
- CLI subcommands extracted to
src/cli/modules (index.tsx reduced from 753 to 161 lines) WsClientMessagetype derived from Zod schema (single source of truth, no bidirectional assertion)- Zero bare
ws.send(JSON.stringify(...))calls in serve.ts — all usesafeSend isBufferableFrame()runtime type guard replacesas unknown ascasts in WS relay- Security narrative documentation for external audit (
docs/security.md) - Architecture docs updated for all new modules
HIGH (3):
- Dispatcher fall-through: converted independent
ifblocks toif/else ifchain withprocess.exit(0)safety nets - Frame allowlist-output:
sanitizeFrame()constructs new objects with only known fields before relaying IPC frames to WebSocket — strips extra properties from compromised workers - Removed
as anycasts in credential grant iteration — uses Zod-inferredCredentialGranttype from manifest.ts
MEDIUM (7):
safeSendtyped asWsServerMessage— protocol drift caught at compile time- New
WsWarning,WsPongtypes andretryAfter?onWsError streamToStdoutusesextractSdkEvent()instead of raw(msg as any).eventcastssafeClose()wrapper forws.close()at all call sites (auth failure, rate limit, idle timeout, shutdown)- Unknown subcommands print usage error instead of silently launching TUI
--agentsflag warns on wildcard default (least-privilege)- docs/security-model.md updated with Layers 9-10, process isolation details
LOW (6):
isBufferableFramemoved to ipc-protocol.ts (co-located withALLOWED_FRAME_TYPES)access create --namevalidated withvalidateNameat creation time- Dead
buildOptionsimport removed from preflight.ts - CLI subcommand examples added to CLAUDE.md
retryAfter?added toWsError(covered by T05)- Shared CLI context type deferred (not actionable at 10 modules)
The harness now has a dual-interface architecture: the original terminal TUI for single-user iteration, and a web UI for multi-user remote access. Both share the same agent runtime, tools, and configuration.
Phase 1: Frontmatter + Tool Filtering
- IDENTITY.md files now support YAML frontmatter for agent metadata (name, description, icon, tags, starters, access control)
- Per-agent tool filtering via
tools.allow/tools.denyin frontmatter - Zod-validated
AgentManifesttype;--list-agentsshows rich metadata - 55 unit tests covering frontmatter parsing, tool filtering, and agent context
Phase 2: Serve Mode Backend
--servestarts a Fastify HTTP/WebSocket server (default port 3200)- REST API: agent roster, session CRUD, usage tracking
- WebSocket: real-time streaming (text tokens, thinking tokens, tool calls, sub-agent progress)
- Token-based authentication via
~/.mastersof-ai/access.yaml(SHA-256 hashed tokens) - Per-user session isolation and message persistence
Phase 3: Web Frontend SPA
- React + Vite + Tailwind CSS + Radix UI frontend in
web/ - Agent card grid, conversation sidebar, streaming chat panel
- Tool call display with collapsible blocks and approval flow
- @mention autocomplete for agent switching
- Dark mode, i18n (English + Portuguese), voice input (Web Speech API)
- WebSocket reconnection with message replay
- Deploys to Cloudflare Pages (
wrangler pages deploy)
Phase 4A: Security Foundation
- Mandatory remote sandbox for serve mode sessions
- Per-user workspace isolation (
workspace/{user}/) - Shell policy enforcement (requires both
tools.allow+sandbox.enforce) - Rate limiting (per-user message rate, connection limits, auth failure throttling)
- CORS origin validation (configurable allowlist, localhost in dev)
Phase 4B: Production Hardening
- Per-agent external MCP servers (URI and command-based, via
mcpfrontmatter field) - Per-user token budgets with rolling windows (session, daily, monthly limits)
- Health monitoring (
/healthshallow +/health/deepadmin endpoints) - Hot reload: file watcher on agents dir, config, and access.yaml; broadcasts roster updates to connected clients
- LGPD compliance: data export, deletion, consent tracking, retention policies
- Graceful shutdown with 30s connection draining
- Structured logging with configurable levels
- A2A server module (
src/a2a/) with Agent Card generation from IDENTITY.md --cardflag outputs Agent Card JSON derived from identity H2 sections- A2A client tools:
a2a_discover,a2a_call,a2a_listfor calling remote A2A agents - AgentExecutor bridge connecting A2A task lifecycle to harness
sendMessage()flow
- 17 code review fixes: CORS security, state management, component quality
- Path traversal hardening, error handling improvements, cancellation support
- SDK upgraded to @anthropic-ai/claude-agent-sdk ^0.2.76
- Fixed
model_querytext extraction from response blocks - npm audit fix + dependency upgrades
- SDK upgraded to ^0.2.75; switched to Opus 4.6 with 1M context window
- Default effort level changed from
hightomax - New TUI commands:
/help,/effort [level],/model [model-id]
- Error classification system with actionable diagnostics for API/auth failures
- Structured error categories surfaced to user with fix instructions
- Security audit: path traversal fixes, shell injection hardening, deprecated thinking config removal
- SDK upgraded to ^0.2.69; InstructionsLoaded hook, sub-agent token tracking
- Smart
web_fetch: CSS error suppression, query-based content extraction
- Pre-flight auth check before starting agent (validates credentials early)
- Sandbox changed to opt-in (
--sandbox) instead of default-on - Linux install documentation added
- Home/End/Delete key fixes in TUI
- Prepared repo for GitHub publish as
@mastersof-ai/harness - SDK upgraded to 0.2.62; MCP tool search auto-enabled for large tool sets
- Memory system documentation added
- Auto-resume most recent session for bare
--resumeflag - Biome lint/format fixes across codebase
- Agent workspace directories (
~/.mastersof-ai/agents/<name>/workspace/) - Bubblewrap sandbox with per-agent configuration
current_timetool- Context bar visibility improvements for dark terminals