framework: strands-python / strands-node — persistent subprocess runner (stdio JSON-RPC)

## Summary

Add \`framework: strands\` as a known value in \`forge.yaml\`. At startup, Forge spawns a persistent Python subprocess (the Strands runner) and routes A2A \`tasks/send\` to it via stdio JSON-RPC. Forge keeps owning the trust shell — auth, audit, egress, guardrails, OTel tracer install, platform policy — while Strands owns the executor loop, model providers, tool definitions, and multi-agent patterns.

Builds on the #182/#183 plumbing (W3C trace propagation + binary skill runtime + curated OTEL_* passthrough). Same env-injection pattern, just at process startup instead of per-skill-call.

## Design

### Transport: stdio JSON-RPC (MCP-style)

Newline-delimited JSON-RPC 2.0 over the runner's stdin/stdout. Same framing Forge already speaks for MCP stdio servers — no new ports, no UDS path management, supervisor reuses MCP's client-side code patterns.

Methods Forge calls on the runner:

| Method | Purpose |
|---|---|
| \`invoke\` | One A2A \`tasks/send\` → one invoke. Params carry \`prompt\`, \`session_id\`, \`traceparent\` header (per-request — env-injection only fires at fork time). |
| \`healthz\` | Periodic liveness probe. |
| \`shutdown\` | Graceful drain before SIGTERM. |

Server-pushed notifications (runner → Forge):

| Method | Purpose |
|---|---|
| \`ready\` | Signal that the runner has built its Strands Agent and is accepting invokes. Forge blocks A2A from listening until this lands. |
| \`audit\` | Audit event from inside the Strands loop (LLM call, tool call, guardrail decision). Forge translates to AuditEvent and emits through its existing sink stack. |
| \`stream_token\` | SSE-equivalent token delta for \`tasks/sendSubscribe\`. |

### Lifecycle (Forge supervises)

- **Startup**: spawn with full env (HTTP_PROXY, OTEL_* curated allowlist, FORGE_AUDIT_SOCKET, FORGE_AGENT_ID, skill-declared secrets). Wait up to \`framework.strands.startup_timeout\` (default 30s) for \`ready\` notification.
- **Per-request**: open \`a2a.<method>\` → \`strands.bridge.invoke\` span. Inject \`traceparent\` into the invoke message params; the runner's OTel SDK extracts on receive. Strands' own \`llm.completion\` / \`tool.<name>\` spans nest correctly.
- **Audit**: runner emits notifications; Forge stamps \`correlation_id\` + \`task_id\` + \`seq\` from the active invoke and emits through \`EmitFromContext\` so the audit stream looks identical to a Go-runtime invocation.
- **Health**: periodic \`healthz\` (default 30s); miss surfaces in \`agent_card_published\` if still down at hot-reload time.
- **Crash recovery**: exponential backoff restart (100ms → 5s cap, mirrors the audit socket sink pattern). Emit \`framework_runner_crashed\` audit event. In-flight invokes fail-fast with a typed error.
- **Hot reload**: file-watcher fires on \`agent.py\` change → drain in-flight → restart → wait for \`ready\` → resume.
- **Shutdown**: SIGTERM, wait for drain up to \`framework.strands.drain_timeout\` (default 10s), then SIGKILL.

### forge.yaml

\`\`\`yaml
framework: strands
strands:
  runner_command: [\"python3\", \"-m\", \"forge_strands_runner\"]
  startup_timeout: 30s
  drain_timeout: 10s
  health_interval: 30s
  restart_policy:
    max_attempts: 5            # 0 = forever
    backoff_initial: 100ms
    backoff_max: 5s
  crash_audit: true
\`\`\`

### Files

\`\`\`
forge-cli/runtime/
  framework_runner.go               # process supervisor (lifecycle + JSON-RPC framing)
  framework_runner_test.go          # spawn / ready / crash / restart / drain
  strands_runner_invoke.go          # A2A tasks/send → invoke translation + response shaping
  strands_runner_audit_bridge.go    # notification → AuditEvent translation
forge-cli/templates/strands/
  agent.py                          # operator's Strands Agent factory (skeleton)
  requirements.txt                  # strands-agents>=1.0, forge-strands-runner
  forge_strands_runner.py           # stdio JSON-RPC loop wrapping the operator's Agent
  Dockerfile                        # python:3.12-slim + forge Go binary
forge-core/catalog/
  frameworks.go                     # register \"strands\" as a known framework value
forge-core/types/config.go          # FrameworkStrands constant + StrandsConfig struct
\`\`\`

### Python shim (\`forge_strands_runner\`)

Ship as a PyPI package so operators don't write the JSON-RPC plumbing — they write \`agent.py\` with a \`build_agent()\` factory; the shim handles ready/invoke/healthz/audit/shutdown messages and the OTel context extraction.

\`\`\`python
# forge_strands_runner.py — shim Forge owns
import json, sys
from opentelemetry.propagate import extract
from opentelemetry import trace

from agent import build_agent

tracer = trace.get_tracer(\"forge.strands.runner\")
agent = build_agent()

def write(obj):
    print(json.dumps(obj), flush=True)

write({\"jsonrpc\": \"2.0\", \"method\": \"ready\"})

for line in sys.stdin:
    req = json.loads(line)
    method = req.get(\"method\")
    if method == \"invoke\":
        ctx = extract({\"traceparent\": req[\"params\"].get(\"traceparent\", \"\")})
        with tracer.start_as_current_span(\"strands.invoke\", context=ctx):
            result = agent(req[\"params\"][\"prompt\"])
        write({\"jsonrpc\": \"2.0\", \"id\": req[\"id\"], \"result\": {\"text\": str(result)}})
    elif method == \"healthz\":
        write({\"jsonrpc\": \"2.0\", \"id\": req[\"id\"], \"result\": \"ok\"})
    elif method == \"shutdown\":
        write({\"jsonrpc\": \"2.0\", \"id\": req[\"id\"], \"result\": \"draining\"})
        break
\`\`\`

## Phased rollout

This issue tracks the whole arc; PRs can land any one phase independently.

1. **Phase 1 — Skeleton** (1-2 weeks). Supervisor + lifecycle + JSON-RPC framing + \`invoke\` roundtrip. \`forge init --framework strands\` scaffolds. No audit bridge, no guardrails. Validates the supervisor + happy path.
2. **Phase 2 — Trace + audit bridge** (1 week). \`strands.bridge.invoke\` span. Per-request \`traceparent\` propagation. Notification → AuditEvent translation with \`correlation_id\` + \`seq\` stamping.
3. **Phase 3 — Guardrails at the boundary** (1 week). Input gate before Forge sends \`invoke\`. Output gate when response comes back. Tool-call gate via runner notification round-trip.
4. **Phase 4 — Container packaging + docs** (1-2 weeks). \`forge package\` emits a working image (python:3.12-slim base + Forge binary + Strands deps + operator agent.py). End-to-end \`docs/frameworks/strands.md\` walkthrough.
5. **Phase 5 — Streaming** (optional). \`stream_token\` notifications wire to A2A \`tasks/sendSubscribe\` SSE responses.

## Open design questions

| Question | Default for v1 |
|---|---|
| Session model — Strands' (in-memory) or Forge's (file)? | Strands' in-memory. File-backed continuity is the operator's choice via Strands' Session abstraction. |
| Tool ownership — Strands tools only, or expose SKILL.md as Strands tools? | Strands tools only. SKILL.md bridge is a follow-up if operators ask for it. |
| LLM provider — Strands picks (Bedrock/Anthropic/OpenAI/LiteLLM), or Forge proxies? | Strands picks. Forge still tracks the chosen provider in forge.yaml for egress-allowlist auto-merge + security analysis. |
| Container shape — one image (Forge + Python venv) or two-container Pod? | One image. K8s sidecar pattern is a follow-up if scaling concerns surface. |

## What this does NOT do (out of scope)

- **Embedded Python via cgo**. Strands' dep tree (boto3, multiple OTel instrumentations, anthropic, openai...) makes cgo a pain. Subprocess is the practical answer.
- **Node / TypeScript runner**. Same supervisor pattern works for any subprocess agent runtime; do Node as a follow-up issue once Strands lands.
- **Multi-runner orchestration** (multiple Strands agents in one pod). One agent.py per Forge agent for v1.
- **Bridge from SKILL.md skills into Strands tools**. Out of scope for v1.

## Risk

Low-medium. The supervisor is a new code path; mitigated by the phased rollout and the fact that the underlying JSON-RPC framing is the same one MCP-stdio already uses. The cgo / dual-language coupling is constrained to a single shim (forge_strands_runner.py). Operators on existing frameworks (custom, crewai, langchain) see no change.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

framework: strands-python / strands-node — persistent subprocess runner (stdio JSON-RPC) #188

Summary

Design

Transport: stdio JSON-RPC (MCP-style)

Lifecycle (Forge supervises)

forge.yaml

Files

Python shim (`forge_strands_runner`)

forge_strands_runner.py — shim Forge owns

Phased rollout

Open design questions

What this does NOT do (out of scope)

Risk

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Method	Purpose
`invoke`	One A2A `tasks/send` → one invoke. Params carry `prompt`, `session_id`, `traceparent` header (per-request — env-injection only fires at fork time).
`healthz`	Periodic liveness probe.
`shutdown`	Graceful drain before SIGTERM.

Method	Purpose
`ready`	Signal that the runner has built its Strands Agent and is accepting invokes. Forge blocks A2A from listening until this lands.
`audit`	Audit event from inside the Strands loop (LLM call, tool call, guardrail decision). Forge translates to AuditEvent and emits through its existing sink stack.
`stream_token`	SSE-equivalent token delta for `tasks/sendSubscribe`.

Question	Default for v1
Session model — Strands' (in-memory) or Forge's (file)?	Strands' in-memory. File-backed continuity is the operator's choice via Strands' Session abstraction.
Tool ownership — Strands tools only, or expose SKILL.md as Strands tools?	Strands tools only. SKILL.md bridge is a follow-up if operators ask for it.
LLM provider — Strands picks (Bedrock/Anthropic/OpenAI/LiteLLM), or Forge proxies?	Strands picks. Forge still tracks the chosen provider in forge.yaml for egress-allowlist auto-merge + security analysis.
Container shape — one image (Forge + Python venv) or two-container Pod?	One image. K8s sidecar pattern is a follow-up if scaling concerns surface.

Uh oh!

framework: strands-python / strands-node — persistent subprocess runner (stdio JSON-RPC) #188

Description

Summary

Design

Transport: stdio JSON-RPC (MCP-style)

Lifecycle (Forge supervises)

forge.yaml

Files

Python shim (`forge_strands_runner`)

forge_strands_runner.py — shim Forge owns

Phased rollout

Open design questions

What this does NOT do (out of scope)

Risk

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions