Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
121 changes: 121 additions & 0 deletions services/agent/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# Agent runner (TypeScript)

The Node side of the agent workflow service. It runs the actual agent loop and serves one
contract: a JSON request in, a structured result out. The Python service
(`services/oss/src/agent/`) decides *what* to run (config, tools, secrets, trace) and calls
in here; this package *runs* it. It lives in Node because the harnesses (Pi, Claude Code,
rivet's `sandbox-agent`) are Node libraries with no Python SDK.

## How it is invoked

Two entrypoints, same `/run` contract (see `src/protocol.ts`):

- **`src/cli.ts`** — one JSON request on stdin, one result on stdout. The Python
SDK adapters use this subprocess transport when `AGENTA_AGENT_PI_URL` is unset. stdout is
the result channel only; logs go to stderr.
- **`src/server.ts`** — the same thing as a long-lived HTTP server on `:8765`
(`GET /health`, `POST /run`). This is the dockerized agent runner sidecar the Python SDK
adapters call over HTTP when `AGENTA_AGENT_PI_URL` points at it. The dev image
(`docker/Dockerfile.dev`) runs `tsx watch src/server.ts`.

Both route to an engine by the request's `backend` field.

## Layout (`src/`)

```
src/
cli.ts entrypoint: stdin/stdout (subprocess transport)
server.ts entrypoint: HTTP sidecar on :8765
protocol.ts the /run wire contract (request, result, events, capabilities)
engines/
pi.ts engine: drive the Pi SDK in-process
rivet.ts engine: drive a harness over ACP via a rivet sandbox-agent daemon
tracing/
otel.ts turn a run into OpenTelemetry spans nested under /invoke
tools/
callback.ts the one /tools/call HTTP client
code.ts execute resolved code tools in a scoped subprocess
dispatch.ts dispatch resolved tools by executor kind
mcp-bridge.ts build the MCP server config that exposes tools to a harness
mcp-server.ts the stdio MCP server itself (launched per session by the daemon)
extensions/
agenta.ts the Pi extension (tracing + tools), bundled into dist/ for Pi to load
```

## Engines

- **`pi`** (`engines/pi.ts`) — drives the Pi SDK directly in-process.
- **`rivet`** (`engines/rivet.ts`) — drives any harness (`pi`, `claude`) over the Agent
Client Protocol through a rivet `sandbox-agent` daemon, either local or in a Daytona
sandbox. This is the default on the platform.

The engine is a deployment choice (`backend` on the wire / `AGENT_BACKEND` env), not a
harness. Harness choice (`pi`, `claude`, or experimental `agenta`) and sandbox (`local` or
`daytona`, where supported) are per-run config the Python service sends.

## Result

```json
{
"ok": true,
"output": "Rome",
"messages": [{ "role": "assistant", "content": "Rome" }],
"events": [{ "type": "message", "text": "Rome" }, { "type": "done" }],
"usage": { "input": 1297, "output": 5, "total": 1302, "cost": 0.0066 },
"stopReason": "end_turn",
"capabilities": { "mcpTools": false, "images": true, "...": "..." },
"sessionId": "...",
"model": "openai-codex/gpt-5.5",
"traceId": "..."
}
```

`runRivet` probes the harness's capabilities and branches on them (for example, tools go
over MCP only when the harness advertises `mcpTools`); usage and the structured event log
come back on every run.

## Tracing

When the request carries a `trace` block, the run is exported to Agenta as OpenTelemetry
spans nested under the caller's `/invoke` span. The Pi path self-instruments via the
bundled extension (`extensions/agenta.ts`); other harnesses are traced from the rivet ACP
event stream (`tracing/otel.ts`). The Python `tracing` module fills `trace` in from the
live workflow span.

## Tools

Tools are resolved in the Python backend and arrive on the request as `customTools` plus a
`toolCallback`. Delivery is capability-routed: the Pi extension registers them natively;
other harnesses get them over MCP through `tools/mcp-bridge.ts` + `tools/mcp-server.ts`.
Either way each call POSTs back to Agenta's `/tools/call` (`tools/callback.ts`), so the
provider key and connection auth stay server-side.

## The extension bundle

`scripts/build-extension.mjs` esbuild-bundles `src/extensions/agenta.ts` into one
self-contained `dist/extensions/agenta.js` that Pi can load anywhere (host, the sidecar, a
Daytona snapshot). The dev image bakes it; rebuild after editing the extension or the
tracer:

```bash
pnpm run build:extension
```

## Auth

Provider keys arrive as `request.secrets` (resolved from the project vault) or fall back to
the harness's own login: Pi reads `~/.pi/agent/auth.json` (`pnpm exec pi` then `/login`),
Claude Code reads `~/.claude`. Set `OPENAI_API_KEY` / `ANTHROPIC_API_KEY` to override.

## config/

`config/AGENTS.md` and `config/agent.json` are a fallback "hello-world" agent, used only
when a request arrives with no config. In practice the playground always sends the agent
revision's config, so these are rarely hit.

## Local use

```bash
pnpm install
echo '{"backend":"pi","messages":[{"role":"user","content":"Hi"}]}' | pnpm run run:cli
```
7 changes: 7 additions & 0 deletions services/agent/config/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Hello-world agent

You are a friendly hello-world agent running on the Agenta agent service.

- Greet the user warmly.
- Answer the user's message in one or two short sentences.
- Do not use tools. Keep replies plain text.
4 changes: 4 additions & 0 deletions services/agent/config/agent.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"model": "gpt-5.5",
"tools": []
}
55 changes: 55 additions & 0 deletions services/agent/docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Agent runner sidecar (sandbox-agent server), production image.
#
# Runs the TypeScript runner (src/server.ts) as a long-lived HTTP server on :8765.
# The Python agent service calls it in-network. Unlike Dockerfile.dev there is no
# `tsx watch` and no bind mount: the source is baked in.
#
# Licensing posture (see docker/README.md):
# - Pi (@earendil-works/pi-coding-agent, MIT) is baked via the npm dependencies.
# - Claude Code is proprietary (Anthropic Commercial Terms). It is NEVER baked into
# this image. The sandbox-agent daemon installs it at runtime from Anthropic over
# HTTPS (the reason ca-certificates is installed). That keeps Anthropic as the
# distributor, the only compliant path for an image we build and ship.
# - No credential is baked: no API key, no OAuth login. Auth is injected at runtime
# (ANTHROPIC_API_KEY / request secrets; OAuth self-host is a mounted opt-in only).

FROM node:24-slim

WORKDIR /app

# CA certificates: the sandbox-agent daemon (Rust) downloads harness CLIs (e.g. Claude
# Code) over HTTPS using the system trust store, which node:*-slim omits — without this
# the daemon's `install-agent claude` fails TLS verification. git lets npm/installers
# fetch git deps.
RUN apt-get update \
&& apt-get install -y --no-install-recommends ca-certificates git \
&& rm -rf /var/lib/apt/lists/*

RUN corepack enable

# Install deps as a cached layer (manifest + lockfile only). The full dependency set is
# installed (not --prod): the runtime uses `tsx` and the extension build uses `esbuild`,
# both devDependencies.
COPY package.json pnpm-lock.yaml ./
RUN pnpm install --frozen-lockfile

# Bake the source (no bind mount in production).
COPY tsconfig.json ./
COPY scripts ./scripts
COPY src ./src
COPY config ./config
COPY skills ./skills

# Bundle the Agenta Pi extension (tracing + tools) into dist/. runSandboxAgent installs
# this baked copy into Pi's agent dir on every run. Rebuild the image after editing
# src/extensions/agenta.ts or the tracer.
RUN pnpm run build:extension

ENV NODE_ENV=production \
PORT=8765

EXPOSE 8765

# Call the local tsx binary directly to avoid pnpm/corepack HOME writes when the
# container runs as a non-root host uid.
CMD ["node_modules/.bin/tsx", "src/server.ts"]
Comment on lines +16 to +55

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Run the production container as a non-root user.

Line 16-55 currently runs the sidecar as root (no USER set), which weakens container isolation for a network-exposed service process.

Suggested fix
 FROM node:24-slim
 
 WORKDIR /app
@@
 RUN pnpm run build:extension
 
 ENV NODE_ENV=production \
     PORT=8765
 
+RUN groupadd --system app && useradd --system --gid app --create-home app \
+    && chown -R app:app /app
+USER app
+
 EXPOSE 8765
@@
 CMD ["node_modules/.bin/tsx", "src/server.ts"]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
FROM node:24-slim
WORKDIR /app
# CA certificates: the sandbox-agent daemon (Rust) downloads harness CLIs (e.g. Claude
# Code) over HTTPS using the system trust store, which node:*-slim omits — without this
# the daemon's `install-agent claude` fails TLS verification. git lets npm/installers
# fetch git deps.
RUN apt-get update \
&& apt-get install -y --no-install-recommends ca-certificates git \
&& rm -rf /var/lib/apt/lists/*
RUN corepack enable
# Install deps as a cached layer (manifest + lockfile only). The full dependency set is
# installed (not --prod): the runtime uses `tsx` and the extension build uses `esbuild`,
# both devDependencies.
COPY package.json pnpm-lock.yaml ./
RUN pnpm install --frozen-lockfile
# Bake the source (no bind mount in production).
COPY tsconfig.json ./
COPY scripts ./scripts
COPY src ./src
COPY config ./config
COPY skills ./skills
# Bundle the Agenta Pi extension (tracing + tools) into dist/. runSandboxAgent installs
# this baked copy into Pi's agent dir on every run. Rebuild the image after editing
# src/extensions/agenta.ts or the tracer.
RUN pnpm run build:extension
ENV NODE_ENV=production \
PORT=8765
EXPOSE 8765
# Call the local tsx binary directly to avoid pnpm/corepack HOME writes when the
# container runs as a non-root host uid.
CMD ["node_modules/.bin/tsx", "src/server.ts"]
FROM node:24-slim
WORKDIR /app
# CA certificates: the sandbox-agent daemon (Rust) downloads harness CLIs (e.g. Claude
# Code) over HTTPS using the system trust store, which node:*-slim omits — without this
# the daemon's `install-agent claude` fails TLS verification. git lets npm/installers
# fetch git deps.
RUN apt-get update \
&& apt-get install -y --no-install-recommends ca-certificates git \
&& rm -rf /var/lib/apt/lists/*
RUN corepack enable
# Install deps as a cached layer (manifest + lockfile only). The full dependency set is
# installed (not --prod): the runtime uses `tsx` and the extension build uses `esbuild`,
# both devDependencies.
COPY package.json pnpm-lock.yaml ./
RUN pnpm install --frozen-lockfile
# Bake the source (no bind mount in production).
COPY tsconfig.json ./
COPY scripts ./scripts
COPY src ./src
COPY config ./config
COPY skills ./skills
# Bundle the Agenta Pi extension (tracing + tools) into dist/. runSandboxAgent installs
# this baked copy into Pi's agent dir on every run. Rebuild the image after editing
# src/extensions/agenta.ts or the tracer.
RUN pnpm run build:extension
ENV NODE_ENV=production \
PORT=8765
RUN groupadd --system app && useradd --system --gid app --create-home app \
&& chown -R app:app /app
USER app
EXPOSE 8765
# Call the local tsx binary directly to avoid pnpm/corepack HOME writes when the
# container runs as a non-root host uid.
CMD ["node_modules/.bin/tsx", "src/server.ts"]

Source: Linters/SAST tools

41 changes: 41 additions & 0 deletions services/agent/docker/Dockerfile.dev
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Pi harness sidecar (WP-2), dev image.
#
# Runs the TypeScript Pi wrapper as an HTTP server. The Python agent service calls
# it in-network. Source is bind-mounted in dev so `tsx watch` hot-reloads; node_modules
# stays baked into the image. Build context is services/agent.

FROM node:24-slim

WORKDIR /app

# CA certificates: the rivet daemon (Rust) downloads harness CLIs (e.g. Claude Code) over
# HTTPS using the system trust store, which node:*-slim omits — without this the daemon's
# `install-agent claude` fails TLS verification. git lets npm/installers fetch git deps.
RUN apt-get update \
&& apt-get install -y --no-install-recommends ca-certificates git \
&& rm -rf /var/lib/apt/lists/*

RUN corepack enable

# Install deps as a cached layer (manifest + lockfile only).
COPY package.json pnpm-lock.yaml ./
RUN pnpm install --frozen-lockfile

# Fallback copy for non-mounted runs; in dev these are bind-mounted over.
COPY tsconfig.json ./
COPY scripts ./scripts
COPY src ./src

# Bundle the Agenta Pi extension (tracing + tools) into dist/. dist/ is NOT bind-mounted
# in dev, so this baked copy is what runRivet installs into Pi's agent dir. Rebuild the
# image after editing src/piExtension.ts or src/agenta-otel.ts.
RUN pnpm run build:extension

ENV NODE_ENV=development \
PORT=8765

EXPOSE 8765

# Call the local tsx binary directly to avoid pnpm/corepack HOME writes when the
# container runs as a non-root host uid.
CMD ["node_modules/.bin/tsx", "watch", "src/server.ts"]
Comment on lines +7 to +41

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use a non-root runtime user in the dev image too.

Line 7-41 leaves the process running as root; switching to a non-root user keeps dev closer to prod hardening and avoids unnecessary privilege.

Suggested fix
 FROM node:24-slim
@@
 RUN pnpm run build:extension
 
 ENV NODE_ENV=development \
     PORT=8765
 
+RUN groupadd --system app && useradd --system --gid app --create-home app \
+    && chown -R app:app /app
+USER app
+
 EXPOSE 8765
@@
 CMD ["node_modules/.bin/tsx", "watch", "src/server.ts"]

Source: Linters/SAST tools

66 changes: 66 additions & 0 deletions services/agent/docker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Agent sidecar images

Images for the agent runner sidecar (the `sandbox-agent server` runtime in
`services/agent/src/server.ts`). The Python service calls it in-network at
`:8765`.

- `Dockerfile.dev` — dev image. `tsx watch`, source bind-mounted, hot reload.
- `Dockerfile` — production image. Source baked in, no watcher.

## Licensing posture (read before changing any image or build recipe)

The rule that shapes every image here:

> **We ship build recipes, not Claude-containing images, and we never bake a
> credential into any image.**

Why:

- **Pi** (`@earendil-works/pi-coding-agent`) is MIT. We bake it freely via the npm
dependencies, in every image and snapshot.
- **Claude Code** is proprietary (© Anthropic PBC, governed by Anthropic's
[Commercial Terms](https://www.anthropic.com/legal/commercial-terms);
[legal & compliance](https://code.claude.com/docs/en/legal-and-compliance)). The
Commercial Terms grant a usage license only. They do not grant any right to
redistribute, resell, sublicense, or repackage the Services. So an image **we
build and distribute must not contain Claude Code.**
- Claude Code is installed **from Anthropic** (`npm install -g
@anthropic-ai/claude-code`, `https://claude.ai/install.sh`, or the daemon's
`install-agent claude`). That keeps Anthropic as the distributor, which is the
permitted path. The production sidecar does this at runtime; a snapshot we build
for our own use does it at build time.

## Authentication

Auth is injected at runtime, never baked into a layer.

- **API key (default, and the only option for cloud / multi-tenant).** Set
`ANTHROPIC_API_KEY` (or pass provider keys as request secrets from the vault).
Anthropic directs products and services that interact with Claude to use API key
auth, so this is the path for any Agenta-orchestrated run that serves users.
- **OAuth subscription (self-host opt-in only).** An individual operator may mount
their own Claude login (e.g. `~/.claude`) into the container and run with their
own subscription. This is for personal, individual use of Claude Code, never for
serving other users, and it is the operator's responsibility. Anthropic restricts
Free/Pro/Max OAuth to first-party use and forbids third parties routing requests
through it (enforced since 2026-03). Cloud and multi-tenant deployments must stay
API-key only.

We never bake an OAuth login or an API key into an image.

## Build recipes (two paths)

- **Cloud / Daytona (API key).** The Daytona snapshot recipe bakes Pi. Agenta Cloud
builds and uses its own snapshot internally; self-hosters run the same recipe
against their own Daytona account. We ship the build script (the recipe), not the
built snapshot, so we never distribute a Claude-containing artifact. Snapshot
builder: `docs/design/agent-workflows/scratch/wp-8-rivet-acp-runtime/poc/build_rivet_snapshot.py`.
Today it bases on rivet's `-full` image, which already bundles Claude. That is
compliant under the recipe-not-image model. **Cleaner-provenance follow-up
(needs a live Daytona build to verify):** base on a daemon-only rivet image and
install Claude from Anthropic at build, so the snapshot's Claude comes straight
from Anthropic rather than from a third party's bundled image. Relocation of the
builder into this folder is a follow-up.
- **Self-host (API key, OAuth optional).** Build the production `Dockerfile` (it
bakes neither Claude nor a credential), then supply auth at runtime: an
`ANTHROPIC_API_KEY` env var, or, for individual use, a mounted OAuth login dir.
30 changes: 30 additions & 0 deletions services/agent/scripts/build-extension.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
/**
* Bundle the Agenta Pi extension into one self-contained file so its OpenTelemetry deps
* resolve wherever Pi loads it (host, docker sidecar, Daytona snapshot). Pi only accepts
* `.ts`/`.js` extension files, so we emit `.js` (ESM) with a default export.
*
* Run: pnpm run build:extension -> dist/extensions/agenta.js
*/
import { build } from "esbuild";
import { dirname, join } from "node:path";
import { fileURLToPath } from "node:url";

const root = join(dirname(fileURLToPath(import.meta.url)), "..");

await build({
entryPoints: [join(root, "src/extensions/agenta.ts")],
outfile: join(root, "dist/extensions/agenta.js"),
bundle: true,
platform: "node",
format: "esm",
target: "node20",
// Pi provides the ExtensionAPI at load time; never bundle the harness SDK.
external: ["@earendil-works/pi-coding-agent"],
banner: {
// protobufjs and some deps expect CommonJS globals under ESM; shim them.
js: "import{createRequire as __cr}from'node:module';const require=__cr(import.meta.url);",
},
logLevel: "info",
});

process.stderr.write("[build-extension] wrote dist/extensions/agenta.js\n");
21 changes: 21 additions & 0 deletions services/agent/skills/agenta-getting-started/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
name: agenta-getting-started
description: Baseline guidance for agents running on the Agenta platform. Use at the start of a task to recall how to work with the tools and skills Agenta provides and how to report results clearly.
---

# Agenta getting started

This is a placeholder Agenta skill that ships with the `AgentaHarness`. It proves the
forced-skill path end to end; replace its content with real Agenta guidance.

## When to use

Read this when you begin a task and want a reminder of the Agenta conventions below.

## Conventions

- Prefer the provided tools and skills over guessing; call a tool when one fits.
- When another skill matches the task, read its `SKILL.md` fully before acting.
- Keep answers grounded in what the tools and skills actually return. Do not fabricate
results or tool output.
- Be concise. State what you did, what it returned, and what is left.
Loading
Loading