Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 30 additions & 12 deletions docs/rfcs/RFC-005-codex-code-cli-runtime.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,9 +121,14 @@ function normalizeRuntime(profileOrRuntime?: Profile | string): RuntimeName {
}
```

### 3.3 spawn codex CLI 路径(类比 claude-code-cli)
### 3.3 spawn codex CLI 路径(类比 claude-code-cli — TUI attached 模式

新增 cli.ts:1640 附近的 \"spawn claude CLI\" 之后一段 \"spawn codex CLI\":
新增 cli.ts:1640 附近的 \"spawn claude CLI\" 之后一段 \"spawn codex CLI\"。

> **⚠️ Errata (2026-05-13 通信工程马 step 1 catch + 通信龙 verify)**:
> 早期 draft 含 `["exec", ...]` 是误抄 codex-sdk 内部 batch 模式(`exec --experimental-json`),跟 §3.6 user workflow 意图(用户 attached TUI)冲突。
> **正确**:codex CLI **无 subcommand** = TUI 模式,跟 `claude-code-cli` `spawn("claude", ...)` 对称。节点 = attached daemon,用户在终端看 codex TUI 直接跟其他 agent 通信。
> §6.6 session matters 同步修正:`codex resume <SESSION_ID>` 是 **top-level subcommand**(不是 `--resume <id>` flag)。

```ts
// (sketch — 不是最终实施代码)
Expand All @@ -137,11 +142,18 @@ if (runtime === "codex-code-cli") {
env[k] = v.replace(/^~/, home);
}

const codexArgs: string[] = ["exec"];
// TUI 模式:codex 无 subcommand = interactive attached
// (跟 claude-code-cli L1677 `spawn("claude", claudeArgs)` 对称)
const codexArgs: string[] = [];

// session 续接 (TUI mode 用 top-level `resume` subcommand)
if (profile.session) {
codexArgs.push("resume", profile.session);
}

// commhub MCP inline 注入 — 不污染 ~/.codex/config.toml
// 注意: 使用 array args (execFileSync pattern), 不走 shell, 避免 shell injection
// (cf. #86 patch round 2)
// 注意: 使用 array args, 不走 shell, 避免 shell injection (cf. #86 patch round 2)
// inner double-quote 是 TOML literal 语法需要 (codex --config value 侧 TOML 解析)
codexArgs.push("--config", `mcp_servers.commhub.url="${commhubUrl}/mcp"`);
codexArgs.push("--config", `mcp_servers.commhub.bearer_token_env_var="COMMHUB_TOKEN"`);

Expand All @@ -153,12 +165,8 @@ if (runtime === "codex-code-cli") {
// 隔离 host 用户 codex config(R8 SaaS 沙箱化建议)
codexArgs.push("--ignore-user-config", "--ignore-rules");

// session 续接(codex resume)
if (profile.session) {
codexArgs.unshift("resume", profile.session);
}

// 跟 claude-code-cli 同款:spawn 后 pid 写入 .pid file,exit handler 清理
// TUI 模式: stdio:"inherit" attach 用户终端
const child = spawn("codex", codexArgs, { env, stdio: "inherit" });
const pidFile = join(nodesDir(), nodeId, ".pid");
if (child.pid) writeFileSync(pidFile, String(child.pid));
Expand Down Expand Up @@ -307,9 +315,19 @@ anet node start my-codex

本 runtime 的 spawn 调用**严格用** array args + `execFileSync` 或 `spawn(.., { stdio: "inherit" })` 形式,**不带** `shell: true`,避免 user alias 含特殊字符导致 shell injection — 跟 #86 patch round 2 防御纵深一致。

### 6.6 session 续接 vs 新建
### 6.6 session 续接 vs 新建(**已 amend** per 通信工程马 step 1 catch)

codex `Thread.id` semantics 跟 claude session UUID 不同(codex thread ID 在 turn start 后才有,claude session UUID 在 spawn 前预生成)。

**TUI mode session 续接路径**(codex-code-cli runtime 实际走):

- **首次启动**: `spawn("codex", [...flags])` — 无 resume,codex 自动生成新 thread ID
- **后续启动**: `spawn("codex", ["resume", session, ...flags])` — `resume` 是 codex CLI **top-level subcommand**(不是 flag)
- codex 把 thread 持久化到 `~/.codex/sessions/`(或 `--ignore-user-config` 之后走 anet 沙箱目录,具体 spike 验)

> ⚠️ Errata: 早期 draft 写 `codex exec resume <session>` 是混淆了 codex-sdk 内部 batch 路径 (`exec --experimental-json`)。**TUI runtime 不走 exec subcommand**,资 §3.3 amend 一致。

codex `Thread.id` semantics 跟 claude session UUID 不同(codex thread ID 在 turn start 后才有,claude session UUID 在 spawn 前预生成)。codex-code-cli runtime 的 session 字段**沿用** profile.session UUID,但**首次启动**走 `codex exec`(无 resume),后续启动走 `codex exec resume <session>`。这跟 codex-sdk runtime 当前实现一致(agent-node/src/cli.ts:634-636)
跟 claude-code-cli 的 `--session-id` / `--resume` flag 行为不同(claude 用 flag,codex 用 subcommand),但都是 attached daemon 模式 — anet wrapper 抽象后用户体验对称

## 7. Smoke matrix(Docker E2E test 设计)

Expand Down
26 changes: 26 additions & 0 deletions tests/test-codex-code-cli/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
FROM node:20-bookworm

ENV DEBIAN_FRONTEND=noninteractive
WORKDIR /app

RUN apt-get update && apt-get install -y --no-install-recommends \
bash \
ca-certificates \
curl \
jq \
procps \
unzip \
&& rm -rf /var/lib/apt/lists/*

# bun (for hub spawn) + codex CLI (subject under test) + anet (preview).
RUN curl -fsSL https://bun.sh/install | bash
ENV PATH="/root/.bun/bin:${PATH}"

# Pin codex CLI to the version validated in RFC-005 §6.1 (≥ 0.128.0 required
# for inline `--config` MCP injection; 0.130.0 is what RFC-005 §7.1 specifies).
RUN npm install -g @openai/codex@0.130.0

COPY tests/test-codex-code-cli/run.sh /app/run.sh
RUN chmod +x /app/run.sh

CMD ["bash", "/app/run.sh"]
72 changes: 72 additions & 0 deletions tests/test-codex-code-cli/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# test-codex-code-cli — E2E for RFC-005 `codex-code-cli` runtime

**RFC**: [RFC-005 codex-code-cli runtime](../../docs/rfcs/RFC-005-codex-code-cli-runtime.md)
**派单**: 通信龙 → 通信测试马(Primary in test PR / Helper in cli.ts PR)
**目标版本**: agent-network v2.1.8(Vincent telegram 4019+4020 拍板)

## 范围

| 层 | 内容 | 状态 |
|---|---|---|
| L0 | `which codex` + `which anet` | scaffold sanity |
| L1 | `anet hub start` + `/health` + admin login(带 retry) | scaffold sanity |
| L2 | `anet node create x --runtime codex-code-cli` + `config.json` 写盘 + `.runtime == codex-code-cli` | **需 cli.ts ship** |
| L3 | spawn `codex` 二进制 + `mcp_servers.commhub` flag 注入 | **需 cli.ts ship** |
| L4 | `COMMHUB_TOKEN` 环境变量注入到 codex 进程(`/proc/<pid>/environ` 验) | **需 cli.ts ship** |
| L5 | `--ignore-user-config` + `--ignore-rules` 沙箱化 flag(`/proc/<pid>/cmdline` 验) | **需 cli.ts ship** |
| L6 | codex-code-cli + claude-code-cli 两 alias 共存 + admin send_task 跨 runtime 路由(hub-side) | **需 cli.ts ship** |

不覆盖(per [RFC-005 §7.3](../../docs/rfcs/RFC-005-codex-code-cli-runtime.md)):
- 真实 OpenAI / Anthropic API auth(test env 无 key)
- agent 实际消费 task → 真 LLM 响应(mock 留独立 test,跟 qa-node-02 / docker-e2e SC05 同思路)
- session 续接(resume)— 跟 [test31](../test31-claude-code-cli-resume) 同类逻辑,等 codex 侧支持后另测

## 跑

```bash
sg docker -c 'docker build -t anet-test-codex-code-cli -f tests/test-codex-code-cli/Dockerfile .'
sg docker -c 'docker run --rm anet-test-codex-code-cli'
```

预算:cold ~90s(含 apt + bun + codex npm + anet npm),warm ~25s。

## 当前状态(2026-05-13)

⚠️ **L2+ 现在必 fail** —— `agent-network/bin/cli.ts` L140 `RuntimeName` enum 还没加 `codex-code-cli`(通信工程马 1-2 天内 ship)。

- L0 + L1 现在就能跑(用做 scaffold sanity)
- L2-L6 等通信工程马 cli.ts merge 到 main + preview tag publish 后一次性应 PASS

发版工作流(通信龙 铁律):
1. 通信工程马 cli.ts PR merge 到 main
2. preview tag publish `@sleep2agi/agent-network@x.y.z-preview.N`
3. 本测试 `sg docker -c 'docker build ... && docker run ...'` 跑通
4. Vincent 亲测 → latest 升级

## 锁住的 RFC-005 契约

| 契约 | 测试断言 |
|------|---------|
| RuntimeName enum 含 `codex-code-cli` | L2 config.runtime 字段 |
| `--config 'mcp_servers.commhub.url=...'` inline 注入 | L3 pgrep cmdline |
| `bearer_token_env_var=COMMHUB_TOKEN` + env 注入 | L4 /proc/environ |
| `--ignore-user-config` + `--ignore-rules` 沙箱化 | L5 /proc/cmdline |
| hub 跨 runtime 路由 alias-agnostic | L6 send_task to codex-bot + claude-peer |

## 资源

- Docker(`sg docker`)
- `node:20-bookworm` + apt(bash/curl/jq/procps/unzip/ca-certificates)
- `bun` via `bun.sh/install`
- `@openai/codex@0.130.0`(pin per RFC-005 §7.1 — 含 `--config` inline MCP 支持)
- `@sleep2agi/agent-network@preview`(等 cli.ts ship)
- 0 OpenAI / Anthropic API calls

## 跟现有测试的关系

| 测试 | 关系 |
|------|------|
| [test31 claude-code-cli-resume](../test31-claude-code-cli-resume/) | claude 侧对称参考(runtime + session resume) |
| [qa-hub-05-roundtrip](../qa-hub-05-roundtrip/) | hub 起 + login + ntok mint + report_status 模式复用 |
| [qa-node-02-success-reply](../qa-node-02-success-reply/) | mock-via-MCP 模式(L6 send_task hub-side 验证) |
| [test32 shell-spawn-audit](../test32-agent-network-shell-spawn-audit/) | 防御纵深(spawn 不带 `shell: true`) — 本测试不重 audit 但 cli.ts 实施需遵守 |
192 changes: 192 additions & 0 deletions tests/test-codex-code-cli/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
#!/usr/bin/env bash
# test-codex-code-cli — E2E for RFC-005 codex-code-cli runtime
# Coverage (per 通信龙 派单 + RFC-005 §7):
# L0 prerequisites — which codex + which anet
# L1 hub up — anet hub start + /health
# L2 node create — anet node create x --runtime codex-code-cli + config 写盘
# L3 spawn verify — anet node start + codex fork + mcp_servers.commhub flag
# L4 env injection — COMMHUB_TOKEN 注入正确
# L5 sandbox flags — --ignore-user-config + --ignore-rules 生效
# L6 cross-agent — codex-code-cli + claude-code-cli 两 node 共存,send_task
# 跨 runtime 真路由(hub-side 验,agent 端不必实际处理)
#
# 状态:cli.ts codex-code-cli runtime 由通信工程马实施。L2+ 在 cli.ts ship 前
# 必然 fail(runtime enum 缺)。工程马 ship 到 main + preview publish 后
# 本测试应一次性 PASS。L0/L1 在任何 anet 版本都应 PASS(用做 scaffold sanity)。
set -Eeuo pipefail

LOG_DIR="${LOG_DIR:-/tmp/anet-codex-code-cli}"
HOME_DIR="${HOME_DIR:-/tmp/anet-codex-home}"
mkdir -p "$LOG_DIR" "$HOME_DIR"
export HOME="$HOME_DIR"
export COMMHUB_URL="http://127.0.0.1:9200"
export ANET_HUB="$COMMHUB_URL"

ADMIN_PW="StrongPassw0rd"

cleanup() {
set +e
pkill -KILL -f 'anet node start' 2>/dev/null
pkill -KILL -f 'codex exec' 2>/dev/null
pkill -KILL -f 'commhub-server' 2>/dev/null
}
trap cleanup EXIT

section() { echo ""; echo "========== $* =========="; }
pass() { echo "PASS: $*"; }
fail() {
echo "FAIL: $*" >&2
echo "---- anet -v ----"; tail -40 "$LOG_DIR/anet-version.log" 2>/dev/null
echo "---- hub.log ----"; tail -80 "$LOG_DIR/hub.log" 2>/dev/null
echo "---- node-start.log ----"; tail -80 "$LOG_DIR/node-start.log" 2>/dev/null
exit 1
}

mcp_call() {
# POST /mcp tools/call; print .result.content[0].text
local tok="$1" name="$2" args="$3" body
body=$(jq -nc --arg n "$name" --argjson a "$args" \
'{jsonrpc:"2.0",id:1,method:"tools/call",params:{name:$n,arguments:$a}}')
curl -sS -X POST "$COMMHUB_URL/mcp" \
-H "Authorization: Bearer $tok" \
-H 'Content-Type: application/json' \
-H 'Accept: application/json, text/event-stream' \
-H 'MCP-Protocol-Version: 2025-03-26' \
-d "$body" \
| sed -n 's/^data: //p' | head -1 \
| jq -r '.result.content[0].text // empty'
}

# ─────────────── L0 prerequisites ───────────────
section "L0 prerequisites"
which codex >/dev/null || fail "codex CLI not on PATH"
codex --version 2>&1 | tee "$LOG_DIR/codex-version.log"
pass "L0a codex installed"

npm install -g @sleep2agi/agent-network@preview >"$LOG_DIR/npm-install.log" 2>&1 \
|| fail "anet npm install failed"
which anet >/dev/null || fail "anet binary missing after install"
anet -v >"$LOG_DIR/anet-version.log" 2>&1
pass "L0b anet installed ($(head -1 "$LOG_DIR/anet-version.log"))"

# ─────────────── L1 hub up ───────────────
section "L1 hub up"
rm -rf "$HOME/.anet" "$HOME/.commhub"
anet hub start --host 127.0.0.1 --port 9200 --username admin --password "$ADMIN_PW" \
>"$LOG_DIR/hub.log" 2>&1 &
for _ in {1..60}; do curl -fsS "$COMMHUB_URL/health" >/dev/null 2>&1 && break; sleep 1; done
curl -fsS "$COMMHUB_URL/health" >/dev/null || fail "hub did not come up"
pass "L1 hub healthy on $COMMHUB_URL"

# admin login (retry — bootstrap race, see issue #31 R6/R8/R19)
UTOK=""
for _ in {1..20}; do
UTOK=$(curl -sS -X POST "$COMMHUB_URL/api/auth/login" -H 'Content-Type: application/json' \
-d "{\"username\":\"admin\",\"password\":\"$ADMIN_PW\"}" 2>/dev/null | jq -r '.token // empty')
[[ "$UTOK" == utok_* ]] && break
sleep 0.5
done
[[ "$UTOK" == utok_* ]] || fail "admin login never returned utok"
pass "L1b admin login OK"

# ─────────────── L2 anet node create --runtime codex-code-cli ───────────────
section "L2 node create --runtime codex-code-cli"
NODE_DIR="$HOME/.anet/nodes/codex-bot"
# Login + create network so node create works
anet login --hub "$COMMHUB_URL" --username admin --password "$ADMIN_PW" \
>"$LOG_DIR/anet-login.log" 2>&1 \
|| fail "anet login (CLI) failed"

# Try `--runtime codex-code-cli`. Will fail until 工程马 ship cli.ts.
anet node create codex-bot --runtime codex-code-cli >"$LOG_DIR/node-create.log" 2>&1 \
|| fail "anet node create --runtime codex-code-cli failed (cli.ts not yet shipped?)"

[[ -f "$NODE_DIR/config.json" ]] || fail "node config.json not written at $NODE_DIR"
RUNTIME=$(jq -r '.runtime' "$NODE_DIR/config.json")
[[ "$RUNTIME" == "codex-code-cli" ]] || fail "config.runtime != codex-code-cli (got '$RUNTIME')"
pass "L2 config.json runtime=codex-code-cli"

# ─────────────── L3 spawn verify — codex fork + mcp_servers.commhub flag ───────────────
section "L3 spawn verify"
( anet node start codex-bot >"$LOG_DIR/node-start.log" 2>&1 ) &
NODE_PID=$!
# Give it a moment to spawn
sleep 4
# Find a codex subprocess with mcp_servers.commhub flag
if pgrep -af 'codex.*mcp_servers\.commhub' >"$LOG_DIR/codex-spawn.log"; then
pass "L3 codex spawned with mcp_servers.commhub flag"
else
fail "no 'codex ... mcp_servers.commhub' process detected"
fi

# ─────────────── L4 env injection — COMMHUB_TOKEN bound ───────────────
section "L4 env injection"
# RFC-005 §3.3: COMMHUB_TOKEN passed via env (not hardcoded into config args).
# Check the spawned codex process /proc/<pid>/environ for COMMHUB_TOKEN.
CODEX_PID=$(pgrep -f 'codex.*mcp_servers\.commhub' | head -1 || true)
[[ -n "$CODEX_PID" ]] || fail "could not find codex pid for env probe"
if tr '\0' '\n' </proc/"$CODEX_PID"/environ 2>/dev/null | grep -q '^COMMHUB_TOKEN='; then
pass "L4 COMMHUB_TOKEN present in codex process env"
else
fail "COMMHUB_TOKEN missing from codex process env (pid $CODEX_PID)"
fi

# ─────────────── L5 sandbox flags — --ignore-user-config + --ignore-rules ───────────────
section "L5 sandbox flags"
# RFC-005 §3.3 + §6.2: spawn must pass --ignore-user-config + --ignore-rules
# so host user's ~/.codex/config.toml (e.g. Vincent's stale [mcp_servers.commhub-proxy])
# does NOT leak into the runtime session.
CMDLINE=$(tr '\0' ' ' </proc/"$CODEX_PID"/cmdline 2>/dev/null)
echo "$CMDLINE" >"$LOG_DIR/codex-cmdline.log"
echo "$CMDLINE" | grep -q -- '--ignore-user-config' || fail "--ignore-user-config flag missing"
echo "$CMDLINE" | grep -q -- '--ignore-rules' || fail "--ignore-rules flag missing"
pass "L5 sandbox flags applied"

# ─────────────── L6 cross-agent send_task between codex-code-cli and claude-code-cli ───────────────
section "L6 cross-agent (codex-code-cli ↔ claude-code-cli)"
# Strategy: mint ntoks for both aliases, register sessions via report_status MCP
# (mock-via-MCP, no real LLM needed). Then admin send_task to each — both must
# land + be queryable. Hub-side cross-runtime routing is what matters here;
# whether the real codex/claude process actually consumes the task is out
# of scope (see RFC-005 §7.3 "不测的 case").
# This verifies: hub does NOT special-case runtime, alias resolution is
# runtime-agnostic, ntok issued for codex-bot can mint another session row.
NETWORK_ID=$(curl -fsS -X POST "$COMMHUB_URL/api/networks" \
-H "Authorization: Bearer $UTOK" -H 'Content-Type: application/json' \
-d '{"name":"codex-cross-test"}' | jq -r '.network.network_id // .network_id')
[[ -n "$NETWORK_ID" ]] || fail "could not create cross-test network"

# mint ntok for an additional alias 'claude-peer' (we don't spawn a real claude
# CLI — register a session row via MCP report_status to satisfy SSE-delivery
# precondition, same trick used in qa-hub-05 / qa-node-02)
NTOK_PEER=$(curl -fsS -X POST "$COMMHUB_URL/api/auth/node-token" \
-H "Authorization: Bearer $UTOK" -H 'Content-Type: application/json' \
-d "{\"network_id\":\"$NETWORK_ID\",\"node_name\":\"claude-peer\"}" | jq -r '.token')
ARG=$(jq -nc --arg net "$NETWORK_ID" \
'{resume_id:"00000000-aaaa-bbbb-cccc-codex0000l6",alias:"claude-peer",status:"idle",network_id:$net}')
mcp_call "$NTOK_PEER" "report_status" "$ARG" | jq -e '.ok == true' >/dev/null \
|| fail "claude-peer report_status failed"

# admin dispatches task to codex-bot (created in L2)
TASK_RESP=$(curl -fsS -X POST "$COMMHUB_URL/api/task" \
-H "Authorization: Bearer $UTOK" -H 'Content-Type: application/json' \
-d "{\"alias\":\"codex-bot\",\"task\":\"hello-from-claude-peer\",\"priority\":\"normal\"}")
echo "$TASK_RESP" | jq -e '.ok == true' >/dev/null \
|| fail "send_task to codex-bot rejected: $TASK_RESP"

# query /api/tasks — task must be delivered + visible
sleep 0.3
HIT=$(curl -fsS "$COMMHUB_URL/api/tasks?to_name=codex-bot" \
-H "Authorization: Bearer $UTOK" \
| jq '[.tasks[]? | select(.content=="hello-from-claude-peer")] | length')
[[ "$HIT" -ge 1 ]] || fail "task to codex-bot not in /api/tasks"
pass "L6 cross-runtime task dispatch routed correctly"

# ─────────────── Cleanup ───────────────
section "stopping node"
kill -KILL "$NODE_PID" 2>/dev/null || true
pkill -KILL -f 'codex exec' 2>/dev/null || true
pkill -KILL -f 'anet node start' 2>/dev/null || true

echo ""
echo "PASS test-codex-code-cli (L0+L1+L2+L3+L4+L5+L6 all green)"
Loading