docs(agent): agent-workflows design and ground truth#4777
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughAdds agent-workflows design/status/review/archive documentation and a separate vault named secrets planning set. ChangesAgent workflows documentation set
Vault named secrets planning docs
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (17)
docs/design/agent-workflows/status.md (1)
47-61: 💤 Low valueConsider refactoring the three successive "Should" questions into a bullet list for readability.
Lines 49–51 begin consecutive sentences with "Should", which LanguageTool flags as repetitive. While the current structure is correct and intentional (listing open decisions), a bulleted format would improve clarity:
- Should `LocalBackend` remain exported before it is implemented? - Should MCP controls be hidden or constrained by selected harness/backend? - Should `agenta` be hidden from non-local sandbox selections?docs/design/agent-workflows/architecture.md (2)
27-54: ⚡ Quick winAdd language identifier to fenced code block.
The ASCII diagram block (lines 27–54) is missing a language specification. Add
textornonefor clarity:📝 Proposed fix
-``` +```text browser / playground | | POST /invoke or POST /messages v services container Python workflow handler services/oss/src/agent/app.py | | POST /run, or spawn the runner CLI in local checkout mode v agent runner sidecar compose service: agent-pi Node HTTP server services/agent/src/server.ts | +-- in-process Pi engine | services/agent/src/engines/pi.ts | +-- rivet engine services/agent/src/engines/rivet.ts | +-- sandbox-agent daemon | +-- ACP adapter: pi-acp or claude-agent-acp | +-- harness CLI: pi or claude -``` +```Source: Linters/SAST tools
56-64: 💤 Low valueUse hyphen in compound adjective "full-stack".
Line 62 uses "full stack environment" as a compound modifier. LanguageTool suggests the hyphenated form "full-stack environment" for clarity, or rephrasing to "entire stack environment":
📝 Proposed fixes
Option 1 (hyphenate):
-The sidecar deliberately does not inherit the full stack environment. Provider keys and +The sidecar deliberately does not inherit the full-stack environment. Provider keys andOption 2 (rephrase):
-The sidecar deliberately does not inherit the full stack environment. Provider keys and +The sidecar deliberately does not inherit the entire stack environment. Provider keys andSource: Linters/SAST tools
docs/design/agent-workflows/sessions.md (1)
43-51: 💤 Low valueConsider refactoring the three successive "If" conditions into a bullet list for readability.
Lines 46–50 begin consecutive sentences with "If", which LanguageTool flags as repetitive. While the structure clearly lists the three intended
session_idsemantics, a bulleted format would improve scannability:The intended behavior is create-or-resume: - If the client omits `session_id`, the server creates one and returns it. - If the client supplies a known `session_id`, the server resumes that session. - If the client supplies an unknown but valid `session_id`, the server creates a session using that id.docs/design/agent-workflows/adapters/pi.md (2)
110-116: ⚡ Quick winAdd language identifier to fenced code block (span tree diagram).
The span tree diagram (lines 110–116) is missing a language specification. Add
textornone:📝 Proposed fix
-``` +```text invoke_agent (AGENT) turn N (CHAIN) chat <model> (LLM) real token usage from the provider call execute_tool <name> (TOOL) one per tool the turn ran -``` +```Source: Linters/SAST tools
144-147: 💤 Low valueSimplify "in order to" to "to" for conciseness.
Line 146 uses the phrase "in order to", which LanguageTool suggests is unnecessarily wordy:
📝 Proposed fix
-For output, Pi streams pure text deltas over ACP (`agent_message_chunk`). The runner appends them in order to build the final answer. +For output, Pi streams pure text deltas over ACP (`agent_message_chunk`). The runner appends them to build the final answer.Source: Linters/SAST tools
docs/design/agent-workflows/trash/sdk-local-backend/status.md (1)
3-5: ⚡ Quick winClarify the document's purpose statement.
Lines 3–5 claim this is "the only page that describes things that do not fully exist yet," but the document immediately documents substantial completed work in the "Done and verified" section (lines 19–49). The distinction seems to be that this document uniquely contains both complete and incomplete work, whereas other design pages document only what is built. Rephrase to make that distinction explicit, for example: "This page documents both completed work and ongoing/future work; other design pages in this directory document only what is currently built."
docs/design/agent-workflows/sdk-local-tools/review/scope.md (1)
14-21: ⚡ Quick winUpdate branch reference for final merge.
Line 19 references
gitbutler/workspaceas a "working tree," indicating this scope was drafted during development. Before merging to main, update this reference to the final branch name (or commit hash if merging directly to main). This ensures the scope document accurately reflects the reviewed state.docs/design/agent-workflows/trash/harness-port-redesign/research.md (1)
15-20: 💤 Low valueAdd language specifiers to fenced code blocks per markdownlint (MD040).
Code blocks should declare their language for proper syntax highlighting. The Python class at lines 15–20, TypeScript Session interface at lines 83–94, and SessionPersistDriver interface at lines 130–138 are missing language specifiers.
Fix language specifiers
-``` +```python class Harness(ABC): -``` +```ts class Session { -``` +```ts interface SessionPersistDriver {Also applies to: 83-94, 130-138
Source: Linters/SAST tools
docs/design/agent-workflows/trash/harness-port-redesign/proposal.md (1)
141-141: 💤 Low valueFix hyphenation in compound adjectives (LanguageTool grammar feedback).
When compound adjectives modify a noun, they should be hyphenated. Lines 141 and 146 have two instances:
- Line 141: "client side streaming" → "client-side streaming"
- Line 146: "per invoke sandboxes" → "per-invocation sandboxes" (or "per-invoke sandboxes" if abbreviating "invoke")
Also applies to: 146-146
Source: Linters/SAST tools
docs/design/agent-workflows/trash/harness-port-redesign/plan.md (1)
93-93: 💤 Low valueHyphenate compound modifier on line 93.
"Cross cutting" should be "Cross-cutting" (compound adjective).
Source: Linters/SAST tools
docs/design/agent-workflows/trash/wp-1-pi-tracing/integrating-the-tracing-extension.md (1)
24-29: 💤 Low valueAdd language specifiers to fenced code blocks (MD040).
Three code blocks lack language identifiers for syntax highlighting:
- Lines 24–29: span tree structure (use
```textor```yaml)- Lines 148–156: dependency list (use
```text)- Lines 162–170: curl command (use
```bash)Add language specifiers
-``` +```text invoke_agent -``` +```text `@earendil-works/pi-coding-agent` 0.79.4 -``` +```bash curl -s "${AGENTA_HOST}/api/spans/?trace_id=<id>"Also applies to: 148-156, 162-170
Source: Linters/SAST tools
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/commit_agent_config.py (3)
1-4: 💤 Low valueEnsure Python formatting compliance before commit per coding guidelines.
This Python file should be run through
ruff formatandruff check --fixbefore merging, per the coding guidelines for**/*.pyfiles. While this is a POC script in the historical trash directory, maintaining formatting consistency is valuable.Source: Coding guidelines
15-18: 💤 Low valueConsider HTTPS for the default Agenta host URL.
The default URL uses HTTP (line 15), which ast-grep flags as CWE-319. While this is a POC script pointing to a local dev box, consider using HTTPS as the default or allowing it to be overridden more safely. For production use, the caller would override
AGENTA_HOSTwith an HTTPS URL, but a more defensive default would be clearer.Source: Linters/SAST tools
23-75: ⚡ Quick winAdd error handling for JSON parsing and required env vars.
The script calls
.json()on HTTP responses (lines 31, 69) without defensive handling if the response body is malformed. Additionally,AGENTA_API_KEYis read directly without a check (line 16). While this is a POC, adding basic validation would improve robustness:
- Validate
AGENTA_API_KEYis set before use- Wrap
.json()calls in try/except to provide helpful error messages if the Agenta API response is unexpecteddocs/design/agent-workflows/trash/research/pi-interaction.md (1)
58-58: ⚡ Quick winAdd language identifier to code fence.
Line 58 opens a code block without specifying the language. Based on the content (bash commands), please add
```bashinstead of```.📝 Proposed fix
-\`\`\` +\`\`\`bash npm install `@earendil-works/pi-coding-agent` # SDK + CLIdocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/architecture.md (1)
10-43: ⚡ Quick winAdd language identifier to ASCII diagram code fence.
The code fence at line 10 contains an ASCII diagram and has no language specified. Consider using
```textor```diagramto clarify intent, though the content is clear as-is.📝 Proposed fix
-\`\`\` +\`\`\`text unchanged ┌───────────────────────────────────────────────┐
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 2e8ebdac-ec42-4f55-881a-919108e5e706
⛔ Files ignored due to path filters (1)
docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (95)
docs/design/agent-workflows/README.mddocs/design/agent-workflows/adapters/agenta.mddocs/design/agent-workflows/adapters/claude-code.mddocs/design/agent-workflows/adapters/pi.mddocs/design/agent-workflows/agent-template.mddocs/design/agent-workflows/architecture.mddocs/design/agent-workflows/ground-truth.mddocs/design/agent-workflows/implementation-review.mddocs/design/agent-workflows/meeting-alignment.mddocs/design/agent-workflows/open-issues.mddocs/design/agent-workflows/ports-and-adapters.mddocs/design/agent-workflows/pr-stack.mddocs/design/agent-workflows/protocol.mddocs/design/agent-workflows/sdk-local-tools/README.mddocs/design/agent-workflows/sdk-local-tools/codebase-conventions.mddocs/design/agent-workflows/sdk-local-tools/context.mddocs/design/agent-workflows/sdk-local-tools/conventions-review.mddocs/design/agent-workflows/sdk-local-tools/organization-proposal.mddocs/design/agent-workflows/sdk-local-tools/plan.mddocs/design/agent-workflows/sdk-local-tools/research.mddocs/design/agent-workflows/sdk-local-tools/review/evidence/app-mcp-reassign.mddocs/design/agent-workflows/sdk-local-tools/review/evidence/attach-orthogonal-mutation.mddocs/design/agent-workflows/sdk-local-tools/review/evidence/description-default-inconsistency.mddocs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-no-logging.mddocs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-orthogonal-untested.mddocs/design/agent-workflows/sdk-local-tools/review/evidence/handler-resolution-error.mddocs/design/agent-workflows/sdk-local-tools/review/findings.mddocs/design/agent-workflows/sdk-local-tools/review/metadata.jsondocs/design/agent-workflows/sdk-local-tools/review/plan.mddocs/design/agent-workflows/sdk-local-tools/review/progress.mddocs/design/agent-workflows/sdk-local-tools/review/questions.mddocs/design/agent-workflows/sdk-local-tools/review/risks.mddocs/design/agent-workflows/sdk-local-tools/review/scope.mddocs/design/agent-workflows/sdk-local-tools/review/scorecard.mddocs/design/agent-workflows/sdk-local-tools/review/summary.mddocs/design/agent-workflows/sdk-local-tools/status.mddocs/design/agent-workflows/sessions.mddocs/design/agent-workflows/status.mddocs/design/agent-workflows/trash/README.mddocs/design/agent-workflows/trash/harness-port-redesign/README.mddocs/design/agent-workflows/trash/harness-port-redesign/implementation.mddocs/design/agent-workflows/trash/harness-port-redesign/plan.mddocs/design/agent-workflows/trash/harness-port-redesign/proposal.mddocs/design/agent-workflows/trash/harness-port-redesign/research.mddocs/design/agent-workflows/trash/harness-port-redesign/status.mddocs/design/agent-workflows/trash/old-rfcs/agent-protocol-rfc.mddocs/design/agent-workflows/trash/old-rfcs/streaming-and-sessions.mddocs/design/agent-workflows/trash/research/auth-secrets.mddocs/design/agent-workflows/trash/research/daytona-sandbox.mddocs/design/agent-workflows/trash/research/diskless-in-memory-config.mddocs/design/agent-workflows/trash/research/open-questions.mddocs/design/agent-workflows/trash/research/otel-instrumentation.mddocs/design/agent-workflows/trash/research/pi-interaction.mddocs/design/agent-workflows/trash/research/sandbox-sharing.mddocs/design/agent-workflows/trash/sdk-local-backend/status.mddocs/design/agent-workflows/trash/wp-1-pi-tracing/README.mddocs/design/agent-workflows/trash/wp-1-pi-tracing/integrating-the-tracing-extension.mddocs/design/agent-workflows/trash/wp-1-pi-tracing/poc/.env.exampledocs/design/agent-workflows/trash/wp-1-pi-tracing/poc/README.mddocs/design/agent-workflows/trash/wp-1-pi-tracing/poc/agenta-otel.tsdocs/design/agent-workflows/trash/wp-1-pi-tracing/poc/package.jsondocs/design/agent-workflows/trash/wp-1-pi-tracing/poc/run.tsdocs/design/agent-workflows/trash/wp-1-pi-tracing/tracing-in-the-agent-service.mddocs/design/agent-workflows/trash/wp-2-agent-service/README.mddocs/design/agent-workflows/trash/wp-2-agent-service/implementation-plan.mddocs/design/agent-workflows/trash/wp-2-agent-service/qa.mddocs/design/agent-workflows/trash/wp-3-daytona-sandbox/README.mddocs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/README.mddocs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/bench_coldstart.pydocs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/build_snapshot.pydocs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/cleanup.pydocs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/run_agent.pydocs/design/agent-workflows/trash/wp-4-multi-message-output/README.mddocs/design/agent-workflows/trash/wp-5-chat-vs-completion/README.mddocs/design/agent-workflows/trash/wp-6-workflow-type-and-template/README.mddocs/design/agent-workflows/trash/wp-7-tools/README.mddocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/README.mddocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/architecture.mddocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/context.mddocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/isolation-and-fork.mddocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/plan.mddocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/build_rivet_snapshot.pydocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/commit_agent_config.pydocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/debug-events.tsdocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/dump-full.tsdocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/package.jsondocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/spike.tsdocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/research.mddocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/status.mddocs/design/agent-workflows/triggers.mddocs/design/vault-named-secrets/README.mddocs/design/vault-named-secrets/context.mddocs/design/vault-named-secrets/plan.mddocs/design/vault-named-secrets/research.mddocs/design/vault-named-secrets/status.md
| - [daytona-sandboxes] Daytona — Sandboxes (lifecycle, states, auto-stop/archive/delete, | ||
| refreshActivity, resource limits, per-sandbox isolation): | ||
| https://www.daytona.io/docs/en/sandboxes/ | ||
| - [daytona-process] Daytona — Process and Code Execution (exec/code_run vs Sessions, cwd, | ||
| env, create/execute/get/delete session): https://www.daytona.io/docs/en/process-code-execution/ | ||
| - [daytona-process-src] Daytona docs source — process-code-execution.mdx (verbatim session | ||
| example, SessionExecuteRequest fields): | ||
| https://github.com/daytonaio/daytona/blob/main/apps/docs/src/content/docs/en/process-code-execution.mdx | ||
| - [daytona-volumes] Daytona — Volumes (creation, VolumeMount, mount_path/subpath, FUSE, | ||
| mounting via CreateSandboxFromSnapshotParams): https://www.daytona.io/docs/en/volumes/ | ||
| - [daytona-volumes-src] Daytona docs source — volumes.mdx (verbatim "mounted at creation", | ||
| persistence, FUSE not transactional, last-write-wins): | ||
| https://github.com/daytonaio/daytona/blob/main/apps/docs/src/content/docs/en/volumes.mdx | ||
| - [daytona-fuse-issue] Daytona GitHub issue #3331 — FUSE volume permission limitations | ||
| (mv/touch/stat/copystat failures): https://github.com/daytonaio/daytona/issues/3331 | ||
| - [daytona-parallel-issue] Daytona GitHub issue #4001 — Design and Implement Parallel | ||
| Sandbox Execution API (fork filesystem+memory; current workaround = many sandboxes): | ||
| https://github.com/daytonaio/daytona/issues/4001 | ||
| - [daytona-blog-best] Northflank — "Best code execution sandbox for AI agents 2026" | ||
| (isolated sandbox per execution; Docker-default isolation weaker than microVMs): | ||
| https://northflank.com/blog/best-code-execution-sandbox-for-ai-agents | ||
| - [pi-home] pi.dev — product overview (harness, modes, AGENTS.md/SYSTEM.md context): | ||
| https://pi.dev | ||
| - [pi-docs] pi.dev — docs index (session tree, JSONL session format, RPC/SDK modes): | ||
| https://pi.dev/docs/latest | ||
| - [pi-sdk] pi.dev — SDK/RPC (SessionManager.create/continueRecent/open/inMemory, cwd, | ||
| agentDir, runRpcMode, `--mode rpc --no-session`): https://pi.dev/docs/latest/sdk | ||
| - Agenta repo — `api/oss/src/utils/env.py` `DaytonaConfig` (DAYTONA_API_KEY, |
There was a problem hiding this comment.
Define the source labels as actual Markdown reference definitions.
The entries in ## Sources are formatted as list items, so labels like [daytona-process] are not defined and earlier references stay unresolved.
Suggested fix pattern
- - [daytona-sandboxes] Daytona — Sandboxes ...:
- https://www.daytona.io/docs/en/sandboxes/
- - [daytona-process] Daytona — Process and Code Execution ...:
- https://www.daytona.io/docs/en/process-code-execution/
+ [daytona-sandboxes]: https://www.daytona.io/docs/en/sandboxes/
+ [daytona-process]: https://www.daytona.io/docs/en/process-code-execution/
+ [daytona-process-src]: https://github.com/daytonaio/daytona/blob/main/apps/docs/src/content/docs/en/process-code-execution.mdx
+ [daytona-volumes]: https://www.daytona.io/docs/en/volumes/
+ [daytona-volumes-src]: https://github.com/daytonaio/daytona/blob/main/apps/docs/src/content/docs/en/volumes.mdx
+ [daytona-fuse-issue]: https://github.com/daytonaio/daytona/issues/3331
+ [daytona-parallel-issue]: https://github.com/daytonaio/daytona/issues/4001
+ [daytona-blog-best]: https://northflank.com/blog/best-code-execution-sandbox-for-ai-agents
+ [pi-home]: https://pi.dev
+ [pi-docs]: https://pi.dev/docs/latest
+ [pi-sdk]: https://pi.dev/docs/latest/sdkSource: Linters/SAST tools
| AGENTA_HOST=http://144.76.237.122:8280/ | ||
| AGENTA_API_KEY=your-agenta-project-api-key |
There was a problem hiding this comment.
Use a safe default collector endpoint in the example env file.
AGENTA_HOST is currently an http:// hard-coded host. That can expose Authorization: ApiKey and captured span content over plaintext transport if copied as-is.
Suggested fix
-AGENTA_HOST=http://144.76.237.122:8280/
+AGENTA_HOST=https://cloud.agenta.ai
AGENTA_API_KEY=your-agenta-project-api-key📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| AGENTA_HOST=http://144.76.237.122:8280/ | |
| AGENTA_API_KEY=your-agenta-project-api-key | |
| AGENTA_HOST=https://cloud.agenta.ai | |
| AGENTA_API_KEY=your-agenta-project-api-key |
🧰 Tools
🪛 dotenv-linter (4.0.0)
[warning] 3-3: [UnorderedKey] The AGENTA_API_KEY key should go before the AGENTA_HOST key
(UnorderedKey)
| ``` | ||
| invoke_agent (openinference.span.kind = AGENT, carries session.id) | ||
| turn N (CHAIN) | ||
| chat <model> (LLM — model, latency, token usage, finish reason) | ||
| execute_tool <name> (TOOL — args + result) | ||
| ``` |
There was a problem hiding this comment.
Add languages to fenced code blocks to clear MD040.
Both fenced blocks are untyped; markdownlint flags them.
Suggested fix
-```
+```text
invoke_agent (openinference.span.kind = AGENT, carries session.id)
turn N (CHAIN)
chat <model> (LLM — model, latency, token usage, finish reason)
execute_tool <name> (TOOL — args + result)@@
- +text
invoke_agent (agent) ag.data.inputs={prompt}, ag.data.outputs=text, ag.session.id, cumulative tokens
turn N (chain)
chat (chat) ag.data.inputs.prompt[] + ag.data.outputs.completion[] (OpenInference
messages), ag.meta.request.model, incremental token usage
execute_tool (tool) ag.data.inputs={args}, ag.data.outputs=result
Also applies to: 67-73
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 15-15: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
Source: Linters/SAST tools
| ``` | ||
| _agent workflow (the Python /invoke span, root) | ||
| invoke_agent AGENT (the Pi run, now a child of _agent) | ||
| turn N CHAIN | ||
| chat <model> LLM model, tokens, cost, message thread | ||
| execute_tool ... TOOL | ||
| ``` |
There was a problem hiding this comment.
Add fenced-code languages to satisfy markdownlint (MD040).
The two code fences are missing language identifiers.
Suggested fix
-```
+```text
_agent workflow (the Python /invoke span, root)
invoke_agent AGENT (the Pi run, now a child of _agent)
turn N CHAIN
chat <model> LLM model, tokens, cost, message thread
execute_tool ... TOOL@@
-
curl -s "${AGENTA_HOST}/api/spans/?trace_id=<id>" -H "Authorization: ApiKey ${AGENTA_API_KEY}"
</details>
Also applies to: 102-104
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>
[warning] 26-26: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<!-- fingerprinting:phantom:poseidon:hawk -->
<!-- cr-comment:v1:b8932b53ba63b2c340391c34 -->
_Source: Linters/SAST tools_
<!-- This is an auto-generated comment by CodeRabbit -->
|
|
||
| For pi.dev it might make sense to have two adapters one for RPC and the other for json | ||
|
|
||
| Success for this WP1 is: |
There was a problem hiding this comment.
Fix the work-package label typo in success criteria.
This section is in WP-2 but says “Success for this WP1,” which makes scope tracking ambiguous.
Suggested fix
-Success for this WP1 is:
+Success for this WP-2 is:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Success for this WP1 is: | |
| Success for this WP-2 is: |
| for snap in SNAPSHOTS: | ||
| times: list[float] = [] | ||
| for i in range(N): | ||
| t = time.monotonic() | ||
| sb = daytona.create( | ||
| CreateSandboxFromSnapshotParams(snapshot=snap, auto_stop_interval=0), | ||
| timeout=120, | ||
| ) | ||
| dt = time.monotonic() - t | ||
| times.append(dt) | ||
| print(f"{snap:20} run {i + 1}/{N}: {dt:.2f}s state={sb.state}", flush=True) | ||
| daytona.delete(sb) | ||
| results[snap] = times |
There was a problem hiding this comment.
Guarantee sandbox teardown with try/finally per run.
If an exception occurs after daytona.create(...) and before daytona.delete(sb), the sandbox can be left running.
Proposed fix
for snap in SNAPSHOTS:
times: list[float] = []
for i in range(N):
- t = time.monotonic()
- sb = daytona.create(
- CreateSandboxFromSnapshotParams(snapshot=snap, auto_stop_interval=0),
- timeout=120,
- )
- dt = time.monotonic() - t
- times.append(dt)
- print(f"{snap:20} run {i + 1}/{N}: {dt:.2f}s state={sb.state}", flush=True)
- daytona.delete(sb)
+ sb = None
+ try:
+ t = time.monotonic()
+ sb = daytona.create(
+ CreateSandboxFromSnapshotParams(snapshot=snap, auto_stop_interval=0),
+ timeout=120,
+ )
+ dt = time.monotonic() - t
+ times.append(dt)
+ print(f"{snap:20} run {i + 1}/{N}: {dt:.2f}s state={sb.state}", flush=True)
+ finally:
+ if sb is not None:
+ daytona.delete(sb)
results[snap] = times| def arg(name: str, default: str) -> str: | ||
| return sys.argv[sys.argv.index(name) + 1] if name in sys.argv else default |
There was a problem hiding this comment.
Handle missing flag values in arg() to avoid IndexError.
Passing --auth or --model without a value currently crashes instead of exiting cleanly.
Proposed fix
def arg(name: str, default: str) -> str:
- return sys.argv[sys.argv.index(name) + 1] if name in sys.argv else default
+ if name not in sys.argv:
+ return default
+ idx = sys.argv.index(name) + 1
+ if idx >= len(sys.argv) or sys.argv[idx].startswith("--"):
+ raise ValueError(f"Missing value for {name}")
+ return sys.argv[idx]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def arg(name: str, default: str) -> str: | |
| return sys.argv[sys.argv.index(name) + 1] if name in sys.argv else default | |
| def arg(name: str, default: str) -> str: | |
| if name not in sys.argv: | |
| return default | |
| idx = sys.argv.index(name) + 1 | |
| if idx >= len(sys.argv) or sys.argv[idx].startswith("--"): | |
| raise ValueError(f"Missing value for {name}") | |
| return sys.argv[idx] |
| pi_cmd = ( | ||
| f"cd {run_dir} && TMPDIR={run_dir}/tmp " | ||
| f"pi -p {json.dumps(PROMPT)} " | ||
| f"--mode json --approve --provider {provider} --model {model} " | ||
| f"-t read,bash,edit,write,ls " | ||
| f"--session-dir {run_dir}/.pi-sessions --name {session_id} " | ||
| f"< /dev/null" | ||
| ) |
There was a problem hiding this comment.
Quote CLI-derived values before composing the shell command.
provider/model are interpolated directly into pi_cmd; crafted values can break out of the intended command.
Proposed fix
import asyncio
import json
import os
+import shlex
import sys
@@
pi_cmd = (
- f"cd {run_dir} && TMPDIR={run_dir}/tmp "
- f"pi -p {json.dumps(PROMPT)} "
- f"--mode json --approve --provider {provider} --model {model} "
+ f"cd {shlex.quote(run_dir)} && TMPDIR={shlex.quote(run_dir + '/tmp')} "
+ f"pi -p {shlex.quote(PROMPT)} "
+ f"--mode json --approve --provider {shlex.quote(provider)} --model {shlex.quote(model)} "
f"-t read,bash,edit,write,ls "
- f"--session-dir {run_dir}/.pi-sessions --name {session_id} "
+ f"--session-dir {shlex.quote(run_dir + '/.pi-sessions')} --name {shlex.quote(session_id)} "
f"< /dev/null"
)🧰 Tools
🪛 ast-grep (0.43.0)
[info] 260-260: use jsonify instead of json.dumps for JSON output
Context: json.dumps(PROMPT)
Note: Security best practice.
(use-jsonify)
330b1c6 to
36eb5d5
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: bac6cb92-ec42-46bb-87c6-e1728ccfbda2
⛔ Files ignored due to path filters (1)
docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (95)
docs/design/agent-workflows/README.mddocs/design/agent-workflows/adapters/agenta.mddocs/design/agent-workflows/adapters/claude-code.mddocs/design/agent-workflows/adapters/pi.mddocs/design/agent-workflows/agent-template.mddocs/design/agent-workflows/architecture.mddocs/design/agent-workflows/ground-truth.mddocs/design/agent-workflows/implementation-review.mddocs/design/agent-workflows/meeting-alignment.mddocs/design/agent-workflows/open-issues.mddocs/design/agent-workflows/ports-and-adapters.mddocs/design/agent-workflows/pr-stack.mddocs/design/agent-workflows/protocol.mddocs/design/agent-workflows/sdk-local-tools/README.mddocs/design/agent-workflows/sdk-local-tools/codebase-conventions.mddocs/design/agent-workflows/sdk-local-tools/context.mddocs/design/agent-workflows/sdk-local-tools/conventions-review.mddocs/design/agent-workflows/sdk-local-tools/organization-proposal.mddocs/design/agent-workflows/sdk-local-tools/plan.mddocs/design/agent-workflows/sdk-local-tools/research.mddocs/design/agent-workflows/sdk-local-tools/review/evidence/app-mcp-reassign.mddocs/design/agent-workflows/sdk-local-tools/review/evidence/attach-orthogonal-mutation.mddocs/design/agent-workflows/sdk-local-tools/review/evidence/description-default-inconsistency.mddocs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-no-logging.mddocs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-orthogonal-untested.mddocs/design/agent-workflows/sdk-local-tools/review/evidence/handler-resolution-error.mddocs/design/agent-workflows/sdk-local-tools/review/findings.mddocs/design/agent-workflows/sdk-local-tools/review/metadata.jsondocs/design/agent-workflows/sdk-local-tools/review/plan.mddocs/design/agent-workflows/sdk-local-tools/review/progress.mddocs/design/agent-workflows/sdk-local-tools/review/questions.mddocs/design/agent-workflows/sdk-local-tools/review/risks.mddocs/design/agent-workflows/sdk-local-tools/review/scope.mddocs/design/agent-workflows/sdk-local-tools/review/scorecard.mddocs/design/agent-workflows/sdk-local-tools/review/summary.mddocs/design/agent-workflows/sdk-local-tools/status.mddocs/design/agent-workflows/sessions.mddocs/design/agent-workflows/status.mddocs/design/agent-workflows/trash/README.mddocs/design/agent-workflows/trash/harness-port-redesign/README.mddocs/design/agent-workflows/trash/harness-port-redesign/implementation.mddocs/design/agent-workflows/trash/harness-port-redesign/plan.mddocs/design/agent-workflows/trash/harness-port-redesign/proposal.mddocs/design/agent-workflows/trash/harness-port-redesign/research.mddocs/design/agent-workflows/trash/harness-port-redesign/status.mddocs/design/agent-workflows/trash/old-rfcs/agent-protocol-rfc.mddocs/design/agent-workflows/trash/old-rfcs/streaming-and-sessions.mddocs/design/agent-workflows/trash/research/auth-secrets.mddocs/design/agent-workflows/trash/research/daytona-sandbox.mddocs/design/agent-workflows/trash/research/diskless-in-memory-config.mddocs/design/agent-workflows/trash/research/open-questions.mddocs/design/agent-workflows/trash/research/otel-instrumentation.mddocs/design/agent-workflows/trash/research/pi-interaction.mddocs/design/agent-workflows/trash/research/sandbox-sharing.mddocs/design/agent-workflows/trash/sdk-local-backend/status.mddocs/design/agent-workflows/trash/wp-1-pi-tracing/README.mddocs/design/agent-workflows/trash/wp-1-pi-tracing/integrating-the-tracing-extension.mddocs/design/agent-workflows/trash/wp-1-pi-tracing/poc/.env.exampledocs/design/agent-workflows/trash/wp-1-pi-tracing/poc/README.mddocs/design/agent-workflows/trash/wp-1-pi-tracing/poc/agenta-otel.tsdocs/design/agent-workflows/trash/wp-1-pi-tracing/poc/package.jsondocs/design/agent-workflows/trash/wp-1-pi-tracing/poc/run.tsdocs/design/agent-workflows/trash/wp-1-pi-tracing/tracing-in-the-agent-service.mddocs/design/agent-workflows/trash/wp-2-agent-service/README.mddocs/design/agent-workflows/trash/wp-2-agent-service/implementation-plan.mddocs/design/agent-workflows/trash/wp-2-agent-service/qa.mddocs/design/agent-workflows/trash/wp-3-daytona-sandbox/README.mddocs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/README.mddocs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/bench_coldstart.pydocs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/build_snapshot.pydocs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/cleanup.pydocs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/run_agent.pydocs/design/agent-workflows/trash/wp-4-multi-message-output/README.mddocs/design/agent-workflows/trash/wp-5-chat-vs-completion/README.mddocs/design/agent-workflows/trash/wp-6-workflow-type-and-template/README.mddocs/design/agent-workflows/trash/wp-7-tools/README.mddocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/README.mddocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/architecture.mddocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/context.mddocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/isolation-and-fork.mddocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/plan.mddocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/build_rivet_snapshot.pydocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/commit_agent_config.pydocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/debug-events.tsdocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/dump-full.tsdocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/package.jsondocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/spike.tsdocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/research.mddocs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/status.mddocs/design/agent-workflows/triggers.mddocs/design/vault-named-secrets/README.mddocs/design/vault-named-secrets/context.mddocs/design/vault-named-secrets/plan.mddocs/design/vault-named-secrets/research.mddocs/design/vault-named-secrets/status.md
✅ Files skipped from review due to trivial changes (47)
- docs/design/agent-workflows/sdk-local-tools/review/risks.md
- docs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-orthogonal-untested.md
- docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/build_rivet_snapshot.py
- docs/design/vault-named-secrets/README.md
- docs/design/agent-workflows/sdk-local-tools/review/evidence/description-default-inconsistency.md
- docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/package.json
- docs/design/agent-workflows/sdk-local-tools/review/evidence/attach-orthogonal-mutation.md
- docs/design/agent-workflows/agent-template.md
- docs/design/agent-workflows/trash/wp-1-pi-tracing/README.md
- docs/design/agent-workflows/sdk-local-tools/review/metadata.json
- docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/README.md
- docs/design/agent-workflows/trash/wp-4-multi-message-output/README.md
- docs/design/agent-workflows/trash/README.md
- docs/design/agent-workflows/sdk-local-tools/review/scorecard.md
- docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/context.md
- docs/design/agent-workflows/sdk-local-tools/review/questions.md
- docs/design/agent-workflows/pr-stack.md
- docs/design/agent-workflows/trash/sdk-local-backend/status.md
- docs/design/agent-workflows/open-issues.md
- docs/design/agent-workflows/sdk-local-tools/review/scope.md
- docs/design/agent-workflows/meeting-alignment.md
- docs/design/agent-workflows/ports-and-adapters.md
- docs/design/agent-workflows/sdk-local-tools/review/findings.md
- docs/design/agent-workflows/sdk-local-tools/review/progress.md
- docs/design/agent-workflows/trash/wp-3-daytona-sandbox/README.md
- docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/README.md
- docs/design/agent-workflows/sdk-local-tools/review/evidence/handler-resolution-error.md
- docs/design/agent-workflows/README.md
- docs/design/agent-workflows/trash/harness-port-redesign/implementation.md
- docs/design/agent-workflows/sdk-local-tools/review/plan.md
- docs/design/agent-workflows/trash/wp-2-agent-service/README.md
- docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/plan.md
- docs/design/agent-workflows/adapters/agenta.md
- docs/design/agent-workflows/trash/research/daytona-sandbox.md
- docs/design/agent-workflows/sdk-local-tools/status.md
- docs/design/agent-workflows/trash/wp-5-chat-vs-completion/README.md
- docs/design/vault-named-secrets/context.md
- docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/isolation-and-fork.md
- docs/design/agent-workflows/trash/wp-6-workflow-type-and-template/README.md
- docs/design/agent-workflows/sdk-local-tools/review/summary.md
- docs/design/agent-workflows/trash/research/diskless-in-memory-config.md
- docs/design/agent-workflows/trash/harness-port-redesign/README.md
- docs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-no-logging.md
- docs/design/agent-workflows/ground-truth.md
- docs/design/agent-workflows/sdk-local-tools/review/evidence/app-mcp-reassign.md
- docs/design/vault-named-secrets/status.md
- docs/design/vault-named-secrets/research.md
🚧 Files skipped from review as they are similar to previous changes (11)
- docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/package.json
- docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/debug-events.ts
- docs/design/agent-workflows/triggers.md
- docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/build_snapshot.py
- docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/dump-full.ts
- docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/cleanup.py
- docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/run.ts
- docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/bench_coldstart.py
- docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/agenta-otel.ts
- docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/spike.ts
- docs/design/agent-workflows/protocol.md
|
|
||
| ## The span tree it produces | ||
|
|
||
| ``` |
There was a problem hiding this comment.
Add fence languages to satisfy markdownlint MD040.
Line 24, Line 148, and Line 162 open fenced blocks without a language identifier.
Suggested patch
-```
+```text
invoke_agent openinference.span.kind = AGENT (root, one per user prompt)
turn N CHAIN
chat <model> LLM model, latency, token usage, finish reason, messages
execute_tool <name> TOOL args in, result out- +text
@earendil-works/pi-coding-agent 0.79.4
@opentelemetry/api 1.9.0
@opentelemetry/exporter-trace-otlp-proto 0.54.0
@opentelemetry/resources 1.28.0
@opentelemetry/sdk-trace-base 1.28.0
@opentelemetry/sdk-trace-node 1.28.0
@opentelemetry/semantic-conventions 1.28.0
- ```
+ ```bash
curl -s "${AGENTA_HOST}/api/spans/?trace_id=<id>" -H "Authorization: ApiKey ${AGENTA_API_KEY}"
```
Also applies to: 148-148, 162-162
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 24-24: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
Source: Linters/SAST tools
|
|
||
| The service is harness-agnostic at its core, with the two ports the design doc calls out. | ||
|
|
||
| ``` |
There was a problem hiding this comment.
Add a language identifier to the architecture code fence.
Line 154 opens a fenced block without a language, which triggers MD040.
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 154-154: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
Source: Linters/SAST tools
| `~/.pi/agent` doesn't exist yet, so no model is available. Pi can't reuse the `~/.codex` token directly; it needs its own login (same ChatGPT account, browser OAuth — I can't drive that for you): | ||
|
|
||
| ```bash | ||
| cd docs/design/agent-workflows/wp-1-pi-tracing/poc |
There was a problem hiding this comment.
Fix the cd path in the Pi login snippet.
Line 268 points to docs/design/agent-workflows/wp-1-pi-tracing/poc, but this doc set places that folder under docs/design/agent-workflows/trash/wp-1-pi-tracing/poc.
Suggested patch
-cd docs/design/agent-workflows/wp-1-pi-tracing/poc
+cd docs/design/agent-workflows/trash/wp-1-pi-tracing/poc📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cd docs/design/agent-workflows/wp-1-pi-tracing/poc | |
| cd docs/design/agent-workflows/trash/wp-1-pi-tracing/poc |
| await daytona.delete(sandbox) | ||
| log(" deleted.") |
There was a problem hiding this comment.
Handle sandbox deletion failures to avoid leaked sandboxes.
Line 320 can raise on transient API errors and skip final cleanup confirmation, leaving costly sandboxes running.
Suggested patch
- await daytona.delete(sandbox)
- log(" deleted.")
+ try:
+ await daytona.delete(sandbox)
+ log(" deleted.")
+ except Exception as e: # noqa: BLE001
+ log(f" delete failed for {sandbox.id}: {e}")
+ log(" sandbox may still be running; delete it manually.")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| await daytona.delete(sandbox) | |
| log(" deleted.") | |
| try: | |
| await daytona.delete(sandbox) | |
| log(" deleted.") | |
| except Exception as e: # noqa: BLE001 | |
| log(f" delete failed for {sandbox.id}: {e}") | |
| log(" sandbox may still be running; delete it manually.") |
| import os | ||
| import httpx | ||
|
|
||
| BASE = os.getenv("AGENTA_HOST", "http://144.76.237.122:8280").rstrip("/") |
There was a problem hiding this comment.
Use HTTPS for API endpoint with credentials.
The default API host uses unencrypted HTTP (http://144.76.237.122:8280), which transmits the API key in cleartext. Even for development/POC scripts, use HTTPS or require it via environment override. This addresses CWE-319 (cleartext transmission of sensitive information).
🔒 Suggested fix
-BASE = os.getenv("AGENTA_HOST", "http://144.76.237.122:8280").rstrip("/")
+BASE = os.getenv("AGENTA_HOST", "https://144.76.237.122:8280").rstrip("/")Alternatively, enforce HTTPS explicitly:
BASE = os.getenv("AGENTA_HOST", "https://144.76.237.122:8280").rstrip("/")
if not BASE.startswith("https://"):
raise ValueError("AGENTA_HOST must use HTTPS; cleartext credentials are not allowed")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| BASE = os.getenv("AGENTA_HOST", "http://144.76.237.122:8280").rstrip("/") | |
| BASE = os.getenv("AGENTA_HOST", "https://144.76.237.122:8280").rstrip("/") |
Source: Linters/SAST tools
Reviewer guide: where to look
|
mmabrouk
left a comment
There was a problem hiding this comment.
Codex subagent review for #4777
No blocking findings. I compared the required ground-truth.md and pr-stack.md files against #4779's docs/agent-workflows branch; the two files have identical blob SHAs across #4777 and #4779. The changed-file lists also match, so the docs content is not drifting between those two PRs. The only cross-PR drift I saw is metadata/process-level: #4777's PR body points the runner-engine sibling at #4774, while #4779's PR body points at #4778. Since those look like duplicate docs PRs with different stack labels, coordinate which PR is authoritative before merging to avoid duplicated review state.
One non-blocking docs drift to fix or clarify:
docs/design/agent-workflows/status.md:22says stale live comments and docs were updated. That conflicts with the same docs set atdocs/design/agent-workflows/implementation-review.md:123, which says implementation comments still refer to WP-2/WP-7/WP-8, and with the sibling code PRs I sampled:services/agent/src/server.ts:2in #4774/#4778 still starts withWP-2 Pi wrapper HTTP server, andsdks/python/agenta/sdk/agents/adapters/local.py:11in #4771 still points at the olddocs/design/agent-workflows/scratch/...path. Since this design folder asks reviewers to treatground-truth.mdand the current-state pages as authoritative, I would change the status bullet to say the stale-comment cleanup is still pending, or narrow it to docs-only cleanup if that was the intended claim.
The source-of-truth structure otherwise looks coherent: README.md and ground-truth.md both make ground-truth.md the implementation map, the current pages label planned/blocked/not-implemented work instead of presenting it as shipped, and trash/README.md clearly marks trash/ as historical and non-authoritative. I also sampled #4771-#4774 against the main implementation map: LocalBackend is a stub, NoopSessionStore is the default, the service composes the SDK runtime and backend selection, and the runner protocol/server files line up with the documented /run JSON/NDJSON contract.
Residual risk: this was a read-only GitHub review. I did not run a docs build, link checker, or any sibling PR tests.
Agent-workflows: functional PR set
Sliced by functional area, final code only (no intermediate churn). Most PRs are independent off
main; two pairs are stacked. This PR's base ismain.Context
Read this PR first if you are reviewing the agent-workflows stack. It is the docs functional slice, cut from
main. It does not change behavior. It explains the behavior that the sibling code PRs ship, and it gives reviewers a map so they can read those PRs in the right order.The agent-workflows feature was sliced by functional area. This slice carries all the design pages under
docs/design/agent-workflows/**and thedocs/design/vault-named-secrets/**pages that the named-secret work depends on.What this changes
It adds the current-state design pages for the agent workflow and archives the historical material that led to them.
The current-state pages are:
ground-truth.md: the implementation map. It lists every code surface, what is implemented, what is not, and where the tests live.architecture.md: the two-container runtime, the backends, and the harnesses.ports-and-adapters.md: the SDK runtime ports and the service and runner adapters that plug into them.protocol.md:/invoke,/messages,/load-session, and the internal/runrunner wire.sessions.md: cold replay, streaming, session ids, and the missing session store.agent-template.md: the intended split between agent identity, harness config, and runtime infrastructure.triggers.md: the planned trigger and event integration.meeting-alignment.md: where the current work matches the design discussion and where it diverges.implementation-review.md: cleanup risks and slicing notes.pr-stack.md: the functional breakpoints for the reviewable PRs.status.md: cleanup state, decisions, and blockers.The historical POC notes, research spikes, and superseded RFCs now live under
trash/. They are kept for provenance, not as design truth. No code runs in this PR.What to verify
This is a docs PR, so the key decision to confirm is whether the pages tell one consistent story.
ground-truth.mdis the single source of truth. TheREADME.mdand every other current-state page should defer to it when they disagree. The other pages should not restate facts that drift from it.ground-truth.mdand the gaps inarchitecture.mdare the parts most likely to drift from that code.How to review this PR
Read
ground-truth.mdfirst. Then readarchitecture.mdandports-and-adapters.md. Those three pages carry the whole model.The line count is large, but most of it is moved POC files under
trash/. That folder is archival. Do not read it line by line. Open atrash/file only to check that a current page's claim matches the history it cites.