docs(agent): agent-workflows design and ground truth by mmabrouk · Pull Request #4777 · Agenta-AI/agenta

mmabrouk · 2026-06-19T16:29:10Z

Agent-workflows: functional PR set

Sliced by functional area, final code only (no intermediate churn). Most PRs are independent off main; two pairs are stacked. This PR's base is main.

feat(sdk): agent runtime behind backend/harness ports #4771: SDK agent runtime: ports, adapters, tools, messages protocol
- feat(agent): agent workflow service and tool-resolution API #4772: Agent service + tool-resolution API
feat(agent): runner wire contract and tool execution #4773: Runner wire contract + tool execution
- feat(agent): runner engines, server, and tracing #4774: Runner engines, server, tracing
feat(frontend): agent config playground controls #4775: Playground agent config UI
chore(hosting): wire the agent runner sidecar into compose #4776: Hosting compose wiring
docs(agent): agent-workflows design and ground truth #4777: Docs: design + ground truth <- you are here

Context

Read this PR first if you are reviewing the agent-workflows stack. It is the docs functional slice, cut from main. It does not change behavior. It explains the behavior that the sibling code PRs ship, and it gives reviewers a map so they can read those PRs in the right order.

The agent-workflows feature was sliced by functional area. This slice carries all the design pages under docs/design/agent-workflows/** and the docs/design/vault-named-secrets/** pages that the named-secret work depends on.

What this changes

It adds the current-state design pages for the agent workflow and archives the historical material that led to them.

The current-state pages are:

ground-truth.md: the implementation map. It lists every code surface, what is implemented, what is not, and where the tests live.
architecture.md: the two-container runtime, the backends, and the harnesses.
ports-and-adapters.md: the SDK runtime ports and the service and runner adapters that plug into them.
protocol.md: /invoke, /messages, /load-session, and the internal /run runner wire.
sessions.md: cold replay, streaming, session ids, and the missing session store.
agent-template.md: the intended split between agent identity, harness config, and runtime infrastructure.
triggers.md: the planned trigger and event integration.
meeting-alignment.md: where the current work matches the design discussion and where it diverges.
implementation-review.md: cleanup risks and slicing notes.
pr-stack.md: the functional breakpoints for the reviewable PRs.
status.md: cleanup state, decisions, and blockers.

The historical POC notes, research spikes, and superseded RFCs now live under trash/. They are kept for provenance, not as design truth. No code runs in this PR.

What to verify

This is a docs PR, so the key decision to confirm is whether the pages tell one consistent story.

Confirm ground-truth.md is the single source of truth. The README.md and every other current-state page should defer to it when they disagree. The other pages should not restate facts that drift from it.
Confirm the docs match the code that shipped in the sibling PRs: feat(sdk): agent runtime behind backend/harness ports #4771 (SDK runtime), feat(agent): agent workflow service and tool-resolution API #4772 (service), feat(agent): runner wire contract and tool execution #4773 (runner tools), and feat(agent): runner engines, server, and tracing #4774 (runner engines). The "Implemented" and "Not Implemented" sections in ground-truth.md and the gaps in architecture.md are the parts most likely to drift from that code.

How to review this PR

Read ground-truth.md first. Then read architecture.md and ports-and-adapters.md. Those three pages carry the whole model.

The line count is large, but most of it is moved POC files under trash/. That folder is archival. Do not read it line by line. Open a trash/ file only to check that a current page's claim matches the history it cites.

vercel · 2026-06-19T16:29:17Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Jun 19, 2026 5:49pm

coderabbitai · 2026-06-19T16:29:37Z

📝 Walkthrough

Walkthrough

Adds agent-workflows design/status/review/archive documentation and a separate vault named secrets planning set.

Changes

Agent workflows documentation set

Layer / File(s)	Summary
Current workflow model and protocol `docs/design/agent-workflows/README.md`, `docs/design/agent-workflows/{ground-truth.md,status.md,architecture.md,ports-and-adapters.md,protocol.md,sessions.md,agent-template.md,triggers.md,meeting-alignment.md,open-issues.md,implementation-review.md,pr-stack.md}`, `docs/design/agent-workflows/adapters/*`	Documents the current workflow source of truth, HTTP and runner surfaces, harness/backend mappings, session behavior, adapter behavior for Pi/Claude/Agenta, open gaps, and follow-on planning.
SDK local tools design `docs/design/agent-workflows/sdk-local-tools/{README.md,context.md,codebase-conventions.md,conventions-review.md,organization-proposal.md,plan.md,research.md,status.md}`	Defines the standalone local-tools problem, repository conventions, organization proposal, phased plan, research findings, and status for SDK-owned tool resolution.
SDK local tools review package `docs/design/agent-workflows/sdk-local-tools/review/*`	Adds the review scope, plan, evidence notes, findings, risks, questions, progress, scorecard, summary, and metadata for the sdk-local-tools review.
Historical archive, research, and POCs `docs/design/agent-workflows/trash/*`	Adds archival READMEs, superseded redesign notes, old RFCs, research notes, and proof-of-concept scripts for tracing, agent service wrapping, Daytona sandboxing, tools, and Rivet ACP runtime experiments.

Vault named secrets planning docs

Layer / File(s)	Summary
Named secret scope, research, and implementation plan `docs/design/vault-named-secrets/{README.md,context.md,research.md,plan.md,status.md}`	Defines a separate design for project-scoped `custom_secret` vault entries, documents current vault behavior, lays out backend/frontend phases, and tracks planning status and verification steps.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.81% which is insufficient. The required threshold is 60.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main change: adding agent-workflows design and ground truth documentation.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The PR description clearly documents the purpose, scope, and verification strategy for a comprehensive design documentation package for agent-workflows functionality.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch docs/agent-workflows-design

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

🧹 Nitpick comments (17)

docs/design/agent-workflows/status.md (1)
47-61: 💤 Low value

Consider refactoring the three successive "Should" questions into a bullet list for readability.

Lines 49–51 begin consecutive sentences with "Should", which LanguageTool flags as repetitive. While the current structure is correct and intentional (listing open decisions), a bulleted format would improve clarity:
- Should `LocalBackend` remain exported before it is implemented?
- Should MCP controls be hidden or constrained by selected harness/backend?
- Should `agenta` be hidden from non-local sandbox selections?
docs/design/agent-workflows/architecture.md (2)
27-54: ⚡ Quick win

Add language identifier to fenced code block.

The ASCII diagram block (lines 27–54) is missing a language specification. Add text or none for clarity:
📝 Proposed fix
-```
+```text
 browser / playground
     |
     | POST /invoke or POST /messages
     v
 services container
     Python workflow handler
     services/oss/src/agent/app.py
     |
     | POST /run, or spawn the runner CLI in local checkout mode
     v
 agent runner sidecar
     compose service: agent-pi
     Node HTTP server
     services/agent/src/server.ts
     |
     +-- in-process Pi engine
     |   services/agent/src/engines/pi.ts
     |
     +-- rivet engine
         services/agent/src/engines/rivet.ts
         |
         +-- sandbox-agent daemon
             |
             +-- ACP adapter: pi-acp or claude-agent-acp
                 |
                 +-- harness CLI: pi or claude
-```
+```
Source: Linters/SAST tools

56-64: 💤 Low value

Use hyphen in compound adjective "full-stack".

Line 62 uses "full stack environment" as a compound modifier. LanguageTool suggests the hyphenated form "full-stack environment" for clarity, or rephrasing to "entire stack environment":
📝 Proposed fixes

Option 1 (hyphenate):
-The sidecar deliberately does not inherit the full stack environment. Provider keys and
+The sidecar deliberately does not inherit the full-stack environment. Provider keys and
Option 2 (rephrase):
-The sidecar deliberately does not inherit the full stack environment. Provider keys and
+The sidecar deliberately does not inherit the entire stack environment. Provider keys and
Source: Linters/SAST tools
docs/design/agent-workflows/sessions.md (1)
43-51: 💤 Low value

Consider refactoring the three successive "If" conditions into a bullet list for readability.

Lines 46–50 begin consecutive sentences with "If", which LanguageTool flags as repetitive. While the structure clearly lists the three intended session_id semantics, a bulleted format would improve scannability:
The intended behavior is create-or-resume:

- If the client omits `session_id`, the server creates one and returns it.
- If the client supplies a known `session_id`, the server resumes that session.
- If the client supplies an unknown but valid `session_id`, the server creates a session using that id.
docs/design/agent-workflows/adapters/pi.md (2)
110-116: ⚡ Quick win

Add language identifier to fenced code block (span tree diagram).

The span tree diagram (lines 110–116) is missing a language specification. Add text or none:
📝 Proposed fix
-```
+```text
 invoke_agent            (AGENT)
   turn N                (CHAIN)
     chat <model>        (LLM)    real token usage from the provider call
     execute_tool <name> (TOOL)   one per tool the turn ran
-```
+```
Source: Linters/SAST tools

144-147: 💤 Low value

Simplify "in order to" to "to" for conciseness.

Line 146 uses the phrase "in order to", which LanguageTool suggests is unnecessarily wordy:
📝 Proposed fix
-For output, Pi streams pure text deltas over ACP (`agent_message_chunk`). The runner appends them in order to build the final answer.
+For output, Pi streams pure text deltas over ACP (`agent_message_chunk`). The runner appends them to build the final answer.
Source: Linters/SAST tools
docs/design/agent-workflows/trash/sdk-local-backend/status.md (1)

3-5: ⚡ Quick win

Clarify the document's purpose statement.

Lines 3–5 claim this is "the only page that describes things that do not fully exist yet," but the document immediately documents substantial completed work in the "Done and verified" section (lines 19–49). The distinction seems to be that this document uniquely contains both complete and incomplete work, whereas other design pages document only what is built. Rephrase to make that distinction explicit, for example: "This page documents both completed work and ongoing/future work; other design pages in this directory document only what is currently built."

docs/design/agent-workflows/sdk-local-tools/review/scope.md (1)

14-21: ⚡ Quick win

Update branch reference for final merge.

Line 19 references gitbutler/workspace as a "working tree," indicating this scope was drafted during development. Before merging to main, update this reference to the final branch name (or commit hash if merging directly to main). This ensures the scope document accurately reflects the reviewed state.
docs/design/agent-workflows/trash/harness-port-redesign/research.md (1)
15-20: 💤 Low value

Add language specifiers to fenced code blocks per markdownlint (MD040).

Code blocks should declare their language for proper syntax highlighting. The Python class at lines 15–20, TypeScript Session interface at lines 83–94, and SessionPersistDriver interface at lines 130–138 are missing language specifiers.
Fix language specifiers
-```
+```python
 class Harness(ABC):

-```
+```ts
 class Session {

-```
+```ts
 interface SessionPersistDriver {
Also applies to: 83-94, 130-138

Source: Linters/SAST tools
docs/design/agent-workflows/trash/harness-port-redesign/proposal.md (1)

141-141: 💤 Low value

Fix hyphenation in compound adjectives (LanguageTool grammar feedback).

When compound adjectives modify a noun, they should be hyphenated. Lines 141 and 146 have two instances:

Line 141: "client side streaming" → "client-side streaming"

Line 146: "per invoke sandboxes" → "per-invocation sandboxes" (or "per-invoke sandboxes" if abbreviating "invoke")

Also applies to: 146-146

Source: Linters/SAST tools

docs/design/agent-workflows/trash/harness-port-redesign/plan.md (1)

93-93: 💤 Low value

Hyphenate compound modifier on line 93.

"Cross cutting" should be "Cross-cutting" (compound adjective).

Source: Linters/SAST tools
docs/design/agent-workflows/trash/wp-1-pi-tracing/integrating-the-tracing-extension.md (1)
24-29: 💤 Low value

Add language specifiers to fenced code blocks (MD040).

Three code blocks lack language identifiers for syntax highlighting:

Lines 24–29: span tree structure (use ```text or ```yaml)

Lines 148–156: dependency list (use ```text)

Lines 162–170: curl command (use ```bash)
Add language specifiers
-```
+```text
 invoke_agent

-```
+```text
 `@earendil-works/pi-coding-agent`  0.79.4

-```
+```bash
 curl -s "${AGENTA_HOST}/api/spans/?trace_id=<id>"
Also applies to: 148-156, 162-170

Source: Linters/SAST tools
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/commit_agent_config.py (3)

1-4: 💤 Low value

Ensure Python formatting compliance before commit per coding guidelines.

This Python file should be run through ruff format and ruff check --fix before merging, per the coding guidelines for **/*.py files. While this is a POC script in the historical trash directory, maintaining formatting consistency is valuable.

Source: Coding guidelines

15-18: 💤 Low value

Consider HTTPS for the default Agenta host URL.

The default URL uses HTTP (line 15), which ast-grep flags as CWE-319. While this is a POC script pointing to a local dev box, consider using HTTPS as the default or allowing it to be overridden more safely. For production use, the caller would override AGENTA_HOST with an HTTPS URL, but a more defensive default would be clearer.

Source: Linters/SAST tools

23-75: ⚡ Quick win

Add error handling for JSON parsing and required env vars.

The script calls .json() on HTTP responses (lines 31, 69) without defensive handling if the response body is malformed. Additionally, AGENTA_API_KEY is read directly without a check (line 16). While this is a POC, adding basic validation would improve robustness:

Validate AGENTA_API_KEY is set before use

Wrap .json() calls in try/except to provide helpful error messages if the Agenta API response is unexpected
docs/design/agent-workflows/trash/research/pi-interaction.md (1)
58-58: ⚡ Quick win

Add language identifier to code fence.

Line 58 opens a code block without specifying the language. Based on the content (bash commands), please add ```bash instead of ```.
📝 Proposed fix
-\`\`\`
+\`\`\`bash
 npm install `@earendil-works/pi-coding-agent`   # SDK + CLI
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/architecture.md (1)
10-43: ⚡ Quick win

Add language identifier to ASCII diagram code fence.

The code fence at line 10 contains an ASCII diagram and has no language specified. Consider using ```text or ```diagram to clarify intent, though the content is clear as-is.
📝 Proposed fix
-\`\`\`
+\`\`\`text
                  unchanged
   ┌───────────────────────────────────────────────┐

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 2e8ebdac-ec42-4f55-881a-919108e5e706

📥 Commits

Reviewing files that changed from the base of the PR and between a97e608 and 330b1c6.

⛔ Files ignored due to path filters (1)

docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (95)

docs/design/agent-workflows/README.md
docs/design/agent-workflows/adapters/agenta.md
docs/design/agent-workflows/adapters/claude-code.md
docs/design/agent-workflows/adapters/pi.md
docs/design/agent-workflows/agent-template.md
docs/design/agent-workflows/architecture.md
docs/design/agent-workflows/ground-truth.md
docs/design/agent-workflows/implementation-review.md
docs/design/agent-workflows/meeting-alignment.md
docs/design/agent-workflows/open-issues.md
docs/design/agent-workflows/ports-and-adapters.md
docs/design/agent-workflows/pr-stack.md
docs/design/agent-workflows/protocol.md
docs/design/agent-workflows/sdk-local-tools/README.md
docs/design/agent-workflows/sdk-local-tools/codebase-conventions.md
docs/design/agent-workflows/sdk-local-tools/context.md
docs/design/agent-workflows/sdk-local-tools/conventions-review.md
docs/design/agent-workflows/sdk-local-tools/organization-proposal.md
docs/design/agent-workflows/sdk-local-tools/plan.md
docs/design/agent-workflows/sdk-local-tools/research.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/app-mcp-reassign.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/attach-orthogonal-mutation.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/description-default-inconsistency.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-no-logging.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-orthogonal-untested.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/handler-resolution-error.md
docs/design/agent-workflows/sdk-local-tools/review/findings.md
docs/design/agent-workflows/sdk-local-tools/review/metadata.json
docs/design/agent-workflows/sdk-local-tools/review/plan.md
docs/design/agent-workflows/sdk-local-tools/review/progress.md
docs/design/agent-workflows/sdk-local-tools/review/questions.md
docs/design/agent-workflows/sdk-local-tools/review/risks.md
docs/design/agent-workflows/sdk-local-tools/review/scope.md
docs/design/agent-workflows/sdk-local-tools/review/scorecard.md
docs/design/agent-workflows/sdk-local-tools/review/summary.md
docs/design/agent-workflows/sdk-local-tools/status.md
docs/design/agent-workflows/sessions.md
docs/design/agent-workflows/status.md
docs/design/agent-workflows/trash/README.md
docs/design/agent-workflows/trash/harness-port-redesign/README.md
docs/design/agent-workflows/trash/harness-port-redesign/implementation.md
docs/design/agent-workflows/trash/harness-port-redesign/plan.md
docs/design/agent-workflows/trash/harness-port-redesign/proposal.md
docs/design/agent-workflows/trash/harness-port-redesign/research.md
docs/design/agent-workflows/trash/harness-port-redesign/status.md
docs/design/agent-workflows/trash/old-rfcs/agent-protocol-rfc.md
docs/design/agent-workflows/trash/old-rfcs/streaming-and-sessions.md
docs/design/agent-workflows/trash/research/auth-secrets.md
docs/design/agent-workflows/trash/research/daytona-sandbox.md
docs/design/agent-workflows/trash/research/diskless-in-memory-config.md
docs/design/agent-workflows/trash/research/open-questions.md
docs/design/agent-workflows/trash/research/otel-instrumentation.md
docs/design/agent-workflows/trash/research/pi-interaction.md
docs/design/agent-workflows/trash/research/sandbox-sharing.md
docs/design/agent-workflows/trash/sdk-local-backend/status.md
docs/design/agent-workflows/trash/wp-1-pi-tracing/README.md
docs/design/agent-workflows/trash/wp-1-pi-tracing/integrating-the-tracing-extension.md
docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/.env.example
docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/README.md
docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/agenta-otel.ts
docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/package.json
docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/run.ts
docs/design/agent-workflows/trash/wp-1-pi-tracing/tracing-in-the-agent-service.md
docs/design/agent-workflows/trash/wp-2-agent-service/README.md
docs/design/agent-workflows/trash/wp-2-agent-service/implementation-plan.md
docs/design/agent-workflows/trash/wp-2-agent-service/qa.md
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/README.md
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/README.md
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/bench_coldstart.py
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/build_snapshot.py
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/cleanup.py
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/run_agent.py
docs/design/agent-workflows/trash/wp-4-multi-message-output/README.md
docs/design/agent-workflows/trash/wp-5-chat-vs-completion/README.md
docs/design/agent-workflows/trash/wp-6-workflow-type-and-template/README.md
docs/design/agent-workflows/trash/wp-7-tools/README.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/README.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/architecture.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/context.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/isolation-and-fork.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/plan.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/build_rivet_snapshot.py
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/commit_agent_config.py
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/debug-events.ts
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/dump-full.ts
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/package.json
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/spike.ts
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/research.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/status.md
docs/design/agent-workflows/triggers.md
docs/design/vault-named-secrets/README.md
docs/design/vault-named-secrets/context.md
docs/design/vault-named-secrets/plan.md
docs/design/vault-named-secrets/research.md
docs/design/vault-named-secrets/status.md

coderabbitai · 2026-06-19T16:41:29Z

+- [daytona-sandboxes] Daytona — Sandboxes (lifecycle, states, auto-stop/archive/delete,
+  refreshActivity, resource limits, per-sandbox isolation):
+  https://www.daytona.io/docs/en/sandboxes/
+- [daytona-process] Daytona — Process and Code Execution (exec/code_run vs Sessions, cwd,
+  env, create/execute/get/delete session): https://www.daytona.io/docs/en/process-code-execution/
+- [daytona-process-src] Daytona docs source — process-code-execution.mdx (verbatim session
+  example, SessionExecuteRequest fields):
+  https://github.com/daytonaio/daytona/blob/main/apps/docs/src/content/docs/en/process-code-execution.mdx
+- [daytona-volumes] Daytona — Volumes (creation, VolumeMount, mount_path/subpath, FUSE,
+  mounting via CreateSandboxFromSnapshotParams): https://www.daytona.io/docs/en/volumes/
+- [daytona-volumes-src] Daytona docs source — volumes.mdx (verbatim "mounted at creation",
+  persistence, FUSE not transactional, last-write-wins):
+  https://github.com/daytonaio/daytona/blob/main/apps/docs/src/content/docs/en/volumes.mdx
+- [daytona-fuse-issue] Daytona GitHub issue #3331 — FUSE volume permission limitations
+  (mv/touch/stat/copystat failures): https://github.com/daytonaio/daytona/issues/3331
+- [daytona-parallel-issue] Daytona GitHub issue #4001 — Design and Implement Parallel
+  Sandbox Execution API (fork filesystem+memory; current workaround = many sandboxes):
+  https://github.com/daytonaio/daytona/issues/4001
+- [daytona-blog-best] Northflank — "Best code execution sandbox for AI agents 2026"
+  (isolated sandbox per execution; Docker-default isolation weaker than microVMs):
+  https://northflank.com/blog/best-code-execution-sandbox-for-ai-agents
+- [pi-home] pi.dev — product overview (harness, modes, AGENTS.md/SYSTEM.md context):
+  https://pi.dev
+- [pi-docs] pi.dev — docs index (session tree, JSONL session format, RPC/SDK modes):
+  https://pi.dev/docs/latest
+- [pi-sdk] pi.dev — SDK/RPC (SessionManager.create/continueRecent/open/inMemory, cwd,
+  agentDir, runRpcMode, `--mode rpc --no-session`): https://pi.dev/docs/latest/sdk
+- Agenta repo — `api/oss/src/utils/env.py` `DaytonaConfig` (DAYTONA_API_KEY,


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Define the source labels as actual Markdown reference definitions.

The entries in ## Sources are formatted as list items, so labels like [daytona-process] are not defined and earlier references stay unresolved.

Suggested fix pattern

- - [daytona-sandboxes] Daytona — Sandboxes ...: - https://www.daytona.io/docs/en/sandboxes/ - - [daytona-process] Daytona — Process and Code Execution ...: - https://www.daytona.io/docs/en/process-code-execution/ + [daytona-sandboxes]: https://www.daytona.io/docs/en/sandboxes/ + [daytona-process]: https://www.daytona.io/docs/en/process-code-execution/ + [daytona-process-src]: https://github.com/daytonaio/daytona/blob/main/apps/docs/src/content/docs/en/process-code-execution.mdx + [daytona-volumes]: https://www.daytona.io/docs/en/volumes/ + [daytona-volumes-src]: https://github.com/daytonaio/daytona/blob/main/apps/docs/src/content/docs/en/volumes.mdx + [daytona-fuse-issue]: https://github.com/daytonaio/daytona/issues/3331 + [daytona-parallel-issue]: https://github.com/daytonaio/daytona/issues/4001 + [daytona-blog-best]: https://northflank.com/blog/best-code-execution-sandbox-for-ai-agents + [pi-home]: https://pi.dev + [pi-docs]: https://pi.dev/docs/latest + [pi-sdk]: https://pi.dev/docs/latest/sdk

Source: Linters/SAST tools

coderabbitai · 2026-06-19T16:41:29Z

+AGENTA_HOST=http://144.76.237.122:8280/
+AGENTA_API_KEY=your-agenta-project-api-key


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use a safe default collector endpoint in the example env file.

AGENTA_HOST is currently an http:// hard-coded host. That can expose Authorization: ApiKey and captured span content over plaintext transport if copied as-is.

Suggested fix

-AGENTA_HOST=http://144.76.237.122:8280/ +AGENTA_HOST=https://cloud.agenta.ai AGENTA_API_KEY=your-agenta-project-api-key

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

AGENTA_HOST=http://144.76.237.122:8280/

AGENTA_API_KEY=your-agenta-project-api-key

AGENTA_HOST=https://cloud.agenta.ai

AGENTA_API_KEY=your-agenta-project-api-key

🧰 Tools

🪛 dotenv-linter (4.0.0)

[warning] 3-3: [UnorderedKey] The AGENTA_API_KEY key should go before the AGENTA_HOST key

(UnorderedKey)

coderabbitai · 2026-06-19T16:41:29Z

+```
+invoke_agent              (openinference.span.kind = AGENT, carries session.id)
+  turn N                  (CHAIN)
+    chat <model>          (LLM   — model, latency, token usage, finish reason)
+    execute_tool <name>   (TOOL  — args + result)
+```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add languages to fenced code blocks to clear MD040.

Both fenced blocks are untyped; markdownlint flags them.

Suggested fix

-``` +```text invoke_agent (openinference.span.kind = AGENT, carries session.id) turn N (CHAIN) chat <model> (LLM — model, latency, token usage, finish reason) execute_tool <name> (TOOL — args + result)

@@
- +text
invoke_agent (agent) ag.data.inputs={prompt}, ag.data.outputs=text, ag.session.id, cumulative tokens
turn N (chain)
chat (chat) ag.data.inputs.prompt[] + ag.data.outputs.completion[] (OpenInference
messages), ag.meta.request.model, incremental token usage
execute_tool (tool) ag.data.inputs={args}, ag.data.outputs=result

Also applies to: 67-73

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 15-15: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

Source: Linters/SAST tools

coderabbitai · 2026-06-19T16:41:29Z

+```
+_agent                 workflow   (the Python /invoke span, root)
+  invoke_agent         AGENT      (the Pi run, now a child of _agent)
+    turn N             CHAIN
+      chat <model>     LLM        model, tokens, cost, message thread
+      execute_tool ... TOOL
+```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add fenced-code languages to satisfy markdownlint (MD040).

The two code fences are missing language identifiers.

Suggested fix

-``` +```text _agent workflow (the Python /invoke span, root) invoke_agent AGENT (the Pi run, now a child of _agent) turn N CHAIN chat <model> LLM model, tokens, cost, message thread execute_tool ... TOOL

@@

curl -s "${AGENTA_HOST}/api/spans/?trace_id=<id>" -H "Authorization: ApiKey ${AGENTA_API_KEY}"

</details> Also applies to: 102-104 <details> <summary>🧰 Tools</summary> <details> <summary>🪛 markdownlint-cli2 (0.22.1)</summary> [warning] 26-26: Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details>   _Source: Linters/SAST tools_ 

coderabbitai · 2026-06-19T16:41:29Z

+
+For pi.dev it might make sense to have two adapters one for RPC and the other for json 
+
+Success for this WP1 is:


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the work-package label typo in success criteria.

This section is in WP-2 but says “Success for this WP1,” which makes scope tracking ambiguous.

Suggested fix

-Success for this WP1 is: +Success for this WP-2 is:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Success for this WP1 is:

Success for this WP-2 is:

coderabbitai · 2026-06-19T16:41:29Z

+for snap in SNAPSHOTS:
+    times: list[float] = []
+    for i in range(N):
+        t = time.monotonic()
+        sb = daytona.create(
+            CreateSandboxFromSnapshotParams(snapshot=snap, auto_stop_interval=0),
+            timeout=120,
+        )
+        dt = time.monotonic() - t
+        times.append(dt)
+        print(f"{snap:20} run {i + 1}/{N}: {dt:.2f}s  state={sb.state}", flush=True)
+        daytona.delete(sb)
+    results[snap] = times


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guarantee sandbox teardown with try/finally per run.

If an exception occurs after daytona.create(...) and before daytona.delete(sb), the sandbox can be left running.

Proposed fix

for snap in SNAPSHOTS: times: list[float] = [] for i in range(N): - t = time.monotonic() - sb = daytona.create( - CreateSandboxFromSnapshotParams(snapshot=snap, auto_stop_interval=0), - timeout=120, - ) - dt = time.monotonic() - t - times.append(dt) - print(f"{snap:20} run {i + 1}/{N}: {dt:.2f}s state={sb.state}", flush=True) - daytona.delete(sb) + sb = None + try: + t = time.monotonic() + sb = daytona.create( + CreateSandboxFromSnapshotParams(snapshot=snap, auto_stop_interval=0), + timeout=120, + ) + dt = time.monotonic() - t + times.append(dt) + print(f"{snap:20} run {i + 1}/{N}: {dt:.2f}s state={sb.state}", flush=True) + finally: + if sb is not None: + daytona.delete(sb) results[snap] = times

coderabbitai · 2026-06-19T16:41:29Z

+def arg(name: str, default: str) -> str:
+    return sys.argv[sys.argv.index(name) + 1] if name in sys.argv else default


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Handle missing flag values in arg() to avoid IndexError.

Passing --auth or --model without a value currently crashes instead of exiting cleanly.

Proposed fix

def arg(name: str, default: str) -> str: - return sys.argv[sys.argv.index(name) + 1] if name in sys.argv else default + if name not in sys.argv: + return default + idx = sys.argv.index(name) + 1 + if idx >= len(sys.argv) or sys.argv[idx].startswith("--"): + raise ValueError(f"Missing value for {name}") + return sys.argv[idx]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def arg(name: str, default: str) -> str:

return sys.argv[sys.argv.index(name) + 1] if name in sys.argv else default

def arg(name: str, default: str) -> str:

if name not in sys.argv:

return default

idx = sys.argv.index(name) + 1

if idx >= len(sys.argv) or sys.argv[idx].startswith("--"):

raise ValueError(f"Missing value for {name}")

return sys.argv[idx]

coderabbitai · 2026-06-19T16:41:29Z

+            pi_cmd = (
+                f"cd {run_dir} && TMPDIR={run_dir}/tmp "
+                f"pi -p {json.dumps(PROMPT)} "
+                f"--mode json --approve --provider {provider} --model {model} "
+                f"-t read,bash,edit,write,ls "
+                f"--session-dir {run_dir}/.pi-sessions --name {session_id} "
+                f"< /dev/null"
+            )


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Quote CLI-derived values before composing the shell command.

provider/model are interpolated directly into pi_cmd; crafted values can break out of the intended command.

Proposed fix

import asyncio import json import os +import shlex import sys @@ pi_cmd = ( - f"cd {run_dir} && TMPDIR={run_dir}/tmp " - f"pi -p {json.dumps(PROMPT)} " - f"--mode json --approve --provider {provider} --model {model} " + f"cd {shlex.quote(run_dir)} && TMPDIR={shlex.quote(run_dir + '/tmp')} " + f"pi -p {shlex.quote(PROMPT)} " + f"--mode json --approve --provider {shlex.quote(provider)} --model {shlex.quote(model)} " f"-t read,bash,edit,write,ls " - f"--session-dir {run_dir}/.pi-sessions --name {session_id} " + f"--session-dir {shlex.quote(run_dir + '/.pi-sessions')} --name {shlex.quote(session_id)} " f"< /dev/null" )

🧰 Tools

🪛 ast-grep (0.43.0)

[info] 260-260: use jsonify instead of json.dumps for JSON output
Context: json.dumps(PROMPT)
Note: Security best practice.

(use-jsonify)

coderabbitai

Actionable comments posted: 5

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: bac6cb92-ec42-46bb-87c6-e1728ccfbda2

📥 Commits

Reviewing files that changed from the base of the PR and between 330b1c6 and 36eb5d5.

⛔ Files ignored due to path filters (1)

docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (95)

docs/design/agent-workflows/README.md
docs/design/agent-workflows/adapters/agenta.md
docs/design/agent-workflows/adapters/claude-code.md
docs/design/agent-workflows/adapters/pi.md
docs/design/agent-workflows/agent-template.md
docs/design/agent-workflows/architecture.md
docs/design/agent-workflows/ground-truth.md
docs/design/agent-workflows/implementation-review.md
docs/design/agent-workflows/meeting-alignment.md
docs/design/agent-workflows/open-issues.md
docs/design/agent-workflows/ports-and-adapters.md
docs/design/agent-workflows/pr-stack.md
docs/design/agent-workflows/protocol.md
docs/design/agent-workflows/sdk-local-tools/README.md
docs/design/agent-workflows/sdk-local-tools/codebase-conventions.md
docs/design/agent-workflows/sdk-local-tools/context.md
docs/design/agent-workflows/sdk-local-tools/conventions-review.md
docs/design/agent-workflows/sdk-local-tools/organization-proposal.md
docs/design/agent-workflows/sdk-local-tools/plan.md
docs/design/agent-workflows/sdk-local-tools/research.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/app-mcp-reassign.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/attach-orthogonal-mutation.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/description-default-inconsistency.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-no-logging.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-orthogonal-untested.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/handler-resolution-error.md
docs/design/agent-workflows/sdk-local-tools/review/findings.md
docs/design/agent-workflows/sdk-local-tools/review/metadata.json
docs/design/agent-workflows/sdk-local-tools/review/plan.md
docs/design/agent-workflows/sdk-local-tools/review/progress.md
docs/design/agent-workflows/sdk-local-tools/review/questions.md
docs/design/agent-workflows/sdk-local-tools/review/risks.md
docs/design/agent-workflows/sdk-local-tools/review/scope.md
docs/design/agent-workflows/sdk-local-tools/review/scorecard.md
docs/design/agent-workflows/sdk-local-tools/review/summary.md
docs/design/agent-workflows/sdk-local-tools/status.md
docs/design/agent-workflows/sessions.md
docs/design/agent-workflows/status.md
docs/design/agent-workflows/trash/README.md
docs/design/agent-workflows/trash/harness-port-redesign/README.md
docs/design/agent-workflows/trash/harness-port-redesign/implementation.md
docs/design/agent-workflows/trash/harness-port-redesign/plan.md
docs/design/agent-workflows/trash/harness-port-redesign/proposal.md
docs/design/agent-workflows/trash/harness-port-redesign/research.md
docs/design/agent-workflows/trash/harness-port-redesign/status.md
docs/design/agent-workflows/trash/old-rfcs/agent-protocol-rfc.md
docs/design/agent-workflows/trash/old-rfcs/streaming-and-sessions.md
docs/design/agent-workflows/trash/research/auth-secrets.md
docs/design/agent-workflows/trash/research/daytona-sandbox.md
docs/design/agent-workflows/trash/research/diskless-in-memory-config.md
docs/design/agent-workflows/trash/research/open-questions.md
docs/design/agent-workflows/trash/research/otel-instrumentation.md
docs/design/agent-workflows/trash/research/pi-interaction.md
docs/design/agent-workflows/trash/research/sandbox-sharing.md
docs/design/agent-workflows/trash/sdk-local-backend/status.md
docs/design/agent-workflows/trash/wp-1-pi-tracing/README.md
docs/design/agent-workflows/trash/wp-1-pi-tracing/integrating-the-tracing-extension.md
docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/.env.example
docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/README.md
docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/agenta-otel.ts
docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/package.json
docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/run.ts
docs/design/agent-workflows/trash/wp-1-pi-tracing/tracing-in-the-agent-service.md
docs/design/agent-workflows/trash/wp-2-agent-service/README.md
docs/design/agent-workflows/trash/wp-2-agent-service/implementation-plan.md
docs/design/agent-workflows/trash/wp-2-agent-service/qa.md
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/README.md
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/README.md
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/bench_coldstart.py
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/build_snapshot.py
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/cleanup.py
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/run_agent.py
docs/design/agent-workflows/trash/wp-4-multi-message-output/README.md
docs/design/agent-workflows/trash/wp-5-chat-vs-completion/README.md
docs/design/agent-workflows/trash/wp-6-workflow-type-and-template/README.md
docs/design/agent-workflows/trash/wp-7-tools/README.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/README.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/architecture.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/context.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/isolation-and-fork.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/plan.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/build_rivet_snapshot.py
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/commit_agent_config.py
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/debug-events.ts
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/dump-full.ts
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/package.json
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/spike.ts
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/research.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/status.md
docs/design/agent-workflows/triggers.md
docs/design/vault-named-secrets/README.md
docs/design/vault-named-secrets/context.md
docs/design/vault-named-secrets/plan.md
docs/design/vault-named-secrets/research.md
docs/design/vault-named-secrets/status.md

✅ Files skipped from review due to trivial changes (47)

docs/design/agent-workflows/sdk-local-tools/review/risks.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-orthogonal-untested.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/build_rivet_snapshot.py
docs/design/vault-named-secrets/README.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/description-default-inconsistency.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/package.json
docs/design/agent-workflows/sdk-local-tools/review/evidence/attach-orthogonal-mutation.md
docs/design/agent-workflows/agent-template.md
docs/design/agent-workflows/trash/wp-1-pi-tracing/README.md
docs/design/agent-workflows/sdk-local-tools/review/metadata.json
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/README.md
docs/design/agent-workflows/trash/wp-4-multi-message-output/README.md
docs/design/agent-workflows/trash/README.md
docs/design/agent-workflows/sdk-local-tools/review/scorecard.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/context.md
docs/design/agent-workflows/sdk-local-tools/review/questions.md
docs/design/agent-workflows/pr-stack.md
docs/design/agent-workflows/trash/sdk-local-backend/status.md
docs/design/agent-workflows/open-issues.md
docs/design/agent-workflows/sdk-local-tools/review/scope.md
docs/design/agent-workflows/meeting-alignment.md
docs/design/agent-workflows/ports-and-adapters.md
docs/design/agent-workflows/sdk-local-tools/review/findings.md
docs/design/agent-workflows/sdk-local-tools/review/progress.md
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/README.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/README.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/handler-resolution-error.md
docs/design/agent-workflows/README.md
docs/design/agent-workflows/trash/harness-port-redesign/implementation.md
docs/design/agent-workflows/sdk-local-tools/review/plan.md
docs/design/agent-workflows/trash/wp-2-agent-service/README.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/plan.md
docs/design/agent-workflows/adapters/agenta.md
docs/design/agent-workflows/trash/research/daytona-sandbox.md
docs/design/agent-workflows/sdk-local-tools/status.md
docs/design/agent-workflows/trash/wp-5-chat-vs-completion/README.md
docs/design/vault-named-secrets/context.md
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/isolation-and-fork.md
docs/design/agent-workflows/trash/wp-6-workflow-type-and-template/README.md
docs/design/agent-workflows/sdk-local-tools/review/summary.md
docs/design/agent-workflows/trash/research/diskless-in-memory-config.md
docs/design/agent-workflows/trash/harness-port-redesign/README.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-no-logging.md
docs/design/agent-workflows/ground-truth.md
docs/design/agent-workflows/sdk-local-tools/review/evidence/app-mcp-reassign.md
docs/design/vault-named-secrets/status.md
docs/design/vault-named-secrets/research.md

🚧 Files skipped from review as they are similar to previous changes (11)

docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/package.json
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/debug-events.ts
docs/design/agent-workflows/triggers.md
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/build_snapshot.py
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/dump-full.ts
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/cleanup.py
docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/run.ts
docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/bench_coldstart.py
docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/agenta-otel.ts
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/spike.ts
docs/design/agent-workflows/protocol.md

coderabbitai · 2026-06-19T17:54:18Z

+
+## The span tree it produces
+
+```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add fence languages to satisfy markdownlint MD040.

Line 24, Line 148, and Line 162 open fenced blocks without a language identifier.

Suggested patch

-``` +```text invoke_agent openinference.span.kind = AGENT (root, one per user prompt) turn N CHAIN chat <model> LLM model, latency, token usage, finish reason, messages execute_tool <name> TOOL args in, result out

- +text
@earendil-works/pi-coding-agent 0.79.4
@opentelemetry/api 1.9.0
@opentelemetry/exporter-trace-otlp-proto 0.54.0
@opentelemetry/resources 1.28.0
@opentelemetry/sdk-trace-base 1.28.0
@opentelemetry/sdk-trace-node 1.28.0
@opentelemetry/semantic-conventions 1.28.0

- ``` + ```bash curl -s "${AGENTA_HOST}/api/spans/?trace_id=<id>" -H "Authorization: ApiKey ${AGENTA_API_KEY}" ```

Also applies to: 148-148, 162-162

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 24-24: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

Source: Linters/SAST tools

coderabbitai · 2026-06-19T17:54:18Z

+
+The service is harness-agnostic at its core, with the two ports the design doc calls out.
+
+```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a language identifier to the architecture code fence.

Line 154 opens a fenced block without a language, which triggers MD040.

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 154-154: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

Source: Linters/SAST tools

coderabbitai · 2026-06-19T17:54:18Z

+`~/.pi/agent` doesn't exist yet, so no model is available. Pi can't reuse the `~/.codex` token directly; it needs its own login (same ChatGPT account, browser OAuth — I can't drive that for you):
+
+```bash
+cd docs/design/agent-workflows/wp-1-pi-tracing/poc


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the cd path in the Pi login snippet.

Line 268 points to docs/design/agent-workflows/wp-1-pi-tracing/poc, but this doc set places that folder under docs/design/agent-workflows/trash/wp-1-pi-tracing/poc.

Suggested patch

-cd docs/design/agent-workflows/wp-1-pi-tracing/poc +cd docs/design/agent-workflows/trash/wp-1-pi-tracing/poc

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cd docs/design/agent-workflows/wp-1-pi-tracing/poc

cd docs/design/agent-workflows/trash/wp-1-pi-tracing/poc

coderabbitai · 2026-06-19T17:54:18Z

+                await daytona.delete(sandbox)
+                log("      deleted.")


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle sandbox deletion failures to avoid leaked sandboxes.

Line 320 can raise on transient API errors and skip final cleanup confirmation, leaving costly sandboxes running.

Suggested patch

- await daytona.delete(sandbox) - log(" deleted.") + try: + await daytona.delete(sandbox) + log(" deleted.") + except Exception as e: # noqa: BLE001 + log(f" delete failed for {sandbox.id}: {e}") + log(" sandbox may still be running; delete it manually.")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

await daytona.delete(sandbox)

log(" deleted.")

try:

await daytona.delete(sandbox)

log(" deleted.")

except Exception as e: # noqa: BLE001

log(f" delete failed for {sandbox.id}: {e}")

log(" sandbox may still be running; delete it manually.")

coderabbitai · 2026-06-19T17:54:18Z

+import os
+import httpx
+
+BASE = os.getenv("AGENTA_HOST", "http://144.76.237.122:8280").rstrip("/")


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use HTTPS for API endpoint with credentials.

The default API host uses unencrypted HTTP (http://144.76.237.122:8280), which transmits the API key in cleartext. Even for development/POC scripts, use HTTPS or require it via environment override. This addresses CWE-319 (cleartext transmission of sensitive information).

🔒 Suggested fix

-BASE = os.getenv("AGENTA_HOST", "http://144.76.237.122:8280").rstrip("/") +BASE = os.getenv("AGENTA_HOST", "https://144.76.237.122:8280").rstrip("/")

Alternatively, enforce HTTPS explicitly:

BASE = os.getenv("AGENTA_HOST", "https://144.76.237.122:8280").rstrip("/") if not BASE.startswith("https://"): raise ValueError("AGENTA_HOST must use HTTPS; cleartext credentials are not allowed")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

BASE = os.getenv("AGENTA_HOST", "http://144.76.237.122:8280").rstrip("/")

BASE = os.getenv("AGENTA_HOST", "https://144.76.237.122:8280").rstrip("/")

Source: Linters/SAST tools

mmabrouk · 2026-06-19T17:54:52Z

Reviewer guide: where to look

docs/design/agent-workflows/ground-truth.md — read this first. It is the implementation map and the single source of truth. Every other page defers to it.
docs/design/agent-workflows/architecture.md — the two-container runtime, the backends, and the harness matrix.
docs/design/agent-workflows/ports-and-adapters.md — the SDK runtime ports and the service and runner adapters that plug into them.
docs/design/agent-workflows/protocol.md — /invoke, /messages, /load-session, and the internal /run runner wire.
docs/design/agent-workflows/pr-stack.md — the functional breakpoints that explain how the sibling code PRs are sliced.
docs/design/agent-workflows/trash/** — archival POC notes, research spikes, and old RFCs. This is most of the line count. Do not read it line by line.

mmabrouk

Codex subagent review for #4777

No blocking findings. I compared the required ground-truth.md and pr-stack.md files against #4779's docs/agent-workflows branch; the two files have identical blob SHAs across #4777 and #4779. The changed-file lists also match, so the docs content is not drifting between those two PRs. The only cross-PR drift I saw is metadata/process-level: #4777's PR body points the runner-engine sibling at #4774, while #4779's PR body points at #4778. Since those look like duplicate docs PRs with different stack labels, coordinate which PR is authoritative before merging to avoid duplicated review state.

One non-blocking docs drift to fix or clarify:

docs/design/agent-workflows/status.md:22 says stale live comments and docs were updated. That conflicts with the same docs set at docs/design/agent-workflows/implementation-review.md:123, which says implementation comments still refer to WP-2/WP-7/WP-8, and with the sibling code PRs I sampled: services/agent/src/server.ts:2 in #4774/#4778 still starts with WP-2 Pi wrapper HTTP server, and sdks/python/agenta/sdk/agents/adapters/local.py:11 in #4771 still points at the old docs/design/agent-workflows/scratch/... path. Since this design folder asks reviewers to treat ground-truth.md and the current-state pages as authoritative, I would change the status bullet to say the stale-comment cleanup is still pending, or narrow it to docs-only cleanup if that was the intended claim.

The source-of-truth structure otherwise looks coherent: README.md and ground-truth.md both make ground-truth.md the implementation map, the current pages label planned/blocked/not-implemented work instead of presenting it as shipped, and trash/README.md clearly marks trash/ as historical and non-authoritative. I also sampled #4771-#4774 against the main implementation map: LocalBackend is a stub, NoopSessionStore is the default, the service composes the SDK runtime and backend selection, and the runner protocol/server files line up with the documented /run JSON/NDJSON contract.

Residual risk: this was a read-only GitHub review. I did not run a docs build, link checker, or any sibling PR tests.

dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Jun 19, 2026

dosubot Bot added the documentation Improvements or additions to documentation label Jun 19, 2026

vercel Bot deployed to Preview June 19, 2026 16:29 View deployment

coderabbitai Bot reviewed Jun 19, 2026

View reviewed changes

docs(agent): agent-workflows design and ground truth

36eb5d5

mmabrouk force-pushed the docs/agent-workflows-design branch from 330b1c6 to 36eb5d5 Compare June 19, 2026 17:47

vercel Bot deployed to Preview June 19, 2026 17:49 View deployment

coderabbitai Bot reviewed Jun 19, 2026

View reviewed changes

mmabrouk commented Jun 19, 2026

View reviewed changes

		AGENTA_HOST=http://144.76.237.122:8280/
		AGENTA_API_KEY=your-agenta-project-api-key


		For pi.dev it might make sense to have two adapters one for RPC and the other for json

		Success for this WP1 is:

		def arg(name: str, default: str) -> str:
		return sys.argv[sys.argv.index(name) + 1] if name in sys.argv else default

-def arg(name: str, default: str) -> str:
-    return sys.argv[sys.argv.index(name) + 1] if name in sys.argv else default
+def arg(name: str, default: str) -> str:
+    if name not in sys.argv:
+        return default
+    idx = sys.argv.index(name) + 1
+    if idx >= len(sys.argv) or sys.argv[idx].startswith("--"):
+        raise ValueError(f"Missing value for {name}")
+    return sys.argv[idx]


		The service is harness-agnostic at its core, with the two ports the design doc calls out.

		```

	cd docs/design/agent-workflows/wp-1-pi-tracing/poc
	cd docs/design/agent-workflows/trash/wp-1-pi-tracing/poc

	BASE = os.getenv("AGENTA_HOST", "http://144.76.237.122:8280").rstrip("/")
	BASE = os.getenv("AGENTA_HOST", "https://144.76.237.122:8280").rstrip("/")

Conversation

mmabrouk commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Agent-workflows: functional PR set

Context

What this changes

What to verify

How to review this PR

Uh oh!

vercel Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk commented Jun 19, 2026

Reviewer guide: where to look

Uh oh!

mmabrouk left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mmabrouk commented Jun 19, 2026 •

edited

Loading

vercel Bot commented Jun 19, 2026 •

edited

Loading

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading