Skip to content

docs(agent): agent-workflows design and ground truth#4777

Open
mmabrouk wants to merge 1 commit into
mainfrom
docs/agent-workflows-design
Open

docs(agent): agent-workflows design and ground truth#4777
mmabrouk wants to merge 1 commit into
mainfrom
docs/agent-workflows-design

Conversation

@mmabrouk

@mmabrouk mmabrouk commented Jun 19, 2026

Copy link
Copy Markdown
Member

Agent-workflows: functional PR set

Sliced by functional area, final code only (no intermediate churn). Most PRs are independent off main; two pairs are stacked. This PR's base is main.

Context

Read this PR first if you are reviewing the agent-workflows stack. It is the docs functional slice, cut from main. It does not change behavior. It explains the behavior that the sibling code PRs ship, and it gives reviewers a map so they can read those PRs in the right order.

The agent-workflows feature was sliced by functional area. This slice carries all the design pages under docs/design/agent-workflows/** and the docs/design/vault-named-secrets/** pages that the named-secret work depends on.

What this changes

It adds the current-state design pages for the agent workflow and archives the historical material that led to them.

The current-state pages are:

  • ground-truth.md: the implementation map. It lists every code surface, what is implemented, what is not, and where the tests live.
  • architecture.md: the two-container runtime, the backends, and the harnesses.
  • ports-and-adapters.md: the SDK runtime ports and the service and runner adapters that plug into them.
  • protocol.md: /invoke, /messages, /load-session, and the internal /run runner wire.
  • sessions.md: cold replay, streaming, session ids, and the missing session store.
  • agent-template.md: the intended split between agent identity, harness config, and runtime infrastructure.
  • triggers.md: the planned trigger and event integration.
  • meeting-alignment.md: where the current work matches the design discussion and where it diverges.
  • implementation-review.md: cleanup risks and slicing notes.
  • pr-stack.md: the functional breakpoints for the reviewable PRs.
  • status.md: cleanup state, decisions, and blockers.

The historical POC notes, research spikes, and superseded RFCs now live under trash/. They are kept for provenance, not as design truth. No code runs in this PR.

What to verify

This is a docs PR, so the key decision to confirm is whether the pages tell one consistent story.

How to review this PR

Read ground-truth.md first. Then read architecture.md and ports-and-adapters.md. Those three pages carry the whole model.

The line count is large, but most of it is moved POC files under trash/. That folder is archival. Do not read it line by line. Open a trash/ file only to check that a current page's claim matches the history it cites.

@dosubot dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Jun 19, 2026
@vercel

vercel Bot commented Jun 19, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jun 19, 2026 5:49pm

Request Review

@dosubot dosubot Bot added the documentation Improvements or additions to documentation label Jun 19, 2026
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds agent-workflows design/status/review/archive documentation and a separate vault named secrets planning set.

Changes

Agent workflows documentation set

Layer / File(s) Summary
Current workflow model and protocol
docs/design/agent-workflows/README.md, docs/design/agent-workflows/{ground-truth.md,status.md,architecture.md,ports-and-adapters.md,protocol.md,sessions.md,agent-template.md,triggers.md,meeting-alignment.md,open-issues.md,implementation-review.md,pr-stack.md}, docs/design/agent-workflows/adapters/*
Documents the current workflow source of truth, HTTP and runner surfaces, harness/backend mappings, session behavior, adapter behavior for Pi/Claude/Agenta, open gaps, and follow-on planning.
SDK local tools design
docs/design/agent-workflows/sdk-local-tools/{README.md,context.md,codebase-conventions.md,conventions-review.md,organization-proposal.md,plan.md,research.md,status.md}
Defines the standalone local-tools problem, repository conventions, organization proposal, phased plan, research findings, and status for SDK-owned tool resolution.
SDK local tools review package
docs/design/agent-workflows/sdk-local-tools/review/*
Adds the review scope, plan, evidence notes, findings, risks, questions, progress, scorecard, summary, and metadata for the sdk-local-tools review.
Historical archive, research, and POCs
docs/design/agent-workflows/trash/*
Adds archival READMEs, superseded redesign notes, old RFCs, research notes, and proof-of-concept scripts for tracing, agent service wrapping, Daytona sandboxing, tools, and Rivet ACP runtime experiments.

Vault named secrets planning docs

Layer / File(s) Summary
Named secret scope, research, and implementation plan
docs/design/vault-named-secrets/{README.md,context.md,research.md,plan.md,status.md}
Defines a separate design for project-scoped custom_secret vault entries, documents current vault behavior, lays out backend/frontend phases, and tracks planning status and verification steps.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.81% which is insufficient. The required threshold is 60.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main change: adding agent-workflows design and ground truth documentation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description clearly documents the purpose, scope, and verification strategy for a comprehensive design documentation package for agent-workflows functionality.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/agent-workflows-design

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (17)
docs/design/agent-workflows/status.md (1)

47-61: 💤 Low value

Consider refactoring the three successive "Should" questions into a bullet list for readability.

Lines 49–51 begin consecutive sentences with "Should", which LanguageTool flags as repetitive. While the current structure is correct and intentional (listing open decisions), a bulleted format would improve clarity:

- Should `LocalBackend` remain exported before it is implemented?
- Should MCP controls be hidden or constrained by selected harness/backend?
- Should `agenta` be hidden from non-local sandbox selections?
docs/design/agent-workflows/architecture.md (2)

27-54: ⚡ Quick win

Add language identifier to fenced code block.

The ASCII diagram block (lines 27–54) is missing a language specification. Add text or none for clarity:

📝 Proposed fix
-```
+```text
 browser / playground
     |
     | POST /invoke or POST /messages
     v
 services container
     Python workflow handler
     services/oss/src/agent/app.py
     |
     | POST /run, or spawn the runner CLI in local checkout mode
     v
 agent runner sidecar
     compose service: agent-pi
     Node HTTP server
     services/agent/src/server.ts
     |
     +-- in-process Pi engine
     |   services/agent/src/engines/pi.ts
     |
     +-- rivet engine
         services/agent/src/engines/rivet.ts
         |
         +-- sandbox-agent daemon
             |
             +-- ACP adapter: pi-acp or claude-agent-acp
                 |
                 +-- harness CLI: pi or claude
-```
+```

Source: Linters/SAST tools


56-64: 💤 Low value

Use hyphen in compound adjective "full-stack".

Line 62 uses "full stack environment" as a compound modifier. LanguageTool suggests the hyphenated form "full-stack environment" for clarity, or rephrasing to "entire stack environment":

📝 Proposed fixes

Option 1 (hyphenate):

-The sidecar deliberately does not inherit the full stack environment. Provider keys and
+The sidecar deliberately does not inherit the full-stack environment. Provider keys and

Option 2 (rephrase):

-The sidecar deliberately does not inherit the full stack environment. Provider keys and
+The sidecar deliberately does not inherit the entire stack environment. Provider keys and

Source: Linters/SAST tools

docs/design/agent-workflows/sessions.md (1)

43-51: 💤 Low value

Consider refactoring the three successive "If" conditions into a bullet list for readability.

Lines 46–50 begin consecutive sentences with "If", which LanguageTool flags as repetitive. While the structure clearly lists the three intended session_id semantics, a bulleted format would improve scannability:

The intended behavior is create-or-resume:

- If the client omits `session_id`, the server creates one and returns it.
- If the client supplies a known `session_id`, the server resumes that session.
- If the client supplies an unknown but valid `session_id`, the server creates a session using that id.
docs/design/agent-workflows/adapters/pi.md (2)

110-116: ⚡ Quick win

Add language identifier to fenced code block (span tree diagram).

The span tree diagram (lines 110–116) is missing a language specification. Add text or none:

📝 Proposed fix
-```
+```text
 invoke_agent            (AGENT)
   turn N                (CHAIN)
     chat <model>        (LLM)    real token usage from the provider call
     execute_tool <name> (TOOL)   one per tool the turn ran
-```
+```

Source: Linters/SAST tools


144-147: 💤 Low value

Simplify "in order to" to "to" for conciseness.

Line 146 uses the phrase "in order to", which LanguageTool suggests is unnecessarily wordy:

📝 Proposed fix
-For output, Pi streams pure text deltas over ACP (`agent_message_chunk`). The runner appends them in order to build the final answer.
+For output, Pi streams pure text deltas over ACP (`agent_message_chunk`). The runner appends them to build the final answer.

Source: Linters/SAST tools

docs/design/agent-workflows/trash/sdk-local-backend/status.md (1)

3-5: ⚡ Quick win

Clarify the document's purpose statement.

Lines 3–5 claim this is "the only page that describes things that do not fully exist yet," but the document immediately documents substantial completed work in the "Done and verified" section (lines 19–49). The distinction seems to be that this document uniquely contains both complete and incomplete work, whereas other design pages document only what is built. Rephrase to make that distinction explicit, for example: "This page documents both completed work and ongoing/future work; other design pages in this directory document only what is currently built."

docs/design/agent-workflows/sdk-local-tools/review/scope.md (1)

14-21: ⚡ Quick win

Update branch reference for final merge.

Line 19 references gitbutler/workspace as a "working tree," indicating this scope was drafted during development. Before merging to main, update this reference to the final branch name (or commit hash if merging directly to main). This ensures the scope document accurately reflects the reviewed state.

docs/design/agent-workflows/trash/harness-port-redesign/research.md (1)

15-20: 💤 Low value

Add language specifiers to fenced code blocks per markdownlint (MD040).

Code blocks should declare their language for proper syntax highlighting. The Python class at lines 15–20, TypeScript Session interface at lines 83–94, and SessionPersistDriver interface at lines 130–138 are missing language specifiers.

Fix language specifiers
-```
+```python
 class Harness(ABC):

-```
+```ts
 class Session {

-```
+```ts
 interface SessionPersistDriver {

Also applies to: 83-94, 130-138

Source: Linters/SAST tools

docs/design/agent-workflows/trash/harness-port-redesign/proposal.md (1)

141-141: 💤 Low value

Fix hyphenation in compound adjectives (LanguageTool grammar feedback).

When compound adjectives modify a noun, they should be hyphenated. Lines 141 and 146 have two instances:

  • Line 141: "client side streaming" → "client-side streaming"
  • Line 146: "per invoke sandboxes" → "per-invocation sandboxes" (or "per-invoke sandboxes" if abbreviating "invoke")

Also applies to: 146-146

Source: Linters/SAST tools

docs/design/agent-workflows/trash/harness-port-redesign/plan.md (1)

93-93: 💤 Low value

Hyphenate compound modifier on line 93.

"Cross cutting" should be "Cross-cutting" (compound adjective).

Source: Linters/SAST tools

docs/design/agent-workflows/trash/wp-1-pi-tracing/integrating-the-tracing-extension.md (1)

24-29: 💤 Low value

Add language specifiers to fenced code blocks (MD040).

Three code blocks lack language identifiers for syntax highlighting:

  • Lines 24–29: span tree structure (use ```text or ```yaml)
  • Lines 148–156: dependency list (use ```text)
  • Lines 162–170: curl command (use ```bash)
Add language specifiers
-```
+```text
 invoke_agent

-```
+```text
 `@earendil-works/pi-coding-agent`  0.79.4

-```
+```bash
 curl -s "${AGENTA_HOST}/api/spans/?trace_id=<id>"

Also applies to: 148-156, 162-170

Source: Linters/SAST tools

docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/commit_agent_config.py (3)

1-4: 💤 Low value

Ensure Python formatting compliance before commit per coding guidelines.

This Python file should be run through ruff format and ruff check --fix before merging, per the coding guidelines for **/*.py files. While this is a POC script in the historical trash directory, maintaining formatting consistency is valuable.

Source: Coding guidelines


15-18: 💤 Low value

Consider HTTPS for the default Agenta host URL.

The default URL uses HTTP (line 15), which ast-grep flags as CWE-319. While this is a POC script pointing to a local dev box, consider using HTTPS as the default or allowing it to be overridden more safely. For production use, the caller would override AGENTA_HOST with an HTTPS URL, but a more defensive default would be clearer.

Source: Linters/SAST tools


23-75: ⚡ Quick win

Add error handling for JSON parsing and required env vars.

The script calls .json() on HTTP responses (lines 31, 69) without defensive handling if the response body is malformed. Additionally, AGENTA_API_KEY is read directly without a check (line 16). While this is a POC, adding basic validation would improve robustness:

  • Validate AGENTA_API_KEY is set before use
  • Wrap .json() calls in try/except to provide helpful error messages if the Agenta API response is unexpected
docs/design/agent-workflows/trash/research/pi-interaction.md (1)

58-58: ⚡ Quick win

Add language identifier to code fence.

Line 58 opens a code block without specifying the language. Based on the content (bash commands), please add ```bash instead of ```.

📝 Proposed fix
-\`\`\`
+\`\`\`bash
 npm install `@earendil-works/pi-coding-agent`   # SDK + CLI
docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/architecture.md (1)

10-43: ⚡ Quick win

Add language identifier to ASCII diagram code fence.

The code fence at line 10 contains an ASCII diagram and has no language specified. Consider using ```text or ```diagram to clarify intent, though the content is clear as-is.

📝 Proposed fix
-\`\`\`
+\`\`\`text
                  unchanged
   ┌───────────────────────────────────────────────┐

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 2e8ebdac-ec42-4f55-881a-919108e5e706

📥 Commits

Reviewing files that changed from the base of the PR and between a97e608 and 330b1c6.

⛔ Files ignored due to path filters (1)
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (95)
  • docs/design/agent-workflows/README.md
  • docs/design/agent-workflows/adapters/agenta.md
  • docs/design/agent-workflows/adapters/claude-code.md
  • docs/design/agent-workflows/adapters/pi.md
  • docs/design/agent-workflows/agent-template.md
  • docs/design/agent-workflows/architecture.md
  • docs/design/agent-workflows/ground-truth.md
  • docs/design/agent-workflows/implementation-review.md
  • docs/design/agent-workflows/meeting-alignment.md
  • docs/design/agent-workflows/open-issues.md
  • docs/design/agent-workflows/ports-and-adapters.md
  • docs/design/agent-workflows/pr-stack.md
  • docs/design/agent-workflows/protocol.md
  • docs/design/agent-workflows/sdk-local-tools/README.md
  • docs/design/agent-workflows/sdk-local-tools/codebase-conventions.md
  • docs/design/agent-workflows/sdk-local-tools/context.md
  • docs/design/agent-workflows/sdk-local-tools/conventions-review.md
  • docs/design/agent-workflows/sdk-local-tools/organization-proposal.md
  • docs/design/agent-workflows/sdk-local-tools/plan.md
  • docs/design/agent-workflows/sdk-local-tools/research.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/app-mcp-reassign.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/attach-orthogonal-mutation.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/description-default-inconsistency.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-no-logging.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-orthogonal-untested.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/handler-resolution-error.md
  • docs/design/agent-workflows/sdk-local-tools/review/findings.md
  • docs/design/agent-workflows/sdk-local-tools/review/metadata.json
  • docs/design/agent-workflows/sdk-local-tools/review/plan.md
  • docs/design/agent-workflows/sdk-local-tools/review/progress.md
  • docs/design/agent-workflows/sdk-local-tools/review/questions.md
  • docs/design/agent-workflows/sdk-local-tools/review/risks.md
  • docs/design/agent-workflows/sdk-local-tools/review/scope.md
  • docs/design/agent-workflows/sdk-local-tools/review/scorecard.md
  • docs/design/agent-workflows/sdk-local-tools/review/summary.md
  • docs/design/agent-workflows/sdk-local-tools/status.md
  • docs/design/agent-workflows/sessions.md
  • docs/design/agent-workflows/status.md
  • docs/design/agent-workflows/trash/README.md
  • docs/design/agent-workflows/trash/harness-port-redesign/README.md
  • docs/design/agent-workflows/trash/harness-port-redesign/implementation.md
  • docs/design/agent-workflows/trash/harness-port-redesign/plan.md
  • docs/design/agent-workflows/trash/harness-port-redesign/proposal.md
  • docs/design/agent-workflows/trash/harness-port-redesign/research.md
  • docs/design/agent-workflows/trash/harness-port-redesign/status.md
  • docs/design/agent-workflows/trash/old-rfcs/agent-protocol-rfc.md
  • docs/design/agent-workflows/trash/old-rfcs/streaming-and-sessions.md
  • docs/design/agent-workflows/trash/research/auth-secrets.md
  • docs/design/agent-workflows/trash/research/daytona-sandbox.md
  • docs/design/agent-workflows/trash/research/diskless-in-memory-config.md
  • docs/design/agent-workflows/trash/research/open-questions.md
  • docs/design/agent-workflows/trash/research/otel-instrumentation.md
  • docs/design/agent-workflows/trash/research/pi-interaction.md
  • docs/design/agent-workflows/trash/research/sandbox-sharing.md
  • docs/design/agent-workflows/trash/sdk-local-backend/status.md
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/README.md
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/integrating-the-tracing-extension.md
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/.env.example
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/README.md
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/agenta-otel.ts
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/package.json
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/run.ts
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/tracing-in-the-agent-service.md
  • docs/design/agent-workflows/trash/wp-2-agent-service/README.md
  • docs/design/agent-workflows/trash/wp-2-agent-service/implementation-plan.md
  • docs/design/agent-workflows/trash/wp-2-agent-service/qa.md
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/README.md
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/README.md
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/bench_coldstart.py
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/build_snapshot.py
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/cleanup.py
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/run_agent.py
  • docs/design/agent-workflows/trash/wp-4-multi-message-output/README.md
  • docs/design/agent-workflows/trash/wp-5-chat-vs-completion/README.md
  • docs/design/agent-workflows/trash/wp-6-workflow-type-and-template/README.md
  • docs/design/agent-workflows/trash/wp-7-tools/README.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/README.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/architecture.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/context.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/isolation-and-fork.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/plan.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/build_rivet_snapshot.py
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/commit_agent_config.py
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/debug-events.ts
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/dump-full.ts
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/package.json
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/spike.ts
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/research.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/status.md
  • docs/design/agent-workflows/triggers.md
  • docs/design/vault-named-secrets/README.md
  • docs/design/vault-named-secrets/context.md
  • docs/design/vault-named-secrets/plan.md
  • docs/design/vault-named-secrets/research.md
  • docs/design/vault-named-secrets/status.md

Comment on lines +328 to +355
- [daytona-sandboxes] Daytona — Sandboxes (lifecycle, states, auto-stop/archive/delete,
refreshActivity, resource limits, per-sandbox isolation):
https://www.daytona.io/docs/en/sandboxes/
- [daytona-process] Daytona — Process and Code Execution (exec/code_run vs Sessions, cwd,
env, create/execute/get/delete session): https://www.daytona.io/docs/en/process-code-execution/
- [daytona-process-src] Daytona docs source — process-code-execution.mdx (verbatim session
example, SessionExecuteRequest fields):
https://github.com/daytonaio/daytona/blob/main/apps/docs/src/content/docs/en/process-code-execution.mdx
- [daytona-volumes] Daytona — Volumes (creation, VolumeMount, mount_path/subpath, FUSE,
mounting via CreateSandboxFromSnapshotParams): https://www.daytona.io/docs/en/volumes/
- [daytona-volumes-src] Daytona docs source — volumes.mdx (verbatim "mounted at creation",
persistence, FUSE not transactional, last-write-wins):
https://github.com/daytonaio/daytona/blob/main/apps/docs/src/content/docs/en/volumes.mdx
- [daytona-fuse-issue] Daytona GitHub issue #3331 — FUSE volume permission limitations
(mv/touch/stat/copystat failures): https://github.com/daytonaio/daytona/issues/3331
- [daytona-parallel-issue] Daytona GitHub issue #4001 — Design and Implement Parallel
Sandbox Execution API (fork filesystem+memory; current workaround = many sandboxes):
https://github.com/daytonaio/daytona/issues/4001
- [daytona-blog-best] Northflank — "Best code execution sandbox for AI agents 2026"
(isolated sandbox per execution; Docker-default isolation weaker than microVMs):
https://northflank.com/blog/best-code-execution-sandbox-for-ai-agents
- [pi-home] pi.dev — product overview (harness, modes, AGENTS.md/SYSTEM.md context):
https://pi.dev
- [pi-docs] pi.dev — docs index (session tree, JSONL session format, RPC/SDK modes):
https://pi.dev/docs/latest
- [pi-sdk] pi.dev — SDK/RPC (SessionManager.create/continueRecent/open/inMemory, cwd,
agentDir, runRpcMode, `--mode rpc --no-session`): https://pi.dev/docs/latest/sdk
- Agenta repo — `api/oss/src/utils/env.py` `DaytonaConfig` (DAYTONA_API_KEY,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Define the source labels as actual Markdown reference definitions.

The entries in ## Sources are formatted as list items, so labels like [daytona-process] are not defined and earlier references stay unresolved.

Suggested fix pattern
- - [daytona-sandboxes] Daytona — Sandboxes ...:
-   https://www.daytona.io/docs/en/sandboxes/
- - [daytona-process] Daytona — Process and Code Execution ...:
-   https://www.daytona.io/docs/en/process-code-execution/
+ [daytona-sandboxes]: https://www.daytona.io/docs/en/sandboxes/
+ [daytona-process]: https://www.daytona.io/docs/en/process-code-execution/
+ [daytona-process-src]: https://github.com/daytonaio/daytona/blob/main/apps/docs/src/content/docs/en/process-code-execution.mdx
+ [daytona-volumes]: https://www.daytona.io/docs/en/volumes/
+ [daytona-volumes-src]: https://github.com/daytonaio/daytona/blob/main/apps/docs/src/content/docs/en/volumes.mdx
+ [daytona-fuse-issue]: https://github.com/daytonaio/daytona/issues/3331
+ [daytona-parallel-issue]: https://github.com/daytonaio/daytona/issues/4001
+ [daytona-blog-best]: https://northflank.com/blog/best-code-execution-sandbox-for-ai-agents
+ [pi-home]: https://pi.dev
+ [pi-docs]: https://pi.dev/docs/latest
+ [pi-sdk]: https://pi.dev/docs/latest/sdk

Source: Linters/SAST tools

Comment on lines +2 to +3
AGENTA_HOST=http://144.76.237.122:8280/
AGENTA_API_KEY=your-agenta-project-api-key

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use a safe default collector endpoint in the example env file.

AGENTA_HOST is currently an http:// hard-coded host. That can expose Authorization: ApiKey and captured span content over plaintext transport if copied as-is.

Suggested fix
-AGENTA_HOST=http://144.76.237.122:8280/
+AGENTA_HOST=https://cloud.agenta.ai
 AGENTA_API_KEY=your-agenta-project-api-key
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
AGENTA_HOST=http://144.76.237.122:8280/
AGENTA_API_KEY=your-agenta-project-api-key
AGENTA_HOST=https://cloud.agenta.ai
AGENTA_API_KEY=your-agenta-project-api-key
🧰 Tools
🪛 dotenv-linter (4.0.0)

[warning] 3-3: [UnorderedKey] The AGENTA_API_KEY key should go before the AGENTA_HOST key

(UnorderedKey)

Comment on lines +15 to +20
```
invoke_agent (openinference.span.kind = AGENT, carries session.id)
turn N (CHAIN)
chat <model> (LLM — model, latency, token usage, finish reason)
execute_tool <name> (TOOL — args + result)
```

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add languages to fenced code blocks to clear MD040.

Both fenced blocks are untyped; markdownlint flags them.

Suggested fix
-```
+```text
 invoke_agent              (openinference.span.kind = AGENT, carries session.id)
   turn N                  (CHAIN)
     chat <model>          (LLM   — model, latency, token usage, finish reason)
     execute_tool <name>   (TOOL  — args + result)

@@
- +text
invoke_agent (agent) ag.data.inputs={prompt}, ag.data.outputs=text, ag.session.id, cumulative tokens
turn N (chain)
chat (chat) ag.data.inputs.prompt[] + ag.data.outputs.completion[] (OpenInference
messages), ag.meta.request.model, incremental token usage
execute_tool (tool) ag.data.inputs={args}, ag.data.outputs=result

Also applies to: 67-73

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 15-15: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

Source: Linters/SAST tools

Comment on lines +26 to +32
```
_agent workflow (the Python /invoke span, root)
invoke_agent AGENT (the Pi run, now a child of _agent)
turn N CHAIN
chat <model> LLM model, tokens, cost, message thread
execute_tool ... TOOL
```

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add fenced-code languages to satisfy markdownlint (MD040).

The two code fences are missing language identifiers.

Suggested fix
-```
+```text
 _agent                 workflow   (the Python /invoke span, root)
   invoke_agent         AGENT      (the Pi run, now a child of _agent)
     turn N             CHAIN
       chat <model>     LLM        model, tokens, cost, message thread
       execute_tool ... TOOL

@@

  • curl -s "${AGENTA_HOST}/api/spans/?trace_id=<id>" -H "Authorization: ApiKey ${AGENTA_API_KEY}"
</details>


Also applies to: 102-104

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>

[warning] 26-26: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<!-- fingerprinting:phantom:poseidon:hawk -->

<!-- cr-comment:v1:b8932b53ba63b2c340391c34 -->

_Source: Linters/SAST tools_

<!-- This is an auto-generated comment by CodeRabbit -->


For pi.dev it might make sense to have two adapters one for RPC and the other for json

Success for this WP1 is:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the work-package label typo in success criteria.

This section is in WP-2 but says “Success for this WP1,” which makes scope tracking ambiguous.

Suggested fix
-Success for this WP1 is:
+Success for this WP-2 is:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Success for this WP1 is:
Success for this WP-2 is:

Comment on lines +30 to +42
for snap in SNAPSHOTS:
times: list[float] = []
for i in range(N):
t = time.monotonic()
sb = daytona.create(
CreateSandboxFromSnapshotParams(snapshot=snap, auto_stop_interval=0),
timeout=120,
)
dt = time.monotonic() - t
times.append(dt)
print(f"{snap:20} run {i + 1}/{N}: {dt:.2f}s state={sb.state}", flush=True)
daytona.delete(sb)
results[snap] = times

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guarantee sandbox teardown with try/finally per run.

If an exception occurs after daytona.create(...) and before daytona.delete(sb), the sandbox can be left running.

Proposed fix
 for snap in SNAPSHOTS:
     times: list[float] = []
     for i in range(N):
-        t = time.monotonic()
-        sb = daytona.create(
-            CreateSandboxFromSnapshotParams(snapshot=snap, auto_stop_interval=0),
-            timeout=120,
-        )
-        dt = time.monotonic() - t
-        times.append(dt)
-        print(f"{snap:20} run {i + 1}/{N}: {dt:.2f}s  state={sb.state}", flush=True)
-        daytona.delete(sb)
+        sb = None
+        try:
+            t = time.monotonic()
+            sb = daytona.create(
+                CreateSandboxFromSnapshotParams(snapshot=snap, auto_stop_interval=0),
+                timeout=120,
+            )
+            dt = time.monotonic() - t
+            times.append(dt)
+            print(f"{snap:20} run {i + 1}/{N}: {dt:.2f}s  state={sb.state}", flush=True)
+        finally:
+            if sb is not None:
+                daytona.delete(sb)
     results[snap] = times

Comment on lines +183 to +184
def arg(name: str, default: str) -> str:
return sys.argv[sys.argv.index(name) + 1] if name in sys.argv else default

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Handle missing flag values in arg() to avoid IndexError.

Passing --auth or --model without a value currently crashes instead of exiting cleanly.

Proposed fix
 def arg(name: str, default: str) -> str:
-    return sys.argv[sys.argv.index(name) + 1] if name in sys.argv else default
+    if name not in sys.argv:
+        return default
+    idx = sys.argv.index(name) + 1
+    if idx >= len(sys.argv) or sys.argv[idx].startswith("--"):
+        raise ValueError(f"Missing value for {name}")
+    return sys.argv[idx]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def arg(name: str, default: str) -> str:
return sys.argv[sys.argv.index(name) + 1] if name in sys.argv else default
def arg(name: str, default: str) -> str:
if name not in sys.argv:
return default
idx = sys.argv.index(name) + 1
if idx >= len(sys.argv) or sys.argv[idx].startswith("--"):
raise ValueError(f"Missing value for {name}")
return sys.argv[idx]

Comment on lines +259 to +266
pi_cmd = (
f"cd {run_dir} && TMPDIR={run_dir}/tmp "
f"pi -p {json.dumps(PROMPT)} "
f"--mode json --approve --provider {provider} --model {model} "
f"-t read,bash,edit,write,ls "
f"--session-dir {run_dir}/.pi-sessions --name {session_id} "
f"< /dev/null"
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Quote CLI-derived values before composing the shell command.

provider/model are interpolated directly into pi_cmd; crafted values can break out of the intended command.

Proposed fix
 import asyncio
 import json
 import os
+import shlex
 import sys
@@
             pi_cmd = (
-                f"cd {run_dir} && TMPDIR={run_dir}/tmp "
-                f"pi -p {json.dumps(PROMPT)} "
-                f"--mode json --approve --provider {provider} --model {model} "
+                f"cd {shlex.quote(run_dir)} && TMPDIR={shlex.quote(run_dir + '/tmp')} "
+                f"pi -p {shlex.quote(PROMPT)} "
+                f"--mode json --approve --provider {shlex.quote(provider)} --model {shlex.quote(model)} "
                 f"-t read,bash,edit,write,ls "
-                f"--session-dir {run_dir}/.pi-sessions --name {session_id} "
+                f"--session-dir {shlex.quote(run_dir + '/.pi-sessions')} --name {shlex.quote(session_id)} "
                 f"< /dev/null"
             )
🧰 Tools
🪛 ast-grep (0.43.0)

[info] 260-260: use jsonify instead of json.dumps for JSON output
Context: json.dumps(PROMPT)
Note: Security best practice.

(use-jsonify)

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: bac6cb92-ec42-46bb-87c6-e1728ccfbda2

📥 Commits

Reviewing files that changed from the base of the PR and between 330b1c6 and 36eb5d5.

⛔ Files ignored due to path filters (1)
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (95)
  • docs/design/agent-workflows/README.md
  • docs/design/agent-workflows/adapters/agenta.md
  • docs/design/agent-workflows/adapters/claude-code.md
  • docs/design/agent-workflows/adapters/pi.md
  • docs/design/agent-workflows/agent-template.md
  • docs/design/agent-workflows/architecture.md
  • docs/design/agent-workflows/ground-truth.md
  • docs/design/agent-workflows/implementation-review.md
  • docs/design/agent-workflows/meeting-alignment.md
  • docs/design/agent-workflows/open-issues.md
  • docs/design/agent-workflows/ports-and-adapters.md
  • docs/design/agent-workflows/pr-stack.md
  • docs/design/agent-workflows/protocol.md
  • docs/design/agent-workflows/sdk-local-tools/README.md
  • docs/design/agent-workflows/sdk-local-tools/codebase-conventions.md
  • docs/design/agent-workflows/sdk-local-tools/context.md
  • docs/design/agent-workflows/sdk-local-tools/conventions-review.md
  • docs/design/agent-workflows/sdk-local-tools/organization-proposal.md
  • docs/design/agent-workflows/sdk-local-tools/plan.md
  • docs/design/agent-workflows/sdk-local-tools/research.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/app-mcp-reassign.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/attach-orthogonal-mutation.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/description-default-inconsistency.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-no-logging.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-orthogonal-untested.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/handler-resolution-error.md
  • docs/design/agent-workflows/sdk-local-tools/review/findings.md
  • docs/design/agent-workflows/sdk-local-tools/review/metadata.json
  • docs/design/agent-workflows/sdk-local-tools/review/plan.md
  • docs/design/agent-workflows/sdk-local-tools/review/progress.md
  • docs/design/agent-workflows/sdk-local-tools/review/questions.md
  • docs/design/agent-workflows/sdk-local-tools/review/risks.md
  • docs/design/agent-workflows/sdk-local-tools/review/scope.md
  • docs/design/agent-workflows/sdk-local-tools/review/scorecard.md
  • docs/design/agent-workflows/sdk-local-tools/review/summary.md
  • docs/design/agent-workflows/sdk-local-tools/status.md
  • docs/design/agent-workflows/sessions.md
  • docs/design/agent-workflows/status.md
  • docs/design/agent-workflows/trash/README.md
  • docs/design/agent-workflows/trash/harness-port-redesign/README.md
  • docs/design/agent-workflows/trash/harness-port-redesign/implementation.md
  • docs/design/agent-workflows/trash/harness-port-redesign/plan.md
  • docs/design/agent-workflows/trash/harness-port-redesign/proposal.md
  • docs/design/agent-workflows/trash/harness-port-redesign/research.md
  • docs/design/agent-workflows/trash/harness-port-redesign/status.md
  • docs/design/agent-workflows/trash/old-rfcs/agent-protocol-rfc.md
  • docs/design/agent-workflows/trash/old-rfcs/streaming-and-sessions.md
  • docs/design/agent-workflows/trash/research/auth-secrets.md
  • docs/design/agent-workflows/trash/research/daytona-sandbox.md
  • docs/design/agent-workflows/trash/research/diskless-in-memory-config.md
  • docs/design/agent-workflows/trash/research/open-questions.md
  • docs/design/agent-workflows/trash/research/otel-instrumentation.md
  • docs/design/agent-workflows/trash/research/pi-interaction.md
  • docs/design/agent-workflows/trash/research/sandbox-sharing.md
  • docs/design/agent-workflows/trash/sdk-local-backend/status.md
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/README.md
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/integrating-the-tracing-extension.md
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/.env.example
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/README.md
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/agenta-otel.ts
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/package.json
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/run.ts
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/tracing-in-the-agent-service.md
  • docs/design/agent-workflows/trash/wp-2-agent-service/README.md
  • docs/design/agent-workflows/trash/wp-2-agent-service/implementation-plan.md
  • docs/design/agent-workflows/trash/wp-2-agent-service/qa.md
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/README.md
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/README.md
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/bench_coldstart.py
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/build_snapshot.py
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/cleanup.py
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/run_agent.py
  • docs/design/agent-workflows/trash/wp-4-multi-message-output/README.md
  • docs/design/agent-workflows/trash/wp-5-chat-vs-completion/README.md
  • docs/design/agent-workflows/trash/wp-6-workflow-type-and-template/README.md
  • docs/design/agent-workflows/trash/wp-7-tools/README.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/README.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/architecture.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/context.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/isolation-and-fork.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/plan.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/build_rivet_snapshot.py
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/commit_agent_config.py
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/debug-events.ts
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/dump-full.ts
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/package.json
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/spike.ts
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/research.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/status.md
  • docs/design/agent-workflows/triggers.md
  • docs/design/vault-named-secrets/README.md
  • docs/design/vault-named-secrets/context.md
  • docs/design/vault-named-secrets/plan.md
  • docs/design/vault-named-secrets/research.md
  • docs/design/vault-named-secrets/status.md
✅ Files skipped from review due to trivial changes (47)
  • docs/design/agent-workflows/sdk-local-tools/review/risks.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-orthogonal-untested.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/build_rivet_snapshot.py
  • docs/design/vault-named-secrets/README.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/description-default-inconsistency.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/package.json
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/attach-orthogonal-mutation.md
  • docs/design/agent-workflows/agent-template.md
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/README.md
  • docs/design/agent-workflows/sdk-local-tools/review/metadata.json
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/README.md
  • docs/design/agent-workflows/trash/wp-4-multi-message-output/README.md
  • docs/design/agent-workflows/trash/README.md
  • docs/design/agent-workflows/sdk-local-tools/review/scorecard.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/context.md
  • docs/design/agent-workflows/sdk-local-tools/review/questions.md
  • docs/design/agent-workflows/pr-stack.md
  • docs/design/agent-workflows/trash/sdk-local-backend/status.md
  • docs/design/agent-workflows/open-issues.md
  • docs/design/agent-workflows/sdk-local-tools/review/scope.md
  • docs/design/agent-workflows/meeting-alignment.md
  • docs/design/agent-workflows/ports-and-adapters.md
  • docs/design/agent-workflows/sdk-local-tools/review/findings.md
  • docs/design/agent-workflows/sdk-local-tools/review/progress.md
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/README.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/README.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/handler-resolution-error.md
  • docs/design/agent-workflows/README.md
  • docs/design/agent-workflows/trash/harness-port-redesign/implementation.md
  • docs/design/agent-workflows/sdk-local-tools/review/plan.md
  • docs/design/agent-workflows/trash/wp-2-agent-service/README.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/plan.md
  • docs/design/agent-workflows/adapters/agenta.md
  • docs/design/agent-workflows/trash/research/daytona-sandbox.md
  • docs/design/agent-workflows/sdk-local-tools/status.md
  • docs/design/agent-workflows/trash/wp-5-chat-vs-completion/README.md
  • docs/design/vault-named-secrets/context.md
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/isolation-and-fork.md
  • docs/design/agent-workflows/trash/wp-6-workflow-type-and-template/README.md
  • docs/design/agent-workflows/sdk-local-tools/review/summary.md
  • docs/design/agent-workflows/trash/research/diskless-in-memory-config.md
  • docs/design/agent-workflows/trash/harness-port-redesign/README.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/gateway-no-logging.md
  • docs/design/agent-workflows/ground-truth.md
  • docs/design/agent-workflows/sdk-local-tools/review/evidence/app-mcp-reassign.md
  • docs/design/vault-named-secrets/status.md
  • docs/design/vault-named-secrets/research.md
🚧 Files skipped from review as they are similar to previous changes (11)
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/package.json
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/debug-events.ts
  • docs/design/agent-workflows/triggers.md
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/build_snapshot.py
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/dump-full.ts
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/cleanup.py
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/run.ts
  • docs/design/agent-workflows/trash/wp-3-daytona-sandbox/poc/bench_coldstart.py
  • docs/design/agent-workflows/trash/wp-1-pi-tracing/poc/agenta-otel.ts
  • docs/design/agent-workflows/trash/wp-8-rivet-acp-runtime/poc/spike.ts
  • docs/design/agent-workflows/protocol.md


## The span tree it produces

```

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add fence languages to satisfy markdownlint MD040.

Line 24, Line 148, and Line 162 open fenced blocks without a language identifier.

Suggested patch
-```
+```text
 invoke_agent              openinference.span.kind = AGENT   (root, one per user prompt)
   turn N                  CHAIN
     chat <model>          LLM    model, latency, token usage, finish reason, messages
     execute_tool <name>   TOOL   args in, result out

- +text
@earendil-works/pi-coding-agent 0.79.4
@opentelemetry/api 1.9.0
@opentelemetry/exporter-trace-otlp-proto 0.54.0
@opentelemetry/resources 1.28.0
@opentelemetry/sdk-trace-base 1.28.0
@opentelemetry/sdk-trace-node 1.28.0
@opentelemetry/semantic-conventions 1.28.0


-   ```
+   ```bash
   curl -s "${AGENTA_HOST}/api/spans/?trace_id=<id>" -H "Authorization: ApiKey ${AGENTA_API_KEY}"
   ```

Also applies to: 148-148, 162-162

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 24-24: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

Source: Linters/SAST tools


The service is harness-agnostic at its core, with the two ports the design doc calls out.

```

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a language identifier to the architecture code fence.

Line 154 opens a fenced block without a language, which triggers MD040.

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 154-154: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

Source: Linters/SAST tools

`~/.pi/agent` doesn't exist yet, so no model is available. Pi can't reuse the `~/.codex` token directly; it needs its own login (same ChatGPT account, browser OAuth — I can't drive that for you):

```bash
cd docs/design/agent-workflows/wp-1-pi-tracing/poc

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the cd path in the Pi login snippet.

Line 268 points to docs/design/agent-workflows/wp-1-pi-tracing/poc, but this doc set places that folder under docs/design/agent-workflows/trash/wp-1-pi-tracing/poc.

Suggested patch
-cd docs/design/agent-workflows/wp-1-pi-tracing/poc
+cd docs/design/agent-workflows/trash/wp-1-pi-tracing/poc
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cd docs/design/agent-workflows/wp-1-pi-tracing/poc
cd docs/design/agent-workflows/trash/wp-1-pi-tracing/poc

Comment on lines +320 to +321
await daytona.delete(sandbox)
log(" deleted.")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle sandbox deletion failures to avoid leaked sandboxes.

Line 320 can raise on transient API errors and skip final cleanup confirmation, leaving costly sandboxes running.

Suggested patch
-                await daytona.delete(sandbox)
-                log("      deleted.")
+                try:
+                    await daytona.delete(sandbox)
+                    log("      deleted.")
+                except Exception as e:  # noqa: BLE001
+                    log(f"      delete failed for {sandbox.id}: {e}")
+                    log("      sandbox may still be running; delete it manually.")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
await daytona.delete(sandbox)
log(" deleted.")
try:
await daytona.delete(sandbox)
log(" deleted.")
except Exception as e: # noqa: BLE001
log(f" delete failed for {sandbox.id}: {e}")
log(" sandbox may still be running; delete it manually.")

import os
import httpx

BASE = os.getenv("AGENTA_HOST", "http://144.76.237.122:8280").rstrip("/")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use HTTPS for API endpoint with credentials.

The default API host uses unencrypted HTTP (http://144.76.237.122:8280), which transmits the API key in cleartext. Even for development/POC scripts, use HTTPS or require it via environment override. This addresses CWE-319 (cleartext transmission of sensitive information).

🔒 Suggested fix
-BASE = os.getenv("AGENTA_HOST", "http://144.76.237.122:8280").rstrip("/")
+BASE = os.getenv("AGENTA_HOST", "https://144.76.237.122:8280").rstrip("/")

Alternatively, enforce HTTPS explicitly:

BASE = os.getenv("AGENTA_HOST", "https://144.76.237.122:8280").rstrip("/")
if not BASE.startswith("https://"):
    raise ValueError("AGENTA_HOST must use HTTPS; cleartext credentials are not allowed")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
BASE = os.getenv("AGENTA_HOST", "http://144.76.237.122:8280").rstrip("/")
BASE = os.getenv("AGENTA_HOST", "https://144.76.237.122:8280").rstrip("/")

Source: Linters/SAST tools

@mmabrouk

Copy link
Copy Markdown
Member Author

Reviewer guide: where to look

  • docs/design/agent-workflows/ground-truth.md — read this first. It is the implementation map and the single source of truth. Every other page defers to it.
  • docs/design/agent-workflows/architecture.md — the two-container runtime, the backends, and the harness matrix.
  • docs/design/agent-workflows/ports-and-adapters.md — the SDK runtime ports and the service and runner adapters that plug into them.
  • docs/design/agent-workflows/protocol.md/invoke, /messages, /load-session, and the internal /run runner wire.
  • docs/design/agent-workflows/pr-stack.md — the functional breakpoints that explain how the sibling code PRs are sliced.
  • docs/design/agent-workflows/trash/** — archival POC notes, research spikes, and old RFCs. This is most of the line count. Do not read it line by line.

@mmabrouk mmabrouk left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex subagent review for #4777

No blocking findings. I compared the required ground-truth.md and pr-stack.md files against #4779's docs/agent-workflows branch; the two files have identical blob SHAs across #4777 and #4779. The changed-file lists also match, so the docs content is not drifting between those two PRs. The only cross-PR drift I saw is metadata/process-level: #4777's PR body points the runner-engine sibling at #4774, while #4779's PR body points at #4778. Since those look like duplicate docs PRs with different stack labels, coordinate which PR is authoritative before merging to avoid duplicated review state.

One non-blocking docs drift to fix or clarify:

  • docs/design/agent-workflows/status.md:22 says stale live comments and docs were updated. That conflicts with the same docs set at docs/design/agent-workflows/implementation-review.md:123, which says implementation comments still refer to WP-2/WP-7/WP-8, and with the sibling code PRs I sampled: services/agent/src/server.ts:2 in #4774/#4778 still starts with WP-2 Pi wrapper HTTP server, and sdks/python/agenta/sdk/agents/adapters/local.py:11 in #4771 still points at the old docs/design/agent-workflows/scratch/... path. Since this design folder asks reviewers to treat ground-truth.md and the current-state pages as authoritative, I would change the status bullet to say the stale-comment cleanup is still pending, or narrow it to docs-only cleanup if that was the intended claim.

The source-of-truth structure otherwise looks coherent: README.md and ground-truth.md both make ground-truth.md the implementation map, the current pages label planned/blocked/not-implemented work instead of presenting it as shipped, and trash/README.md clearly marks trash/ as historical and non-authoritative. I also sampled #4771-#4774 against the main implementation map: LocalBackend is a stub, NoopSessionStore is the default, the service composes the SDK runtime and backend selection, and the runner protocol/server files line up with the documented /run JSON/NDJSON contract.

Residual risk: this was a read-only GitHub review. I did not run a docs build, link checker, or any sibling PR tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant