chore(hosting): wire the agent runner sidecar into compose#4776
chore(hosting): wire the agent runner sidecar into compose#4776mmabrouk wants to merge 1 commit into
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughThe EE dev ChangesEE Dev docker-compose Updates
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: e6864c15-c994-4861-8084-b3f87649d438
📒 Files selected for processing (1)
hosting/docker-compose/ee/docker-compose.dev.yml
| watchmedo auto-restart --directory=/app/ee/src --directory=/app/ee/databases --directory=/app/oss/src | ||
| --directory=/app/oss/databases --directory=/app/entrypoints --directory=/sdks/python/agenta | ||
| --directory=/clients/python/agenta_client --pattern=*.py --recursive --ignore-patterns=*/tests/* -- | ||
| watchmedo auto-restart --directory=/app/ --pattern=*.py --recursive --ignore-patterns=*/tests/* -- |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Inspect worker watch directories and mounted source directories.
# Expected: worker watchmedo commands include the mounted SDK/client paths or those mounts are intentionally not hot-reloaded.
sed -n '141,264p' hosting/docker-compose/ee/docker-compose.dev.ymlRepository: Agenta-AI/agenta
Length of output: 5516
🏁 Script executed:
# Check structure of SDK and client directories
find ../../../sdks/python -maxdepth 2 -type d 2>/dev/null | head -20
echo "---"
find ../../../clients/python -maxdepth 2 -type d 2>/dev/null | head -20Repository: Agenta-AI/agenta
Length of output: 64
🏁 Script executed:
# Search for API service hot-reload configuration to understand expected pattern
rg -A 10 "watchmedo.*directory=" hosting/docker-compose/ee/ --no-headingRepository: Agenta-AI/agenta
Length of output: 4521
🏁 Script executed:
# Check if there are any existing references to watching SDK/client paths
rg "sdks/python|clients/python" hosting/docker-compose/ -B 2 -A 2Repository: Agenta-AI/agenta
Length of output: 13615
Add SDK and client paths to worker watchmedo scopes.
The four worker services watch only /app/, but they mount SDK and client code at /sdks/python and /clients/python. This means edits to those dependencies will not trigger restarts, unlike the API service in the same file which includes them. Align worker watchers with the OSS version and API service by adding both directories:
♻️ Proposed watcher scope fix
- watchmedo auto-restart --directory=/app/ --pattern=*.py --recursive --ignore-patterns=*/tests/* --
+ watchmedo auto-restart --directory=/app/ --directory=/sdks/python/agenta --directory=/clients/python/agenta_client --pattern=*.py --recursive --ignore-patterns=*/tests/* --Apply to lines 146, 183, 220, and 263 (worker-evaluations, worker-tracing, worker-webhooks, worker-events).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| watchmedo auto-restart --directory=/app/ --pattern=*.py --recursive --ignore-patterns=*/tests/* -- | |
| watchmedo auto-restart --directory=/app/ --directory=/sdks/python/agenta --directory=/clients/python/agenta_client --pattern=*.py --recursive --ignore-patterns=*/tests/* -- |
| # Agent workflow: reach the agent runner sidecar in-network. | ||
| AGENTA_AGENT_PI_URL: http://agent-pi:8765 | ||
| # Agent runtime (WP-8): drive the harness over ACP via a rivet daemon. | ||
| # Harness (pi/claude) and sandbox (local/daytona) are independent axes. | ||
| AGENTA_AGENT_RUNTIME: ${AGENTA_AGENT_RUNTIME:-rivet} | ||
| AGENTA_AGENT_HARNESS: ${AGENTA_AGENT_HARNESS:-pi} | ||
| AGENTA_AGENT_SANDBOX: ${AGENTA_AGENT_SANDBOX:-local} |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Locate sidecar URL usage to decide whether service_started is enough or a healthcheck is required.
# Expected: startup-time probes should be guarded by depends_on + healthcheck; request-time calls should handle connection failures gracefully.
rg -n -C3 'AGENTA_AGENT_PI_URL|agent-pi|8765' services api hosting/docker-composeRepository: Agenta-AI/agenta
Length of output: 12396
🏁 Script executed:
# Get the full services container definition to check if depends_on already exists
head -n 470 hosting/docker-compose/ee/docker-compose.dev.yml | tail -n 100Repository: Agenta-AI/agenta
Length of output: 5329
🏁 Script executed:
# Search for where AGENTA_AGENT_PI_URL is actually used in Python code
rg -t py 'AGENTA_AGENT_PI_URL' --max-count 20Repository: Agenta-AI/agenta
Length of output: 42
🏁 Script executed:
# Check the agent-pi service definition (lines 421-468)
sed -n '421,468p' hosting/docker-compose/ee/docker-compose.dev.ymlRepository: Agenta-AI/agenta
Length of output: 2995
🏁 Script executed:
# Search more broadly for AGENTA_AGENT_PI_URL usage across all file types
rg 'AGENTA_AGENT_PI_URL' --max-count 30Repository: Agenta-AI/agenta
Length of output: 289
🏁 Script executed:
# Check if depends_on already exists in the services container definition
# by looking at the full services block
sed -n '370,420p' hosting/docker-compose/ee/docker-compose.dev.ymlRepository: Agenta-AI/agenta
Length of output: 2411
🏁 Script executed:
# Search for any healthcheck definition in the agent-pi service
grep -n 'healthcheck' hosting/docker-compose/ee/docker-compose.dev.ymlRepository: Agenta-AI/agenta
Length of output: 301
🏁 Script executed:
# Find where AGENTA_AGENT_PI_URL env var is consumed in the services application code
find services -type f -name '*.py' | xargs grep -l 'AGENTA_AGENT_PI_URL' 2>/dev/nullRepository: Agenta-AI/agenta
Length of output: 44
🏁 Script executed:
# Also search for 'agent_pi' or 'agent-pi' in services code
rg -t py 'agent.?pi|agent.?runner' services --max-count 20 -iRepository: Agenta-AI/agenta
Length of output: 42
🏁 Script executed:
# Search more broadly for agent pi in all file types, not just Python
rg 'agent.?pi|AGENTA_AGENT_PI' services --max-count 30 -iRepository: Agenta-AI/agenta
Length of output: 42
🏁 Script executed:
# Check if there's a TypeScript/JS implementation in services that uses the env var
rg 'AGENTA_AGENT_PI_URL|agent-pi:8765' services --type ts --type js --max-count 20Repository: Agenta-AI/agenta
Length of output: 42
🏁 Script executed:
# Look at the services entrypoint to understand startup flow
find services -name '__main__.py' -o -name 'main.py' -o -name 'app.py' | head -5 | xargs -I {} sh -c 'echo "=== {} ===" && head -50 {}'Repository: Agenta-AI/agenta
Length of output: 1733
🏁 Script executed:
# Search in the SDK code for agent-pi or AGENTA_AGENT_PI_URL usage
find services/sdk -type f \( -name '*.py' -o -name '*.ts' -o -name '*.js' \) 2>/dev/null | xargs grep -l 'AGENTA_AGENT_PI_URL\|agent.pi\|agent_pi' 2>/dev/null | head -10Repository: Agenta-AI/agenta
Length of output: 42
🏁 Script executed:
# Look for any config or environment initialization that reads AGENTA_AGENT_PI_URL
rg 'getenv|environ\[' services/sdk -t py -B2 -A2 | grep -A5 -B5 'AGENT' | head -40Repository: Agenta-AI/agenta
Length of output: 113
🏁 Script executed:
# Check if depends_on is present anywhere in the services block of the file
sed -n '340,420p' hosting/docker-compose/ee/docker-compose.dev.yml | grep -i 'depends_on'Repository: Agenta-AI/agenta
Length of output: 80
🏁 Script executed:
# Check the full depends_on section for services to see what's listed
sed -n '340,425p' hosting/docker-compose/ee/docker-compose.dev.yml | grep -A 10 'depends_on'Repository: Agenta-AI/agenta
Length of output: 405
🏁 Script executed:
# Look at the services structure more carefully - check lines before and after depends_on
sed -n '400,420p' hosting/docker-compose/ee/docker-compose.dev.ymlRepository: Agenta-AI/agenta
Length of output: 1230
🏁 Script executed:
# Look for where agent-pi URL is actually called - check for HTTP client initialization or requests
rg -t py 'http://' services --max-count 50 | grep -i 'agent\|8765' | head -20Repository: Agenta-AI/agenta
Length of output: 42
🏁 Script executed:
# Check if agent-pi has a healthcheck
sed -n '421,510p' hosting/docker-compose/ee/docker-compose.dev.yml | grep -A 5 'healthcheck'Repository: Agenta-AI/agenta
Length of output: 231
🏁 Script executed:
# Verify the exact structure: does services.depends_on only have postgres, or does it have agent-pi too?
sed -n '338,408p' hosting/docker-compose/ee/docker-compose.dev.ymlRepository: Agenta-AI/agenta
Length of output: 2888
🏁 Script executed:
# Verify the complete depends_on for the services container - look for agent-pi anywhere in it
sed -n '338,425p' hosting/docker-compose/ee/docker-compose.dev.yml | grep -A 20 'services:' | grep -A 15 'depends_on'Repository: Agenta-AI/agenta
Length of output: 42
🏁 Script executed:
# Check if there's any other reference to agent-pi in the services container section
sed -n '338,425p' hosting/docker-compose/ee/docker-compose.dev.yml | grep -i 'agent.pi'Repository: Agenta-AI/agenta
Length of output: 205
🏁 Script executed:
# Look at the full structure around lines 397-403 to confirm the exact context
sed -n '390,410p' hosting/docker-compose/ee/docker-compose.dev.ymlRepository: Agenta-AI/agenta
Length of output: 1139
🏁 Script executed:
# Check if the services container itself has a healthcheck or startup probe
sed -n '338,425p' hosting/docker-compose/ee/docker-compose.dev.yml | tail -40Repository: Agenta-AI/agenta
Length of output: 2128
🏁 Script executed:
# Let's look at the complete file structure around the services definition to see if there's a depends_on after the environment section
sed -n '395,425p' hosting/docker-compose/ee/docker-compose.dev.ymlRepository: Agenta-AI/agenta
Length of output: 1757
🏁 Script executed:
# Search the entire docker-compose.dev.yml file for any reference to agent-pi in depends_on
grep -n 'agent-pi' hosting/docker-compose/ee/docker-compose.dev.ymlRepository: Agenta-AI/agenta
Length of output: 217
Add startup ordering for the new agent-pi sidecar dependency.
The services container references http://agent-pi:8765 via AGENTA_AGENT_PI_URL but has no depends_on entry for agent-pi. Docker Compose will not guarantee startup order, risking race conditions where the services container initializes before the sidecar is ready.
🔧 Startup-order fix
# === NETWORK ============================================== #
networks:
- agenta-network
extra_hosts:
- "host.docker.internal:host-gateway"
+ # === ORCHESTRATION ======================================== #
+ depends_on:
+ agent-pi:
+ condition: service_started
# === LABELS =============================================== #| # Tracing export fallback (used when a request carries no usable OTLP | ||
| # credential). Must be reachable from this container. | ||
| AGENTA_HOST: ${AGENTA_HOST:-http://144.76.237.122:8280} | ||
| AGENTA_API_KEY: ${AGENTA_API_KEY:-} |
There was a problem hiding this comment.
Do not default telemetry to a public HTTP IP.
Line 446 sends the sidecar’s fallback tracing export to http://144.76.237.122:8280 whenever AGENTA_HOST is unset. For a dev stack, that can leak local trace data/metadata to an external endpoint by default.
🛡️ Proposed safer default
- AGENTA_HOST: ${AGENTA_HOST:-http://144.76.237.122:8280}
+ # Set AGENTA_HOST explicitly when exporting traces outside the local stack.
+ AGENTA_HOST: ${AGENTA_HOST:-}If tracing must work out of the box, default this to an in-stack service URL instead of a public IP.
Reviewer guide: interesting codeA dev-compose-only change. The spots worth a close look:
|
Reviewer guide: interesting codeA short tour of the parts worth a close read. All paths are in
|
| environment: | ||
| DOCKER_NETWORK_MODE: ${DOCKER_NETWORK_MODE:-bridge} | ||
| # Agent workflow: reach the agent runner sidecar in-network. | ||
| AGENTA_AGENT_PI_URL: http://agent-pi:8765 |
There was a problem hiding this comment.
This is the single line that routes runs to the sidecar. The runner reads AGENTA_AGENT_PI_URL and only goes over HTTP when it is set; unset means a local subprocess (services/oss/src/agent/app.py:63). The agent-pi host here must match the service name at line 421, or every run hits an unresolved host.
| # === LIFECYCLE ============================================ # | ||
| restart: always | ||
|
|
||
| agent-pi: |
There was a problem hiding this comment.
The agent-pi service has no env_file by design. The sandbox must not inherit the stack secrets (Composio, Stripe, PostHog, Google). It gets only its port, the mounted Pi login, the OTLP fallback, and the Daytona keys for the daytona sandbox axis. Tools run server-side via /tools/call, so the sandbox never needs the broad secret set. Worth confirming this stays empty as the stack grows.
mmabrouk
left a comment
There was a problem hiding this comment.
Codex subagent review for #4776
Blocking finding:
hosting/docker-compose/ee/docker-compose.dev.yml:426/:427addsagent-piwithbuild.context: ../../../services/agentanddockerfile: docker/Dockerfile.dev, but this PR is based onmainandservices/agent/*is not present onmain. I checked the related runner PR #4778, and that is whereservices/agent/package.json,services/agent/src/server.ts, andservices/agent/docker/Dockerfile.devare introduced. As-is, if #4776 merges independently as advertised,./hosting/docker-compose/run.sh --build --license ee --dev ...fails before the dev stack can start because the build context/dockerfile is missing. Please either stack/retarget this PR on the runner PR, enforce merge order by landing #4778 first, or include/gate the sidecar so the compose file remains buildable from its declared base.
Context checked:
hosting/docker-compose/ee/docker-compose.dev.yml:398matches the #4772 service contract:AGENTA_AGENT_PI_URLselects the HTTP sidecar transport, andhttp://agent-pi:8765matches the new service name and runner server port from #4778.hosting/docker-compose/ee/docker-compose.dev.yml:421keeps the sidecar outside the stackenv_file, which looks like the right secret boundary. The explicit Daytona and trace envs are scoped rather than inheriting the full API/service secret set.- Residual risk after the stack/base issue is fixed: because
AGENTA_AGENT_PI_URLis unconditional, any override/profile that drops or failsagent-pileavesservicesconfigured to call an unavailable host. A healthcheck/depends_onor tying the env var to the sidecar inclusion would make that failure mode clearer. I did not mark that separately because the sidecar is present in this file and existing review comments already called out startup ordering.
I could not submit this as REQUEST_CHANGES because GitHub rejects change requests on your own PR for the authenticated account, but I would treat the missing runner build context as blocking.
I did not run the compose stack locally; this review used GitHub PR metadata, patches, and files plus the #4779 ground-truth/stack docs and the #4772/#4778 contracts.
|
Codex subagent review for #4776 follow-up Additional review-map issue:
I checked the compose diff itself and did not find comments that directly name #4774/#4778; the stale references are in the PR body/review guidance. This reinforces the earlier blocking finding that #4776 is not independently buildable from its declared |
Agent-workflows: functional PR set
Sliced by functional area, final code only (no intermediate churn). Most PRs are independent off
main; two pairs are stacked. This PR's base ismain.Context
The agent runner needs a place to live in the dev stack. The Python service in
api/decides per run whether to call a Pi process in its own checkout or an in-network sidecar. Until now the EE dev compose had no sidecar to call. This PR adds one. It branches offmainand is the only hosting change in the agent-workflows slice: the runner stack, nothing else.What this changes
It adds an
agent-piservice tohosting/docker-compose/ee/docker-compose.dev.yml. The service builds fromservices/agent(thesandbox-agent serverruntime) and listens on port8765.It also wires the
servicescontainer to reach the sidecar. Four env vars carry the routing and the run defaults:AGENTA_AGENT_PI_URL=http://agent-pi:8765routes calls to the sidecar over the compose network.AGENTA_AGENT_RUNTIMEpicks the runtime (defaultrivet).AGENTA_AGENT_HARNESSpicks the harness (defaultpi).AGENTA_AGENT_SANDBOXpicks the sandbox (defaultlocal).The
agent-piservice runs with noenv_fileon purpose. The Pi sandbox must not inherit the stack secrets. It gets only its port, the mounted Pi login, an OTLP export fallback, and the Daytona credentials the SDK reads when the sandbox axis isdaytona.Key architectural decision to review
Confirm the in-network URL wiring lines up with the runner. The service reads
AGENTA_AGENT_PI_URLto choose HTTP-to-sidecar over a local subprocess (services/oss/src/agent/app.py:63). The compose valuehttp://agent-pi:8765(line 398) is the Docker DNS name of theagent-piservice (line 421), so the names must stay in sync.Confirm the two axes are independent. Harness (
pi/claude) and sandbox (local/daytona) are separate choices the Python service sends per run, not a fixed pairing (line 400).Confirm the licensing note. The comment at line 456 says we ship the snapshot builder, not the snapshot, so we never distribute a Claude-containing image. That matches the posture in
services/agent/docker/README.md: Pi (MIT) is baked freely, Claude Code is installed from Anthropic at runtime and never repackaged.How to review this PR
Read the
agent-piblock top to bottom and check the four points above: the matching service name and URL, the emptyenv_file, the scoped Daytona credentials, and the licensing comment.The regression to watch: a stack that does not mount the sidecar. If a deploy omits
agent-pibut theservicescontainer still hasAGENTA_AGENT_PI_URLset, every agent run hits an unresolved host. The URL and the service must land together or not at all.