feat(agent): agent workflow service and tool-resolution API by mmabrouk · Pull Request #4772 · Agenta-AI/agenta

mmabrouk · 2026-06-19T16:28:59Z

Agent-workflows: functional PR set

Sliced by functional area, final code only (no intermediate churn). Most PRs are independent off main; two pairs are stacked. This PR's base is #4771 (review that first).

feat(sdk): agent runtime behind backend/harness ports #4771: SDK agent runtime: ports, adapters, tools, messages protocol
- feat(agent): agent workflow service and tool-resolution API #4772: Agent service + tool-resolution API <- you are here
feat(agent): runner wire contract and tool execution #4773: Runner wire contract + tool execution
- feat(agent): runner engines, server, and tracing #4774: Runner engines, server, tracing
feat(frontend): agent config playground controls #4775: Playground agent config UI
chore(hosting): wire the agent runner sidecar into compose #4776: Hosting compose wiring
docs(agent): agent-workflows design and ground truth #4777: Docs: design + ground truth

Context

The agent workflow service is the Agenta app that runs an agent turn. It registers the agent as a normal Agenta workflow (ag.create_app + ag.workflow + ag.route), so it exposes /invoke and /inspect like the chat and completion services, plus the agent-only /messages and /load-session routes. The handler reads the agent config from parameters, resolves tools and provider secrets server-side, threads the trace context, picks a backend, and runs one turn.

This PR also adds the platform-side tool-resolution API the service calls back into. POST /tools/resolve turns an agent's tool references into model-ready specs without ever handing provider keys to the running agent.

This is a functional slice that shows the final code, not the path we took to get there. It stacks on the SDK runtime PR feat/agent-sdk-runtime, which owns the engine-agnostic ports (backend, environment, harness) and their adapters. This PR is the thin Agenta integration that feeds those ports resolved tools, vault secrets, and a trace context.

What this changes

The service composes the SDK's offline tool resolver with two server-only adapters. A gateway resolver calls POST /tools/resolve for Composio tools. A vault secret provider calls POST /secrets/resolve for named secrets. Both fail fast: a missing API base or a non-200 from the gateway raises rather than running an agent with half its tools.

select_backend makes the engine-routing decision explicit. Before, the harness and sandbox choices could be silently dropped. Now pi and agenta stay on the in-process Pi backend on a local sandbox. A claude harness, any non-local sandbox, or AGENTA_AGENT_RUNTIME=rivet routes to the rivet backend. The transport to the TypeScript runner follows AGENTA_AGENT_PI_URL: set means HTTP to the sidecar, unset means spawn the runner CLI.

On the platform side, call_tool and the new agent resolver previously duplicated connection lookup and validation. That logic now lives in one resolve_connection_by_slug method, so the call path and the resolve path raise the same precise errors: missing, inactive, invalid, or no provider handshake.

The agent advertises an agent_config catalog type through /inspect. The playground renders one composite control for the whole config and pre-fills it from the schema default.

Key architectural decision to review

Backend gating is split across two files, and that split is deliberate. select_backend in services/oss/src/agent/app.py only routes. It never raises. It sends agenta on a non-local sandbox to the rivet backend even though the Agenta harness cannot run there yet. The actual rejection happens one line later, when make_harness builds the harness and the harness validates its backend and raises UnsupportedHarnessError. Scrutinize whether routing and validation belong in separate places. The tradeoff: select_backend stays a pure mapping that is trivial to unit test, and the harness owns the truth about which backends it supports. The risk: a reader skimming select_backend sees agenta routed to rivet and may assume it runs there.

The second decision is the trust boundary in tool resolution. The platform never returns provider keys. resolve_agent_tools in api/oss/src/core/tools/service.py returns only a call_ref slug (tools.{provider}.{integration}.{action}.{connection}), and the running agent calls back through POST /tools/call to execute. The _SLUG_SEGMENT_RE validation matters here. __ is forbidden in a segment because /tools/call round-trips __ to . when parsing function names, so a __ inside a segment would corrupt the split. Check that the regex and the call-ref construction agree.

How to review this PR

Read services/oss/src/agent/app.py first. The _agent handler is the whole control flow: parse config, resolve tools and secrets, build a session config, pick a backend, run one turn batch or streaming. Confirm select_backend matches the routing table in its test, and confirm the streaming path owns setup and cleanup correctly.

Then read the tool adapters. tools/gateway.py is the largest file and the one to read closely. It validates the gateway response shape hard: one spec per ref, no duplicates, every ref returned. tools/resolver.py shows the composition. tools/secrets.py and secrets.py are short HTTP clients.

Then schemas.py and tracing.py. In tracing, check that record_usage stamps gen_ai.usage.* on the active span, because the harness exports its own OTLP batch and the per-batch roll-up cannot bridge totals onto the workflow span.

Then api/oss/src/core/tools/service.py: the new resolver and the extracted resolve_connection_by_slug.

Skip the test files on a first pass. They read as a spec for the behavior above.

Likely regression: the call_tool refactor. It now delegates connection lookup to resolve_connection_by_slug. Confirm the error-to-HTTP-status mapping in the router did not change for the call path.

Tests / notes

Unit tests cover select_backend routing, the invoke handler, gateway and secret mapping, and tool resolution. Integration tests exercise the gateway and secret HTTP adapters against a mocked backend. Platform tests cover the resolver, the legacy-shape coercion, and the stable call-ref.

Both secret resolvers are best-effort by design. An empty vault returns no env vars, and the harness falls back to its own login or OAuth. The gateway resolver is the opposite. It fails fast, because an agent missing a configured tool should not run silently. Watch that this asymmetry is intended when reviewing failure handling.

vercel · 2026-06-19T16:29:05Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Jun 19, 2026 4:29pm

coderabbitai · 2026-06-19T16:29:08Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 74d40750-a0ad-45f2-9b36-c0d67aab0ecf

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

✅ Review completed - (🔄 Check again to review again)

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/agent-service

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

mmabrouk · 2026-06-19T16:34:23Z

Reviewer guide: interesting code

Start here, in this order:

services/oss/src/agent/app.py:65 — select_backend: the boolean that routes a run to rivet versus in-process Pi. The whole control flow turns on this. Line 72 is the in-process fallback; the Agenta-plus-non-local gap fails loudly instead.
services/oss/src/agent/tools/resolver.py:46 — the three-way composition: SDK offline resolver, gateway adapter, vault provider, all under MissingSecretPolicy.ERROR (line 49).
services/oss/src/agent/tools/gateway.py:123 — gateway response validation: count match, then joining specs to references by call_ref (line 149), not by position. This is where a malformed platform response is rejected.
services/oss/src/agent/tools/secrets.py:63 — VaultToolSecretProvider: best-effort, returns {} on a vault outage. Read this against the resolver's ERROR policy to see the split fail-fast behavior.
api/oss/src/core/tools/service.py:489 — resolve_agent_tools: up-front connection validation and the call-reference slug. The _SLUG_SEGMENT_RE at line 42 keeps __ out of a call reference so /tools/call can round-trip the function name.
services/oss/src/agent/tracing.py:29 — trace threading and usage stamping. Both are best-effort and must never fail a run.

mmabrouk · 2026-06-19T16:34:35Z

+        runtime == "rivet"
+        or selection.harness not in ("pi", "agenta")
+        or selection.sandbox != "local"
+    )


This OR decides rivet vs in-process. Worth a careful read: any harness other than pi/agenta, any non-local sandbox, or the runtime override sends the run to rivet. The Agenta-plus-non-local case has no backend and raises downstream rather than silently substituting one. Confirm a local Pi run never trips any of these clauses.

mmabrouk · 2026-06-19T16:34:42Z

+
+class VaultToolSecretProvider:
+    async def get_many(self, names: Sequence[str]) -> Mapping[str, str]:
+        return await resolve_named_secrets(names)


This provider is deliberately best-effort: a vault outage logs and returns {} rather than raising. It pairs with the SDK resolver running under MissingSecretPolicy.ERROR, so a code tool whose declared secret is now absent raises MissingToolSecretError. Net effect: an outage downgrades to 'secret missing' and then hard-fails for any tool that needed it, but a project with no secret-bearing tools still runs. Review this layering as one unit.

mmabrouk · 2026-06-19T16:34:55Z

+                )
+                log.warning("agent: %s", error)
+                raise error
+            reference = _normalize_reference(str(call_ref))


Specs are joined to references by call_ref here, not by list position. With the count check at line 123 and the duplicate-reference guards, a reordered or partial platform response is rejected instead of silently binding a tool to the wrong schema. This is the spot to verify the reference normalization (__ <-> .) matches the slug the service emits.

mmabrouk · 2026-06-19T16:34:56Z

+        provider_key = ToolProviderKind.COMPOSIO.value
+
+        for segment in (ref.integration, ref.action, ref.connection):
+            if not _SLUG_SEGMENT_RE.match(segment):


The slug-segment regex forbids __ inside a segment because /tools/call round-trips __ <-> . when parsing function names, so a __ in a segment would corrupt the split. Combined with the up-front resolve_connection_by_slug call, a missing/inactive/invalid connection fails the invoke before the agent loop starts. Confirm the regex is the only validation gate on these three segments.

mmabrouk · 2026-06-19T17:52:23Z

Reviewer guide: interesting code

A map to the load-bearing lines. This is a functional slice on top of feat/agent-sdk-runtime; read that PR's ports first.

services/oss/src/agent/app.py:51 — select_backend is the engine-routing table: pi/agenta stay in-process on a local sandbox, everything else (claude, non-local sandbox, AGENTA_AGENT_RUNTIME=rivet) goes to rivet. Pure mapping, never raises.
services/oss/src/agent/app.py:112 — make_harness is where an unsupported pairing actually raises. agenta on rivet is routed by the line above but rejected here. Routing and validation are split on purpose.
services/oss/src/agent/tools/gateway.py:123 — the gateway resolver validates the /tools/resolve response hard: exactly one spec per ref, no duplicates, every ref returned. Fail-fast, so an agent never runs missing a configured tool.
api/oss/src/core/tools/service.py:489 — resolve_agent_tools: builtins pass through as names; Composio refs are validated against project connections and enriched from the catalog before the model sees them. Returns a call_ref slug, never a provider key.
api/oss/src/core/tools/service.py:432 — resolve_connection_by_slug is the lookup+validation extracted from call_tool, now shared by the call path and the resolve path so both raise the same precise errors.
services/oss/src/agent/tracing.py:63 — record_usage stamps gen_ai.usage.* on the active /invoke span because the harness exports its own OTLP batch and the per-batch roll-up cannot bridge totals onto that span.

mmabrouk · 2026-06-19T17:52:37Z

+    url = os.getenv("AGENTA_AGENT_PI_URL")
+    cwd = str(wrapper_dir())
+    use_rivet = (
+        runtime == "rivet"


This boolean is the whole routing policy: agenta on a non-local sandbox routes to rivet here even though the Agenta harness can't run there yet. The rejection is deferred to make_harness below, so this function stays a pure mapping that the unit test can lock down. Worth a comment cross-link so a future reader doesn't assume routing here implies support.

mmabrouk · 2026-06-19T17:52:37Z

+        raw_specs = payload.get("custom") if isinstance(payload, dict) else None
+        if not isinstance(raw_specs, list):
+            raw_specs = []
+        if len(raw_specs) != len(tools):


Strict cardinality check: one spec per ref. Combined with the duplicate-ref guards above and the per-ref specs_by_reference.get below, the resolver refuses to run an agent with a tool catalog that doesn't exactly match what was requested. This is the fail-fast half of the secret-resolver asymmetry described in the PR body.

mmabrouk · 2026-06-19T17:52:38Z

+            action.schemas.inputs if action.schemas and action.schemas.inputs else None
+        )
+        name = ref.name or f"{ref.integration}__{ref.action}"
+        call_ref = (


The call_ref slug is the trust boundary: the platform returns this, never the provider key, and /tools/call parses it back. Note it uses raw __ joins for name two lines up but . for the call_ref; _SLUG_SEGMENT_RE forbids __ inside a segment precisely so the __<->. round-trip in /tools/call can't corrupt the split.

mmabrouk · 2026-06-19T17:52:39Z

+
+
+def record_usage(usage: Optional[Dict[str, Any]]) -> None:
+    """Stamp the agent's token/cost totals onto the active ``/invoke`` workflow span.


record_usage writes gen_ai.usage.* directly on the active span instead of relying on the cumulative roll-up, because the harness exports its span tree in a separate OTLP batch. Without this the /invoke trace would show zero tokens/cost for the agent run. The early return on a missing total keeps it best-effort.

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (2)

api/oss/tests/pytest/unit/tools/test_agent_resolution.py (1)

21-48: ⚡ Quick win

Add regression coverage for provider and payload-shape edge cases

Please add tests for: (1) a gateway tool with provider != "composio" being rejected, and (2) non-array tools payloads (e.g. {}) failing validation. These two cases directly guard the new resolution boundary behavior.

Also applies to: 50-79
services/oss/src/agent/config.py (1)
65-72: ⚡ Quick win

tools is typed too narrowly in load_config().

agent.json tools can be structured objects, but the local variable is annotated as List[str]. Aligning it with AgentConfig.tools avoids incorrect assumptions in future edits.
Suggested fix
-    tools: List[str] = []
+    tools: List[Any] = []

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 9c1f9ecf-9a11-41a4-9004-cc4fc8a74aaa

📥 Commits

Reviewing files that changed from the base of the PR and between b9e62f9 and 9587798.

📒 Files selected for processing (35)

api/oss/src/apis/fastapi/tools/models.py
api/oss/src/apis/fastapi/tools/router.py
api/oss/src/core/tools/dtos.py
api/oss/src/core/tools/exceptions.py
api/oss/src/core/tools/service.py
api/oss/tests/pytest/unit/tools/__init__.py
api/oss/tests/pytest/unit/tools/test_agent_resolution.py
services/entrypoints/main.py
services/oss/src/agent/__init__.py
services/oss/src/agent/app.py
services/oss/src/agent/client.py
services/oss/src/agent/config.py
services/oss/src/agent/schemas.py
services/oss/src/agent/secrets.py
services/oss/src/agent/tools/__init__.py
services/oss/src/agent/tools/gateway.py
services/oss/src/agent/tools/resolver.py
services/oss/src/agent/tools/secrets.py
services/oss/src/agent/tracing.py
services/oss/tests/pytest/integration/__init__.py
services/oss/tests/pytest/integration/agent/__init__.py
services/oss/tests/pytest/integration/agent/conftest.py
services/oss/tests/pytest/integration/agent/test_resolve_secrets_http.py
services/oss/tests/pytest/integration/agent/tools/__init__.py
services/oss/tests/pytest/integration/agent/tools/test_gateway_http.py
services/oss/tests/pytest/integration/agent/tools/test_secrets_http.py
services/oss/tests/pytest/unit/__init__.py
services/oss/tests/pytest/unit/agent/__init__.py
services/oss/tests/pytest/unit/agent/conftest.py
services/oss/tests/pytest/unit/agent/test_invoke_handler.py
services/oss/tests/pytest/unit/agent/test_secrets_mapping.py
services/oss/tests/pytest/unit/agent/test_select_backend.py
services/oss/tests/pytest/unit/agent/tools/__init__.py
services/oss/tests/pytest/unit/agent/tools/test_gateway_mapping.py
services/oss/tests/pytest/unit/agent/tools/test_resolution.py

coderabbitai · 2026-06-19T18:26:14Z

+    def _coerce_tools(cls, value: Any) -> List[AgentToolReference]:
+        try:
+            configs = coerce_tool_configs(value or []).tool_configs
+        except ToolConfigurationError as exc:


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid falsy-coercing invalid payloads to an empty tools list

Using value or [] makes invalid falsy payloads (e.g. {}, 0, False) pass as [], so malformed requests can be silently accepted.

Suggested fix

def _coerce_tools(cls, value: Any) -> List[AgentToolReference]: + if value is None: + raw_values = [] + elif isinstance(value, list): + raw_values = value + else: + raise ValueError("tools must be an array") try: - configs = coerce_tool_configs(value or []).tool_configs + configs = coerce_tool_configs(raw_values).tool_configs except ToolConfigurationError as exc: raise ValueError(str(exc)) from exc

coderabbitai · 2026-06-19T18:26:14Z

+    async def _resolve_composio_tool(
+        self,
+        *,
+        project_id: UUID,
+        ref: AgentComposioTool,
+    ) -> ResolvedAgentTool:
+        provider_key = ToolProviderKind.COMPOSIO.value
+
+        for segment in (ref.integration, ref.action, ref.connection):
+            if not _SLUG_SEGMENT_RE.match(segment):
+                raise ToolSlugInvalidError(
+                    slug=f"{provider_key}.{ref.integration}.{ref.action}.{ref.connection}",
+                    detail=f"Invalid slug segment: {segment!r}",
+                )
+
+        # Fail fast if the connection is missing/inactive/invalid for this project.
+        await self.resolve_connection_by_slug(
+            project_id=project_id,
+            provider_key=provider_key,
+            integration_key=ref.integration,
+            connection_slug=ref.connection,
+        )
+
+        action = await self.get_action(
+            provider_key=provider_key,
+            integration_key=ref.integration,
+            action_key=ref.action,
+        )
+        if not action:
+            raise ActionNotFoundError(
+                provider_key=provider_key,
+                integration_key=ref.integration,
+                action_key=ref.action,
+            )
+
+        input_schema = (
+            action.schemas.inputs if action.schemas and action.schemas.inputs else None
+        )
+        name = ref.name or f"{ref.integration}__{ref.action}"
+        call_ref = (
+            f"tools.{provider_key}.{ref.integration}.{ref.action}.{ref.connection}"
+        )


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate gateway provider before composing a composio call-ref

_resolve_composio_tool always forces provider_key="composio" but never checks ref.provider. A gateway reference with a different provider is currently accepted and silently remapped into the composio namespace, which can resolve/execute against the wrong integration space.

Suggested fix

async def _resolve_composio_tool( self, *, project_id: UUID, ref: AgentComposioTool, ) -> ResolvedAgentTool: provider_key = ToolProviderKind.COMPOSIO.value + if ref.provider != provider_key: + raise ToolSlugInvalidError( + slug=f"{ref.provider}.{ref.integration}.{ref.action}.{ref.connection}", + detail=f"Unsupported gateway provider for /tools/resolve: {ref.provider!r}", + )

coderabbitai · 2026-06-19T18:26:14Z

+from agenta.sdk.engines.tracing.propagation import inject
+
+# Budget for a backend round-trip (the tool catalog/connection check, the vault fetch).
+TOOLS_TIMEOUT = float(os.getenv("AGENTA_AGENT_TOOLS_TIMEOUT", "30"))


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard TOOLS_TIMEOUT parsing to avoid import-time crashes.

A non-numeric AGENTA_AGENT_TOOLS_TIMEOUT will raise ValueError at import time and fail the agent app startup path.

Suggested fix

-TOOLS_TIMEOUT = float(os.getenv("AGENTA_AGENT_TOOLS_TIMEOUT", "30")) +try: + TOOLS_TIMEOUT = float(os.getenv("AGENTA_AGENT_TOOLS_TIMEOUT", "30")) +except ValueError: + TOOLS_TIMEOUT = 30.0

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

TOOLS_TIMEOUT = float(os.getenv("AGENTA_AGENT_TOOLS_TIMEOUT", "30"))

try:

TOOLS_TIMEOUT = float(os.getenv("AGENTA_AGENT_TOOLS_TIMEOUT", "30"))

except ValueError:

TOOLS_TIMEOUT = 30.0

coderabbitai · 2026-06-19T18:26:14Z

+            log.warning(
+                "agent: named-secret resolve HTTP %s for %s",
+                response.status_code,
+                names,
+            )
+            return {}
+        data = response.json() or {}
+    except Exception:  # pylint: disable=broad-except
+        log.warning("agent: named-secret resolve failed for %s", names, exc_info=True)
+        return {}
+
+    resolved = data.get("secrets") if isinstance(data, dict) else None
+    resolved = resolved if isinstance(resolved, dict) else {}
+    missing = [name for name in names if name not in resolved]
+    if missing:
+        log.warning("agent: unresolved named secret(s): %s", missing)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid logging raw secret names in warning paths.

Line 43, Line 50, and Line 57 log secret identifiers directly. Even without values, names can expose internal integrations and tenant metadata in shared logs.

Proposed patch

- if response.status_code >= 400: - log.warning( - "agent: named-secret resolve HTTP %s for %s", - response.status_code, - names, - ) + if response.status_code >= 400: + log.warning( + "agent: named-secret resolve HTTP %s for %d requested secret(s)", + response.status_code, + len(names), + ) return {} @@ - except Exception: # pylint: disable=broad-except - log.warning("agent: named-secret resolve failed for %s", names, exc_info=True) + except Exception: # pylint: disable=broad-except + log.warning( + "agent: named-secret resolve failed for %d requested secret(s)", + len(names), + exc_info=True, + ) return {} @@ - if missing: - log.warning("agent: unresolved named secret(s): %s", missing) + if missing: + log.warning("agent: unresolved named secret count=%d", len(missing))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

log.warning(

"agent: named-secret resolve HTTP %s for %s",

response.status_code,

names,

)

return {}

data = response.json() or {}

except Exception: # pylint: disable=broad-except

log.warning("agent: named-secret resolve failed for %s", names, exc_info=True)

return {}

resolved = data.get("secrets") if isinstance(data, dict) else None

resolved = resolved if isinstance(resolved, dict) else {}

missing = [name for name in names if name not in resolved]

if missing:

log.warning("agent: unresolved named secret(s): %s", missing)

log.warning(

"agent: named-secret resolve HTTP %s for %d requested secret(s)",

response.status_code,

len(names),

)

return {}

data = response.json() or {}

except Exception: # pylint: disable=broad-except

log.warning(

"agent: named-secret resolve failed for %d requested secret(s)",

len(names),

exc_info=True,

)

return {}

resolved = data.get("secrets") if isinstance(data, dict) else None

resolved = resolved if isinstance(resolved, dict) else {}

missing = [name for name in names if name not in resolved]

if missing:

log.warning("agent: unresolved named secret count=%d", len(missing))

coderabbitai · 2026-06-19T18:26:14Z

+    resolved = data.get("secrets") if isinstance(data, dict) else None
+    resolved = resolved if isinstance(resolved, dict) else {}
+    missing = [name for name in names if name not in resolved]
+    if missing:
+        log.warning("agent: unresolved named secret(s): %s", missing)
+    return {
+        str(key): str(value) for key, value in resolved.items() if value is not None
+    }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Restrict returned secrets to the requested name set.

Line 59 currently returns every key in resolved. If upstream returns extras, this path propagates unrequested secrets into runtime memory.

Proposed patch

- return { - str(key): str(value) for key, value in resolved.items() if value is not None - } + requested = {str(name) for name in names} + return { + str(key): str(value) + for key, value in resolved.items() + if value is not None and str(key) in requested + }

coderabbitai · 2026-06-19T18:26:14Z

+    if not usage or not usage.get("total"):
+        return
+    try:
+        span = otel_trace.get_current_span()
+        input_tokens = int(usage.get("input") or 0)
+        output_tokens = int(usage.get("output") or 0)
+        span.set_attribute("gen_ai.usage.input_tokens", input_tokens)
+        span.set_attribute("gen_ai.usage.output_tokens", output_tokens)
+        span.set_attribute("gen_ai.usage.prompt_tokens", input_tokens)
+        span.set_attribute("gen_ai.usage.completion_tokens", output_tokens)
+        span.set_attribute("gen_ai.usage.total_tokens", int(usage.get("total") or 0))
+        cost = usage.get("cost")
+        if cost:
+            span.set_attribute("gen_ai.usage.cost", float(cost))


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Zero-valued usage metrics are currently dropped.

Using truthy checks skips valid 0 totals/costs, so traces can’t distinguish “zero usage” from “missing usage”.

Suggested fix

- if not usage or not usage.get("total"): + if not usage or usage.get("total") is None: return @@ - cost = usage.get("cost") - if cost: - span.set_attribute("gen_ai.usage.cost", float(cost)) + cost = usage.get("cost") + if cost is not None: + span.set_attribute("gen_ai.usage.cost", float(cost))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if not usage or not usage.get("total"):

return

try:

span = otel_trace.get_current_span()

input_tokens = int(usage.get("input") or 0)

output_tokens = int(usage.get("output") or 0)

span.set_attribute("gen_ai.usage.input_tokens", input_tokens)

span.set_attribute("gen_ai.usage.output_tokens", output_tokens)

span.set_attribute("gen_ai.usage.prompt_tokens", input_tokens)

span.set_attribute("gen_ai.usage.completion_tokens", output_tokens)

span.set_attribute("gen_ai.usage.total_tokens", int(usage.get("total") or 0))

cost = usage.get("cost")

if cost:

span.set_attribute("gen_ai.usage.cost", float(cost))

if not usage or usage.get("total") is None:

return

try:

span = otel_trace.get_current_span()

input_tokens = int(usage.get("input") or 0)

output_tokens = int(usage.get("output") or 0)

span.set_attribute("gen_ai.usage.input_tokens", input_tokens)

span.set_attribute("gen_ai.usage.output_tokens", output_tokens)

span.set_attribute("gen_ai.usage.prompt_tokens", input_tokens)

span.set_attribute("gen_ai.usage.completion_tokens", output_tokens)

span.set_attribute("gen_ai.usage.total_tokens", int(usage.get("total") or 0))

cost = usage.get("cost")

if cost is not None:

span.set_attribute("gen_ai.usage.cost", float(cost))

mmabrouk

Codex subagent review for #4772

Blocking finding:

services/oss/src/agent/app.py:63 / services/oss/src/agent/config.py:40: the default local path selects InProcessPiBackend(url=None, cwd=wrapper_dir()), and wrapper_dir() defaults to services/agent. At this PR's head (feat/agent-service), services/agent/src/cli.ts, services/agent/package.json, and services/agent/src/server.ts are not present; those files land in the runner PRs (#4773/#4778). After merging only #4771 + #4772, the newly mounted /agent/v0/invoke path will try to spawn pnpm exec tsx src/cli.ts from a missing package whenever AGENTA_AGENT_PI_URL is unset, so the service is not independently runnable against its declared base. The HTTP sidecar path also depends on the runner/hosting stack being present. Please either stack/retarget this PR on the runner PR that provides the runtime, include the runtime dependency here, or gate /agent/v0/backend selection until a runner package or sidecar URL is available, returning a clear 503/validation error with coverage.

Related cross-PR issue:

services/oss/src/agent/app.py:149 / services/oss/src/agent/schemas.py:36: related to the #4775 frontend path, this PR mounts a directly registered workflow and the comment explicitly says there is no first-class builtin URI yet (agenta:builtin:agent:v0). /inspect can describe the mounted handler, but I do not see a catalog template/registration that would make an unconditional frontend "create Agent" option work. Please either add that backend surface here, gate the frontend create option until it lands, or update the PR/stack docs so #4775 does not depend on a template this PR does not provide.

I also checked the tool resolver/call-path refactor, vault/gateway adapters, and streaming setup/cleanup path from the patch and did not find another blocker. I did not run tests; this review is based on the GitHub PR metadata, patches, and branch files. The PR body's stack map still points to #4774/#4777 while the newer runner/docs PRs are #4778/#4779, which is contributing to the dependency confusion above.

feat(agent): agent workflow service and gateway tool-resolution API

9587798

dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Jun 19, 2026

dosubot Bot added Backend feature python Pull requests that update Python code tests labels Jun 19, 2026

vercel Bot deployed to Preview June 19, 2026 16:29 View deployment

mmabrouk mentioned this pull request Jun 19, 2026

feat(sdk): agent runtime behind backend/harness ports #4771

Open

mmabrouk commented Jun 19, 2026

View reviewed changes

This was referenced Jun 19, 2026

docs(agent): agent-workflows design and ground truth #4777

Open

feat(agent): runner engines, server, and tracing #4774

Open

coderabbitai Bot reviewed Jun 19, 2026

View reviewed changes

mmabrouk commented Jun 19, 2026

View reviewed changes



		def record_usage(usage: Optional[Dict[str, Any]]) -> None:
		"""Stamp the agent's token/cost totals onto the active ``/invoke`` workflow span.

Conversation

mmabrouk commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Agent-workflows: functional PR set

Context

What this changes

Key architectural decision to review

How to review this PR

Tests / notes

Uh oh!

vercel Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

mmabrouk commented Jun 19, 2026

Reviewer guide: interesting code

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mmabrouk commented Jun 19, 2026

Reviewer guide: interesting code

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mmabrouk commented Jun 19, 2026 •

edited

Loading

vercel Bot commented Jun 19, 2026 •

edited

Loading

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading