Skip to content

feat(agent): agent workflow service and tool-resolution API#4772

Open
mmabrouk wants to merge 1 commit into
feat/agent-sdk-runtimefrom
feat/agent-service
Open

feat(agent): agent workflow service and tool-resolution API#4772
mmabrouk wants to merge 1 commit into
feat/agent-sdk-runtimefrom
feat/agent-service

Conversation

@mmabrouk

@mmabrouk mmabrouk commented Jun 19, 2026

Copy link
Copy Markdown
Member

Agent-workflows: functional PR set

Sliced by functional area, final code only (no intermediate churn). Most PRs are independent off main; two pairs are stacked. This PR's base is #4771 (review that first).

Context

The agent workflow service is the Agenta app that runs an agent turn. It registers the agent as a normal Agenta workflow (ag.create_app + ag.workflow + ag.route), so it exposes /invoke and /inspect like the chat and completion services, plus the agent-only /messages and /load-session routes. The handler reads the agent config from parameters, resolves tools and provider secrets server-side, threads the trace context, picks a backend, and runs one turn.

This PR also adds the platform-side tool-resolution API the service calls back into. POST /tools/resolve turns an agent's tool references into model-ready specs without ever handing provider keys to the running agent.

This is a functional slice that shows the final code, not the path we took to get there. It stacks on the SDK runtime PR feat/agent-sdk-runtime, which owns the engine-agnostic ports (backend, environment, harness) and their adapters. This PR is the thin Agenta integration that feeds those ports resolved tools, vault secrets, and a trace context.

What this changes

The service composes the SDK's offline tool resolver with two server-only adapters. A gateway resolver calls POST /tools/resolve for Composio tools. A vault secret provider calls POST /secrets/resolve for named secrets. Both fail fast: a missing API base or a non-200 from the gateway raises rather than running an agent with half its tools.

select_backend makes the engine-routing decision explicit. Before, the harness and sandbox choices could be silently dropped. Now pi and agenta stay on the in-process Pi backend on a local sandbox. A claude harness, any non-local sandbox, or AGENTA_AGENT_RUNTIME=rivet routes to the rivet backend. The transport to the TypeScript runner follows AGENTA_AGENT_PI_URL: set means HTTP to the sidecar, unset means spawn the runner CLI.

On the platform side, call_tool and the new agent resolver previously duplicated connection lookup and validation. That logic now lives in one resolve_connection_by_slug method, so the call path and the resolve path raise the same precise errors: missing, inactive, invalid, or no provider handshake.

The agent advertises an agent_config catalog type through /inspect. The playground renders one composite control for the whole config and pre-fills it from the schema default.

Key architectural decision to review

Backend gating is split across two files, and that split is deliberate. select_backend in services/oss/src/agent/app.py only routes. It never raises. It sends agenta on a non-local sandbox to the rivet backend even though the Agenta harness cannot run there yet. The actual rejection happens one line later, when make_harness builds the harness and the harness validates its backend and raises UnsupportedHarnessError. Scrutinize whether routing and validation belong in separate places. The tradeoff: select_backend stays a pure mapping that is trivial to unit test, and the harness owns the truth about which backends it supports. The risk: a reader skimming select_backend sees agenta routed to rivet and may assume it runs there.

The second decision is the trust boundary in tool resolution. The platform never returns provider keys. resolve_agent_tools in api/oss/src/core/tools/service.py returns only a call_ref slug (tools.{provider}.{integration}.{action}.{connection}), and the running agent calls back through POST /tools/call to execute. The _SLUG_SEGMENT_RE validation matters here. __ is forbidden in a segment because /tools/call round-trips __ to . when parsing function names, so a __ inside a segment would corrupt the split. Check that the regex and the call-ref construction agree.

How to review this PR

Read services/oss/src/agent/app.py first. The _agent handler is the whole control flow: parse config, resolve tools and secrets, build a session config, pick a backend, run one turn batch or streaming. Confirm select_backend matches the routing table in its test, and confirm the streaming path owns setup and cleanup correctly.

Then read the tool adapters. tools/gateway.py is the largest file and the one to read closely. It validates the gateway response shape hard: one spec per ref, no duplicates, every ref returned. tools/resolver.py shows the composition. tools/secrets.py and secrets.py are short HTTP clients.

Then schemas.py and tracing.py. In tracing, check that record_usage stamps gen_ai.usage.* on the active span, because the harness exports its own OTLP batch and the per-batch roll-up cannot bridge totals onto the workflow span.

Then api/oss/src/core/tools/service.py: the new resolver and the extracted resolve_connection_by_slug.

Skip the test files on a first pass. They read as a spec for the behavior above.

Likely regression: the call_tool refactor. It now delegates connection lookup to resolve_connection_by_slug. Confirm the error-to-HTTP-status mapping in the router did not change for the call path.

Tests / notes

Unit tests cover select_backend routing, the invoke handler, gateway and secret mapping, and tool resolution. Integration tests exercise the gateway and secret HTTP adapters against a mocked backend. Platform tests cover the resolver, the legacy-shape coercion, and the stable call-ref.

Both secret resolvers are best-effort by design. An empty vault returns no env vars, and the harness falls back to its own login or OAuth. The gateway resolver is the opposite. It fails fast, because an agent missing a configured tool should not run silently. Watch that this asymmetry is intended when reviewing failure handling.

@dosubot dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Jun 19, 2026
@vercel

vercel Bot commented Jun 19, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jun 19, 2026 4:29pm

Request Review

@dosubot dosubot Bot added Backend feature python Pull requests that update Python code tests labels Jun 19, 2026
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 74d40750-a0ad-45f2-9b36-c0d67aab0ecf

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • ✅ Review completed - (🔄 Check again to review again)
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/agent-service

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mmabrouk

Copy link
Copy Markdown
Member Author

Reviewer guide: interesting code

Start here, in this order:

  • services/oss/src/agent/app.py:65select_backend: the boolean that routes a run to rivet versus in-process Pi. The whole control flow turns on this. Line 72 is the in-process fallback; the Agenta-plus-non-local gap fails loudly instead.
  • services/oss/src/agent/tools/resolver.py:46 — the three-way composition: SDK offline resolver, gateway adapter, vault provider, all under MissingSecretPolicy.ERROR (line 49).
  • services/oss/src/agent/tools/gateway.py:123 — gateway response validation: count match, then joining specs to references by call_ref (line 149), not by position. This is where a malformed platform response is rejected.
  • services/oss/src/agent/tools/secrets.py:63VaultToolSecretProvider: best-effort, returns {} on a vault outage. Read this against the resolver's ERROR policy to see the split fail-fast behavior.
  • api/oss/src/core/tools/service.py:489resolve_agent_tools: up-front connection validation and the call-reference slug. The _SLUG_SEGMENT_RE at line 42 keeps __ out of a call reference so /tools/call can round-trip the function name.
  • services/oss/src/agent/tracing.py:29 — trace threading and usage stamping. Both are best-effort and must never fail a run.

runtime == "rivet"
or selection.harness not in ("pi", "agenta")
or selection.sandbox != "local"
)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This OR decides rivet vs in-process. Worth a careful read: any harness other than pi/agenta, any non-local sandbox, or the runtime override sends the run to rivet. The Agenta-plus-non-local case has no backend and raises downstream rather than silently substituting one. Confirm a local Pi run never trips any of these clauses.


class VaultToolSecretProvider:
async def get_many(self, names: Sequence[str]) -> Mapping[str, str]:
return await resolve_named_secrets(names)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This provider is deliberately best-effort: a vault outage logs and returns {} rather than raising. It pairs with the SDK resolver running under MissingSecretPolicy.ERROR, so a code tool whose declared secret is now absent raises MissingToolSecretError. Net effect: an outage downgrades to 'secret missing' and then hard-fails for any tool that needed it, but a project with no secret-bearing tools still runs. Review this layering as one unit.

)
log.warning("agent: %s", error)
raise error
reference = _normalize_reference(str(call_ref))

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specs are joined to references by call_ref here, not by list position. With the count check at line 123 and the duplicate-reference guards, a reordered or partial platform response is rejected instead of silently binding a tool to the wrong schema. This is the spot to verify the reference normalization (__ <-> .) matches the slug the service emits.

provider_key = ToolProviderKind.COMPOSIO.value

for segment in (ref.integration, ref.action, ref.connection):
if not _SLUG_SEGMENT_RE.match(segment):

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The slug-segment regex forbids __ inside a segment because /tools/call round-trips __ <-> . when parsing function names, so a __ in a segment would corrupt the split. Combined with the up-front resolve_connection_by_slug call, a missing/inactive/invalid connection fails the invoke before the agent loop starts. Confirm the regex is the only validation gate on these three segments.

@mmabrouk

Copy link
Copy Markdown
Member Author

Reviewer guide: interesting code

A map to the load-bearing lines. This is a functional slice on top of feat/agent-sdk-runtime; read that PR's ports first.

  • services/oss/src/agent/app.py:51select_backend is the engine-routing table: pi/agenta stay in-process on a local sandbox, everything else (claude, non-local sandbox, AGENTA_AGENT_RUNTIME=rivet) goes to rivet. Pure mapping, never raises.
  • services/oss/src/agent/app.py:112make_harness is where an unsupported pairing actually raises. agenta on rivet is routed by the line above but rejected here. Routing and validation are split on purpose.
  • services/oss/src/agent/tools/gateway.py:123 — the gateway resolver validates the /tools/resolve response hard: exactly one spec per ref, no duplicates, every ref returned. Fail-fast, so an agent never runs missing a configured tool.
  • api/oss/src/core/tools/service.py:489resolve_agent_tools: builtins pass through as names; Composio refs are validated against project connections and enriched from the catalog before the model sees them. Returns a call_ref slug, never a provider key.
  • api/oss/src/core/tools/service.py:432resolve_connection_by_slug is the lookup+validation extracted from call_tool, now shared by the call path and the resolve path so both raise the same precise errors.
  • services/oss/src/agent/tracing.py:63record_usage stamps gen_ai.usage.* on the active /invoke span because the harness exports its own OTLP batch and the per-batch roll-up cannot bridge totals onto that span.

url = os.getenv("AGENTA_AGENT_PI_URL")
cwd = str(wrapper_dir())
use_rivet = (
runtime == "rivet"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This boolean is the whole routing policy: agenta on a non-local sandbox routes to rivet here even though the Agenta harness can't run there yet. The rejection is deferred to make_harness below, so this function stays a pure mapping that the unit test can lock down. Worth a comment cross-link so a future reader doesn't assume routing here implies support.

raw_specs = payload.get("custom") if isinstance(payload, dict) else None
if not isinstance(raw_specs, list):
raw_specs = []
if len(raw_specs) != len(tools):

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strict cardinality check: one spec per ref. Combined with the duplicate-ref guards above and the per-ref specs_by_reference.get below, the resolver refuses to run an agent with a tool catalog that doesn't exactly match what was requested. This is the fail-fast half of the secret-resolver asymmetry described in the PR body.

action.schemas.inputs if action.schemas and action.schemas.inputs else None
)
name = ref.name or f"{ref.integration}__{ref.action}"
call_ref = (

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call_ref slug is the trust boundary: the platform returns this, never the provider key, and /tools/call parses it back. Note it uses raw __ joins for name two lines up but . for the call_ref; _SLUG_SEGMENT_RE forbids __ inside a segment precisely so the __<->. round-trip in /tools/call can't corrupt the split.



def record_usage(usage: Optional[Dict[str, Any]]) -> None:
"""Stamp the agent's token/cost totals onto the active ``/invoke`` workflow span.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

record_usage writes gen_ai.usage.* directly on the active span instead of relying on the cumulative roll-up, because the harness exports its span tree in a separate OTLP batch. Without this the /invoke trace would show zero tokens/cost for the agent run. The early return on a missing total keeps it best-effort.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (2)
api/oss/tests/pytest/unit/tools/test_agent_resolution.py (1)

21-48: ⚡ Quick win

Add regression coverage for provider and payload-shape edge cases

Please add tests for: (1) a gateway tool with provider != "composio" being rejected, and (2) non-array tools payloads (e.g. {}) failing validation. These two cases directly guard the new resolution boundary behavior.

Also applies to: 50-79

services/oss/src/agent/config.py (1)

65-72: ⚡ Quick win

tools is typed too narrowly in load_config().

agent.json tools can be structured objects, but the local variable is annotated as List[str]. Aligning it with AgentConfig.tools avoids incorrect assumptions in future edits.

Suggested fix
-    tools: List[str] = []
+    tools: List[Any] = []

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 9c1f9ecf-9a11-41a4-9004-cc4fc8a74aaa

📥 Commits

Reviewing files that changed from the base of the PR and between b9e62f9 and 9587798.

📒 Files selected for processing (35)
  • api/oss/src/apis/fastapi/tools/models.py
  • api/oss/src/apis/fastapi/tools/router.py
  • api/oss/src/core/tools/dtos.py
  • api/oss/src/core/tools/exceptions.py
  • api/oss/src/core/tools/service.py
  • api/oss/tests/pytest/unit/tools/__init__.py
  • api/oss/tests/pytest/unit/tools/test_agent_resolution.py
  • services/entrypoints/main.py
  • services/oss/src/agent/__init__.py
  • services/oss/src/agent/app.py
  • services/oss/src/agent/client.py
  • services/oss/src/agent/config.py
  • services/oss/src/agent/schemas.py
  • services/oss/src/agent/secrets.py
  • services/oss/src/agent/tools/__init__.py
  • services/oss/src/agent/tools/gateway.py
  • services/oss/src/agent/tools/resolver.py
  • services/oss/src/agent/tools/secrets.py
  • services/oss/src/agent/tracing.py
  • services/oss/tests/pytest/integration/__init__.py
  • services/oss/tests/pytest/integration/agent/__init__.py
  • services/oss/tests/pytest/integration/agent/conftest.py
  • services/oss/tests/pytest/integration/agent/test_resolve_secrets_http.py
  • services/oss/tests/pytest/integration/agent/tools/__init__.py
  • services/oss/tests/pytest/integration/agent/tools/test_gateway_http.py
  • services/oss/tests/pytest/integration/agent/tools/test_secrets_http.py
  • services/oss/tests/pytest/unit/__init__.py
  • services/oss/tests/pytest/unit/agent/__init__.py
  • services/oss/tests/pytest/unit/agent/conftest.py
  • services/oss/tests/pytest/unit/agent/test_invoke_handler.py
  • services/oss/tests/pytest/unit/agent/test_secrets_mapping.py
  • services/oss/tests/pytest/unit/agent/test_select_backend.py
  • services/oss/tests/pytest/unit/agent/tools/__init__.py
  • services/oss/tests/pytest/unit/agent/tools/test_gateway_mapping.py
  • services/oss/tests/pytest/unit/agent/tools/test_resolution.py

Comment on lines +111 to +114
def _coerce_tools(cls, value: Any) -> List[AgentToolReference]:
try:
configs = coerce_tool_configs(value or []).tool_configs
except ToolConfigurationError as exc:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid falsy-coercing invalid payloads to an empty tools list

Using value or [] makes invalid falsy payloads (e.g. {}, 0, False) pass as [], so malformed requests can be silently accepted.

Suggested fix
     def _coerce_tools(cls, value: Any) -> List[AgentToolReference]:
+        if value is None:
+            raw_values = []
+        elif isinstance(value, list):
+            raw_values = value
+        else:
+            raise ValueError("tools must be an array")
         try:
-            configs = coerce_tool_configs(value or []).tool_configs
+            configs = coerce_tool_configs(raw_values).tool_configs
         except ToolConfigurationError as exc:
             raise ValueError(str(exc)) from exc

Comment on lines +521 to +562
async def _resolve_composio_tool(
self,
*,
project_id: UUID,
ref: AgentComposioTool,
) -> ResolvedAgentTool:
provider_key = ToolProviderKind.COMPOSIO.value

for segment in (ref.integration, ref.action, ref.connection):
if not _SLUG_SEGMENT_RE.match(segment):
raise ToolSlugInvalidError(
slug=f"{provider_key}.{ref.integration}.{ref.action}.{ref.connection}",
detail=f"Invalid slug segment: {segment!r}",
)

# Fail fast if the connection is missing/inactive/invalid for this project.
await self.resolve_connection_by_slug(
project_id=project_id,
provider_key=provider_key,
integration_key=ref.integration,
connection_slug=ref.connection,
)

action = await self.get_action(
provider_key=provider_key,
integration_key=ref.integration,
action_key=ref.action,
)
if not action:
raise ActionNotFoundError(
provider_key=provider_key,
integration_key=ref.integration,
action_key=ref.action,
)

input_schema = (
action.schemas.inputs if action.schemas and action.schemas.inputs else None
)
name = ref.name or f"{ref.integration}__{ref.action}"
call_ref = (
f"tools.{provider_key}.{ref.integration}.{ref.action}.{ref.connection}"
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate gateway provider before composing a composio call-ref

_resolve_composio_tool always forces provider_key="composio" but never checks ref.provider. A gateway reference with a different provider is currently accepted and silently remapped into the composio namespace, which can resolve/execute against the wrong integration space.

Suggested fix
 async def _resolve_composio_tool(
     self,
     *,
     project_id: UUID,
     ref: AgentComposioTool,
 ) -> ResolvedAgentTool:
     provider_key = ToolProviderKind.COMPOSIO.value
+    if ref.provider != provider_key:
+        raise ToolSlugInvalidError(
+            slug=f"{ref.provider}.{ref.integration}.{ref.action}.{ref.connection}",
+            detail=f"Unsupported gateway provider for /tools/resolve: {ref.provider!r}",
+        )

from agenta.sdk.engines.tracing.propagation import inject

# Budget for a backend round-trip (the tool catalog/connection check, the vault fetch).
TOOLS_TIMEOUT = float(os.getenv("AGENTA_AGENT_TOOLS_TIMEOUT", "30"))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard TOOLS_TIMEOUT parsing to avoid import-time crashes.

A non-numeric AGENTA_AGENT_TOOLS_TIMEOUT will raise ValueError at import time and fail the agent app startup path.

Suggested fix
-TOOLS_TIMEOUT = float(os.getenv("AGENTA_AGENT_TOOLS_TIMEOUT", "30"))
+try:
+    TOOLS_TIMEOUT = float(os.getenv("AGENTA_AGENT_TOOLS_TIMEOUT", "30"))
+except ValueError:
+    TOOLS_TIMEOUT = 30.0
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
TOOLS_TIMEOUT = float(os.getenv("AGENTA_AGENT_TOOLS_TIMEOUT", "30"))
try:
TOOLS_TIMEOUT = float(os.getenv("AGENTA_AGENT_TOOLS_TIMEOUT", "30"))
except ValueError:
TOOLS_TIMEOUT = 30.0

Comment on lines +42 to +57
log.warning(
"agent: named-secret resolve HTTP %s for %s",
response.status_code,
names,
)
return {}
data = response.json() or {}
except Exception: # pylint: disable=broad-except
log.warning("agent: named-secret resolve failed for %s", names, exc_info=True)
return {}

resolved = data.get("secrets") if isinstance(data, dict) else None
resolved = resolved if isinstance(resolved, dict) else {}
missing = [name for name in names if name not in resolved]
if missing:
log.warning("agent: unresolved named secret(s): %s", missing)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid logging raw secret names in warning paths.

Line 43, Line 50, and Line 57 log secret identifiers directly. Even without values, names can expose internal integrations and tenant metadata in shared logs.

Proposed patch
-        if response.status_code >= 400:
-            log.warning(
-                "agent: named-secret resolve HTTP %s for %s",
-                response.status_code,
-                names,
-            )
+        if response.status_code >= 400:
+            log.warning(
+                "agent: named-secret resolve HTTP %s for %d requested secret(s)",
+                response.status_code,
+                len(names),
+            )
             return {}
@@
-    except Exception:  # pylint: disable=broad-except
-        log.warning("agent: named-secret resolve failed for %s", names, exc_info=True)
+    except Exception:  # pylint: disable=broad-except
+        log.warning(
+            "agent: named-secret resolve failed for %d requested secret(s)",
+            len(names),
+            exc_info=True,
+        )
         return {}
@@
-    if missing:
-        log.warning("agent: unresolved named secret(s): %s", missing)
+    if missing:
+        log.warning("agent: unresolved named secret count=%d", len(missing))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
log.warning(
"agent: named-secret resolve HTTP %s for %s",
response.status_code,
names,
)
return {}
data = response.json() or {}
except Exception: # pylint: disable=broad-except
log.warning("agent: named-secret resolve failed for %s", names, exc_info=True)
return {}
resolved = data.get("secrets") if isinstance(data, dict) else None
resolved = resolved if isinstance(resolved, dict) else {}
missing = [name for name in names if name not in resolved]
if missing:
log.warning("agent: unresolved named secret(s): %s", missing)
log.warning(
"agent: named-secret resolve HTTP %s for %d requested secret(s)",
response.status_code,
len(names),
)
return {}
data = response.json() or {}
except Exception: # pylint: disable=broad-except
log.warning(
"agent: named-secret resolve failed for %d requested secret(s)",
len(names),
exc_info=True,
)
return {}
resolved = data.get("secrets") if isinstance(data, dict) else None
resolved = resolved if isinstance(resolved, dict) else {}
missing = [name for name in names if name not in resolved]
if missing:
log.warning("agent: unresolved named secret count=%d", len(missing))

Comment on lines +53 to +60
resolved = data.get("secrets") if isinstance(data, dict) else None
resolved = resolved if isinstance(resolved, dict) else {}
missing = [name for name in names if name not in resolved]
if missing:
log.warning("agent: unresolved named secret(s): %s", missing)
return {
str(key): str(value) for key, value in resolved.items() if value is not None
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Restrict returned secrets to the requested name set.

Line 59 currently returns every key in resolved. If upstream returns extras, this path propagates unrequested secrets into runtime memory.

Proposed patch
-    return {
-        str(key): str(value) for key, value in resolved.items() if value is not None
-    }
+    requested = {str(name) for name in names}
+    return {
+        str(key): str(value)
+        for key, value in resolved.items()
+        if value is not None and str(key) in requested
+    }

Comment on lines +70 to +83
if not usage or not usage.get("total"):
return
try:
span = otel_trace.get_current_span()
input_tokens = int(usage.get("input") or 0)
output_tokens = int(usage.get("output") or 0)
span.set_attribute("gen_ai.usage.input_tokens", input_tokens)
span.set_attribute("gen_ai.usage.output_tokens", output_tokens)
span.set_attribute("gen_ai.usage.prompt_tokens", input_tokens)
span.set_attribute("gen_ai.usage.completion_tokens", output_tokens)
span.set_attribute("gen_ai.usage.total_tokens", int(usage.get("total") or 0))
cost = usage.get("cost")
if cost:
span.set_attribute("gen_ai.usage.cost", float(cost))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Zero-valued usage metrics are currently dropped.

Using truthy checks skips valid 0 totals/costs, so traces can’t distinguish “zero usage” from “missing usage”.

Suggested fix
-    if not usage or not usage.get("total"):
+    if not usage or usage.get("total") is None:
         return
@@
-        cost = usage.get("cost")
-        if cost:
-            span.set_attribute("gen_ai.usage.cost", float(cost))
+        cost = usage.get("cost")
+        if cost is not None:
+            span.set_attribute("gen_ai.usage.cost", float(cost))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if not usage or not usage.get("total"):
return
try:
span = otel_trace.get_current_span()
input_tokens = int(usage.get("input") or 0)
output_tokens = int(usage.get("output") or 0)
span.set_attribute("gen_ai.usage.input_tokens", input_tokens)
span.set_attribute("gen_ai.usage.output_tokens", output_tokens)
span.set_attribute("gen_ai.usage.prompt_tokens", input_tokens)
span.set_attribute("gen_ai.usage.completion_tokens", output_tokens)
span.set_attribute("gen_ai.usage.total_tokens", int(usage.get("total") or 0))
cost = usage.get("cost")
if cost:
span.set_attribute("gen_ai.usage.cost", float(cost))
if not usage or usage.get("total") is None:
return
try:
span = otel_trace.get_current_span()
input_tokens = int(usage.get("input") or 0)
output_tokens = int(usage.get("output") or 0)
span.set_attribute("gen_ai.usage.input_tokens", input_tokens)
span.set_attribute("gen_ai.usage.output_tokens", output_tokens)
span.set_attribute("gen_ai.usage.prompt_tokens", input_tokens)
span.set_attribute("gen_ai.usage.completion_tokens", output_tokens)
span.set_attribute("gen_ai.usage.total_tokens", int(usage.get("total") or 0))
cost = usage.get("cost")
if cost is not None:
span.set_attribute("gen_ai.usage.cost", float(cost))

@mmabrouk mmabrouk left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex subagent review for #4772

Blocking finding:

  • services/oss/src/agent/app.py:63 / services/oss/src/agent/config.py:40: the default local path selects InProcessPiBackend(url=None, cwd=wrapper_dir()), and wrapper_dir() defaults to services/agent. At this PR's head (feat/agent-service), services/agent/src/cli.ts, services/agent/package.json, and services/agent/src/server.ts are not present; those files land in the runner PRs (#4773/#4778). After merging only #4771 + #4772, the newly mounted /agent/v0/invoke path will try to spawn pnpm exec tsx src/cli.ts from a missing package whenever AGENTA_AGENT_PI_URL is unset, so the service is not independently runnable against its declared base. The HTTP sidecar path also depends on the runner/hosting stack being present. Please either stack/retarget this PR on the runner PR that provides the runtime, include the runtime dependency here, or gate /agent/v0/backend selection until a runner package or sidecar URL is available, returning a clear 503/validation error with coverage.

Related cross-PR issue:

  • services/oss/src/agent/app.py:149 / services/oss/src/agent/schemas.py:36: related to the #4775 frontend path, this PR mounts a directly registered workflow and the comment explicitly says there is no first-class builtin URI yet (agenta:builtin:agent:v0). /inspect can describe the mounted handler, but I do not see a catalog template/registration that would make an unconditional frontend "create Agent" option work. Please either add that backend surface here, gate the frontend create option until it lands, or update the PR/stack docs so #4775 does not depend on a template this PR does not provide.

I also checked the tool resolver/call-path refactor, vault/gateway adapters, and streaming setup/cleanup path from the patch and did not find another blocker. I did not run tests; this review is based on the GitHub PR metadata, patches, and branch files. The PR body's stack map still points to #4774/#4777 while the newer runner/docs PRs are #4778/#4779, which is contributing to the dependency confusion above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Backend feature python Pull requests that update Python code size:XXL This PR changes 1000+ lines, ignoring generated files. tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant