From 9f4a48451fa147f44aab15c689a56a3479af72ed Mon Sep 17 00:00:00 2001 From: Sairen Christian Buerano <122254250+sairenchristianbuerano@users.noreply.github.com> Date: Thu, 28 May 2026 21:19:36 +0800 Subject: [PATCH 1/6] chore: add repo hygiene policy --- google_adk/repo_hygiene.yaml | 34 ++++++++++++++++++++++++++++++++++ openai_sdk/repo_hygiene.yaml | 35 +++++++++++++++++++++++++++++++++++ 2 files changed, 69 insertions(+) create mode 100644 google_adk/repo_hygiene.yaml create mode 100644 openai_sdk/repo_hygiene.yaml diff --git a/google_adk/repo_hygiene.yaml b/google_adk/repo_hygiene.yaml new file mode 100644 index 0000000..b6f57c7 --- /dev/null +++ b/google_adk/repo_hygiene.yaml @@ -0,0 +1,34 @@ +policy: + id: google_adk_repo_hygiene + name: Google ADK repo hygiene + category: google_adk + description: Repo-scoped hygiene rules for projects that use the Google ADK. Fire + once per scan against the repo manifest and inventory rather than per tool or + agent. +rules: +- id: ADK-201 + title: Google ADK project missing CLAUDE.md + scope: repo + severity: low + confidence: 0.9 + language: python + applies_to: + - google_adk + match: + all: + - repo_has_sdk_in_code: + - google_adk + - not: + repo_component_present: + - claude_md + explanation: The project uses the Google ADK in code but ships no CLAUDE.md at the + repo root. When Claude Code (or any agent following the CLAUDE.md convention) + edits this repo, it has no project-specific guidance on agent-class choices (LlmAgent + vs SequentialAgent vs ParallelAgent vs LoopAgent), sub_agents composition, FunctionTool + wrapping conventions, required guardrails, or the local test and build commands. + The likely consequence is generated code that violates the project's tool and + agent contracts because nothing in-tree teaches the agent the local rules. + fix: Add a CLAUDE.md at the repo root. State which ADK agent classes the project + uses and why, how tools must be wrapped, any required guardrails or sandboxing, + and the exact test, lint, and build commands. Keep it short and concrete so an + editing agent can act on it without re-deriving the conventions. diff --git a/openai_sdk/repo_hygiene.yaml b/openai_sdk/repo_hygiene.yaml new file mode 100644 index 0000000..193628d --- /dev/null +++ b/openai_sdk/repo_hygiene.yaml @@ -0,0 +1,35 @@ +policy: + id: openai_sdk_repo_hygiene + name: OpenAI Agents SDK repo hygiene + category: openai_sdk + description: Repo-scoped hygiene rules for projects that use the OpenAI Agents SDK. + Fire once per scan against the repo manifest and inventory rather than per tool + or agent. +rules: +- id: OAI-202 + title: OpenAI Agents project missing CLAUDE.md + scope: repo + severity: low + confidence: 0.9 + language: python + applies_to: + - openai_agents + match: + all: + - repo_has_sdk_in_code: + - openai_agents + - not: + repo_component_present: + - claude_md + explanation: The project uses the OpenAI Agents SDK in code but ships no CLAUDE.md + at the repo root. When Claude Code (or any agent honoring the CLAUDE.md convention) + edits this repo, it has no project-specific guidance on Agent vs SandboxAgent + choice, handoff topology, required input/output guardrails, tool_choice settings, + or the local test and build commands. The likely consequence is generated code + that bypasses the project's safety contracts because nothing in-tree teaches the + agent the local rules. + fix: Add a CLAUDE.md at the repo root. State whether the project uses Agent or SandboxAgent, + list required guardrails (input_guardrails, output_guardrails) and tool_choice + conventions, note any handoff or tracing policy, and give the exact test, lint, + and build commands. Keep it short and concrete so an editing agent can act on + it without re-deriving the conventions. From 37a229c36512f310899dabd1efb646a5a2b74dc8 Mon Sep 17 00:00:00 2001 From: Sairen Christian Buerano <122254250+sairenchristianbuerano@users.noreply.github.com> Date: Thu, 28 May 2026 21:28:38 +0800 Subject: [PATCH 2/6] chore: repo hygiene policy added --- google_adk/repo_hygiene.yaml | 61 +++++++++++++++++++--------------- openai_sdk/repo_hygiene.yaml | 64 ++++++++++++++++++++---------------- 2 files changed, 69 insertions(+), 56 deletions(-) diff --git a/google_adk/repo_hygiene.yaml b/google_adk/repo_hygiene.yaml index b6f57c7..ce41747 100644 --- a/google_adk/repo_hygiene.yaml +++ b/google_adk/repo_hygiene.yaml @@ -2,33 +2,40 @@ policy: id: google_adk_repo_hygiene name: Google ADK repo hygiene category: google_adk - description: Repo-scoped hygiene rules for projects that use the Google ADK. Fire - once per scan against the repo manifest and inventory rather than per tool or + description: > + Repo-scoped hygiene rules for projects that use the Google ADK. Fire once + per scan against the repo manifest and inventory rather than per tool or agent. + rules: -- id: ADK-201 - title: Google ADK project missing CLAUDE.md - scope: repo - severity: low - confidence: 0.9 - language: python - applies_to: - - google_adk - match: - all: - - repo_has_sdk_in_code: + - id: ADK-201 + title: Google ADK project missing CLAUDE.md + severity: low + confidence: 0.9 + language: python + applies_to: - google_adk - - not: - repo_component_present: - - claude_md - explanation: The project uses the Google ADK in code but ships no CLAUDE.md at the - repo root. When Claude Code (or any agent following the CLAUDE.md convention) - edits this repo, it has no project-specific guidance on agent-class choices (LlmAgent - vs SequentialAgent vs ParallelAgent vs LoopAgent), sub_agents composition, FunctionTool - wrapping conventions, required guardrails, or the local test and build commands. - The likely consequence is generated code that violates the project's tool and - agent contracts because nothing in-tree teaches the agent the local rules. - fix: Add a CLAUDE.md at the repo root. State which ADK agent classes the project - uses and why, how tools must be wrapped, any required guardrails or sandboxing, - and the exact test, lint, and build commands. Keep it short and concrete so an - editing agent can act on it without re-deriving the conventions. + scope: repo + match: + all: + - repo_has_sdk_in_code: + - google_adk + - not: + repo_component_present: + - claude_md + explanation: > + The project uses the Google ADK in code but ships no CLAUDE.md at the + repo root. When Claude Code (or any agent following the CLAUDE.md + convention) edits this repo, it has no project-specific guidance on + agent-class choices (LlmAgent vs SequentialAgent vs ParallelAgent vs + LoopAgent), sub_agents composition, FunctionTool wrapping conventions, + required guardrails, or the local test and build commands. The likely + consequence is generated code that violates the project's tool and + agent contracts because nothing in-tree teaches the agent the local + rules. + fix: > + Add a CLAUDE.md at the repo root. State which ADK agent classes the + project uses and why, how tools must be wrapped, any required guardrails + or sandboxing, and the exact test, lint, and build commands. Keep it + short and concrete so an editing agent can act on it without re-deriving + the conventions. diff --git a/openai_sdk/repo_hygiene.yaml b/openai_sdk/repo_hygiene.yaml index 193628d..4622e1c 100644 --- a/openai_sdk/repo_hygiene.yaml +++ b/openai_sdk/repo_hygiene.yaml @@ -2,34 +2,40 @@ policy: id: openai_sdk_repo_hygiene name: OpenAI Agents SDK repo hygiene category: openai_sdk - description: Repo-scoped hygiene rules for projects that use the OpenAI Agents SDK. - Fire once per scan against the repo manifest and inventory rather than per tool - or agent. + description: > + Repo-scoped hygiene rules for projects that use the OpenAI Agents SDK. + Fire once per scan against the repo manifest and inventory rather than per + tool or agent. + rules: -- id: OAI-202 - title: OpenAI Agents project missing CLAUDE.md - scope: repo - severity: low - confidence: 0.9 - language: python - applies_to: - - openai_agents - match: - all: - - repo_has_sdk_in_code: + - id: OAI-202 + title: OpenAI Agents project missing CLAUDE.md + severity: low + confidence: 0.9 + language: python + applies_to: - openai_agents - - not: - repo_component_present: - - claude_md - explanation: The project uses the OpenAI Agents SDK in code but ships no CLAUDE.md - at the repo root. When Claude Code (or any agent honoring the CLAUDE.md convention) - edits this repo, it has no project-specific guidance on Agent vs SandboxAgent - choice, handoff topology, required input/output guardrails, tool_choice settings, - or the local test and build commands. The likely consequence is generated code - that bypasses the project's safety contracts because nothing in-tree teaches the - agent the local rules. - fix: Add a CLAUDE.md at the repo root. State whether the project uses Agent or SandboxAgent, - list required guardrails (input_guardrails, output_guardrails) and tool_choice - conventions, note any handoff or tracing policy, and give the exact test, lint, - and build commands. Keep it short and concrete so an editing agent can act on - it without re-deriving the conventions. + scope: repo + match: + all: + - repo_has_sdk_in_code: + - openai_agents + - not: + repo_component_present: + - claude_md + explanation: > + The project uses the OpenAI Agents SDK in code but ships no CLAUDE.md at + the repo root. When Claude Code (or any agent honoring the CLAUDE.md + convention) edits this repo, it has no project-specific guidance on + Agent vs SandboxAgent choice, handoff topology, required input/output + guardrails, tool_choice settings, or the local test and build commands. + The likely consequence is generated code that bypasses the project's + safety contracts because nothing in-tree teaches the agent the local + rules. + fix: > + Add a CLAUDE.md at the repo root. State whether the project uses Agent + or SandboxAgent, list required guardrails (input_guardrails, + output_guardrails) and tool_choice conventions, note any handoff or + tracing policy, and give the exact test, lint, and build commands. Keep + it short and concrete so an editing agent can act on it without + re-deriving the conventions. From ca7e11fcd07ac451486082ce3a155fff81ebf096 Mon Sep 17 00:00:00 2001 From: Sairen Christian Buerano <122254250+sairenchristianbuerano@users.noreply.github.com> Date: Thu, 28 May 2026 21:32:18 +0800 Subject: [PATCH 3/6] chore: added google adk agent safety policy --- google_adk/agent_safety.yaml | 2 +- google_adk/builtin_tools.yaml | 37 ----------------------------------- 2 files changed, 1 insertion(+), 38 deletions(-) delete mode 100644 google_adk/builtin_tools.yaml diff --git a/google_adk/agent_safety.yaml b/google_adk/agent_safety.yaml index 09f6828..af85ac6 100644 --- a/google_adk/agent_safety.yaml +++ b/google_adk/agent_safety.yaml @@ -124,7 +124,7 @@ rules: match: all: - agent_class: [LlmAgent] - - agent_uses_hosted_tool_class: [google_search, VertexAISearch] + - agent_uses_hosted_tool_class: [GoogleSearchTool, VertexAiSearchTool] - agent_kwarg_missing: [before_tool_callback] explanation: > This LlmAgent is wired with a web search built-in (google_search or diff --git a/google_adk/builtin_tools.yaml b/google_adk/builtin_tools.yaml deleted file mode 100644 index e6846a4..0000000 --- a/google_adk/builtin_tools.yaml +++ /dev/null @@ -1,37 +0,0 @@ -policy: - id: google_adk_builtin_tools - name: ADK built-in tool safety configuration - category: google_adk - description: > - Rules that flag unsafe configurations on Google ADK built-in tool - classes (BashTool, etc.). Their safety kwargs must be set explicitly — - the defaults are permissive. - -rules: - - id: ADK-008 - title: BashTool missing shell metacharacter blocking - severity: high - confidence: 0.9 - language: python - applies_to: - - adk_function_tool - scope: tool - match: - all: - - name_in: - - BashTool - - not: - tool_decorator_kwarg_value: - kwarg: block_shell_metacharacters - value: "True" - explanation: > - BashTool executes shell commands on behalf of the model. Without - block_shell_metacharacters=True, model-supplied input can contain shell - metacharacters (;, |, &&, $(...), backticks) that rewrite the intended - command into an arbitrary shell injection. This parameter was added in - google-adk v1.34 and defaults to False — omitting it leaves the shell - fully injectable. - fix: > - Pass block_shell_metacharacters=True to the BashTool constructor. If the - tool legitimately needs metacharacters for a specific use case, document - the threat model and pair it with an explicit command allowlist. From 81f3ed3bfc9b034bb4c89c26bd226818e4bf6a49 Mon Sep 17 00:00:00 2001 From: Sairen Christian Buerano <122254250+sairenchristianbuerano@users.noreply.github.com> Date: Fri, 29 May 2026 08:10:27 +0800 Subject: [PATCH 4/6] chore: repo hygiene policy --- claude_sdk/repo_hygiene.yaml | 44 +++++++++++++++++++++++++++++++++ claude_sdk/tool_definition.yaml | 32 ++++++++++++++++++++++++ google_adk/tool_definition.yaml | 26 +++++++++++++++++++ openai_sdk/network.yaml | 31 +++++++++++++++++++++++ 4 files changed, 133 insertions(+) create mode 100644 claude_sdk/repo_hygiene.yaml diff --git a/claude_sdk/repo_hygiene.yaml b/claude_sdk/repo_hygiene.yaml new file mode 100644 index 0000000..adfe259 --- /dev/null +++ b/claude_sdk/repo_hygiene.yaml @@ -0,0 +1,44 @@ +policy: + id: claude_sdk_repo_hygiene + name: Repo hygiene + category: claude_sdk + description: > + Repository-scope rules covering the project-level scaffolding a Claude + Agent SDK codebase should ship with — agent-facing READMEs, hook scripts, + sandbox policies, and other components a Claude session relies on at cold + start. + +rules: + - id: CSDK-201 + title: Repo ships Claude Agent SDK code without a CLAUDE.md + severity: low + confidence: 0.9 + language: python + applies_to: + - claude_sdk + scope: repo + match: + all: + - repo_has_sdk_in_code: + - claude_agent_sdk + - not: + repo_component_present: + - claude_md + explanation: > + A repository that builds on the Claude Agent SDK but ships no top-level + CLAUDE.md leaves any Claude session that opens the repo with no + project-specific guidance. The conventions, build commands, test + runners, lint scripts, and "do not touch" boundaries a human maintainer + would describe in onboarding are absent, so the agent has to infer them + from the source on every session and frequently guesses wrong — + bypassing the project's lint, picking the wrong test command, or + violating commit conventions that were never written down anywhere it + could read. The cost compounds: each new contributor (human or agent) + reinvents the same wrong assumptions. + fix: > + Add a CLAUDE.md at the repository root documenting how to build, test, + and lint the project, the coding conventions Claude must respect, the + directories or files Claude must not modify, and any project-specific + safety guardrails (e.g. "never run migrations", "never push to main"). + Treat it as the project's agent-facing README and keep it under version + control so reviewers see drift. diff --git a/claude_sdk/tool_definition.yaml b/claude_sdk/tool_definition.yaml index 08d1900..228b488 100644 --- a/claude_sdk/tool_definition.yaml +++ b/claude_sdk/tool_definition.yaml @@ -74,3 +74,35 @@ rules: refuse to call it at all. fix: > Rename to a verb-object form, e.g. `summarize_invoice`, `refund_charge`. + + - id: CSDK-008 + title: Tool exposes **kwargs without explicit input_schema + severity: medium + confidence: 0.8 + language: python + applies_to: + - claude_sdk_tool + scope: tool + match: + all: + - param_name_matches: + exact: + - kwargs + - not: + tool_decorator_kwarg_present: + - input_schema + explanation: > + The Claude Agent SDK derives a tool's JSON input schema from the + function signature. A tool whose accepted arguments live entirely under + `**kwargs` has no typed surface for the SDK to introspect, so the model + sees an empty parameter object and gets no signal about which keys to + send. Calls then either omit data the body requires or invent keys the + body does not handle, and the failure surfaces as a runtime KeyError at + invoke time instead of a clean schema-validation error before the tool + ever runs. + fix: > + Either declare each accepted parameter on the function signature with a + type annotation so the SDK can derive the schema, or pass an explicit + `input_schema=` to the `@tool` decorator — either a JSON Schema dict or + a Pydantic `BaseModel` subclass — so the contract published to the model + matches what the body actually reads from `kwargs`. diff --git a/google_adk/tool_definition.yaml b/google_adk/tool_definition.yaml index 9436f10..aa7d51f 100644 --- a/google_adk/tool_definition.yaml +++ b/google_adk/tool_definition.yaml @@ -76,3 +76,29 @@ rules: refuse to call it at all when a better-named alternative is present. fix: > Rename to a verb-object form, e.g. `summarize_document`, `fetch_order`. + + - id: ADK-008 + title: FunctionTool body prints to stdout + severity: low + confidence: 0.7 + language: python + applies_to: + - adk_function_tool + scope: tool + match: + has_body_text: + - print( + explanation: > + Google ADK FunctionTool runs inside the agent's process and shares its + stdout with the runtime. A wrapped function that calls `print(...)` for + debug tracing leaks raw arguments (paths, IDs, decoded blob contents) + into the same stream the runner uses for structured logs, mangles JSON + log lines, and can echo secrets pulled from `tool_context.state` into + terminal scrollback or container log shippers. The output is also not + addressable by the model — `print` writes go to the process, not to the + tool response, so the operator sees noise while the agent sees nothing. + fix: > + Delete debug `print` calls before shipping, or replace them with a + module logger (`logging.getLogger(__name__).debug(...)`) so the operator + can silence them with a level switch. If the data is meant for the + model, return it as part of the tool's structured result instead. diff --git a/openai_sdk/network.yaml b/openai_sdk/network.yaml index 0a69625..5da426f 100644 --- a/openai_sdk/network.yaml +++ b/openai_sdk/network.yaml @@ -67,3 +67,34 @@ rules: every urllib.request.urlopen call. Surface timeouts as a structured tool error the model can react to using failure_error_function (see OAI-004). OAI-005 covers the equivalent gap for requests/httpx callees. + + - id: OAI-014 + title: Tool builds outbound URL from non-literal value + severity: medium + confidence: 0.55 + language: python + applies_to: + - openai_tool + scope: tool + match: + has_dynamic_url_call: true + explanation: > + An OpenAI Agents SDK tool builds its outbound HTTP URL from a + non-literal value — typically a tool parameter interpolated into an + f-string or concatenated onto a base URL. Because tool arguments are + produced by the model from conversation context (including prior tool + output and user input), an attacker who can shape that context can + steer the request to an attacker-controlled host or to an internal + address the agent's network egress can reach but the model was never + meant to touch. The same channel also leaks request bodies (auth + headers, JSON payloads, the model's reasoning) to whichever host the + URL resolves to, so the failure mode is both SSRF and data + exfiltration in one call. + fix: > + Treat the model-supplied value as untrusted input to URL construction. + Validate it against an allow-list of permitted hosts/path segments + before any HTTP client touches it, or look the value up against a + server-side registry and build the URL from the registry's trusted + entry. If the value is meant to be an opaque ID (e.g. a connection_id), + pass it as a query parameter or path segment of a fixed base URL and + reject characters that would let it escape that slot. From 7282240a8557bcd5e82dec870a20c08918f47eae Mon Sep 17 00:00:00 2001 From: Sairen Christian Buerano <122254250+sairenchristianbuerano@users.noreply.github.com> Date: Fri, 29 May 2026 10:45:12 +0800 Subject: [PATCH 5/6] chore: add new rules for openai_sdk code_exec idempotency and network --- openai_sdk/code_execution.yaml | 104 ++++++++++------ openai_sdk/idempotency.yaml | 119 ++++++++++++------ openai_sdk/network.yaml | 220 +++++++++++++++++++-------------- 3 files changed, 279 insertions(+), 164 deletions(-) diff --git a/openai_sdk/code_execution.yaml b/openai_sdk/code_execution.yaml index 6640941..f553450 100644 --- a/openai_sdk/code_execution.yaml +++ b/openai_sdk/code_execution.yaml @@ -2,40 +2,74 @@ policy: id: openai_sdk_code_execution name: OpenAI Agents SDK dynamic code execution category: openai_sdk - description: > - Rules covering tool bodies that evaluate dynamic Python code. eval/exec/ - compile inside an @function_tool turns the agent process into an - arbitrary-code-execution surface when any input flows from the model. + description: 'Rules covering tool bodies that evaluate dynamic Python code. eval/exec/ + compile inside an @function_tool turns the agent process into an arbitrary-code-execution + surface when any input flows from the model. + ' rules: - - id: OAI-013 - title: Tool body calls eval/exec/compile on dynamic input - severity: high - confidence: 0.9 - language: python - applies_to: - - openai_tool - scope: tool - match: - has_body_text: - - exec( - - eval( - - compile( - explanation: > - The @function_tool body invokes Python's eval, exec, or compile. When - any portion of the input flows from the model (directly or via session - state the model writes to), the call is arbitrary-code-execution inside - the agent process. Unlike subprocess, no OS-level sandbox stands between - the call and the agent's full Python runtime: imports, file handles, - and credentials in memory are all reachable from inside the exec'd - string. Even an exec into a restricted globals dict is escapable via - the __builtins__ chain unless explicitly stripped, and a model that - controls the body of the exec'd code controls the agent. - fix: > - Remove eval/exec/compile from the tool. Replace dynamic evaluation with - a constrained interpreter (asteval, RestrictedPython) only if the use - case truly requires expression evaluation; for arithmetic, - ast.literal_eval is safe. If the tool is intentionally a code-execution - surface (e.g. notebook-style), run the exec inside a separate process - with seccomp + no network + no filesystem write, and treat the process - as a single-use sacrificial sandbox. +- id: OAI-013 + title: Tool body calls eval/exec/compile on dynamic input + severity: high + confidence: 0.9 + language: python + applies_to: + - openai_tool + scope: tool + match: + has_body_text: + - exec( + - eval( + - compile( + explanation: 'The @function_tool body invokes Python''s eval, exec, or compile. + When any portion of the input flows from the model (directly or via session state + the model writes to), the call is arbitrary-code-execution inside the agent process. + Unlike subprocess, no OS-level sandbox stands between the call and the agent''s + full Python runtime: imports, file handles, and credentials in memory are all + reachable from inside the exec''d string. Even an exec into a restricted globals + dict is escapable via the __builtins__ chain unless explicitly stripped, and a + model that controls the body of the exec''d code controls the agent. + + ' + fix: 'Remove eval/exec/compile from the tool. Replace dynamic evaluation with a + constrained interpreter (asteval, RestrictedPython) only if the use case truly + requires expression evaluation; for arithmetic, ast.literal_eval is safe. If the + tool is intentionally a code-execution surface (e.g. notebook-style), run the + exec inside a separate process with seccomp + no network + no filesystem write, + and treat the process as a single-use sacrificial sandbox. + + ' +- id: OAI-017 + title: TypeScript tool body calls eval / new Function on dynamic input + scope: tool + severity: high + confidence: 0.9 + language: typescript + applies_to: + - openai_tool + match: + has_body_text: + - eval( + - new Function( + - Function( + explanation: "A TypeScript tool body invokes `eval()` or constructs a `new Function(...)`\ + \ from a string. When any portion of that string flows from the model (directly,\ + \ via tool arguments, or via session state the model writes to), the call is arbitrary-code-execution\ + \ inside the agent's Node / Worker / browser runtime. No OS-level sandbox stands\ + \ between the call and the runtime's full capabilities: file I/O, env-var access,\ + \ fetch credentials, and the agent's own keys in memory are all reachable from\ + \ the evaluated string. Even a `with({})`-restricted scope is escapable through\ + \ globalThis, prototype chains, and the platform API surface unless explicitly\ + \ stripped. The miner-flagged `calculate` tool in the Claude SDK template feeds\ + \ a tool argument straight into `eval` to act as a math evaluator \u2014 the canonical\ + \ worst-case shape. Provisional: this rule loads and validates today but will\ + \ not fire until the engine's TypeScript tool parser ships.\n" + fix: 'Remove `eval` and `new Function` from the tool. For arithmetic, parse the + input with a hardened expression library (`mathjs`, `expr-eval`) configured to + disallow function calls and property access. If the tool genuinely needs to run + model-supplied code, isolate it: a separate Worker with no bindings (no env, no + KV, no secrets), a `vm.createContext` with an empty global plus a CPU/memory/time + limit, or a sandboxed iframe with the network and storage origins stripped. Treat + the sandbox as single-use and discard it after every call. + + ' diff --git a/openai_sdk/idempotency.yaml b/openai_sdk/idempotency.yaml index 528ec93..03a35b6 100644 --- a/openai_sdk/idempotency.yaml +++ b/openai_sdk/idempotency.yaml @@ -2,42 +2,87 @@ policy: id: openai_sdk_idempotency name: OpenAI Agents SDK mutating-tool idempotency category: openai_sdk - description: > - Rules that flag mutating OpenAI Agents SDK tools without an idempotency - key. Agents retry tool calls under timeouts and ambiguous failures; without - a key the same side-effecting action can fire more than once. + description: 'Rules that flag mutating OpenAI Agents SDK tools without an idempotency + key. Agents retry tool calls under timeouts and ambiguous failures; without a + key the same side-effecting action can fire more than once. + ' rules: - - id: OAI-009 - title: Mutating tool has no idempotency key - severity: medium - confidence: 0.55 - language: python - applies_to: - - openai_tool - scope: tool - match: - all: - - name_has_prefix: - - create_ - - send_ - - delete_ - - post_ - - update_ - - refund_ - - charge_ - - issue_ - - not: - param_name_matches: - contains: - - idempot - exact: - - request_id - - txn_id - explanation: > - Tool name suggests a side effect (create/send/refund/…). The OpenAI - Agents SDK retries tool calls on timeout and ambiguous failures; without - an idempotency key the same action can fire twice. Downstream services - must also honor the key for this protection to be effective. - fix: > - Add an `idempotency_key: str` parameter and pass it to the backing API. +- id: OAI-009 + title: Mutating tool has no idempotency key + severity: medium + confidence: 0.55 + language: python + applies_to: + - openai_tool + scope: tool + match: + all: + - name_has_prefix: + - create_ + - send_ + - delete_ + - post_ + - update_ + - refund_ + - charge_ + - issue_ + - not: + param_name_matches: + contains: + - idempot + exact: + - request_id + - txn_id + explanation: "Tool name suggests a side effect (create/send/refund/\u2026). The\ + \ OpenAI Agents SDK retries tool calls on timeout and ambiguous failures; without\ + \ an idempotency key the same action can fire twice. Downstream services must\ + \ also honor the key for this protection to be effective.\n" + fix: 'Add an `idempotency_key: str` parameter and pass it to the backing API. + + ' +- id: OAI-015 + title: TypeScript mutating tool has no idempotency key + scope: tool + severity: medium + confidence: 0.5 + language: typescript + applies_to: + - openai_tool + match: + all: + - name_has_prefix: + - create_ + - send_ + - delete_ + - post_ + - update_ + - refund_ + - charge_ + - issue_ + - not: + has_body_text: + - idempot + - request_id + - requestId + - txn_id + - txnId + - correlation_id + - correlationId + explanation: 'An OpenAI Agents SDK tool authored in TypeScript whose name implies + a side effect (create_/send_/delete_/refund_/...) has no idempotency token visible + in its body. The SDK retries tool calls on timeouts and ambiguous failures and + the model itself will retry whenever a tool result reads as inconclusive; without + a key threaded through to the backing API the same action fires twice, producing + duplicate tickets, payments, emails, or deletions. Provisional: this rule loads + and validates today but will not fire until the engine''s TypeScript tool parser + ships. + + ' + fix: 'Add an `idempotencyKey: string` parameter to the tool''s zod schema and forward + it to the underlying API (Stripe `Idempotency-Key` header, REST `X-Request-ID`, + GraphQL `clientMutationId`, etc.). If the backing service does not honor idempotency + keys, document the gap and dedupe in your own store keyed by the token before + issuing the call. + + ' diff --git a/openai_sdk/network.yaml b/openai_sdk/network.yaml index 5da426f..56ee258 100644 --- a/openai_sdk/network.yaml +++ b/openai_sdk/network.yaml @@ -2,99 +2,135 @@ policy: id: openai_sdk_network name: OpenAI Agents SDK network hygiene category: openai_sdk - description: > - Rules covering outbound network calls made from inside OpenAI Agents SDK - tools. The SDK does not enforce timeouts; they must be set explicitly. + description: 'Rules covering outbound network calls made from inside OpenAI Agents + SDK tools. The SDK does not enforce timeouts; they must be set explicitly. + ' rules: - - id: OAI-005 - title: Network call has no timeout - severity: high - confidence: 0.85 - language: python - applies_to: - - openai_tool - scope: tool - match: - call_without_kwarg: - callees: - - requests.get - - requests.post - - requests.put - - requests.delete - - requests.patch - - requests.head - - httpx.get - - httpx.post - - httpx.put - - httpx.delete - - httpx.patch - missing: timeout - explanation: > - An OpenAI Agents SDK tool that makes a network request without a - timeout can hang indefinitely, blocking the agent's run loop and - consuming the conversation's wall-clock budget. The SDK does not - enforce timeouts on tool code. - fix: > - Pass timeout= (typically 5-30 seconds depending on the endpoint). - Surface timeouts as a structured tool error the model can react to, - using failure_error_function (see OAI-004). +- id: OAI-005 + title: Network call has no timeout + severity: high + confidence: 0.85 + language: python + applies_to: + - openai_tool + scope: tool + match: + call_without_kwarg: + callees: + - requests.get + - requests.post + - requests.put + - requests.delete + - requests.patch + - requests.head + - httpx.get + - httpx.post + - httpx.put + - httpx.delete + - httpx.patch + missing: timeout + explanation: 'An OpenAI Agents SDK tool that makes a network request without a timeout + can hang indefinitely, blocking the agent''s run loop and consuming the conversation''s + wall-clock budget. The SDK does not enforce timeouts on tool code. - - id: OAI-011 - title: urllib network call has no timeout - severity: high - confidence: 0.85 - language: python - applies_to: - - openai_tool - scope: tool - match: - call_without_kwarg: - callees: - - urllib.request.urlopen - - urlopen - missing: timeout - explanation: > - An OpenAI Agents SDK tool that calls urllib.request.urlopen without - timeout= can hang on a slow or unresponsive host. Python's default for - urlopen is socket._GLOBAL_DEFAULT_TIMEOUT, which is None unless - socket.setdefaulttimeout has been called process-wide; in practice - that means no timeout. The agent's run loop blocks inside the tool - until the OS-level TCP timeout (often minutes), consuming the - conversation's wall-clock budget and tying up the worker. - fix: > - Pass timeout= (typically 5-30 seconds depending on the endpoint) to - every urllib.request.urlopen call. Surface timeouts as a structured - tool error the model can react to using failure_error_function (see - OAI-004). OAI-005 covers the equivalent gap for requests/httpx callees. + ' + fix: 'Pass timeout= (typically 5-30 seconds depending on the endpoint). Surface + timeouts as a structured tool error the model can react to, using failure_error_function + (see OAI-004). - - id: OAI-014 - title: Tool builds outbound URL from non-literal value - severity: medium - confidence: 0.55 - language: python - applies_to: - - openai_tool - scope: tool - match: - has_dynamic_url_call: true - explanation: > - An OpenAI Agents SDK tool builds its outbound HTTP URL from a - non-literal value — typically a tool parameter interpolated into an - f-string or concatenated onto a base URL. Because tool arguments are - produced by the model from conversation context (including prior tool - output and user input), an attacker who can shape that context can - steer the request to an attacker-controlled host or to an internal - address the agent's network egress can reach but the model was never - meant to touch. The same channel also leaks request bodies (auth - headers, JSON payloads, the model's reasoning) to whichever host the - URL resolves to, so the failure mode is both SSRF and data - exfiltration in one call. - fix: > - Treat the model-supplied value as untrusted input to URL construction. - Validate it against an allow-list of permitted hosts/path segments - before any HTTP client touches it, or look the value up against a - server-side registry and build the URL from the registry's trusted - entry. If the value is meant to be an opaque ID (e.g. a connection_id), - pass it as a query parameter or path segment of a fixed base URL and - reject characters that would let it escape that slot. + ' +- id: OAI-011 + title: urllib network call has no timeout + severity: high + confidence: 0.85 + language: python + applies_to: + - openai_tool + scope: tool + match: + call_without_kwarg: + callees: + - urllib.request.urlopen + - urlopen + missing: timeout + explanation: 'An OpenAI Agents SDK tool that calls urllib.request.urlopen without + timeout= can hang on a slow or unresponsive host. Python''s default for urlopen + is socket._GLOBAL_DEFAULT_TIMEOUT, which is None unless socket.setdefaulttimeout + has been called process-wide; in practice that means no timeout. The agent''s + run loop blocks inside the tool until the OS-level TCP timeout (often minutes), + consuming the conversation''s wall-clock budget and tying up the worker. + + ' + fix: 'Pass timeout= (typically 5-30 seconds depending on the endpoint) to every + urllib.request.urlopen call. Surface timeouts as a structured tool error the model + can react to using failure_error_function (see OAI-004). OAI-005 covers the equivalent + gap for requests/httpx callees. + + ' +- id: OAI-014 + title: Tool builds outbound URL from non-literal value + severity: medium + confidence: 0.55 + language: python + applies_to: + - openai_tool + scope: tool + match: + has_dynamic_url_call: true + explanation: "An OpenAI Agents SDK tool builds its outbound HTTP URL from a non-literal\ + \ value \u2014 typically a tool parameter interpolated into an f-string or concatenated\ + \ onto a base URL. Because tool arguments are produced by the model from conversation\ + \ context (including prior tool output and user input), an attacker who can shape\ + \ that context can steer the request to an attacker-controlled host or to an internal\ + \ address the agent's network egress can reach but the model was never meant to\ + \ touch. The same channel also leaks request bodies (auth headers, JSON payloads,\ + \ the model's reasoning) to whichever host the URL resolves to, so the failure\ + \ mode is both SSRF and data exfiltration in one call.\n" + fix: 'Treat the model-supplied value as untrusted input to URL construction. Validate + it against an allow-list of permitted hosts/path segments before any HTTP client + touches it, or look the value up against a server-side registry and build the + URL from the registry''s trusted entry. If the value is meant to be an opaque + ID (e.g. a connection_id), pass it as a query parameter or path segment of a fixed + base URL and reject characters that would let it escape that slot. + + ' +- id: OAI-016 + title: TypeScript tool fetch call has no AbortSignal timeout + scope: tool + severity: high + confidence: 0.6 + language: typescript + applies_to: + - openai_tool + match: + all: + - has_body_text: + - fetch( + - not: + has_body_text: + - AbortSignal + - AbortController + - 'signal:' + - AbortSignal.timeout + explanation: 'An OpenAI Agents SDK tool authored in TypeScript calls `fetch()` without + attaching an AbortSignal. Node''s and the browser''s `fetch` have no implicit + timeout; a slow or unresponsive host blocks the tool''s `execute` callback indefinitely, + which in turn blocks the agent''s run loop, consumes the conversation''s wall-clock + budget, and ties up the worker that owns the request. The miner''s examples include + Cloudflare Worker and realtime-agent templates that fetch from URLs interpolated + from tool arguments, so the gap also amplifies SSRF and exfiltration impact: the + call cannot be cancelled even when downstream behaviour goes obviously wrong. + Provisional: this rule loads and validates today but will not fire until the engine''s + TypeScript tool parser ships. + + ' + fix: 'Attach an AbortSignal that fires after a bounded deadline. Modern runtimes: + `await fetch(url, { signal: AbortSignal.timeout(15_000) })`. For runtimes without + `AbortSignal.timeout`, create an `AbortController`, schedule `controller.abort()` + via `setTimeout`, pass `controller.signal` to `fetch`, and clear the timer in + a `finally` block. Surface the resulting `AbortError` as a structured tool error + the model can react to (see OAI-004''s `failure_error_function` for the Python + equivalent). + + ' From 5489d9043e9933e721d7ac25071954c9a2aa2a79 Mon Sep 17 00:00:00 2001 From: Sairen Christian Buerano <122254250+sairenchristianbuerano@users.noreply.github.com> Date: Mon, 1 Jun 2026 14:51:29 +0800 Subject: [PATCH 6/6] fix: follow schema for idempotency policy --- openai_sdk/idempotency.yaml | 162 ++++++++++++++++++------------------ 1 file changed, 81 insertions(+), 81 deletions(-) diff --git a/openai_sdk/idempotency.yaml b/openai_sdk/idempotency.yaml index 03a35b6..9a0796e 100644 --- a/openai_sdk/idempotency.yaml +++ b/openai_sdk/idempotency.yaml @@ -2,87 +2,87 @@ policy: id: openai_sdk_idempotency name: OpenAI Agents SDK mutating-tool idempotency category: openai_sdk - description: 'Rules that flag mutating OpenAI Agents SDK tools without an idempotency - key. Agents retry tool calls under timeouts and ambiguous failures; without a - key the same side-effecting action can fire more than once. + description: > + Rules that flag mutating OpenAI Agents SDK tools without an idempotency + key. Agents retry tool calls under timeouts and ambiguous failures; without + a key the same side-effecting action can fire more than once. - ' rules: -- id: OAI-009 - title: Mutating tool has no idempotency key - severity: medium - confidence: 0.55 - language: python - applies_to: - - openai_tool - scope: tool - match: - all: - - name_has_prefix: - - create_ - - send_ - - delete_ - - post_ - - update_ - - refund_ - - charge_ - - issue_ - - not: - param_name_matches: - contains: - - idempot - exact: - - request_id - - txn_id - explanation: "Tool name suggests a side effect (create/send/refund/\u2026). The\ - \ OpenAI Agents SDK retries tool calls on timeout and ambiguous failures; without\ - \ an idempotency key the same action can fire twice. Downstream services must\ - \ also honor the key for this protection to be effective.\n" - fix: 'Add an `idempotency_key: str` parameter and pass it to the backing API. + - id: OAI-009 + title: Mutating tool has no idempotency key + severity: medium + confidence: 0.55 + language: python + applies_to: + - openai_tool + scope: tool + match: + all: + - name_has_prefix: + - create_ + - send_ + - delete_ + - post_ + - update_ + - refund_ + - charge_ + - issue_ + - not: + param_name_matches: + contains: + - idempot + exact: + - request_id + - txn_id + explanation: > + Tool name suggests a side effect (create/send/refund/…). The OpenAI + Agents SDK retries tool calls on timeout and ambiguous failures; without + an idempotency key the same action can fire twice. Downstream services + must also honor the key for this protection to be effective. + fix: > + Add an `idempotency_key: str` parameter and pass it to the backing API. - ' -- id: OAI-015 - title: TypeScript mutating tool has no idempotency key - scope: tool - severity: medium - confidence: 0.5 - language: typescript - applies_to: - - openai_tool - match: - all: - - name_has_prefix: - - create_ - - send_ - - delete_ - - post_ - - update_ - - refund_ - - charge_ - - issue_ - - not: - has_body_text: - - idempot - - request_id - - requestId - - txn_id - - txnId - - correlation_id - - correlationId - explanation: 'An OpenAI Agents SDK tool authored in TypeScript whose name implies - a side effect (create_/send_/delete_/refund_/...) has no idempotency token visible - in its body. The SDK retries tool calls on timeouts and ambiguous failures and - the model itself will retry whenever a tool result reads as inconclusive; without - a key threaded through to the backing API the same action fires twice, producing - duplicate tickets, payments, emails, or deletions. Provisional: this rule loads - and validates today but will not fire until the engine''s TypeScript tool parser - ships. - - ' - fix: 'Add an `idempotencyKey: string` parameter to the tool''s zod schema and forward - it to the underlying API (Stripe `Idempotency-Key` header, REST `X-Request-ID`, - GraphQL `clientMutationId`, etc.). If the backing service does not honor idempotency - keys, document the gap and dedupe in your own store keyed by the token before - issuing the call. - - ' + - id: OAI-019 + title: TypeScript mutating tool has no idempotency key + severity: medium + confidence: 0.5 + language: typescript + applies_to: + - openai_tool + scope: tool + match: + all: + - name_has_prefix: + - create_ + - send_ + - delete_ + - post_ + - update_ + - refund_ + - charge_ + - issue_ + - not: + has_body_text: + - idempot + - request_id + - requestId + - txn_id + - txnId + - correlation_id + - correlationId + explanation: > + An OpenAI Agents SDK tool authored in TypeScript whose name implies a + side effect (create_/send_/delete_/refund_/...) has no idempotency token + visible in its body. The SDK retries tool calls on timeouts and + ambiguous failures and the model itself will retry whenever a tool + result reads as inconclusive; without a key threaded through to the + backing API the same action fires twice, producing duplicate tickets, + payments, emails, or deletions. Provisional: this rule loads and + validates today but will not fire until the engine's TypeScript tool + parser ships. + fix: > + Add an `idempotencyKey: string` parameter to the tool's zod schema and + forward it to the underlying API (Stripe `Idempotency-Key` header, + REST `X-Request-ID`, GraphQL `clientMutationId`, etc.). If the backing + service does not honor idempotency keys, document the gap and dedupe + in your own store keyed by the token before issuing the call.