diff --git a/claude_sdk/repo_hygiene.yaml b/claude_sdk/repo_hygiene.yaml new file mode 100644 index 0000000..c366bc0 --- /dev/null +++ b/claude_sdk/repo_hygiene.yaml @@ -0,0 +1,44 @@ +policy: + id: claude_sdk_repo_hygiene + name: Repo hygiene + category: claude_sdk + description: > + Repository-scope rules covering the project-level scaffolding a Claude + Agent SDK codebase should ship with — agent-facing READMEs, hook scripts, + sandbox policies, and other components a Claude session relies on at cold + start. + +rules: + - id: CSDK-203 + title: Repo ships Claude Agent SDK code without a CLAUDE.md + severity: low + confidence: 0.9 + language: python + applies_to: + - claude_sdk + scope: repo + match: + all: + - repo_has_sdk_in_code: + - claude_agent_sdk + - not: + repo_component_present: + - claude_md + explanation: > + A repository that builds on the Claude Agent SDK but ships no top-level + CLAUDE.md leaves any Claude session that opens the repo with no + project-specific guidance. The conventions, build commands, test + runners, lint scripts, and "do not touch" boundaries a human maintainer + would describe in onboarding are absent, so the agent has to infer them + from the source on every session and frequently guesses wrong — + bypassing the project's lint, picking the wrong test command, or + violating commit conventions that were never written down anywhere it + could read. The cost compounds: each new contributor (human or agent) + reinvents the same wrong assumptions. + fix: > + Add a CLAUDE.md at the repository root documenting how to build, test, + and lint the project, the coding conventions Claude must respect, the + directories or files Claude must not modify, and any project-specific + safety guardrails (e.g. "never run migrations", "never push to main"). + Treat it as the project's agent-facing README and keep it under version + control so reviewers see drift. diff --git a/claude_sdk/tool_definition.yaml b/claude_sdk/tool_definition.yaml index 575690d..97ddf04 100644 --- a/claude_sdk/tool_definition.yaml +++ b/claude_sdk/tool_definition.yaml @@ -77,3 +77,35 @@ rules: refuse to call it at all. fix: > Rename to a verb-object form, e.g. `summarize_invoice`, `refund_charge`. + + - id: CSDK-008 + title: Tool exposes **kwargs without explicit input_schema + severity: medium + confidence: 0.8 + language: python + applies_to: + - claude_sdk_tool + scope: tool + match: + all: + - param_name_matches: + exact: + - kwargs + - not: + tool_decorator_kwarg_present: + - input_schema + explanation: > + The Claude Agent SDK derives a tool's JSON input schema from the + function signature. A tool whose accepted arguments live entirely under + `**kwargs` has no typed surface for the SDK to introspect, so the model + sees an empty parameter object and gets no signal about which keys to + send. Calls then either omit data the body requires or invent keys the + body does not handle, and the failure surfaces as a runtime KeyError at + invoke time instead of a clean schema-validation error before the tool + ever runs. + fix: > + Either declare each accepted parameter on the function signature with a + type annotation so the SDK can derive the schema, or pass an explicit + `input_schema=` to the `@tool` decorator — either a JSON Schema dict or + a Pydantic `BaseModel` subclass — so the contract published to the model + matches what the body actually reads from `kwargs`. diff --git a/google_adk/repo_hygiene.yaml b/google_adk/repo_hygiene.yaml new file mode 100644 index 0000000..ce41747 --- /dev/null +++ b/google_adk/repo_hygiene.yaml @@ -0,0 +1,41 @@ +policy: + id: google_adk_repo_hygiene + name: Google ADK repo hygiene + category: google_adk + description: > + Repo-scoped hygiene rules for projects that use the Google ADK. Fire once + per scan against the repo manifest and inventory rather than per tool or + agent. + +rules: + - id: ADK-201 + title: Google ADK project missing CLAUDE.md + severity: low + confidence: 0.9 + language: python + applies_to: + - google_adk + scope: repo + match: + all: + - repo_has_sdk_in_code: + - google_adk + - not: + repo_component_present: + - claude_md + explanation: > + The project uses the Google ADK in code but ships no CLAUDE.md at the + repo root. When Claude Code (or any agent following the CLAUDE.md + convention) edits this repo, it has no project-specific guidance on + agent-class choices (LlmAgent vs SequentialAgent vs ParallelAgent vs + LoopAgent), sub_agents composition, FunctionTool wrapping conventions, + required guardrails, or the local test and build commands. The likely + consequence is generated code that violates the project's tool and + agent contracts because nothing in-tree teaches the agent the local + rules. + fix: > + Add a CLAUDE.md at the repo root. State which ADK agent classes the + project uses and why, how tools must be wrapped, any required guardrails + or sandboxing, and the exact test, lint, and build commands. Keep it + short and concrete so an editing agent can act on it without re-deriving + the conventions. diff --git a/google_adk/tool_definition.yaml b/google_adk/tool_definition.yaml index 9436f10..aa7d51f 100644 --- a/google_adk/tool_definition.yaml +++ b/google_adk/tool_definition.yaml @@ -76,3 +76,29 @@ rules: refuse to call it at all when a better-named alternative is present. fix: > Rename to a verb-object form, e.g. `summarize_document`, `fetch_order`. + + - id: ADK-008 + title: FunctionTool body prints to stdout + severity: low + confidence: 0.7 + language: python + applies_to: + - adk_function_tool + scope: tool + match: + has_body_text: + - print( + explanation: > + Google ADK FunctionTool runs inside the agent's process and shares its + stdout with the runtime. A wrapped function that calls `print(...)` for + debug tracing leaks raw arguments (paths, IDs, decoded blob contents) + into the same stream the runner uses for structured logs, mangles JSON + log lines, and can echo secrets pulled from `tool_context.state` into + terminal scrollback or container log shippers. The output is also not + addressable by the model — `print` writes go to the process, not to the + tool response, so the operator sees noise while the agent sees nothing. + fix: > + Delete debug `print` calls before shipping, or replace them with a + module logger (`logging.getLogger(__name__).debug(...)`) so the operator + can silence them with a level switch. If the data is meant for the + model, return it as part of the tool's structured result instead. diff --git a/openai_sdk/code_execution.yaml b/openai_sdk/code_execution.yaml index dd28efd..603ecd1 100644 --- a/openai_sdk/code_execution.yaml +++ b/openai_sdk/code_execution.yaml @@ -36,3 +36,41 @@ rules: surface (e.g. notebook-style), run the exec inside a separate process with seccomp + no network + no filesystem write, and treat the process as a single-use sacrificial sandbox. + + - id: OAI-017 + title: TypeScript tool body calls eval / new Function on dynamic input + severity: high + confidence: 0.9 + language: typescript + applies_to: + - openai_tool + scope: tool + match: + has_body_text: + - eval( + - new Function( + - Function( + explanation: > + A TypeScript tool body invokes `eval()` or constructs a `new Function(...)` + from a string. When any portion of that string flows from the model + (directly, via tool arguments, or via session state the model writes to), + the call is arbitrary-code-execution inside the agent's Node / Worker / + browser runtime. No OS-level sandbox stands between the call and the + runtime's full capabilities: file I/O, env-var access, fetch credentials, + and the agent's own keys in memory are all reachable from the evaluated + string. Even a `with({})`-restricted scope is escapable through + globalThis, prototype chains, and the platform API surface unless + explicitly stripped. The miner-flagged `calculate` tool in the Claude + SDK template feeds a tool argument straight into `eval` to act as a math + evaluator — the canonical worst-case shape. Provisional: this rule loads + and validates today but will not fire until the engine's TypeScript tool + parser ships. + fix: > + Remove `eval` and `new Function` from the tool. For arithmetic, parse + the input with a hardened expression library (`mathjs`, `expr-eval`) + configured to disallow function calls and property access. If the tool + genuinely needs to run model-supplied code, isolate it: a separate + Worker with no bindings (no env, no KV, no secrets), a + `vm.createContext` with an empty global plus a CPU/memory/time limit, + or a sandboxed iframe with the network and storage origins stripped. + Treat the sandbox as single-use and discard it after every call. diff --git a/openai_sdk/idempotency.yaml b/openai_sdk/idempotency.yaml index 528ec93..9a0796e 100644 --- a/openai_sdk/idempotency.yaml +++ b/openai_sdk/idempotency.yaml @@ -41,3 +41,48 @@ rules: must also honor the key for this protection to be effective. fix: > Add an `idempotency_key: str` parameter and pass it to the backing API. + + - id: OAI-019 + title: TypeScript mutating tool has no idempotency key + severity: medium + confidence: 0.5 + language: typescript + applies_to: + - openai_tool + scope: tool + match: + all: + - name_has_prefix: + - create_ + - send_ + - delete_ + - post_ + - update_ + - refund_ + - charge_ + - issue_ + - not: + has_body_text: + - idempot + - request_id + - requestId + - txn_id + - txnId + - correlation_id + - correlationId + explanation: > + An OpenAI Agents SDK tool authored in TypeScript whose name implies a + side effect (create_/send_/delete_/refund_/...) has no idempotency token + visible in its body. The SDK retries tool calls on timeouts and + ambiguous failures and the model itself will retry whenever a tool + result reads as inconclusive; without a key threaded through to the + backing API the same action fires twice, producing duplicate tickets, + payments, emails, or deletions. Provisional: this rule loads and + validates today but will not fire until the engine's TypeScript tool + parser ships. + fix: > + Add an `idempotencyKey: string` parameter to the tool's zod schema and + forward it to the underlying API (Stripe `Idempotency-Key` header, + REST `X-Request-ID`, GraphQL `clientMutationId`, etc.). If the backing + service does not honor idempotency keys, document the gap and dedupe + in your own store keyed by the token before issuing the call. diff --git a/openai_sdk/network.yaml b/openai_sdk/network.yaml index 0a69625..b21362e 100644 --- a/openai_sdk/network.yaml +++ b/openai_sdk/network.yaml @@ -67,3 +67,73 @@ rules: every urllib.request.urlopen call. Surface timeouts as a structured tool error the model can react to using failure_error_function (see OAI-004). OAI-005 covers the equivalent gap for requests/httpx callees. + + - id: OAI-016 + title: TypeScript tool fetch call has no AbortSignal timeout + severity: high + confidence: 0.6 + language: typescript + applies_to: + - openai_tool + scope: tool + match: + all: + - has_body_text: + - fetch( + - not: + has_body_text: + - AbortSignal + - AbortController + - 'signal:' + - AbortSignal.timeout + explanation: > + An OpenAI Agents SDK tool authored in TypeScript calls `fetch()` without + attaching an AbortSignal. Node's and the browser's `fetch` have no + implicit timeout; a slow or unresponsive host blocks the tool's + `execute` callback indefinitely, which in turn blocks the agent's run + loop, consumes the conversation's wall-clock budget, and ties up the + worker that owns the request. The miner's examples include Cloudflare + Worker and realtime-agent templates that fetch from URLs interpolated + from tool arguments, so the gap also amplifies SSRF and exfiltration + impact: the call cannot be cancelled even when downstream behaviour + goes obviously wrong. Provisional: this rule loads and validates today + but will not fire until the engine's TypeScript tool parser ships. + fix: > + Attach an AbortSignal that fires after a bounded deadline. Modern + runtimes: `await fetch(url, { signal: AbortSignal.timeout(15_000) })`. + For runtimes without `AbortSignal.timeout`, create an `AbortController`, + schedule `controller.abort()` via `setTimeout`, pass `controller.signal` + to `fetch`, and clear the timer in a `finally` block. Surface the + resulting `AbortError` as a structured tool error the model can react + to (see OAI-004's `failure_error_function` for the Python equivalent). + + - id: OAI-018 + title: Tool builds outbound URL from non-literal value + severity: medium + confidence: 0.55 + language: python + applies_to: + - openai_tool + scope: tool + match: + has_dynamic_url_call: true + explanation: > + An OpenAI Agents SDK tool builds its outbound HTTP URL from a + non-literal value — typically a tool parameter interpolated into an + f-string or concatenated onto a base URL. Because tool arguments are + produced by the model from conversation context (including prior tool + output and user input), an attacker who can shape that context can + steer the request to an attacker-controlled host or to an internal + address the agent's network egress can reach but the model was never + meant to touch. The same channel also leaks request bodies (auth + headers, JSON payloads, the model's reasoning) to whichever host the + URL resolves to, so the failure mode is both SSRF and data + exfiltration in one call. + fix: > + Treat the model-supplied value as untrusted input to URL construction. + Validate it against an allow-list of permitted hosts/path segments + before any HTTP client touches it, or look the value up against a + server-side registry and build the URL from the registry's trusted + entry. If the value is meant to be an opaque ID (e.g. a connection_id), + pass it as a query parameter or path segment of a fixed base URL and + reject characters that would let it escape that slot. diff --git a/openai_sdk/repo_hygiene.yaml b/openai_sdk/repo_hygiene.yaml new file mode 100644 index 0000000..4622e1c --- /dev/null +++ b/openai_sdk/repo_hygiene.yaml @@ -0,0 +1,41 @@ +policy: + id: openai_sdk_repo_hygiene + name: OpenAI Agents SDK repo hygiene + category: openai_sdk + description: > + Repo-scoped hygiene rules for projects that use the OpenAI Agents SDK. + Fire once per scan against the repo manifest and inventory rather than per + tool or agent. + +rules: + - id: OAI-202 + title: OpenAI Agents project missing CLAUDE.md + severity: low + confidence: 0.9 + language: python + applies_to: + - openai_agents + scope: repo + match: + all: + - repo_has_sdk_in_code: + - openai_agents + - not: + repo_component_present: + - claude_md + explanation: > + The project uses the OpenAI Agents SDK in code but ships no CLAUDE.md at + the repo root. When Claude Code (or any agent honoring the CLAUDE.md + convention) edits this repo, it has no project-specific guidance on + Agent vs SandboxAgent choice, handoff topology, required input/output + guardrails, tool_choice settings, or the local test and build commands. + The likely consequence is generated code that bypasses the project's + safety contracts because nothing in-tree teaches the agent the local + rules. + fix: > + Add a CLAUDE.md at the repo root. State whether the project uses Agent + or SandboxAgent, list required guardrails (input_guardrails, + output_guardrails) and tool_choice conventions, note any handoff or + tracing policy, and give the exact test, lint, and build commands. Keep + it short and concrete so an editing agent can act on it without + re-deriving the conventions.