diff --git a/claude_sdk/repo_hygiene.yaml b/claude_sdk/repo_hygiene.yaml deleted file mode 100644 index c366bc0..0000000 --- a/claude_sdk/repo_hygiene.yaml +++ /dev/null @@ -1,44 +0,0 @@ -policy: - id: claude_sdk_repo_hygiene - name: Repo hygiene - category: claude_sdk - description: > - Repository-scope rules covering the project-level scaffolding a Claude - Agent SDK codebase should ship with — agent-facing READMEs, hook scripts, - sandbox policies, and other components a Claude session relies on at cold - start. - -rules: - - id: CSDK-203 - title: Repo ships Claude Agent SDK code without a CLAUDE.md - severity: low - confidence: 0.9 - language: python - applies_to: - - claude_sdk - scope: repo - match: - all: - - repo_has_sdk_in_code: - - claude_agent_sdk - - not: - repo_component_present: - - claude_md - explanation: > - A repository that builds on the Claude Agent SDK but ships no top-level - CLAUDE.md leaves any Claude session that opens the repo with no - project-specific guidance. The conventions, build commands, test - runners, lint scripts, and "do not touch" boundaries a human maintainer - would describe in onboarding are absent, so the agent has to infer them - from the source on every session and frequently guesses wrong — - bypassing the project's lint, picking the wrong test command, or - violating commit conventions that were never written down anywhere it - could read. The cost compounds: each new contributor (human or agent) - reinvents the same wrong assumptions. - fix: > - Add a CLAUDE.md at the repository root documenting how to build, test, - and lint the project, the coding conventions Claude must respect, the - directories or files Claude must not modify, and any project-specific - safety guardrails (e.g. "never run migrations", "never push to main"). - Treat it as the project's agent-facing README and keep it under version - control so reviewers see drift. diff --git a/claude_sdk/tool_definition.yaml b/claude_sdk/tool_definition.yaml index 97ddf04..575690d 100644 --- a/claude_sdk/tool_definition.yaml +++ b/claude_sdk/tool_definition.yaml @@ -77,35 +77,3 @@ rules: refuse to call it at all. fix: > Rename to a verb-object form, e.g. `summarize_invoice`, `refund_charge`. - - - id: CSDK-008 - title: Tool exposes **kwargs without explicit input_schema - severity: medium - confidence: 0.8 - language: python - applies_to: - - claude_sdk_tool - scope: tool - match: - all: - - param_name_matches: - exact: - - kwargs - - not: - tool_decorator_kwarg_present: - - input_schema - explanation: > - The Claude Agent SDK derives a tool's JSON input schema from the - function signature. A tool whose accepted arguments live entirely under - `**kwargs` has no typed surface for the SDK to introspect, so the model - sees an empty parameter object and gets no signal about which keys to - send. Calls then either omit data the body requires or invent keys the - body does not handle, and the failure surfaces as a runtime KeyError at - invoke time instead of a clean schema-validation error before the tool - ever runs. - fix: > - Either declare each accepted parameter on the function signature with a - type annotation so the SDK can derive the schema, or pass an explicit - `input_schema=` to the `@tool` decorator — either a JSON Schema dict or - a Pydantic `BaseModel` subclass — so the contract published to the model - matches what the body actually reads from `kwargs`. diff --git a/google_adk/repo_hygiene.yaml b/google_adk/repo_hygiene.yaml deleted file mode 100644 index ce41747..0000000 --- a/google_adk/repo_hygiene.yaml +++ /dev/null @@ -1,41 +0,0 @@ -policy: - id: google_adk_repo_hygiene - name: Google ADK repo hygiene - category: google_adk - description: > - Repo-scoped hygiene rules for projects that use the Google ADK. Fire once - per scan against the repo manifest and inventory rather than per tool or - agent. - -rules: - - id: ADK-201 - title: Google ADK project missing CLAUDE.md - severity: low - confidence: 0.9 - language: python - applies_to: - - google_adk - scope: repo - match: - all: - - repo_has_sdk_in_code: - - google_adk - - not: - repo_component_present: - - claude_md - explanation: > - The project uses the Google ADK in code but ships no CLAUDE.md at the - repo root. When Claude Code (or any agent following the CLAUDE.md - convention) edits this repo, it has no project-specific guidance on - agent-class choices (LlmAgent vs SequentialAgent vs ParallelAgent vs - LoopAgent), sub_agents composition, FunctionTool wrapping conventions, - required guardrails, or the local test and build commands. The likely - consequence is generated code that violates the project's tool and - agent contracts because nothing in-tree teaches the agent the local - rules. - fix: > - Add a CLAUDE.md at the repo root. State which ADK agent classes the - project uses and why, how tools must be wrapped, any required guardrails - or sandboxing, and the exact test, lint, and build commands. Keep it - short and concrete so an editing agent can act on it without re-deriving - the conventions. diff --git a/google_adk/tool_definition.yaml b/google_adk/tool_definition.yaml index aa7d51f..9436f10 100644 --- a/google_adk/tool_definition.yaml +++ b/google_adk/tool_definition.yaml @@ -76,29 +76,3 @@ rules: refuse to call it at all when a better-named alternative is present. fix: > Rename to a verb-object form, e.g. `summarize_document`, `fetch_order`. - - - id: ADK-008 - title: FunctionTool body prints to stdout - severity: low - confidence: 0.7 - language: python - applies_to: - - adk_function_tool - scope: tool - match: - has_body_text: - - print( - explanation: > - Google ADK FunctionTool runs inside the agent's process and shares its - stdout with the runtime. A wrapped function that calls `print(...)` for - debug tracing leaks raw arguments (paths, IDs, decoded blob contents) - into the same stream the runner uses for structured logs, mangles JSON - log lines, and can echo secrets pulled from `tool_context.state` into - terminal scrollback or container log shippers. The output is also not - addressable by the model — `print` writes go to the process, not to the - tool response, so the operator sees noise while the agent sees nothing. - fix: > - Delete debug `print` calls before shipping, or replace them with a - module logger (`logging.getLogger(__name__).debug(...)`) so the operator - can silence them with a level switch. If the data is meant for the - model, return it as part of the tool's structured result instead. diff --git a/openai_sdk/code_execution.yaml b/openai_sdk/code_execution.yaml index 603ecd1..dd28efd 100644 --- a/openai_sdk/code_execution.yaml +++ b/openai_sdk/code_execution.yaml @@ -36,41 +36,3 @@ rules: surface (e.g. notebook-style), run the exec inside a separate process with seccomp + no network + no filesystem write, and treat the process as a single-use sacrificial sandbox. - - - id: OAI-017 - title: TypeScript tool body calls eval / new Function on dynamic input - severity: high - confidence: 0.9 - language: typescript - applies_to: - - openai_tool - scope: tool - match: - has_body_text: - - eval( - - new Function( - - Function( - explanation: > - A TypeScript tool body invokes `eval()` or constructs a `new Function(...)` - from a string. When any portion of that string flows from the model - (directly, via tool arguments, or via session state the model writes to), - the call is arbitrary-code-execution inside the agent's Node / Worker / - browser runtime. No OS-level sandbox stands between the call and the - runtime's full capabilities: file I/O, env-var access, fetch credentials, - and the agent's own keys in memory are all reachable from the evaluated - string. Even a `with({})`-restricted scope is escapable through - globalThis, prototype chains, and the platform API surface unless - explicitly stripped. The miner-flagged `calculate` tool in the Claude - SDK template feeds a tool argument straight into `eval` to act as a math - evaluator — the canonical worst-case shape. Provisional: this rule loads - and validates today but will not fire until the engine's TypeScript tool - parser ships. - fix: > - Remove `eval` and `new Function` from the tool. For arithmetic, parse - the input with a hardened expression library (`mathjs`, `expr-eval`) - configured to disallow function calls and property access. If the tool - genuinely needs to run model-supplied code, isolate it: a separate - Worker with no bindings (no env, no KV, no secrets), a - `vm.createContext` with an empty global plus a CPU/memory/time limit, - or a sandboxed iframe with the network and storage origins stripped. - Treat the sandbox as single-use and discard it after every call. diff --git a/openai_sdk/idempotency.yaml b/openai_sdk/idempotency.yaml index 9a0796e..528ec93 100644 --- a/openai_sdk/idempotency.yaml +++ b/openai_sdk/idempotency.yaml @@ -41,48 +41,3 @@ rules: must also honor the key for this protection to be effective. fix: > Add an `idempotency_key: str` parameter and pass it to the backing API. - - - id: OAI-019 - title: TypeScript mutating tool has no idempotency key - severity: medium - confidence: 0.5 - language: typescript - applies_to: - - openai_tool - scope: tool - match: - all: - - name_has_prefix: - - create_ - - send_ - - delete_ - - post_ - - update_ - - refund_ - - charge_ - - issue_ - - not: - has_body_text: - - idempot - - request_id - - requestId - - txn_id - - txnId - - correlation_id - - correlationId - explanation: > - An OpenAI Agents SDK tool authored in TypeScript whose name implies a - side effect (create_/send_/delete_/refund_/...) has no idempotency token - visible in its body. The SDK retries tool calls on timeouts and - ambiguous failures and the model itself will retry whenever a tool - result reads as inconclusive; without a key threaded through to the - backing API the same action fires twice, producing duplicate tickets, - payments, emails, or deletions. Provisional: this rule loads and - validates today but will not fire until the engine's TypeScript tool - parser ships. - fix: > - Add an `idempotencyKey: string` parameter to the tool's zod schema and - forward it to the underlying API (Stripe `Idempotency-Key` header, - REST `X-Request-ID`, GraphQL `clientMutationId`, etc.). If the backing - service does not honor idempotency keys, document the gap and dedupe - in your own store keyed by the token before issuing the call. diff --git a/openai_sdk/network.yaml b/openai_sdk/network.yaml index b21362e..0a69625 100644 --- a/openai_sdk/network.yaml +++ b/openai_sdk/network.yaml @@ -67,73 +67,3 @@ rules: every urllib.request.urlopen call. Surface timeouts as a structured tool error the model can react to using failure_error_function (see OAI-004). OAI-005 covers the equivalent gap for requests/httpx callees. - - - id: OAI-016 - title: TypeScript tool fetch call has no AbortSignal timeout - severity: high - confidence: 0.6 - language: typescript - applies_to: - - openai_tool - scope: tool - match: - all: - - has_body_text: - - fetch( - - not: - has_body_text: - - AbortSignal - - AbortController - - 'signal:' - - AbortSignal.timeout - explanation: > - An OpenAI Agents SDK tool authored in TypeScript calls `fetch()` without - attaching an AbortSignal. Node's and the browser's `fetch` have no - implicit timeout; a slow or unresponsive host blocks the tool's - `execute` callback indefinitely, which in turn blocks the agent's run - loop, consumes the conversation's wall-clock budget, and ties up the - worker that owns the request. The miner's examples include Cloudflare - Worker and realtime-agent templates that fetch from URLs interpolated - from tool arguments, so the gap also amplifies SSRF and exfiltration - impact: the call cannot be cancelled even when downstream behaviour - goes obviously wrong. Provisional: this rule loads and validates today - but will not fire until the engine's TypeScript tool parser ships. - fix: > - Attach an AbortSignal that fires after a bounded deadline. Modern - runtimes: `await fetch(url, { signal: AbortSignal.timeout(15_000) })`. - For runtimes without `AbortSignal.timeout`, create an `AbortController`, - schedule `controller.abort()` via `setTimeout`, pass `controller.signal` - to `fetch`, and clear the timer in a `finally` block. Surface the - resulting `AbortError` as a structured tool error the model can react - to (see OAI-004's `failure_error_function` for the Python equivalent). - - - id: OAI-018 - title: Tool builds outbound URL from non-literal value - severity: medium - confidence: 0.55 - language: python - applies_to: - - openai_tool - scope: tool - match: - has_dynamic_url_call: true - explanation: > - An OpenAI Agents SDK tool builds its outbound HTTP URL from a - non-literal value — typically a tool parameter interpolated into an - f-string or concatenated onto a base URL. Because tool arguments are - produced by the model from conversation context (including prior tool - output and user input), an attacker who can shape that context can - steer the request to an attacker-controlled host or to an internal - address the agent's network egress can reach but the model was never - meant to touch. The same channel also leaks request bodies (auth - headers, JSON payloads, the model's reasoning) to whichever host the - URL resolves to, so the failure mode is both SSRF and data - exfiltration in one call. - fix: > - Treat the model-supplied value as untrusted input to URL construction. - Validate it against an allow-list of permitted hosts/path segments - before any HTTP client touches it, or look the value up against a - server-side registry and build the URL from the registry's trusted - entry. If the value is meant to be an opaque ID (e.g. a connection_id), - pass it as a query parameter or path segment of a fixed base URL and - reject characters that would let it escape that slot. diff --git a/openai_sdk/repo_hygiene.yaml b/openai_sdk/repo_hygiene.yaml deleted file mode 100644 index 4622e1c..0000000 --- a/openai_sdk/repo_hygiene.yaml +++ /dev/null @@ -1,41 +0,0 @@ -policy: - id: openai_sdk_repo_hygiene - name: OpenAI Agents SDK repo hygiene - category: openai_sdk - description: > - Repo-scoped hygiene rules for projects that use the OpenAI Agents SDK. - Fire once per scan against the repo manifest and inventory rather than per - tool or agent. - -rules: - - id: OAI-202 - title: OpenAI Agents project missing CLAUDE.md - severity: low - confidence: 0.9 - language: python - applies_to: - - openai_agents - scope: repo - match: - all: - - repo_has_sdk_in_code: - - openai_agents - - not: - repo_component_present: - - claude_md - explanation: > - The project uses the OpenAI Agents SDK in code but ships no CLAUDE.md at - the repo root. When Claude Code (or any agent honoring the CLAUDE.md - convention) edits this repo, it has no project-specific guidance on - Agent vs SandboxAgent choice, handoff topology, required input/output - guardrails, tool_choice settings, or the local test and build commands. - The likely consequence is generated code that bypasses the project's - safety contracts because nothing in-tree teaches the agent the local - rules. - fix: > - Add a CLAUDE.md at the repo root. State whether the project uses Agent - or SandboxAgent, list required guardrails (input_guardrails, - output_guardrails) and tool_choice conventions, note any handoff or - tracing policy, and give the exact test, lint, and build commands. Keep - it short and concrete so an editing agent can act on it without - re-deriving the conventions.