Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions claude_sdk/repo_hygiene.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
policy:
id: claude_sdk_repo_hygiene
name: Repo hygiene
category: claude_sdk
description: >
Repository-scope rules covering the project-level scaffolding a Claude
Agent SDK codebase should ship with — agent-facing READMEs, hook scripts,
sandbox policies, and other components a Claude session relies on at cold
start.

rules:
- id: CSDK-203
title: Repo ships Claude Agent SDK code without a CLAUDE.md
severity: low
confidence: 0.9
language: python
applies_to:
- claude_sdk
scope: repo
match:
all:
- repo_has_sdk_in_code:
- claude_agent_sdk
- not:
repo_component_present:
- claude_md
explanation: >
A repository that builds on the Claude Agent SDK but ships no top-level
CLAUDE.md leaves any Claude session that opens the repo with no
project-specific guidance. The conventions, build commands, test
runners, lint scripts, and "do not touch" boundaries a human maintainer
would describe in onboarding are absent, so the agent has to infer them
from the source on every session and frequently guesses wrong —
bypassing the project's lint, picking the wrong test command, or
violating commit conventions that were never written down anywhere it
could read. The cost compounds: each new contributor (human or agent)
reinvents the same wrong assumptions.
fix: >
Add a CLAUDE.md at the repository root documenting how to build, test,
and lint the project, the coding conventions Claude must respect, the
directories or files Claude must not modify, and any project-specific
safety guardrails (e.g. "never run migrations", "never push to main").
Treat it as the project's agent-facing README and keep it under version
control so reviewers see drift.
32 changes: 32 additions & 0 deletions claude_sdk/tool_definition.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -77,3 +77,35 @@ rules:
refuse to call it at all.
fix: >
Rename to a verb-object form, e.g. `summarize_invoice`, `refund_charge`.

- id: CSDK-008
title: Tool exposes **kwargs without explicit input_schema
severity: medium
confidence: 0.8
language: python
applies_to:
- claude_sdk_tool
scope: tool
match:
all:
- param_name_matches:
exact:
- kwargs
- not:
tool_decorator_kwarg_present:
- input_schema
explanation: >
The Claude Agent SDK derives a tool's JSON input schema from the
function signature. A tool whose accepted arguments live entirely under
`**kwargs` has no typed surface for the SDK to introspect, so the model
sees an empty parameter object and gets no signal about which keys to
send. Calls then either omit data the body requires or invent keys the
body does not handle, and the failure surfaces as a runtime KeyError at
invoke time instead of a clean schema-validation error before the tool
ever runs.
fix: >
Either declare each accepted parameter on the function signature with a
type annotation so the SDK can derive the schema, or pass an explicit
`input_schema=` to the `@tool` decorator — either a JSON Schema dict or
a Pydantic `BaseModel` subclass — so the contract published to the model
matches what the body actually reads from `kwargs`.
41 changes: 41 additions & 0 deletions google_adk/repo_hygiene.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
policy:
id: google_adk_repo_hygiene
name: Google ADK repo hygiene
category: google_adk
description: >
Repo-scoped hygiene rules for projects that use the Google ADK. Fire once
per scan against the repo manifest and inventory rather than per tool or
agent.

rules:
- id: ADK-201
title: Google ADK project missing CLAUDE.md
severity: low
confidence: 0.9
language: python
applies_to:
- google_adk
scope: repo
match:
all:
- repo_has_sdk_in_code:
- google_adk
- not:
repo_component_present:
- claude_md
explanation: >
The project uses the Google ADK in code but ships no CLAUDE.md at the
repo root. When Claude Code (or any agent following the CLAUDE.md
convention) edits this repo, it has no project-specific guidance on
agent-class choices (LlmAgent vs SequentialAgent vs ParallelAgent vs
LoopAgent), sub_agents composition, FunctionTool wrapping conventions,
required guardrails, or the local test and build commands. The likely
consequence is generated code that violates the project's tool and
agent contracts because nothing in-tree teaches the agent the local
rules.
fix: >
Add a CLAUDE.md at the repo root. State which ADK agent classes the
project uses and why, how tools must be wrapped, any required guardrails
or sandboxing, and the exact test, lint, and build commands. Keep it
short and concrete so an editing agent can act on it without re-deriving
the conventions.
26 changes: 26 additions & 0 deletions google_adk/tool_definition.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -76,3 +76,29 @@ rules:
refuse to call it at all when a better-named alternative is present.
fix: >
Rename to a verb-object form, e.g. `summarize_document`, `fetch_order`.

- id: ADK-008
title: FunctionTool body prints to stdout
severity: low
confidence: 0.7
language: python
applies_to:
- adk_function_tool
scope: tool
match:
has_body_text:
- print(
explanation: >
Google ADK FunctionTool runs inside the agent's process and shares its
stdout with the runtime. A wrapped function that calls `print(...)` for
debug tracing leaks raw arguments (paths, IDs, decoded blob contents)
into the same stream the runner uses for structured logs, mangles JSON
log lines, and can echo secrets pulled from `tool_context.state` into
terminal scrollback or container log shippers. The output is also not
addressable by the model — `print` writes go to the process, not to the
tool response, so the operator sees noise while the agent sees nothing.
fix: >
Delete debug `print` calls before shipping, or replace them with a
module logger (`logging.getLogger(__name__).debug(...)`) so the operator
can silence them with a level switch. If the data is meant for the
model, return it as part of the tool's structured result instead.
38 changes: 38 additions & 0 deletions openai_sdk/code_execution.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,41 @@ rules:
surface (e.g. notebook-style), run the exec inside a separate process
with seccomp + no network + no filesystem write, and treat the process
as a single-use sacrificial sandbox.

- id: OAI-017
title: TypeScript tool body calls eval / new Function on dynamic input
severity: high
confidence: 0.9
language: typescript
applies_to:
- openai_tool
scope: tool
match:
has_body_text:
- eval(
- new Function(
- Function(
explanation: >
A TypeScript tool body invokes `eval()` or constructs a `new Function(...)`
from a string. When any portion of that string flows from the model
(directly, via tool arguments, or via session state the model writes to),
the call is arbitrary-code-execution inside the agent's Node / Worker /
browser runtime. No OS-level sandbox stands between the call and the
runtime's full capabilities: file I/O, env-var access, fetch credentials,
and the agent's own keys in memory are all reachable from the evaluated
string. Even a `with({})`-restricted scope is escapable through
globalThis, prototype chains, and the platform API surface unless
explicitly stripped. The miner-flagged `calculate` tool in the Claude
SDK template feeds a tool argument straight into `eval` to act as a math
evaluator — the canonical worst-case shape. Provisional: this rule loads
and validates today but will not fire until the engine's TypeScript tool
parser ships.
fix: >
Remove `eval` and `new Function` from the tool. For arithmetic, parse
the input with a hardened expression library (`mathjs`, `expr-eval`)
configured to disallow function calls and property access. If the tool
genuinely needs to run model-supplied code, isolate it: a separate
Worker with no bindings (no env, no KV, no secrets), a
`vm.createContext` with an empty global plus a CPU/memory/time limit,
or a sandboxed iframe with the network and storage origins stripped.
Treat the sandbox as single-use and discard it after every call.
45 changes: 45 additions & 0 deletions openai_sdk/idempotency.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,48 @@ rules:
must also honor the key for this protection to be effective.
fix: >
Add an `idempotency_key: str` parameter and pass it to the backing API.

- id: OAI-019
title: TypeScript mutating tool has no idempotency key
severity: medium
confidence: 0.5
language: typescript
applies_to:
- openai_tool
scope: tool
match:
all:
- name_has_prefix:
- create_
- send_
- delete_
- post_
- update_
- refund_
- charge_
- issue_
- not:
has_body_text:
- idempot
- request_id
- requestId
- txn_id
- txnId
- correlation_id
- correlationId
explanation: >
An OpenAI Agents SDK tool authored in TypeScript whose name implies a
side effect (create_/send_/delete_/refund_/...) has no idempotency token
visible in its body. The SDK retries tool calls on timeouts and
ambiguous failures and the model itself will retry whenever a tool
result reads as inconclusive; without a key threaded through to the
backing API the same action fires twice, producing duplicate tickets,
payments, emails, or deletions. Provisional: this rule loads and
validates today but will not fire until the engine's TypeScript tool
parser ships.
fix: >
Add an `idempotencyKey: string` parameter to the tool's zod schema and
forward it to the underlying API (Stripe `Idempotency-Key` header,
REST `X-Request-ID`, GraphQL `clientMutationId`, etc.). If the backing
service does not honor idempotency keys, document the gap and dedupe
in your own store keyed by the token before issuing the call.
70 changes: 70 additions & 0 deletions openai_sdk/network.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -67,3 +67,73 @@ rules:
every urllib.request.urlopen call. Surface timeouts as a structured
tool error the model can react to using failure_error_function (see
OAI-004). OAI-005 covers the equivalent gap for requests/httpx callees.

- id: OAI-016
title: TypeScript tool fetch call has no AbortSignal timeout
severity: high
confidence: 0.6
language: typescript
applies_to:
- openai_tool
scope: tool
match:
all:
- has_body_text:
- fetch(
- not:
has_body_text:
- AbortSignal
- AbortController
- 'signal:'
- AbortSignal.timeout
explanation: >
An OpenAI Agents SDK tool authored in TypeScript calls `fetch()` without
attaching an AbortSignal. Node's and the browser's `fetch` have no
implicit timeout; a slow or unresponsive host blocks the tool's
`execute` callback indefinitely, which in turn blocks the agent's run
loop, consumes the conversation's wall-clock budget, and ties up the
worker that owns the request. The miner's examples include Cloudflare
Worker and realtime-agent templates that fetch from URLs interpolated
from tool arguments, so the gap also amplifies SSRF and exfiltration
impact: the call cannot be cancelled even when downstream behaviour
goes obviously wrong. Provisional: this rule loads and validates today
but will not fire until the engine's TypeScript tool parser ships.
fix: >
Attach an AbortSignal that fires after a bounded deadline. Modern
runtimes: `await fetch(url, { signal: AbortSignal.timeout(15_000) })`.
For runtimes without `AbortSignal.timeout`, create an `AbortController`,
schedule `controller.abort()` via `setTimeout`, pass `controller.signal`
to `fetch`, and clear the timer in a `finally` block. Surface the
resulting `AbortError` as a structured tool error the model can react
to (see OAI-004's `failure_error_function` for the Python equivalent).

- id: OAI-018
title: Tool builds outbound URL from non-literal value
severity: medium
confidence: 0.55
language: python
applies_to:
- openai_tool
scope: tool
match:
has_dynamic_url_call: true
explanation: >
An OpenAI Agents SDK tool builds its outbound HTTP URL from a
non-literal value — typically a tool parameter interpolated into an
f-string or concatenated onto a base URL. Because tool arguments are
produced by the model from conversation context (including prior tool
output and user input), an attacker who can shape that context can
steer the request to an attacker-controlled host or to an internal
address the agent's network egress can reach but the model was never
meant to touch. The same channel also leaks request bodies (auth
headers, JSON payloads, the model's reasoning) to whichever host the
URL resolves to, so the failure mode is both SSRF and data
exfiltration in one call.
fix: >
Treat the model-supplied value as untrusted input to URL construction.
Validate it against an allow-list of permitted hosts/path segments
before any HTTP client touches it, or look the value up against a
server-side registry and build the URL from the registry's trusted
entry. If the value is meant to be an opaque ID (e.g. a connection_id),
pass it as a query parameter or path segment of a fixed base URL and
reject characters that would let it escape that slot.
41 changes: 41 additions & 0 deletions openai_sdk/repo_hygiene.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
policy:
id: openai_sdk_repo_hygiene
name: OpenAI Agents SDK repo hygiene
category: openai_sdk
description: >
Repo-scoped hygiene rules for projects that use the OpenAI Agents SDK.
Fire once per scan against the repo manifest and inventory rather than per
tool or agent.

rules:
- id: OAI-202
title: OpenAI Agents project missing CLAUDE.md
severity: low
confidence: 0.9
language: python
applies_to:
- openai_agents
scope: repo
match:
all:
- repo_has_sdk_in_code:
- openai_agents
- not:
repo_component_present:
- claude_md
explanation: >
The project uses the OpenAI Agents SDK in code but ships no CLAUDE.md at
the repo root. When Claude Code (or any agent honoring the CLAUDE.md
convention) edits this repo, it has no project-specific guidance on
Agent vs SandboxAgent choice, handoff topology, required input/output
guardrails, tool_choice settings, or the local test and build commands.
The likely consequence is generated code that bypasses the project's
safety contracts because nothing in-tree teaches the agent the local
rules.
fix: >
Add a CLAUDE.md at the repo root. State whether the project uses Agent
or SandboxAgent, list required guardrails (input_guardrails,
output_guardrails) and tool_choice conventions, note any handoff or
tracing policy, and give the exact test, lint, and build commands. Keep
it short and concrete so an editing agent can act on it without
re-deriving the conventions.