Skip to content

feat(scripts): add OpenAI-compatible shim for Claude Code OAuth subscription#85

Open
jarrettj wants to merge 1 commit into
NousResearch:mainfrom
jarrettj:feat/claude-oauth-shim
Open

feat(scripts): add OpenAI-compatible shim for Claude Code OAuth subscription#85
jarrettj wants to merge 1 commit into
NousResearch:mainfrom
jarrettj:feat/claude-oauth-shim

Conversation

@jarrettj
Copy link
Copy Markdown

Adds scripts/claude_oai_shim.py — a FastAPI app exposing /v1/chat/completions that routes through the official claude-agent-sdk, letting any OpenAI-speaking client (litellm, DSPy, etc.) use a Claude Code subscription instead of a metered Anthropic Console API key.

Addresses issue #9 (use Anthropic Max plan subscription instead of API keys).

Changes

  • scripts/claude_oai_shim.py: FastAPI OpenAI-compat shim
  • scripts/README_claude_shim.md: usage documentation
  • pyproject.toml: adds claude-shim optional dependency group

…ription

Adds `scripts/claude_oai_shim.py` — a FastAPI app exposing
`/v1/chat/completions` that routes through the official
`claude-agent-sdk`, letting any OpenAI-speaking client (litellm, DSPy,
etc.) use a Claude Code subscription instead of a metered Anthropic
Console API key.

Why: `evolve_skill` and friends default to OpenAI clients via litellm.
If you already pay for Claude Code, you'd rather not provision a second
billing account just to run evolutions.

Usage:

    pip install -e ".[claude-shim]"
    python scripts/claude_oai_shim.py &
    export OPENAI_BASE_URL=http://127.0.0.1:8765/v1
    export OPENAI_API_KEY=anything

    python -m evolution.skills.evolve_skill \
        --skill github-code-review \
        --optimizer-model openai/claude-opus-4-7 \
        --eval-model openai/claude-haiku-4-5

Verified end-to-end: completes a 5-iteration MIPROv2 run against
github-code-review, optimizer (opus) + eval (haiku) both routed through
the shim, ~29 min wall time, score 54.28%.

Adds a `claude-shim` optional dependency group so the three runtime
deps (claude-agent-sdk, fastapi, uvicorn) don't bloat the default
install.

Caveats:
- Streaming not yet supported (stream: true is buffered)
- Token usage counts are best-effort
- Users should review Claude Code's acceptable-use terms before
  automating against it (note included in README)
Copy link
Copy Markdown
Author

@jarrettj jarrettj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Review

Found 1 Critical and 3 Warning issues that should be addressed before merge. See inline comments for details.

Severity Count
🔴 Critical 1
⚠️ Warning 3
💡 Suggestion 4

Note: Cannot request changes on own PR — submitting as comment review.


from __future__ import annotations

import asyncio
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Warning: import asyncio is unused throughout the file. This will cause a ruff lint failure in CI (F401 'asyncio' imported but unused). Remove this import.

return {"ok": True}


@app.post("/v1/chat/completions")
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Critical: No authentication on /v1/chat/completions. Any process on the local machine can call this endpoint and consume the owner's Claude Code subscription without restriction. While the server binds to 127.0.0.1 (mitigating remote network access), add at minimum an optional API key check via SHIM_API_KEY env var validated against the Authorization header. Example:

import os
from fastapi import Depends, HTTPException, Security
from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer

_bearer = HTTPBearer(auto_error=False)

def _verify(creds: HTTPAuthorizationCredentials = Security(_bearer)):
    key = os.environ.get('SHIM_API_KEY')
    if key and (not creds or creds.credentials != key):
        raise HTTPException(status_code=401, detail='Unauthorized')

This prevents malicious local processes or supply-chain compromises from silently draining the subscription.


text_chunks: list[str] = []
usage = {"input_tokens": 0, "output_tokens": 0}
async for message in query(prompt=prompt, options=options):
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Warning: No error handling around the claude-agent-sdk async call. Auth failures, network errors, rate limits, or SDK version mismatches will bubble up as raw 500 tracebacks. Clients like litellm expect an OpenAI-shaped error envelope. Wrap the async for block in try/except Exception as exc and return:

{"error": {"message": "...", "type": "server_error", "code": null}}

# ResultMessage carries usage at the end
u = getattr(message, "usage", None)
if u:
usage["input_tokens"] = u.get("input_tokens", 0) or 0
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Warning: u.get('input_tokens', 0) assumes usage is a dict. If claude-agent-sdk returns an object (dataclass or Pydantic model) for ResultMessage.usage, this raises AttributeError. Prefer getattr(u, 'input_tokens', None) or 0 to handle both shapes safely.

@app.post("/v1/chat/completions")
async def chat_completions(request: Request) -> JSONResponse:
body = await request.json()
messages = body.get("messages", [])
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion: No validation that messages is non-empty. An empty list sends an empty string to Claude, wasting quota. Add an early-exit 422:

if not messages:
    from fastapi import HTTPException
    raise HTTPException(status_code=422, detail='messages must be non-empty')



@app.post("/v1/chat/completions")
async def chat_completions(request: Request) -> JSONResponse:
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion: No request-size guard. A caller can send an arbitrarily large messages array, consuming significant Claude Code quota in one request. Consider a configurable cap on total message content length before forwarding to the SDK.

resolved = _resolve_model(model)
prompt, system = _messages_to_prompt(messages)

options = ClaudeAgentOptions(model=resolved, system_prompt=system) if system else ClaudeAgentOptions(model=resolved)
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion: Duplicate ClaudeAgentOptions constructor calls. Simplify:

kw = {'model': resolved}
if system:
    kw['system_prompt'] = system
options = ClaudeAgentOptions(**kw)

{
"index": 0,
"message": {"role": "assistant", "content": response_text},
"finish_reason": "stop",
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion: finish_reason is hardcoded to 'stop'. If the SDK signals truncation or content-filter stop this is silently swallowed. Map the SDK result status to 'stop', 'length', or 'content_filter' as appropriate.

@jarrettj
Copy link
Copy Markdown
Author

Code Review Summary — PR #85 (feat/claude-oauth-shim)

Verdict: Changes Requested (posted as COMMENT — cannot self-approve/block own PR)


🔴 Critical (1)

[scripts/claude_oai_shim.py:114] No authentication on /v1/chat/completions

The endpoint has zero authentication. Any process running on the local machine can freely call it and consume the owner's Claude Code subscription quota. While 127.0.0.1 binding blocks remote access, local processes (scripts, test runners, malicious deps) are entirely unrestricted.

Fix: Add an optional SHIM_API_KEY env var check against the Authorization: Bearer header via FastAPI Depends. If not set, warn at startup but allow; if set, enforce. This provides opt-in protection without breaking zero-config usage.


⚠️ Warnings (3)

  1. [scripts/claude_oai_shim.py:53] Unused import asyncio — Will fail ruff CI (F401). Remove.

  2. [scripts/claude_oai_shim.py:127] No error handling around SDK call — Auth failures, network errors, and rate limits from claude-agent-sdk bubble up as raw 500 tracebacks. litellm and DSPy expect an OpenAI-shaped error envelope {"error": {...}}. Wrap the async for block in try/except Exception.

  3. [scripts/claude_oai_shim.py:138] u.get() on potentially non-dict usage objectResultMessage.usage may be a dataclass/Pydantic model in some SDK versions, not a dict. u.get() will raise AttributeError. Use getattr(u, 'input_tokens', None) or 0.


💡 Suggestions (4)

  1. [L117] Validate non-empty messages — Empty messages list sends an empty string to Claude, wasting quota. Add a 422 guard.
  2. [L115] No request-size cap — A caller can send an arbitrarily large messages array. Consider a configurable content-length limit.
  3. [L123] Duplicate ClaudeAgentOptions constructor — Use conditional kwargs dict to avoid repeating the call.
  4. [L153] finish_reason hardcoded to 'stop' — Map SDK result status to 'stop', 'length', or 'content_filter' for OpenAI compatibility.

✅ Looks Good

  • Localhost-only binding (127.0.0.1:8765) correctly limits network attack surface
  • claude-agent-sdk is a legitimate PyPI package (current version: 0.2.82)
  • pyproject.toml correctly uses an optional dependency group [claude-shim] to avoid bloating the default install
  • _resolve_model() gracefully defaults unknown models to sonnet
  • _messages_to_prompt() correctly handles both string and content-block-array message content
  • README documents caveats (streaming unsupported, acceptable-use terms) clearly
  • No hardcoded secrets, no SQL, no XSS vectors, no path traversal

Missing Tests

No test file added for scripts/claude_oai_shim.py. At minimum, unit tests for _resolve_model() and _messages_to_prompt() (both pure functions, no SDK dependency) would validate the core mapping logic without requiring a live Claude Code session.


Reviewed by Hermes Agent (curl path — no gh CLI)

Copy link
Copy Markdown
Author

@jarrettj jarrettj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — feat(scripts): add OpenAI-compatible shim for Claude Code OAuth subscription

Overall the idea is sound and the implementation is mostly clean, but there are several issues that need to be addressed before merging.


Critical / Must-Fix

1. import asyncio is unused (will fail ruff/flake8 CI)
asyncio is imported on line 53 but never referenced anywhere in the file. The async for loop uses query() from the SDK directly — Python's async machinery doesn't require an explicit asyncio import here. This will cause a lint failure in the CI pipeline (ruff is configured in this repo).

2. usage access pattern is fragile — u.get(...) assumes a dict, not an object
Lines 138-139: u.get("input_tokens", 0) will AttributeError at runtime if message.usage is a dataclass/object (which is the typical shape for SDK response types) rather than a plain dict. Should use getattr(u, "input_tokens", 0) consistently with how content and text are accessed elsewhere in the same loop.

3. No error handling in chat_completions — any SDK failure returns a bare 500
If query() raises (e.g. auth expired, rate limited, network error), FastAPI will return a bare 500 with no JSON body. OpenAI-compatible clients expect an error response shaped as {"error": {"message": "...", "type": "...", "code": ...}}. A try/except block with an appropriate JSONResponse(status_code=...) is needed.


Warnings / Should-Fix

4. _resolve_model silently maps all unknown model names to sonnet
If a caller passes "gpt-4" or "gpt-4o" (common when reusing existing DSPy config), the shim silently routes to sonnet with no warning. The README says to use claude-* names but there is no enforcement at the API layer. Consider logging a warning or returning a 400 for unrecognized families.

5. stream: true is silently ignored
The README and docstring mention this limitation, but the endpoint itself never reads body.get("stream") — it simply ignores the field. A well-behaved shim should at minimum log a warning. Clients that assert on stream behavior (some LiteLLM retry paths) may produce unexpected results.

6. model field in response echoes the requested model, not the resolved model
Line 148 returns "model": model (the raw OpenAI-style string from the request). Returning the actual resolved Claude model string would be more transparent and easier to debug.

7. Credentials path typo in README
Line 23 of README_claude_shim.md: ~/.claude/.credentials.json — double dot before credentials. The actual Claude Code credentials path is ~/.claude/credentials.json (single dot directory, no leading dot on the file).


Suggestions / Nice-to-Have

8. Port is hardcoded — no --port/--host CLI arguments
uvicorn.run(app, host="127.0.0.1", port=8765) is fixed. A simple argparse or click argument for --port and optionally --host would make this more composable (e.g. for CI environments or Docker).

9. No tests
Given the repo already has pytest + pytest-asyncio in dev deps, even a minimal test (mock query(), assert response shape) would add meaningful coverage and prevent regressions.

10. claude-opus-4-1-20250805 date stamp will go stale
The MODEL_MAP hardcodes a date-stamped opus model. Claude model IDs change; this will silently route to an outdated or deprecated model once Anthropic updates the version. Consider documenting the update cadence or sourcing the model ID from an environment variable / config.


Looks Good

  • Optional dependency group claude-shim in pyproject.toml is correctly scoped and won't affect existing installs.
  • The _messages_to_prompt() function correctly handles both string and list-of-block content formats, covering the OpenAI vision/tools content shape.
  • Localhost-only binding (127.0.0.1) is a good default for a dev-only tool — avoids accidental exposure.
  • The uuid4 ID generation and timestamp in the response correctly mimics the OpenAI response envelope.
  • Documentation is clear about acceptable-use caveats.

Suggested verdict: REQUEST_CHANGES — items 1–3 are blocking (unused import breaks CI, usage dict access will AttributeError in production, missing error handling degrades client compatibility). Items 4–7 are strongly recommended before merge.

(Note: PR #300 was requested for review but does not exist in this repository — the highest PR number is #85. This review covers PR #85, the most recent open PR.)

@jarrettj
Copy link
Copy Markdown
Author

Code Review Summary — PR #85: feat(scripts): add OpenAI-compatible shim for Claude Code OAuth subscription

What this PR does: Adds scripts/claude_oai_shim.py — a FastAPI server exposing /v1/chat/completions that proxies requests through the claude-agent-sdk, routing them to the authenticated Claude Code subscription. This lets DSPy, LiteLLM, and other OpenAI-speaking tools use a Claude Code Max plan instead of a separate Anthropic Console API key. Also adds scripts/README_claude_shim.md and a claude-shim optional dependency group in pyproject.toml.


Critical (blocking)

# Issue File Line
1 import asyncio is unused — will fail ruff CI claude_oai_shim.py 53
2 u.get("input_tokens") assumes usage is a dict; SDK objects will AttributeError — use getattr(u, ...) instead claude_oai_shim.py 138-139
3 No try/except in chat_completions — any SDK error returns a bare 500 without the OpenAI error envelope {"error": {...}} claude_oai_shim.py 127

Warnings (should fix)

# Issue
4 _resolve_model silently maps gpt-4, gpt-4o, etc. to sonnet — no log warning or 400 response
5 stream: true field is never read — silently ignored without any warning to the caller
6 Response model field echoes the raw requested name, not the resolved Claude model ID
7 README typo: ~/.claude/.credentials.json should be ~/.claude/credentials.json (extra leading dot)

Suggestions (nice-to-have)

  • Add --port/--host CLI args to main() for composability
  • Add at least one pytest test (mock query(), verify response shape)
  • claude-opus-4-1-20250805 will go stale — consider an env var override

Looks good

  • pyproject.toml optional group is correctly scoped
  • _messages_to_prompt() handles both string and list-of-block content
  • 127.0.0.1-only binding is the right default for a dev tool
  • UUID + timestamp response envelope matches OpenAI shape
  • Acceptable-use caveat is prominently documented

Verdict: REQUEST_CHANGES — items 1–3 are blocking before merge.

Copy link
Copy Markdown
Author

@jarrettj jarrettj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hermes Agent Review — PR #85

Verdict: Would Request Changes — 2 bugs, 3 warnings, 2 suggestions (self-review; GitHub doesn't allow REQUEST_CHANGES on your own PR).

🔴 Bugs (fix before merge)

  • claude_oai_shim.py:139u.get() assumes usage is a dict; SDK may return a typed object → AttributeError at runtime
  • claude_oai_shim.py:118stream=true silently ignored; clients expecting SSE will hang or parse incorrectly

⚠️ Warnings

  • claude_oai_shim.py:120 — no error handling around query() — SDK exceptions return raw 500 with full traceback
  • pyproject.toml:34claude-agent-sdk>=0.1.0 is too loose; a breaking 0.x bump will pull in incompatible code silently
  • README_claude_shim.md:43 — example uses claude-opus-4-7 but MODEL_MAP resolves opus to claude-opus-4-1-20250805

💡 Suggestions

  • Port 8765 hard-coded with no --port CLI arg or CLAUDE_SHIM_PORT env var
  • No unit tests for _resolve_model() or _messages_to_prompt()

✅ Looks Good

  • Clean docstring with usage, caveats, and model routing table
  • from __future__ import annotations + typed signatures throughout
  • Correct if __name__ == '__main__' guard
  • Optional dep group claude-shim keeps the default install slim
  • All 139 existing tests pass; ruff: clean

Reviewed by Hermes Agent

u = getattr(message, "usage", None)
if u:
usage["input_tokens"] = u.get("input_tokens", 0) or 0
usage["output_tokens"] = u.get("output_tokens", 0) or 0
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: u.get() assumes usage is a plain dict. The claude-agent-sdk may return a typed object (Pydantic model, dataclass, etc.) — calling .get() on it raises AttributeError at runtime whenever actual usage data is returned.

Suggested fix:

u = getattr(message, 'usage', None)
if u:
    def _tok(obj, key):
        if isinstance(obj, dict):
            return obj.get(key, 0) or 0
        return getattr(obj, key, 0) or 0
    usage['input_tokens'] = _tok(u, 'input_tokens')
    usage['output_tokens'] = _tok(u, 'output_tokens')

async def chat_completions(request: Request) -> JSONResponse:
body = await request.json()
messages = body.get("messages", [])
model = body.get("model", "claude-sonnet-4-5")
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: stream=True is silently downgraded to a buffered JSON response. OpenAI-compatible clients (litellm, DSPy, openai-python) that set stream=True expect a text/event-stream SSE response. Returning plain JSON will cause them to hang waiting for chunks or mis-parse the response.

Short-term fix — return an explicit 400 so callers know streaming isn't supported yet:

if body.get('stream'):
    return JSONResponse(status_code=400, content={
        'error': {'message': 'Streaming is not yet supported by this shim.', 'type': 'invalid_request_error'}
    })

messages = body.get("messages", [])
model = body.get("model", "claude-sonnet-4-5")

resolved = _resolve_model(model)
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Warning: No try/except around the query() call. Any SDK exception (auth failure, model not found, rate-limit) will surface as a raw 500 with a full Python traceback in the response body.

Wrap the handler:

try:
    # ... existing code ...
except Exception as exc:
    return JSONResponse(
        status_code=500,
        content={'error': {'message': str(exc), 'type': 'internal_error'}}
    )

Comment thread pyproject.toml
"darwinian-evolver",
]
claude-shim = [
"claude-agent-sdk>=0.1.0",
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Warning: claude-agent-sdk>=0.1.0 has no upper bound. For an early-stage 0.x package, any minor release can introduce breaking API changes. Recommend:

"claude-agent-sdk>=0.1.0,<1.0",

--skill github-code-review \
--optimizer-model openai/claude-opus-4-7 \
--eval-model openai/claude-haiku-4-5 \
--iterations 5
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Inconsistency: This example passes --optimizer-model openai/claude-opus-4-7, but MODEL_MAP resolves all opus-* names to claude-opus-4-1-20250805. Either update MODEL_MAP to use claude-opus-4-7 (or the latest opus alias), or update this example to match what the shim actually resolves.

@jarrettj
Copy link
Copy Markdown
Author

Code Review Summary — PR #85

Verdict: Would Request Changes (2 bugs, 3 warnings, 2 suggestions)

Note: GitHub prevents REQUEST_CHANGES on your own PR. Full inline review is posted above.

🔴 Bugs

  • scripts/claude_oai_shim.py:139u.get() called on usage assumes it's a dict. The SDK may return a typed object (Pydantic/dataclass) → AttributeError at runtime when usage data arrives. Use getattr(u, key, 0) as a fallback.
  • scripts/claude_oai_shim.py:118stream=True is silently ignored and a non-streaming JSON response is returned. OpenAI clients expecting SSE (text/event-stream) will hang or mis-parse. Return a 400 error explicitly rather than silently degrading.

⚠️ Warnings

  • scripts/claude_oai_shim.py:120 — No try/except around the query() call. SDK errors (auth, rate-limit, network) return a raw 500 with Python traceback. Wrap in a proper JSON error handler.
  • pyproject.toml:34claude-agent-sdk>=0.1.0 has no upper bound. For a pre-1.0 package this risks silent breaking changes. Add <1.0.
  • scripts/README_claude_shim.md:43 — Example uses claude-opus-4-7 but MODEL_MAP resolves opus to claude-opus-4-1-20250805. Align the two.

💡 Suggestions

  • Port 8765 is hard-coded with no --port flag or CLAUDE_SHIM_PORT env var — makes it awkward when the port is in use.
  • No unit tests for _resolve_model() or _messages_to_prompt() — both are pure functions, easy to test in isolation.

✅ Looks Good

  • Clean, well-structured module docstring with usage examples and caveats
  • from __future__ import annotations + full type annotations throughout
  • Correct if __name__ == '__main__' guard
  • Optional claude-shim dep group keeps default install slim
  • All 139 existing tests pass; ruff reports no linting issues
  • Good idea — solves a real pain point for Max-plan users running evolutions

Reviewed by Hermes Agent

Copy link
Copy Markdown
Author

@jarrettj jarrettj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hermes Agent Review

Found 3 warnings and 3 suggestions. Core logic is solid — model routing, message flattening, and the OpenAI response shape are all correct. All 139 existing tests pass. Three issues worth fixing before this lands in main: an unused import, missing error handling on query(), and a potential AttributeError on the usage object. See inline comments.


from __future__ import annotations

import asyncio
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Warning (linting): asyncio is imported but never used — ruff flags this as F401. Remove this import line.

resolved = _resolve_model(model)
prompt, system = _messages_to_prompt(messages)

options = ClaudeAgentOptions(model=resolved, system_prompt=system) if system else ClaudeAgentOptions(model=resolved)
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion: The conditional is unnecessary if ClaudeAgentOptions accepts system_prompt=None. Most SDKs treat None as 'no system prompt'. Simplify to:

options = ClaudeAgentOptions(model=resolved, system_prompt=system)


text_chunks: list[str] = []
usage = {"input_tokens": 0, "output_tokens": 0}
async for message in query(prompt=prompt, options=options):
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Warning: No error handling around the query() call. If the claude-agent-sdk raises (auth failure, rate limit, network error), FastAPI returns a raw 500 traceback instead of an OpenAI-compatible error body. Clients that parse {"error": {"message": ...}} will get a confusing response.

Suggested fix:

try:
    async for message in query(prompt=prompt, options=options):
        ...
except Exception as exc:
    return JSONResponse(
        status_code=500,
        content={"error": {"message": str(exc), "type": "internal_error"}},
    )

# ResultMessage carries usage at the end
u = getattr(message, "usage", None)
if u:
usage["input_tokens"] = u.get("input_tokens", 0) or 0
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Warning: u.get() assumes usage is a dict, but the claude-agent-sdk may return a Usage dataclass or named object — calling .get() on it raises AttributeError at runtime.

Use getattr instead:

usage["input_tokens"] = getattr(u, "input_tokens", 0) or 0
usage["output_tokens"] = getattr(u, "output_tokens", 0) or 0

(Or confirm the SDK always returns a dict here and add a comment noting that.)

from fastapi.responses import JSONResponse

MODEL_MAP = {
"opus": "claude-opus-4-1-20250805",
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion: Model versions are hardcoded and will go stale as Claude releases new versions. Consider env-var overrides so users can pin to newer versions without editing source:

import os
MODEL_MAP = {
    "opus":   os.getenv("CLAUDE_SHIM_OPUS",   "claude-opus-4-1-20250805"),
    "sonnet": os.getenv("CLAUDE_SHIM_SONNET", "claude-sonnet-4-5"),
    "haiku":  os.getenv("CLAUDE_SHIM_HAIKU",  "claude-haiku-4-5"),
}

@app.post("/v1/chat/completions")
async def chat_completions(request: Request) -> JSONResponse:
body = await request.json()
messages = body.get("messages", [])
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion: stream: true is silently ignored. A client requesting streaming will receive a non-streaming response with no indication the mode was not honoured. Consider a warning log so this is observable:

import logging
if body.get("stream"):
    logging.getLogger(__name__).warning(
        "stream=true requested but not supported — returning buffered response"
    )

@jarrettj
Copy link
Copy Markdown
Author

Code Review Summary — PR #85

Verdict: Comments (non-blocking) — 3 warnings, 3 suggestions. Core implementation is correct; two runtime bugs should be fixed before this sees production use.


⚠️ Warnings

  • scripts/claude_oai_shim.py:53 — Unused asyncio import. Ruff F401 — remove it.

  • scripts/claude_oai_shim.py:127 — No error handling around query(). SDK errors (auth failure, rate limit, network) will surface as raw FastAPI 500 tracebacks instead of structured {"error": {"message": ...}} JSON. OpenAI-compatible clients that parse the error body will get confused. Wrap in try/except and return a well-formed error response.

  • scripts/claude_oai_shim.py:138u.get() assumes the usage object is a dict. If the claude-agent-sdk returns a dataclass or object, this raises AttributeError at runtime. Use getattr(u, "input_tokens", 0) instead.


💡 Suggestions

  • scripts/claude_oai_shim.py:64 — Hardcoded model version strings (claude-opus-4-1-20250805, etc.) will go stale. Consider env-var overrides (CLAUDE_SHIM_OPUS, etc.) so users can update without editing source.

  • scripts/claude_oai_shim.py:117stream: true is silently ignored. Add a warning log so the behaviour is observable when streaming clients connect.

  • scripts/claude_oai_shim.py:123 — Verbose conditional for ClaudeAgentOptions — if the SDK accepts system_prompt=None, simplify to a single constructor call.


✅ Looks Good

  • Model routing logic (_resolve_model) is clean and has sensible fallback behaviour.
  • _messages_to_prompt correctly handles both string and content-block message formats.
  • OpenAI response shape (choices, usage, id, created, model) is spec-compliant.
  • pyproject.toml optional dep group is the right pattern — doesn't bloat default installs.
  • README and module docstring are thorough; caveats (streaming, acceptable-use) are clearly called out.
  • No secrets, credentials, or debug statements committed.
  • All 139 existing tests pass.

Reviewed by Hermes Agent · PR #85

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant