Skip to content

feat: add LiteLLM as LLM provider#119

Merged
chauncygu merged 3 commits into
SafeRL-Lab:mainfrom
RheagalFire:feat/add-litellm-provider
May 12, 2026
Merged

feat: add LiteLLM as LLM provider#119
chauncygu merged 3 commits into
SafeRL-Lab:mainfrom
RheagalFire:feat/add-litellm-provider

Conversation

@RheagalFire
Copy link
Copy Markdown
Contributor

Summary

  • Adds LiteLLMProvider as a new LLM adapter in cc_kernel/runner/llm/, enabling access to 100+ providers (OpenAI, Anthropic, Google, Azure, Bedrock, Ollama, etc.) via a single unified SDK
  • Implements the same Provider protocol as AnthropicProvider: __call__(LlmRequest) -> LlmResponse + stream(LlmRequest, on_delta) -> LlmResponse
  • Added as optional dependency: pip install cheetahclaws[litellm]

Changes

  • cc_kernel/runner/llm/litellm_provider.py - new LiteLLMProvider with:
    • litellm.completion() with drop_params=True
    • stream() with on_delta callback for streaming text deltas
    • Multi-turn messages + system prompt support
    • Tool calls parsed into canonical {id, name, input} format
    • Token usage tracking
  • pyproject.toml - added litellm optional dependency (pip install cheetahclaws[litellm]), also included in all extra
  • requirements.txt - added litellm>=1.60.0,<2.0.0
  • tests/test_litellm_provider.py - 12 unit tests (all passing)

Tests

Unit tests (12/12 passing):

$ PYTHONPATH=. python -m pytest tests/test_litellm_provider.py -v
test_default_timeout PASSED                                                                                                                                                                                        
test_custom_api_key PASSED
test_no_lazy_state PASSED                                                                                                                                                                                          
test_calls_litellm_completion PASSED                                                                                                                                                                               
test_omits_api_key_when_none PASSED
test_system_prompt_included PASSED                                                                                                                                                                                 
test_multi_turn_messages PASSED
test_returns_llm_response PASSED
test_rejects_non_llm_request PASSED                                                                                                                                                                                
test_api_error_raises_provider_unavailable PASSED
test_stream_rejects_non_callable_on_delta PASSED                                                                                                                                                                   
test_litellm_in_requirements PASSED
12 passed in 1.18s                                                                                                                                                                                                 

Live E2E tests (3/3 passing against real API):

from cc_kernel.runner.llm.litellm_provider import LiteLLMProvider                                                                                                                                                  
from cc_kernel.runner.llm.provider import LlmRequest
                                                                                                                                                                                                                   
provider = LiteLLMProvider(api_key=os.environ["ANTHROPIC_FOUNDRY_API_KEY"])
                                                                                                                                                                                                                   
req = LlmRequest(model="anthropic/claude-sonnet-4-6", user="What is 2+2?", max_tokens=10)
resp = provider(req)                                                                                                                                                                                               
print(resp.text, resp.tokens_input, resp.tokens_output)
Test 1 (basic): text="4", tokens_in=20, tokens_out=5
Test 2 (stream): text="OK", chunks=1                                                                                                                                                                               
Test 3 (system): text="Ahoy!"
ALL 3 E2E TESTS PASSED                                                                                                                                                                                             

Example usage

from cc_kernel.runner.llm.litellm_provider import LiteLLMProvider
from cc_kernel.runner.llm.provider import LlmRequest

# LiteLLM reads provider keys from env automatically
provider = LiteLLMProvider()                                                                                                                                                                                       

# Basic call                                                                                                                                                                                                       
req = LlmRequest(
    model="anthropic/claude-sonnet-4-20250514",                                                                                                                                                                    
    system="You are a security expert.",
    user="Find SQL injection vectors in this query...",                                                                                                                                                            
)                                                                                                                                                                                                                  
resp = provider(req)
print(resp.text)                                                                                                                                                                                                   
                
# Streaming
def on_token(delta):
    print(delta, end="", flush=True)                                                                                                                                                                               

resp = provider.stream(req, on_delta=on_token)                                                                                                                                                                     

See https://docs.litellm.ai/docs/providers for 100+ supported model strings.

Impact

  • Additive only, existing AnthropicProvider untouched
  • Follows the same Provider protocol (__call__ + stream)
  • Same error types (ProviderUnavailable, ProviderInvalidRequest)
  • litellm is an optional dependency (pip install cheetahclaws[litellm])
  • drop_params=True silently drops provider-unsupported kwargs

@RheagalFire
Copy link
Copy Markdown
Contributor Author

cc @chauncygu

@chauncygu
Copy link
Copy Markdown
Contributor

Hi @RheagalFire, thanks for the PR. A few blockers before this can merge:

  1. Description vs. diff mismatch. Body says litellm is an optional dep, but the third commit moved it into [project] dependencies in pyproject.toml and the core block of requirements.txt. There's no litellm extra and it's not in all. Please move it back to [project.optional-dependencies] and add it to all.

  2. Not wired into either user path. cc_kernel/runner/llm/__main__.py:_select_provider() only knows mock / scripted / anthropic, and the top-level providers.py (the registry the CLI / Web UI consult for --model X) has no litellm entry. So no end-to-end caller can reach the new class — the 12 mocked tests pass by constructing it directly. Please add a litellm branch to _select_provider() and a "litellm" entry to PROVIDERS in providers.py.

  3. stream() drops usage + tool_calls. It always returns tokens_input=0, tokens_output=0, tool_calls=(). The runner emits ledger charge messages from those fields and decides tool dispatch via response.is_tool_use, so streamed calls bypass quota and silently lose tool calls. Please request stream_options={"include_usage": True} and read usage off the final chunk; assemble tool_calls from streaming deltas, or raise ProviderInvalidRequest when request.tools and request.stream are both set.

  4. cost_micro hardcoded to 0 in both paths. AnthropicProvider keeps a pricing table with a fallback — please mirror that, or set metadata={"cost_unknown": True} so callers can detect it.

Smaller items: live E2E results in the PR body aren't in the test suite (please add a skipif-gated tests/e2e_litellm_provider.py); test_litellm_in_requirements reads Path("requirements.txt") and fails outside the repo root; __call__ and stream() duplicate ~30 lines of message-building (factor like anthropic_provider._build_kwargs); no doc update under docs/guides/ — would help to call out when to prefer litellm/<provider>/<model> over the existing custom/ / ollama/ / anthropic/ adapters (Bedrock SigV4, Azure deployment routing, Vertex auth are the real value-add).

Direction is good — fix blockers 1–4 and I'm happy to merge. Thanks!

@chauncygu chauncygu merged commit 3bdda8f into SafeRL-Lab:main May 12, 2026
6 checks passed
chauncygu added a commit that referenced this pull request May 12, 2026
…ull entry in docs/news.md

- README News: one-paragraph May 12 entry summarising what PR #119
  shipped, what the follow-up fixed (dep classification, ledger,
  streaming, registry wiring), and pointing to docs/news.md.
- docs/news.md: long-form entry covering motivation, the four
  integration gaps the original PR left open, the implementation of
  each fix, the five additional bugs caught in self-review, test
  count delta (12 → 23 unit + 3 e2e), full-suite numbers, and the
  doc surface added in this branch.

"(latest)" marker moved off the May 11 daemon F-4 entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@chauncygu
Copy link
Copy Markdown
Contributor

Hi @RheagalFire, Thanks for your great contributions. I have merged and fixed the issues.

Thanks.

tsint pushed a commit to tsint/cheetahclaws that referenced this pull request May 13, 2026
…fix ledger, wire into providers.py

Addresses the upstream review blockers on PR SafeRL-Lab#119:

- Move litellm from core deps to [project.optional-dependencies] under
  the `litellm` extra, and add to `all`. PR SafeRL-Lab#119's body said it was
  optional but the diff put it in core, forcing every install to pull
  litellm's transitive chain.
- Lazy-import the SDK from inside LiteLLMProvider so the module stays
  importable on machines without litellm installed (matches the
  AnthropicProvider contract; verified via test that runs in a dev env
  without litellm installed).
- Wire `litellm` into both user-facing paths:
    • cc_kernel/runner/llm/__main__.py:_select_provider gains a
      `litellm` branch so CC_LLM_PROVIDER=litellm reaches the runner.
    • Top-level providers.PROVIDERS gains a `litellm` entry and a new
      stream_litellm() generator so the CLI / Web UI can resolve
      --model litellm/<provider>/<model>. Without this, no end-to-end
      caller could reach the new class.
- Populate cost_micro via litellm.completion_cost (was hard-coded 0,
  silently zeroing every charge through this path). When pricing isn't
  available, set metadata['cost_unknown']=True so the ledger can tell
  an unpriced model apart from a real $0 (Ollama, free NIM tier).
- Reassemble streaming chunks via litellm.stream_chunk_builder with
  stream_options={"include_usage": True} so the final response carries
  token counts, tool_calls, and the real finish_reason. Streaming
  previously dropped all three, breaking the RFC 0022 multi-iteration
  tool-call loop.
- Map litellm.exceptions.{AuthenticationError, BadRequestError,
  NotFoundError, UnsupportedParamsError} to ProviderInvalidRequest
  instead of swallowing every error into ProviderUnavailable.
- Defensive tool_call parsing: skip malformed entries (missing
  `function`, empty `name`) and coerce JSON-valid-but-non-dict
  arguments (e.g. "null", "[1,2]") to {} so a single broken tool_call
  no longer crashes the entire response via LlmResponse validation.
- Factor message-building into _build_messages / _build_params so
  __call__ and stream() no longer duplicate ~30 lines.
- Surface litellm_provider and actual_model in response metadata to
  aid cross-provider debugging.
- Add tests/e2e_litellm_provider.py (3 live-API tests, skipif-gated on
  CC_LITELLM_E2E=1 + per-provider credentials).
- Make test_litellm_in_requirements cwd-agnostic by resolving paths
  from __file__ rather than Path("requirements.txt").
- Add docs/guides/recipes.md section explaining when to prefer
  litellm/<provider>/<model> over custom/ — Bedrock SigV4, Azure
  deployment routing, Vertex AI service-account JWTs are the
  value-add.

Tests: 29 unit tests in test_litellm_provider.py (was 12), 3 e2e in
e2e_litellm_provider.py. Full non-e2e suite: 2222 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants