Skip to content

Fix/litellm provider followup#122

Merged
chauncygu merged 3 commits into
mainfrom
fix/litellm-provider-followup
May 12, 2026
Merged

Fix/litellm provider followup#122
chauncygu merged 3 commits into
mainfrom
fix/litellm-provider-followup

Conversation

@chauncygu
Copy link
Copy Markdown
Contributor

No description provided.

chauncygu and others added 3 commits May 12, 2026 11:35
…, wire into providers.py

Addresses the upstream review blockers on PR #119:

- Move litellm from core deps to [project.optional-dependencies] under
  the `litellm` extra, and add to `all`. PR #119's body said it was
  optional but the diff put it in core, forcing every install to pull
  litellm's transitive chain.
- Lazy-import the SDK from inside LiteLLMProvider so the module stays
  importable on machines without litellm installed (matches the
  AnthropicProvider contract; verified via test that runs in a dev env
  without litellm installed).
- Wire `litellm` into both user-facing paths:
    • cc_kernel/runner/llm/__main__.py:_select_provider gains a
      `litellm` branch so CC_LLM_PROVIDER=litellm reaches the runner.
    • Top-level providers.PROVIDERS gains a `litellm` entry and a new
      stream_litellm() generator so the CLI / Web UI can resolve
      --model litellm/<provider>/<model>. Without this, no end-to-end
      caller could reach the new class.
- Populate cost_micro via litellm.completion_cost (was hard-coded 0,
  silently zeroing every charge through this path). When pricing isn't
  available, set metadata['cost_unknown']=True so the ledger can tell
  an unpriced model apart from a real $0 (Ollama, free NIM tier).
- Reassemble streaming chunks via litellm.stream_chunk_builder with
  stream_options={"include_usage": True} so the final response carries
  token counts, tool_calls, and the real finish_reason. Streaming
  previously dropped all three, breaking the RFC 0022 multi-iteration
  tool-call loop.
- Map litellm.exceptions.{AuthenticationError, BadRequestError,
  NotFoundError, UnsupportedParamsError} to ProviderInvalidRequest
  instead of swallowing every error into ProviderUnavailable.
- Defensive tool_call parsing: skip malformed entries (missing
  `function`, empty `name`) and coerce JSON-valid-but-non-dict
  arguments (e.g. "null", "[1,2]") to {} so a single broken tool_call
  no longer crashes the entire response via LlmResponse validation.
- Factor message-building into _build_messages / _build_params so
  __call__ and stream() no longer duplicate ~30 lines.
- Surface litellm_provider and actual_model in response metadata to
  aid cross-provider debugging.
- Add tests/e2e_litellm_provider.py (3 live-API tests, skipif-gated on
  CC_LITELLM_E2E=1 + per-provider credentials).
- Make test_litellm_in_requirements cwd-agnostic by resolving paths
  from __file__ rather than Path("requirements.txt").
- Add docs/guides/recipes.md section explaining when to prefer
  litellm/<provider>/<model> over custom/ — Bedrock SigV4, Azure
  deployment routing, Vertex AI service-account JWTs are the
  value-add.

Tests: 29 unit tests in test_litellm_provider.py (was 12), 3 e2e in
e2e_litellm_provider.py. Full non-e2e suite: 2222 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ble, usage section

Three README updates to surface the new `litellm/` provider so users
can find it without reading the recipes guide:

- Optional extras list gains `pip install ".[litellm]"` with a one-line
  hint about its purpose (AWS Bedrock / Azure / Vertex auth).
- Supported Models table adds three rows — `litellm/bedrock/<model>`,
  `litellm/azure/<deployment>`, `litellm/vertex_ai/<model>` — covering
  the auth-heavy providers that aren't reachable via the existing
  custom/ adapter.
- New "LiteLLM" subsection under "Usage: Closed-Source API Models"
  with concrete env-var + invocation examples for the three flagship
  use cases, and an explicit pointer toward custom/ for OpenAI-shaped
  endpoints so users don't pull litellm when they don't need it.

i18n READMEs (CN/JP/ES/DE/PT) intentionally left for the maintainer to
re-translate at their cadence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ull entry in docs/news.md

- README News: one-paragraph May 12 entry summarising what PR #119
  shipped, what the follow-up fixed (dep classification, ledger,
  streaming, registry wiring), and pointing to docs/news.md.
- docs/news.md: long-form entry covering motivation, the four
  integration gaps the original PR left open, the implementation of
  each fix, the five additional bugs caught in self-review, test
  count delta (12 → 23 unit + 3 e2e), full-suite numbers, and the
  doc surface added in this branch.

"(latest)" marker moved off the May 11 daemon F-4 entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@chauncygu chauncygu merged commit 0f21e94 into main May 12, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant