Fix/litellm provider followup#122
Merged
Merged
Conversation
…, wire into providers.py Addresses the upstream review blockers on PR #119: - Move litellm from core deps to [project.optional-dependencies] under the `litellm` extra, and add to `all`. PR #119's body said it was optional but the diff put it in core, forcing every install to pull litellm's transitive chain. - Lazy-import the SDK from inside LiteLLMProvider so the module stays importable on machines without litellm installed (matches the AnthropicProvider contract; verified via test that runs in a dev env without litellm installed). - Wire `litellm` into both user-facing paths: • cc_kernel/runner/llm/__main__.py:_select_provider gains a `litellm` branch so CC_LLM_PROVIDER=litellm reaches the runner. • Top-level providers.PROVIDERS gains a `litellm` entry and a new stream_litellm() generator so the CLI / Web UI can resolve --model litellm/<provider>/<model>. Without this, no end-to-end caller could reach the new class. - Populate cost_micro via litellm.completion_cost (was hard-coded 0, silently zeroing every charge through this path). When pricing isn't available, set metadata['cost_unknown']=True so the ledger can tell an unpriced model apart from a real $0 (Ollama, free NIM tier). - Reassemble streaming chunks via litellm.stream_chunk_builder with stream_options={"include_usage": True} so the final response carries token counts, tool_calls, and the real finish_reason. Streaming previously dropped all three, breaking the RFC 0022 multi-iteration tool-call loop. - Map litellm.exceptions.{AuthenticationError, BadRequestError, NotFoundError, UnsupportedParamsError} to ProviderInvalidRequest instead of swallowing every error into ProviderUnavailable. - Defensive tool_call parsing: skip malformed entries (missing `function`, empty `name`) and coerce JSON-valid-but-non-dict arguments (e.g. "null", "[1,2]") to {} so a single broken tool_call no longer crashes the entire response via LlmResponse validation. - Factor message-building into _build_messages / _build_params so __call__ and stream() no longer duplicate ~30 lines. - Surface litellm_provider and actual_model in response metadata to aid cross-provider debugging. - Add tests/e2e_litellm_provider.py (3 live-API tests, skipif-gated on CC_LITELLM_E2E=1 + per-provider credentials). - Make test_litellm_in_requirements cwd-agnostic by resolving paths from __file__ rather than Path("requirements.txt"). - Add docs/guides/recipes.md section explaining when to prefer litellm/<provider>/<model> over custom/ — Bedrock SigV4, Azure deployment routing, Vertex AI service-account JWTs are the value-add. Tests: 29 unit tests in test_litellm_provider.py (was 12), 3 e2e in e2e_litellm_provider.py. Full non-e2e suite: 2222 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ble, usage section Three README updates to surface the new `litellm/` provider so users can find it without reading the recipes guide: - Optional extras list gains `pip install ".[litellm]"` with a one-line hint about its purpose (AWS Bedrock / Azure / Vertex auth). - Supported Models table adds three rows — `litellm/bedrock/<model>`, `litellm/azure/<deployment>`, `litellm/vertex_ai/<model>` — covering the auth-heavy providers that aren't reachable via the existing custom/ adapter. - New "LiteLLM" subsection under "Usage: Closed-Source API Models" with concrete env-var + invocation examples for the three flagship use cases, and an explicit pointer toward custom/ for OpenAI-shaped endpoints so users don't pull litellm when they don't need it. i18n READMEs (CN/JP/ES/DE/PT) intentionally left for the maintainer to re-translate at their cadence. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ull entry in docs/news.md - README News: one-paragraph May 12 entry summarising what PR #119 shipped, what the follow-up fixed (dep classification, ledger, streaming, registry wiring), and pointing to docs/news.md. - docs/news.md: long-form entry covering motivation, the four integration gaps the original PR left open, the implementation of each fix, the five additional bugs caught in self-review, test count delta (12 → 23 unit + 3 e2e), full-suite numbers, and the doc surface added in this branch. "(latest)" marker moved off the May 11 daemon F-4 entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.