Skip to content

refactor: Decouple ModelFacade from LiteLLM via ModelClient adapter#373

Open
nabinchha wants to merge 40 commits intomainfrom
nm/overhaul-model-facade-guts-pr2
Open

refactor: Decouple ModelFacade from LiteLLM via ModelClient adapter#373
nabinchha wants to merge 40 commits intomainfrom
nm/overhaul-model-facade-guts-pr2

Conversation

@nabinchha
Copy link
Contributor

📋 Summary

Decouples ModelFacade from direct LiteLLM router usage by introducing a ModelClient adapter layer. The facade now operates entirely on canonical request/response types (ChatCompletionRequest, ChatCompletionResponse, etc.) instead of raw LiteLLM objects, making it testable without LiteLLM and preparing for future client backends.

This is PR 2 of the model facade overhaul series, building on the canonical types and LiteLLMBridgeClient introduced in PR 1.

🔄 Changes

✨ Added

  • clients/factory.pycreate_model_client() factory that handles provider resolution, API key setup, and LiteLLM router construction
  • TransportKwargs — Unified transport preparation that flattens extra_body into top-level kwargs and separates extra_headers
  • _raise_from_provider_error() in errors.py — Maps canonical ProviderError to DataDesignerError subclasses
  • extract_message_from_exception_string() for parsing human-readable messages from stringified LiteLLM exceptions
  • make_stub_completion_response() test helper for creating canonical test fixtures
  • close()/aclose() lifecycle methods on ModelFacade and ModelRegistry
  • New test file test_parsing.py for TransportKwargs behavior

🔧 Changed

  • ModelFacade — Now accepts a ModelClient via constructor injection instead of creating its own CustomRouter. All methods use canonical types (ChatCompletionRequest/Response, EmbeddingRequest/Response, ImageGenerationRequest/Response)
  • MCPFacade — Operates on canonical ChatCompletionResponse and ToolCall types instead of raw LiteLLM response objects; removed internal tool call normalization (_extract_tool_calls, _normalize_tool_call) since parsing now happens in the client layer
  • LiteLLMBridgeClient — Uses TransportKwargs.from_request() instead of collect_non_none_optional_fields() for cleaner request forwarding
  • ProviderError — Refactored from @dataclass to regular Exception subclass for proper exception semantics
  • Usage tracking — Consolidated three separate tracking methods into single _track_usage() operating on canonical Usage type
  • Test suite — All facade and MCP tests now use canonical types instead of StubResponse/StubMessage/FakeResponse/FakeMessage; tests mock ModelClient instead of CustomRouter
  • model_facade_factory — Now creates a ModelClient first, then injects it into ModelFacade

🗑️ Removed

  • _try_extract_base64() and direct image parsing from ModelFacade (moved to client layer in PR1)
  • _get_litellm_deployment() from ModelFacade (moved to create_model_client())
  • collect_non_none_optional_fields() from parsing.py (replaced by TransportKwargs)
  • Three separate usage tracking methods (_track_token_usage_from_completion, _track_token_usage_from_embedding, _track_token_usage_from_image_diffusion) replaced by unified _track_usage()
  • StubResponse/FakeResponse usage in tests (replaced by canonical types)
  • Several removed tests for internal normalization logic that is now handled by the client layer

🔍 Attention Areas

⚠️ Reviewers: Please pay special attention to the following:

  • facade.py — Core refactor: constructor signature change (client replaces secret_resolver + internal router), all methods now use canonical types
  • errors.py — New _raise_from_provider_error() mapping function and ProviderErrorException refactor
  • types.pyTransportKwargs design: flattening extra_body vs keeping extra_headers separate
  • mcp/facade.py — Significant simplification from canonical types; verify the _convert_canonical_tool_calls_to_dicts bridge is correct

🤖 Generated with Claude Code

nabinchha and others added 30 commits February 19, 2026 15:50
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…provements

- Wrap all LiteLLM router calls in try/except to normalize raw exceptions
  into canonical ProviderError at the bridge boundary (blocking review item)
- Extract reusable response-parsing helpers into clients/parsing.py for
  shared use across future native adapters
- Add async image parsing path using httpx.AsyncClient to avoid blocking
  the event loop in agenerate_image
- Add retry_after field to ProviderError for future retry engine support
- Fix _to_int_or_none to parse numeric strings from providers
- Create test conftest.py with shared mock_router/bridge_client fixtures
- Parametrize duplicate image generation and error mapping tests
- Add tests for exception wrapping across all bridge methods
…larity

- Parse RFC 7231 HTTP-date strings in Retry-After header (used by
  Azure and Anthropic during rate-limiting) in addition to numeric
  delay-seconds
- Clarify collect_non_none_optional_fields docstring explaining why
  f.default is None is the correct check for optional field forwarding
- Add tests for HTTP-date and garbage Retry-After values
- Fix misleading comment about prompt field defaults in _IMAGE_EXCLUDE
- Handle list-format detail arrays in _extract_structured_message for
  FastAPI/Pydantic validation errors
- Document scope boundary for vision content in collect_raw_image_candidates
- Replace @DataClass + __post_init__ with explicit __init__ that calls
  super().__init__ properly, avoiding brittle field-ordering dependency
- Store cause via __cause__ only, removing the redundant .cause attr
- Update match pattern in handle_llm_exceptions for non-dataclass type
- Rename shadowed local `fields` to `optional_fields` in TransportKwargs
@nabinchha nabinchha requested a review from a team as a code owner March 6, 2026 00:01
@nabinchha nabinchha changed the base branch from nm/overhaul-model-facade-guts-pr1 to main March 6, 2026 00:02
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 6, 2026

Greptile Summary

This PR decouples ModelFacade from direct LiteLLM router usage by introducing a ModelClient adapter layer. ModelFacade now accepts a ModelClient via constructor injection and operates entirely on canonical request/response types (ChatCompletionRequest, ChatCompletionResponse, EmbeddingRequest/Response, ImageGenerationRequest/Response), making it independently testable and paving the way for alternative backends. The new LiteLLMBridgeClient encapsulates all LiteLLM-specific logic, and TransportKwargs cleanly handles the extra_body/extra_headers flattening pattern LiteLLM expects.

Key changes:

  • ModelFacade now uses constructor injection for ModelClient instead of building its own CustomRouter
  • create_model_client() factory centralizes provider resolution, API key setup, and router construction
  • MCPFacade simplified significantly — raw LiteLLM response parsing moved to the client layer
  • ProviderError refactored from @dataclass to a proper Exception subclass with _raise_from_provider_error() mapping it to DataDesignerError subclasses
  • Three separate usage tracking methods unified into a single _track_usage() operating on canonical Usage type
  • close()/aclose() lifecycle methods added to both ModelFacade and ModelRegistry
  • Test suite migrated from StubResponse/FakeResponse to canonical types, mocking ModelClient instead of CustomRouter

Issues found:

  • extract_tool_calls in parsing.py coerces a missing tool name to "" (via or ""), silently creating a ToolCall(name="") that propagates up and fails with a misleading MCPConfigurationError("Tool '' not found…") rather than an explicit early-validation error
  • In generate_image/agenerate_image, when the API returns a response with no images and ImageGenerationError is raised, the finally block still records the request as successful (response is not Noneis_request_successful=True), which can overstate success rates in usage statistics

Confidence Score: 4/5

  • Safe to merge with two minor logic issues: tool name coercion produces misleading errors on malformed tool calls, and image generation error tracking overstates success rates.
  • The refactoring is architecturally sound and well-tested. The ModelClient abstraction correctly decouples ModelFacade from LiteLLM internals, most previous concerns are resolved, and the two remaining issues are edge cases with clear fixes: (1) early validation of empty tool names in the parsing layer, and (2) restructuring error tracking to distinguish between API-level success and application-level success. Neither affects core functionality or prevents the code from working in normal use cases.
  • packages/data-designer-engine/src/data_designer/engine/models/clients/parsing.py (empty tool name coercion) and packages/data-designer-engine/src/data_designer/engine/models/facade.py (image generation error tracking)

Class Diagram

%%{init: {'theme': 'neutral'}}%%
classDiagram
    class ModelFacade {
        -ModelClient _client
        -ModelConfig _model_config
        -ModelUsageStats _usage_stats
        +completion(messages) ChatCompletionResponse
        +acompletion(messages) ChatCompletionResponse
        +generate(prompt) tuple
        +generate_text_embeddings(texts) list
        +generate_image(prompt) list
        +close()
        +aclose()
        -_build_chat_completion_request() ChatCompletionRequest
        -_track_usage(usage, is_request_successful)
    }

    class ModelClient {
        <<Protocol>>
        +completion(request) ChatCompletionResponse
        +acompletion(request) ChatCompletionResponse
        +embeddings(request) EmbeddingResponse
        +aembeddings(request) EmbeddingResponse
        +generate_image(request) ImageGenerationResponse
        +agenerate_image(request) ImageGenerationResponse
        +close()
        +aclose()
    }

    class LiteLLMBridgeClient {
        -LiteLLMRouter _router
        -str provider_name
        +completion(request) ChatCompletionResponse
        +acompletion(request) ChatCompletionResponse
        +embeddings(request) EmbeddingResponse
        +generate_image(request) ImageGenerationResponse
    }

    class TransportKwargs {
        +dict body
        +dict headers
        +from_request(request) TransportKwargs$
        -_collect_optional_fields(request) dict$
    }

    class ProviderError {
        +ProviderErrorKind kind
        +str message
        +int status_code
        +float retry_after
    }

    class create_model_client {
        <<factory>>
        +create_model_client(config, resolver, registry) ModelClient
    }

    ModelFacade --> ModelClient : injects via constructor
    LiteLLMBridgeClient ..|> ModelClient : implements
    LiteLLMBridgeClient --> TransportKwargs : uses
    LiteLLMBridgeClient --> ProviderError : raises
    create_model_client --> LiteLLMBridgeClient : creates
    create_model_client --> ModelClient : returns
Loading

Comments Outside Diff (1)

  1. packages/data-designer-engine/src/data_designer/engine/models/clients/parsing.py, line 211-214 (link)

    Empty tool name silently creates invalid ToolCall

    get_value_from(function, "name") or "" coerces a missing or None tool name to an empty string "", which is then passed into a ToolCall(name=""). This bypasses any early validation.

    The empty ToolCall(name="") propagates through _execute_tool_calls_from_canonical() and reaches _find_resolved_provider_for_tool(""), which raises the much less informative MCPConfigurationError("Tool '' not found on any configured provider.") instead of failing with a clear message at the parsing layer.

    Consider raising a clear error at the parsing layer when name is empty:

    name = get_value_from(function, "name")
    if not name:
        raise ValueError(f"Tool call {tool_call_id!r} is missing a name; skipping.")

    Or, if the client layer should stay error-free, at minimum document in _execute_tool_calls_from_canonical that a ToolCall with name="" will surface as a misleading MCPConfigurationError.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: packages/data-designer-engine/src/data_designer/engine/models/clients/parsing.py
    Line: 211-214
    
    Comment:
    **Empty tool name silently creates invalid `ToolCall`**
    
    `get_value_from(function, "name") or ""` coerces a missing or `None` tool name to an empty string `""`, which is then passed into a `ToolCall(name="")`. This bypasses any early validation.
    
    The empty `ToolCall(name="")` propagates through `_execute_tool_calls_from_canonical()` and reaches `_find_resolved_provider_for_tool("")`, which raises the much less informative `MCPConfigurationError("Tool '' not found on any configured provider.")` instead of failing with a clear message at the parsing layer.
    
    Consider raising a clear error at the parsing layer when `name` is empty:
    ```python
    name = get_value_from(function, "name")
    if not name:
        raise ValueError(f"Tool call {tool_call_id!r} is missing a name; skipping.")
    ```
    
    Or, if the client layer should stay error-free, at minimum document in `_execute_tool_calls_from_canonical` that a `ToolCall` with `name=""` will surface as a misleading `MCPConfigurationError`.
    
    How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: bfed5af

@nabinchha nabinchha self-assigned this Mar 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants