fix(wren-ai-service): handle json_schema response_format for non-OpenAI models and guard empty SQL diagnosis response by Thangoun · Pull Request #2167 · Canner/WrenAI

Thangoun · 2026-03-24T08:52:07Z

Summary

This PR fixes issues when using non-OpenAI models (e.g., Ollama or LiteLLM proxy backends).

Changes

Strip response_format={"type": "json_schema"} for custom api_base models since structured outputs are only supported by native OpenAI models and can return empty responses on other backends.
Add guard for empty SQL diagnosis response to prevent orjson.loads("") crash.

Files Changed

wren-ai-service/src/providers/llm/litellm.py
wren-ai-service/src/pipelines/generation/sql_diagnosis.py

Result

Prevents empty responses from LiteLLM non-OpenAI backends
Prevents crash when SQL diagnosis reply is empty

Summary by CodeRabbit

Bug Fixes
- SQL diagnosis now handles missing or empty model replies gracefully, returning a clear standardized message instead of failing.
- Improved compatibility with custom LLM endpoints by avoiding incompatible response-format instructions being forwarded, reducing errors for non-standard backends.

…I models and guard empty SQL diagnosis response

coderabbitai · 2026-03-24T08:54:19Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: aadea215-2de9-4e85-a82f-9b24ebb96b8f

📥 Commits

Reviewing files that changed from the base of the PR and between 5dfdafc and 8c5e239.

📒 Files selected for processing (2)

wren-ai-service/src/pipelines/generation/sql_diagnosis.py
wren-ai-service/src/providers/llm/litellm.py

✅ Files skipped from review due to trivial changes (1)

wren-ai-service/src/providers/llm/litellm.py

🚧 Files skipped from review as they are similar to previous changes (1)

wren-ai-service/src/pipelines/generation/sql_diagnosis.py

Walkthrough

Two robustness changes: SQL diagnosis post_process now returns a dict and defensively handles missing/empty LLM replies, and the Litellm provider conditionally strips json_schema response_format when api_base is set to avoid forwarding OpenAI-specific formatting to custom backends.

Changes

Cohort / File(s)	Summary
SQL Diagnosis `wren-ai-service/src/pipelines/generation/sql_diagnosis.py`	`post_process` return type changed from `str` to `dict`; extracts first LLM reply defensively via `get("replies", [""])[0]`; returns a fallback `{"reasoning": "LLM did not return any response."}` for missing/empty replies, otherwise parses reply with `orjson.loads`.
Litellm LLM Provider `wren-ai-service/src/providers/llm/litellm.py`	Before building completion request, conditionally remove `response_format` when `self._api_base` is set and `response_format` is a dict with `type == "json_schema"`, preventing forwarding OpenAI `json_schema` formatting to non-OpenAI/custom API backends.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

fix(wren-ai-service): The DeepSeek response may not be a valid JSON format. #1529 — Modifies LitellmLLMProvider._run; related to handling/stripping response_format when api_base is set.
chore(wren-ai-service): update sql diagnosis #1956 — Changes sql_diagnosis output schema (e.g., removes can_be_corrected); related to post_process return/format changes.

Suggested labels

module/ai-service, wren-ai-service

Suggested reviewers

cyyeh
yichieh-lu

Poem

🐰 I nibble on bugs in code so bright,
I patch the silence of a missing byte,
JSON and replies now neatly aligned,
Hops of reliability — coded and signed! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically summarizes both main changes: handling json_schema response_format for non-OpenAI models and guarding empty SQL diagnosis responses.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

wren-ai-service/src/pipelines/generation/sql_diagnosis.py (1)

91-98: ⚠️ Potential issue | 🟡 Minor

Return type annotation is incorrect.

The function returns a dict (either the fallback {"reasoning": ...} or the parsed JSON), but the type hint declares -> str.

Proposed fix

 `@observe`(capture_input=False)
 async def post_process(
     generate_sql_diagnosis: dict,
-) -> str:
+) -> dict:
     reply = generate_sql_diagnosis.get("replies", [""])[0]
     if not reply or not reply.strip():
         return {"reasoning": "LLM did not return any response."}
     return orjson.loads(reply)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@wren-ai-service/src/pipelines/generation/sql_diagnosis.py` around lines 91 -
98, The return type annotation on post_process is wrong: the function returns a
dict (either the fallback {"reasoning": ...} or the JSON parsed via
orjson.loads) but is annotated as -> str; update the signature of post_process
to return a dict type (e.g., -> Dict[str, Any] or -> dict) and import typing as
needed so the annotation matches the actual returns from the post_process
function.

🧹 Nitpick comments (2)

wren-ai-service/src/providers/llm/litellm.py (2)
102-106: Trailing whitespace on line 105.

There appears to be unintentional whitespace added on line 105 after the opening brace merge.
Proposed cleanup
             generation_kwargs = {
                 **combined_generation_kwargs,
                 **(generation_kwargs or {}),
-                
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wren-ai-service/src/providers/llm/litellm.py` around lines 102 - 106, The
assignment to generation_kwargs in litellm.py contains unintended trailing
whitespace after the opening brace in the merged dict expression; edit the
generation_kwargs = { **combined_generation_kwargs, **(generation_kwargs or {}),
} statement (the dict merge that constructs generation_kwargs) to remove the
stray whitespace so the line is clean (ensure no extra spaces before the closing
brace or after the comma).
107-114: Heuristic for stripping json_schema may be overly broad.

Using self._api_base presence as the sole indicator to strip json_schema may inadvertently affect:

Azure OpenAI endpoints (which support json_schema on newer models)

OpenAI-compatible proxies that do support structured outputs

Users accessing native OpenAI through corporate proxies

This fix was added to address real issues where Ollama and similar backends return empty responses with json_schema set. However, LiteLLM itself supports json_schema for many non-OpenAI models, making the heuristic potentially too broad. Consider a more targeted approach such as:

Checking the model name prefix (e.g., gpt-, o1-, o3-)

Using a configuration flag to opt-in/out of json_schema stripping

Checking for specific model providers known not to support it

The TODO comment in the constructor suggests _api_base may be removed in the future, which aligns with reconsidering this approach.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wren-ai-service/src/providers/llm/litellm.py` around lines 107 - 114, The
current heuristic in litellm.py strips generation_kwargs["response_format"]
whenever self._api_base is set, which is too broad; update the condition in the
block that checks response_format/type == "json_schema" to be more targeted by
either (a) honoring a new instance flag such as
self.strip_json_schema_for_legacy_backends (settable in the constructor), or (b)
inspecting the model identifier (e.g., generation_kwargs.get("model") or
self._model_name) and only popping response_format for known incompatible
backends (match specific provider/model prefixes like "ollama", or a maintained
list of incompatible backends) while allowing OpenAI/Azure-style prefixes
("gpt-", "o1-", "o3-") to keep json_schema; ensure the symbol references are
self._api_base, generation_kwargs, and response_format so the change is
localized to the existing conditional and preserve current behavior for backends
that support structured outputs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@wren-ai-service/src/pipelines/generation/sql_diagnosis.py`:
- Around line 91-98: The return type annotation on post_process is wrong: the
function returns a dict (either the fallback {"reasoning": ...} or the JSON
parsed via orjson.loads) but is annotated as -> str; update the signature of
post_process to return a dict type (e.g., -> Dict[str, Any] or -> dict) and
import typing as needed so the annotation matches the actual returns from the
post_process function.

---

Nitpick comments:
In `@wren-ai-service/src/providers/llm/litellm.py`:
- Around line 102-106: The assignment to generation_kwargs in litellm.py
contains unintended trailing whitespace after the opening brace in the merged
dict expression; edit the generation_kwargs = { **combined_generation_kwargs,
**(generation_kwargs or {}), } statement (the dict merge that constructs
generation_kwargs) to remove the stray whitespace so the line is clean (ensure
no extra spaces before the closing brace or after the comma).
- Around line 107-114: The current heuristic in litellm.py strips
generation_kwargs["response_format"] whenever self._api_base is set, which is
too broad; update the condition in the block that checks response_format/type ==
"json_schema" to be more targeted by either (a) honoring a new instance flag
such as self.strip_json_schema_for_legacy_backends (settable in the
constructor), or (b) inspecting the model identifier (e.g.,
generation_kwargs.get("model") or self._model_name) and only popping
response_format for known incompatible backends (match specific provider/model
prefixes like "ollama", or a maintained list of incompatible backends) while
allowing OpenAI/Azure-style prefixes ("gpt-", "o1-", "o3-") to keep json_schema;
ensure the symbol references are self._api_base, generation_kwargs, and
response_format so the change is localized to the existing conditional and
preserve current behavior for backends that support structured outputs.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: aca0aa23-3251-4feb-9692-c0f36e80adb7

📥 Commits

Reviewing files that changed from the base of the PR and between eea642f and 5dfdafc.

📒 Files selected for processing (2)

wren-ai-service/src/pipelines/generation/sql_diagnosis.py
wren-ai-service/src/providers/llm/litellm.py

… stripping logic

Thangoun

Resolve conversation

fix(wren-ai-service): strip json_schema response_format for non-OpenA…

5dfdafc

…I models and guard empty SQL diagnosis response

coderabbitai bot reviewed Mar 24, 2026

View reviewed changes

fix(wren-ai-service): address review comments and improve json_schema…

8c5e239

… stripping logic

Thangoun commented Mar 24, 2026

View reviewed changes

Merge branch 'main' into fix/json-schema-empty-response

7be1c2d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(wren-ai-service): handle json_schema response_format for non-OpenAI models and guard empty SQL diagnosis response#2167

fix(wren-ai-service): handle json_schema response_format for non-OpenAI models and guard empty SQL diagnosis response#2167
Thangoun wants to merge 3 commits intoCanner:mainfrom
Thangoun:fix/json-schema-empty-response

Thangoun commented Mar 24, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 24, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Thangoun left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Thangoun commented Mar 24, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Files Changed

Result

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Thangoun left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Thangoun commented Mar 24, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 24, 2026 •

edited

Loading