fix(models): surface empty LiteLlm streaming completions as error event#6195
Open
kevin-hs-sohn wants to merge 1 commit into
Open
fix(models): surface empty LiteLlm streaming completions as error event#6195kevin-hs-sohn wants to merge 1 commit into
kevin-hs-sohn wants to merge 1 commit into
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Streaming completions where the provider returns a finish_reason but no text + no tool calls currently produce ZERO yielded LlmResponse events: ``aggregated_llm_response`` only gets set when ``(text or reasoning_parts)`` is truthy, and ``aggregated_llm_response_with_tool_call`` needs a function_call. With neither, the loop simply exits and the downstream Runner observes a silent successful empty stream. This pattern is reported across multiple stalled fix attempts: * google#5394 — AnthropicLlm never populates finish_reason on LlmResponse * google#5006 — retry with resume message when model returns empty response * google#5636 — surface error when model returns STOP with empty content * google#3618 / google#3699 — Handle empty message in LiteLLM response It hits providers under several real conditions: anthropic content_filter, gemini 2.5-flash-lite STOP-with-empty after tool calls, 0-token completions under safety, model_not_found responses normalized to stop, etc. From the user's perspective the agent "successfully" ends a turn with no visible output. Fix - Track ``last_finish_reason`` + ``last_model_version`` across the stream so we can attribute the empty response. - After both ``aggregated_llm_response`` and ``aggregated_llm_response_with_tool_call`` checks, if BOTH are None AND a finish_reason was observed, yield ONE LlmResponse with ``error_code`` set to the mapped finish_reason, ``error_message`` describing the failure mode, and the provider's ``model_version`` preserved. ``usage_metadata`` + ``grounding_metadata`` (if any) attach to that response so callers do not lose them. - Minimum-surface change: the guard only fires when the stream produced no aggregated response AND a finish_reason was observed. Streams that genuinely yield nothing (test doubles, empty iterators) stay byte-identical. Tests - tests/unittests/models/test_litellm.py adds 4 cases: * content_filter-empty → surfaces with SAFETY error_code * stop-empty → surfaces with STOP finish_reason + error_message * normal text stream → empty-guard does NOT fire (regression) * literally-empty stream (no chunks, no finish_reason) → byte-identical zero responses 281 lite_llm tests pass + 1 skip; 0 regressions.
0d97329 to
7240cdb
Compare
Author
|
I signed the CLA. Re-pushed without the Co-Authored-By trailer so the check only sees my GitHub-account email. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Streaming completions where the provider returns a
finish_reasonbut no text + no tool calls currently produce zero yieldedLlmResponseevents:aggregated_llm_responseonly gets set when(text or reasoning_parts)is truthy, andaggregated_llm_response_with_tool_callneeds a function_call. With neither, the loop exits and the downstream Runner observes a silent successful empty stream.This pattern is reported across multiple stalled fix attempts:
It hits providers under real conditions: anthropic
content_filter, gemini 2.5-flash-lite STOP-with-empty after tool calls, 0-token completions under safety, model_not_found responses normalized tostop. From the user's perspective the agent "successfully" ends a turn with no visible output. Downstream agent frameworks have no actionable signal to retry / surface / escalate.Change
last_finish_reason+last_model_versionacross the stream.aggregated_llm_responseandaggregated_llm_response_with_tool_callchecks, if BOTH areNoneAND afinish_reasonwas observed, yield ONELlmResponsewitherror_codeset to the mapped finish_reason,error_messagedescribing the failure mode, andmodel_versionpreserved.usage_metadata+grounding_metadataattach to that response.finish_reasonwas observed. Streams that genuinely yield nothing (test doubles, empty iterators) stay byte-identical.Tests
tests/unittests/models/test_litellm.py(4 new):content_filter-empty → surfaces with SAFETY error_codestop-empty → surfaces with STOP finish_reason + error_message281 lite_llm tests pass + 1 skip; 0 regressions.
Context
Filed after several months of stalled PRs in this area (#5512 closed without merge, #5636 closed without merge, #5006 / #3699 open). Submitting a fresh attempt that addresses the same class of bugs with minimum-surface logic + thorough regression coverage. Happy to revise based on review.
Co-Authored-By: Kevin Sohn kevin@openmagi.ai