Skip to content

fix(llm): raise on API failure instead of silent empty fallback#326

Open
ZwwWayne wants to merge 1 commit into
v1.0.0from
fix/async-wrapper-raise-on-error
Open

fix(llm): raise on API failure instead of silent empty fallback#326
ZwwWayne wants to merge 1 commit into
v1.0.0from
fix/async-wrapper-raise-on-error

Conversation

@ZwwWayne
Copy link
Copy Markdown
Collaborator

Summary

  • AsyncOpenAIWrapper.chat() previously returned a fabricated ChatCompletion with empty content when all retries were exhausted. Callers could not distinguish a real empty response from a total API outage, which silently corrupted downstream results (e.g. evaluation pipelines saw 0-score rounds with no indication of the actual failure).
  • Now raises RuntimeError after retries are exhausted so callers can handle failures explicitly.
  • Non-retryable exceptions are re-raised immediately instead of being swallowed.
  • Replaced print() with logger.warning() for retry diagnostics.
  • Cleaned up unused imports (uuid, Choice, ChatCompletionMessage).

Motivation

When using AsyncOpenAIWrapper in a benchmark evaluation harness, intermittent 502/timeout errors from the API resulted in 15 out of 27 tasks silently producing zero-score results — the fallback ChatCompletion(content='') was indistinguishable from a valid but empty model response. The caller had no opportunity to retry at a higher level or log the actual failure.

Test plan

  • Verified from lagent.llms.openai import AsyncOpenAIWrapper imports cleanly
  • Existing downstream callers that relied on the fallback should add try/except RuntimeError to handle persistent failures gracefully

🤖 Generated with Claude Code

AsyncOpenAIWrapper.chat() previously returned a fabricated
ChatCompletion with empty content when all retries were exhausted.
Callers had no way to distinguish a real empty response from a total
API outage, which silently corrupted downstream results.

- Remove the fallback_response pattern; raise RuntimeError after
  retries are exhausted so callers can handle the failure explicitly.
- Non-retryable exceptions are re-raised immediately instead of being
  swallowed.
- Replace print() with logger.warning() for retry diagnostics.
- Clean up unused imports (uuid, Choice, ChatCompletionMessage).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant