Skip to content

Conversation

@Olocool17
Copy link
Contributor

Previous PR: #9123

Cache retrieval fix

While implementing the fix outlined in #9130 I additionally stumbled on an error pertaining to responses models when using caching.
When an item is successfully retrieved from the cache, an ad-hoc attribute .cache_hit is created and set to True on the response object.

Unfortunately, this is only possible for litellm.ModelResponse (the response for chat models) and not for litellm.ResponsesAPIResponse (the response for response models), because the latter is a Pydantic model without extra="allow" set in its config.

My proposed fix for this is to simply remove this ad-hoc attribute alltogether, since it is actually superflueous: the response.usage attribute is cleared on cache hit anyways, which makes settings.usage_tracker.add_usage(self.model, dict(results.usage) a null-op.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant