-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Feat: Content Filtering Exception Handling #3634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feat: Content Filtering Exception Handling #3634
Conversation
|
Is it possible to handle AWS Bedrock as well? Thanks. |
dsfaccini
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @AlanPonnachan thank you for the PR! I've requested a couple small changes, let me know if you have any questions
|
one more thing I missed, please include the name of the new exception in the fallback model docs
|
|
@dsfaccini Thank you for the review. I’ve made the requested changes. |
|
hey @AlanPonnachan thanks a lot for your work! It looks very good now, I requested a couple more changes but once that's done and coverage passes I think the PR will be ready for merge. |
|
@dsfaccini Thanks again for the review! I’ve applied the requested changes. Test coverage is now at 100%. |
DouweM
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlanPonnachan Thanks for working on this! My main concern is that this doesn't actually make it consistent for all models that respond with finish_reason=='content_filter', just for Anthropic/Google/OpenAI.
| self.message = message | ||
|
|
||
|
|
||
| class PromptContentFilterError(ContentFilterError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we name this RequestContentFilterError for consistency with our ModelRequest/ModelResponse
| """Raised when the prompt triggers a content filter.""" | ||
|
|
||
| def __init__(self, status_code: int, model_name: str, body: object | None = None): | ||
| message = f"Model '{model_name}' content filter was triggered by the user's prompt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or by the instructions, right? I'd prefer it to be a bit more generic
| provider_details = {'finish_reason': raw_finish_reason} | ||
| finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) | ||
| if finish_reason == 'content_filter': | ||
| raise ResponseContentFilterError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should do this in ModelRequestNode, so that it'll work for all models, not just Anthropic.
That would mean that it wouldn't trigger FallbackModel to try another model, but that will be possible once we do #3640.
| if finish_reason == 'content_filter' and raw_finish_reason: | ||
| raise UnexpectedModelBehavior( | ||
| f'Content filter {raw_finish_reason.value!r} triggered', response.model_dump_json() | ||
| raise ResponseContentFilterError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a breaking change for users that currently have except UnexpectedModelBehavior. So if we add a new exception, I think it should be a subclass of UnexpectedModelBehavior, not ModelAPIError (as it isn't really an API/HTTP/connection error).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combined with the above, the solution may be to remove this check here, and to implement the same if response.finish_reason == 'content_filter' check in ModelRequestNode that will apply to all models.
If we want to get some details from the response up to that level, we can store them in response.provider_details
| if (error := body_dict.get('error')) and isinstance(error, dict): | ||
| error_dict = cast(dict[str, Any], error) | ||
| if error_dict.get('code') == 'content_filter': | ||
| raise PromptContentFilterError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about instead handling this by returning a ModelResponse with empty parts and finish_reason == 'content_filter', so that it can go through the same logic as the Anthropic and Google models?
I also don't think they distinguish between input/request and output/response content filter errors, so I don't think we need to here either.
Unified Content Filtering Exception Handling
This PR closes #1035
Standardizing how content filtering events are handled across different model providers.
Previously, triggering a content filter resulted in inconsistent behaviors: generic
ModelHTTPError(Azure),UnexpectedModelBehavior(Google), or silent failures depending on the provider. This PR introduces a dedicated exception hierarchy to allow users to catch and handle prompt refusals and response interruptions programmatically and consistently.Key Changes:
ContentFilterError(base),PromptContentFilterError(for input rejections, e.g., Azure 400), andResponseContentFilterError(for output refusals).PromptContentFilterErrorfor Azure's specific 400 error body andResponseContentFilterErrorwhenfinish_reason='content_filter'._process_responseto raiseResponseContentFilterErrorinstead ofUnexpectedModelBehaviorwhen safety thresholds are triggered.ResponseContentFilterError.tests/models/.Example Usage: