-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Add OpenRouter image generation support #3599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
a23194b
1eb728b
06f868a
e32f767
967e612
cfbcb51
5c32e9b
584e4dd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -73,3 +73,58 @@ model = OpenRouterModel('openai/gpt-5') | |
| agent = Agent(model, model_settings=settings) | ||
| ... | ||
| ``` | ||
|
|
||
| ## Image Generation | ||
|
|
||
| You can use OpenRouter models that support image generation with the `openrouter_modalities` setting: | ||
|
|
||
| ```python {test="skip"} | ||
| from pydantic_ai import Agent, BinaryImage | ||
| from pydantic_ai.models.openrouter import OpenRouterModelSettings | ||
|
|
||
| agent = Agent( | ||
| model='openrouter:google/gemini-2.5-flash-image-preview', | ||
| output_type=str | BinaryImage, | ||
| model_settings=OpenRouterModelSettings(openrouter_modalities=['image', 'text']), | ||
| ) | ||
|
|
||
| result = agent.run_sync('A cat') | ||
| assert isinstance(result.output, BinaryImage) | ||
| ``` | ||
|
|
||
| You can further customize image generation using `openrouter_image_config`: | ||
|
|
||
| ```python | ||
| from pydantic_ai.models.openrouter import OpenRouterModelSettings | ||
|
|
||
| settings = OpenRouterModelSettings( | ||
| openrouter_modalities=['image', 'text'], | ||
| openrouter_image_config={'aspect_ratio': '3:2'} | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I want this to be an option on
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm OK with it also being a model setting if it supports more keys than If you want, you can finish that PR as we're at it to make your life here easier.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep! #3672 |
||
| ) | ||
| ``` | ||
|
|
||
| > Available aspect ratios: `'1:1'`, `'2:3'`, `'3:2'`, `'3:4'`, `'4:3'`, `'4:5'`, `'5:4'`, `'9:16'`, `'16:9'`, `'21:9'`. | ||
|
|
||
| Image generation also works with streaming: | ||
|
|
||
| ```python {test="skip"} | ||
| from pydantic_ai import Agent, BinaryImage | ||
| from pydantic_ai.models.openrouter import OpenRouterModelSettings | ||
|
|
||
| agent = Agent( | ||
| model='openrouter:google/gemini-2.5-flash-image-preview', | ||
| output_type=str | BinaryImage, | ||
| model_settings=OpenRouterModelSettings( | ||
| openrouter_modalities=['image', 'text'], | ||
| openrouter_image_config={'aspect_ratio': '3:2'}, | ||
| ), | ||
| ) | ||
|
|
||
| response = agent.run_stream_sync('A dog') | ||
| for output in response.stream_output(): | ||
| if isinstance(output, str): | ||
| print(output) | ||
| elif isinstance(output, BinaryImage): | ||
| # Handle the generated image | ||
| print(f'Generated image: {output.media_type}') | ||
| ``` | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also make this work with
builtin_tools=[ImageGenerationTool()]and document it here: https://ai.pydantic.dev/builtin-tools/#image-generation-toolAs with Google, which doesn't expose that as a tool, using that tool or
BinaryImageinoutput_typeshould automatically enable the modality.