Add Alibaba Cloud `DashScopeProvider` and support audio input for Qwen Omni #3596

Pavanmanikanta98 · 2025-11-29T16:43:58Z

Key Changes:

Updated Profile: Added openai_audio_input_encoding: Literal['base64', 'uri'] to OpenAIModelProfile.
- 'base64' (default): Maintains strict OpenAI compliance.
- 'uri': Enables Data URI formatting for providers like Qwen Omni.
Updated Model Logic: Modified OpenAIChatModel._map_user_prompt to respect this setting.
- For BinaryContent: Uses item.data_uri when encoding is 'uri'.
- For AudioUrl: Manually constructs the Data URI with the correct MIME type (e.g., audio/mpeg for mp3) when
  encoding is 'uri'.
New Tests: Added tests/models/test_openai_audio.py covering both default and URI encoding scenarios for both binary content and audio URLs.

pydantic_ai_slim/pydantic_ai/models/openai.py

pydantic_ai_slim/pydantic_ai/profiles/openai.py

Pavanmanikanta98 · 2025-12-02T16:14:28Z

@DouweM , For the Qwen Omni integration specifically, I’d like to follow your suggestion and handle the Data URI requirement via a dedicated provider rather than changing the shared qwen_model_profile.

Concretely, my plan is:

Add a new provider class for Qwen’s OpenAI‑compatible Chat Completions endpoint (e.g. QwenOpenAIProvider ), which implements its own model_profile(self, model_name: str).

That model_profile will start from the standard OpenAI profile (e.g. openai_model_profile(model_name)) and then update it to set openai_chat_audio_input_encoding='uri'.

Users who want to talk to Qwen Omni via an OpenAI‑style API would instantiate OpenAIChatModel with this provider and the Qwen Omni base URL, and they’d automatically get Data URI audio, while other Qwen providers keep the default base64 behavior.

DouweM · 2025-12-02T21:55:02Z

@Pavanmanikanta98 Thanks, makes sense. It should be just QwenProvider, and we should also support the qwen: model name prefix, update the docs, etc. See https://ai.pydantic.dev/models/openai/#openai-compatible-models for examples; anywhere those a referenced in the code, we should add a branch for qwen as well.

Pavanmanikanta98 · 2025-12-04T04:30:04Z

Hi @DouweM,
I've addressed your feedback: renamed to openai_chat_audio_input_encoding, used item.media_type instead of hardcoded mapping, and added QwenProvider with automatic Omni audio encoding. All tests pass.
Ready for review.

Pavanmanikanta98 · 2025-12-08T17:03:03Z

Hi @DouweM, following up on this when you have time. Ready for review.

DouweM · 2025-12-09T14:34:52Z

docs/models/openai.md

+
+### Qwen
+
+To use Qwen models via the OpenAI-compatible API from [Alibaba Cloud DashScope](https://www.alibabacloud.com/help/doc-detail/2712576.html), you can set the `QWEN_API_KEY` (or `DASHSCOPE_API_KEY`) environment variable and use [`QwenProvider`][pydantic_ai.providers.qwen.QwenProvider] by name:


Should the provider be named AlibabaProvider instead, and the prefix alibaba:, as that's the platform name, whole Qwen is a model family? Or DashScopeProvider?

Let's link to the product page rather than the docs: https://www.alibabacloud.com/en/product/modelstudio

Let's support only 1 env var, likely ALIBABA_API_KEY or DASHSCOPE_API_KEY

DouweM · 2025-12-09T14:35:07Z

docs/models/openai.md

+
+The `QwenProvider` uses the international DashScope compatible endpoint `https://dashscope-intl.aliyuncs.com/compatible-mode/v1` by default.
+
+When using **Qwen Omni** models (e.g. `qwen-omni-turbo`), this provider automatically handles audio input using the Data URI format required by the DashScope API.


We can drop this, the user will assume everything just works, we don't have to explain the specific things we did to make it so

DouweM · 2025-12-09T14:35:26Z

pydantic_ai_slim/pydantic_ai/models/openai.py

                    elif item.is_audio:
                        assert item.format in ('wav', 'mp3')
-                        audio = InputAudio(data=base64.b64encode(item.data).decode('utf-8'), format=item.format)
+                        profile = OpenAIModelProfile.from_profile(self.profile)


Let's do this just once at the top of the method

DouweM · 2025-12-09T14:36:30Z

pydantic_ai_slim/pydantic_ai/models/openai.py

+                    profile = OpenAIModelProfile.from_profile(self.profile)
+                    if profile.openai_chat_audio_input_encoding == 'uri':
+                        mime_type = item.media_type or f'audio/{downloaded_item["data_type"]}'
+                        data_uri = f'data:{mime_type};base64,{downloaded_item["data"]}'


We shouldn't need to do this ourselves, we can call download_item with data_format='base64_uri'

pydantic_ai_slim/pydantic_ai/profiles/openai.py

DouweM · 2025-12-09T14:37:19Z

pydantic_ai_slim/pydantic_ai/providers/qwen.py

+    @property
+    def base_url(self) -> str:
+        # Using the international endpoint by default as it's more standard for global users
+        # Users in China region can override this via passing `openai_client` or implementing logic to check region


Can we take a base_url argument? And mention this in the docs

…r Omni models - Rename QwenProvider to DashScopeProvider with dashscope: prefix - Use single DASHSCOPE_API_KEY environment variable - Add base_url argument to DashScopeProvider constructor - Refactor audio mapping to fetch profile once at method top - Use download_item with base64_uri format for AudioUrl - Remove Qwen-specific mentions from docstrings - Update documentation with product page link and base_url example - Add comprehensive tests for DashScopeProvider Addresses all maintainer feedback from PR pydantic#3596 Fixes pydantic#3530

docs/models/openai.md

tests/models/test_openai_audio.py

DouweM · 2025-12-12T00:22:06Z

docs/models/openai.md

 ...
 ```

+### DashScope


We should mention it on docs/models/overview.md, docs/index.md and README.md where we mention all the other providers

Suggested change

### DashScope

### Alibaba Cloud DashScope

Pavanmanikanta98 · 2025-12-12T18:18:38Z

Hi @DouweM,

Addressed the feedback. Ready for review.

DouweM · 2025-12-12T21:34:03Z

docs/models/openai.md

 ...
 ```

+### DashScope


Suggested change

### DashScope

### Alibaba Cloud DashScope

DouweM · 2025-12-12T21:34:45Z

docs/models/openai.md


+### DashScope
+
+To use Qwen models via [Alibaba Cloud DashScope](https://www.alibabacloud.com/en/product/modelstudio), you can set the `DASHSCOPE_API_KEY` environment variable and use [`DashScopeProvider`][pydantic_ai.providers.dashscope.DashScopeProvider] by name:


Is DashScopeProvider really the most appropriate/recognizable name? I see DashScope mentioned only once on https://www.alibabacloud.com/en/product/modelstudio, so maybe it should just be AlibabaProvider?

Pavanmanikanta98 · 2025-12-13T04:23:22Z

Hi @DouweM,

I'm happy to rename the provider to AlibabaProvider if that's the preferred direction for discovery, but I wanted to share my reasoning for choosing DashScopeProvider initially, backed by the official SDK and documentation:

Official SDK & Identity: The official Python SDK is explicitly named dashscope (PyPI (https://pypi.org/project/dashscope/)), and the official documentation refers to the API usage as "DashScope" (e.g., First API Call to Qwen (https://www.alibabacloud.com/help/en/model-studio/first-api-call-to-qwen)).
Service vs. Platform: "DashScope" is the specific name of the OpenAI-compatible API service we are connecting to. This aligns with other provider naming conventions like BedrockProvider (wrapping Amazon Bedrock, not AmazonProvider) and AzureProvider (wrapping Azure OpenAI, not MicrosoftProvider).
Environment Variable: The standard API keys generated in the console are explicitly for DashScope (DASHSCOPE_API_KEY), so DashScopeProvider aligns with the configuration users will already have.

If you still prefer AlibabaProvider for better recognizability, I can definitely make the switch! I just wanted to clarify that DashScopeProvider was chosen to match the specific API service name.

Let me know what you think!

DouweM · 2025-12-15T17:49:28Z

@Pavanmanikanta98 It's a bit confusing/inconsistent, but I interpret these 2 sentences on https://www.alibabacloud.com/help/en/model-studio/first-api-call-to-qwen to mean that they actually prefer the platform to be called "Alibaba Cloud Model Studio", and DashScope is just the SDK:

Alibaba Cloud Model Studio lets you call large language models (LLMs) through OpenAI compatible interfaces or the DashScope SDK.

(this seems to contradict your saying that "DashScope" is the specific name of the OpenAI-compatible API service we are connecting to", as they here say "OpenAI endpoint OR DashScope")

# Replace YOUR_DASHSCOPE_API_KEY with your Alibaba Cloud Model Studio API key.

(not "DashScope API key")

But then again the URLs all include "dashscope" as well, so clearly it's not just the SDK. So maybe it's the name of their AI inference platform, but also they don't want to actually call it that in marketing material (anymore?) and say "Alibaba Cloud Model Studio" instead?

I think the fact that https://www.alibabacloud.com/help/en/model-studio/what-is-model-studio only mentions "dashscope" in the code example is the deciding factor here. Someone who reads that page and wants to look for Pydantic AI support seems far more likely to scan a list for "Alibaba Cloud Model Studio" or (in one word) "Alibaba" than "DashScope".

So yeah please make the rename, and where we can have the full name (in docs etc), we can say "Alibaba Cloud Model Studio (DashScope)".

Of course Alibaba does a lot more, but it's similar to HerokuProvider and VercelProvider that in Pydantic AI context clearly refer to their AI inference features.

Azure and Bedrock are a little different because those are establish brand names in their own right; if Alibaba still called it Alibaba DashScope AI or something, DashScope would've been fine.

All of that is to say, please make the change 😄

Pavanmanikanta98 · 2025-12-16T16:21:51Z

@DouweM

I've renamed DashScopeProvider to AlibabaProvider (using alibaba: prefix) and updated the documentation to reference "Alibaba Cloud Model Studio (DashScope)" as discussed. DASHSCOPE_API_KEY is preserved for consistency with the official docs. Ready for review

DouweM · 2025-12-16T18:30:16Z

docs/models/openai.md


+### Alibaba Cloud Model Studio (DashScope)
+
+To use Qwen models via [Alibaba Cloud Model Studio (DashScope)](https://www.alibabacloud.com/en/product/modelstudio), you can set the `DASHSCOPE_API_KEY` environment variable and use [`AlibabaProvider`][pydantic_ai.providers.alibaba.AlibabaProvider] by name:


Can we additionally support ALIBABA_API_KEY please? Like for Google we support GOOGLE_ and GEMINI_ both. That'll be easier to recognize in a .env file for someone looking for the token that's used by AlibabaProvider who's not familiar with the DashScope name

This commit introduces `openai_audio_input_encoding` to `OpenAIModelProfile`, allowing users to choose between `'base64'` (default) and `'uri'` encoding for audio inputs. This addresses compatibility issues with providers like Qwen Omni that require Data URI format for audio data. Key changes: - Added `openai_audio_input_encoding` to `OpenAIModelProfile`. - Updated `OpenAIChatModel._map_user_prompt` to respect the configured encoding for `BinaryContent` and `AudioUrl`. - Added new tests in `tests/models/test_openai_audio.py` covering both encoding modes.

…i models - Add QwenProvider for DashScope OpenAI-compatible API - Rename openai_audio_input_encoding to openai_chat_audio_input_encoding - Use item.media_type for Data URI MIME types instead of hardcoded mapping - Automatically set Data URI audio encoding for Qwen Omni models - Add comprehensive tests for QwenProvider and audio encoding - Add Qwen documentation section to OpenAI-compatible models docs Fixes pydantic#3530

- Include 'qwen' in the model inference options for compatibility with Qwen models. - Set up environment variable for Qwen API key in test_examples.py to facilitate testing. This enhances the integration of Qwen models within the existing framework.

- Add tests for initializing QwenProvider with `openai_client` and `http_client` to ensure full branch coverage.

…r Omni models - Rename QwenProvider to DashScopeProvider with dashscope: prefix - Use single DASHSCOPE_API_KEY environment variable - Add base_url argument to DashScopeProvider constructor - Refactor audio mapping to fetch profile once at method top - Use download_item with base64_uri format for AudioUrl - Remove Qwen-specific mentions from docstrings - Update documentation with product page link and base_url example - Add comprehensive tests for DashScopeProvider Addresses all maintainer feedback from PR pydantic#3596 Fixes pydantic#3530

- Add DashScope to provider lists in README.md and docs/index.md - Add pydantic_ai.providers.dashscope to docs/api/providers.md - Merge test_openai_audio.py into test_openai.py and remove redundant test

- Rename DashScopeProvider → AlibabaProvider with prefix alibaba: - Keep DASHSCOPE_API_KEY env var (matches Alibaba official docs) - Update all documentation references - Add Alibaba Cloud to README.md, docs/index.md, docs/models/overview.md

- Add ALIBABA_API_KEY as primary env var (easier to recognize) - Keep DASHSCOPE_API_KEY for compatibility with Alibaba's docs - ALIBABA_API_KEY takes precedence (like GOOGLE_API_KEY/GEMINI_API_KEY) - Update docs, tests, and test fixtures

Pavanmanikanta98 · 2025-12-17T17:26:01Z

@DouweM Done : )

Ready for review

DouweM · 2025-12-17T18:02:33Z

@Pavanmanikanta98 Thanks Pavan!

Pavanmanikanta98 mentioned this pull request Nov 29, 2025

The way OpenAIChatModel sends input audio is incompatible with Qwen Omni API #3530

Closed

DouweM self-assigned this Dec 1, 2025

DouweM requested changes Dec 1, 2025

View reviewed changes

pydantic_ai_slim/pydantic_ai/models/openai.py Outdated Show resolved Hide resolved

pydantic_ai_slim/pydantic_ai/profiles/openai.py Outdated Show resolved Hide resolved

pydantic_ai_slim/pydantic_ai/profiles/openai.py Outdated Show resolved Hide resolved

DouweM added the awaiting author revision label Dec 1, 2025

DouweM requested changes Dec 9, 2025

View reviewed changes

DouweM changed the title ~~Add configurable audio encoding for OpenAI models (Data URI support)~~ Support audio on Alibaba Cloud Qwen Omni Dec 9, 2025

DouweM requested changes Dec 12, 2025

View reviewed changes

docs/models/openai.md Outdated Show resolved Hide resolved

tests/models/test_openai_audio.py Outdated Show resolved Hide resolved

tests/models/test_openai_audio.py Outdated Show resolved Hide resolved

DouweM requested changes Dec 12, 2025

View reviewed changes

DouweM changed the title ~~Support audio on Alibaba Cloud Qwen Omni~~ Add Alibaba Cloud DashScopeProvider and support audio input for Qwen Omni Dec 12, 2025

DouweM requested changes Dec 12, 2025

View reviewed changes

DouweM requested changes Dec 16, 2025

View reviewed changes

pavan added 10 commits December 17, 2025 22:22

test: Improve QwenProvider coverage

aadb23f

- Add tests for initializing QwenProvider with `openai_client` and `http_client` to ensure full branch coverage.

docs: Add DashScope to provider lists and merge audio tests

72b6ac5

- Add DashScope to provider lists in README.md and docs/index.md - Add pydantic_ai.providers.dashscope to docs/api/providers.md - Merge test_openai_audio.py into test_openai.py and remove redundant test

style: Fix formatting in tests

03772b8

docs: Fix provider ordering in README

4b624ec

Pavanmanikanta98 force-pushed the fix/qwen-omni-audio-encoding branch from 9e6b391 to 1126ba5 Compare December 17, 2025 17:03

fix: Remove duplicate reasoning tests from conflict resolution

daa7b2d

DouweM merged commit 81c0d4a into pydantic:main Dec 17, 2025
57 of 59 checks passed

This was referenced Dec 20, 2025

Integrate Code Execution with MCP #3397

Open

Add sambanova provider #3669

Open


		### Qwen

		To use Qwen models via the OpenAI-compatible API from [Alibaba Cloud DashScope](https://www.alibabacloud.com/help/doc-detail/2712576.html), you can set the `QWEN_API_KEY` (or `DASHSCOPE_API_KEY`) environment variable and use [`QwenProvider`][pydantic_ai.providers.qwen.QwenProvider] by name:


		The `QwenProvider` uses the international DashScope compatible endpoint `https://dashscope-intl.aliyuncs.com/compatible-mode/v1` by default.

		When using Qwen Omni models (e.g. `qwen-omni-turbo`), this provider automatically handles audio input using the Data URI format required by the DashScope API.


		### DashScope

		To use Qwen models via [Alibaba Cloud DashScope](https://www.alibabacloud.com/en/product/modelstudio), you can set the `DASHSCOPE_API_KEY` environment variable and use [`DashScopeProvider`][pydantic_ai.providers.dashscope.DashScopeProvider] by name:


		### Alibaba Cloud Model Studio (DashScope)

		To use Qwen models via [Alibaba Cloud Model Studio (DashScope)](https://www.alibabacloud.com/en/product/modelstudio), you can set the `DASHSCOPE_API_KEY` environment variable and use [`AlibabaProvider`][pydantic_ai.providers.alibaba.AlibabaProvider] by name:

Add Alibaba Cloud DashScopeProvider and support audio input for Qwen Omni #3596

Add Alibaba Cloud DashScopeProvider and support audio input for Qwen Omni #3596

Uh oh!

Conversation

Pavanmanikanta98 commented Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Pavanmanikanta98 commented Dec 2, 2025

Uh oh!

DouweM commented Dec 2, 2025

Uh oh!

Pavanmanikanta98 commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Pavanmanikanta98 commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Pavanmanikanta98 commented Dec 12, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Pavanmanikanta98 commented Dec 13, 2025

Uh oh!

DouweM commented Dec 15, 2025

Uh oh!

Pavanmanikanta98 commented Dec 16, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Pavanmanikanta98 commented Dec 17, 2025

Uh oh!

Uh oh!

DouweM commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Alibaba Cloud `DashScopeProvider` and support audio input for Qwen Omni #3596

Add Alibaba Cloud `DashScopeProvider` and support audio input for Qwen Omni #3596

Pavanmanikanta98 commented Nov 29, 2025 •

edited

Loading

Pavanmanikanta98 commented Dec 4, 2025 •

edited

Loading

Pavanmanikanta98 commented Dec 8, 2025 •

edited

Loading