Summary
OpenRouterProvider.generate_audio in sdk/python/agentfield/media_providers.py hardcodes "stream": True in the chat-completions audio payload. OpenRouter rejects every format except pcm16 when stream=true, so format="wav" (and mp3, flac, opus) all 400 against models in the gpt-audio family.
The SDK signature advertises format from {"wav", "mp3", "flac", "opus", "pcm16"}, so callers reasonably expect any of those values to work.
Repro
from agentfield.media_providers import OpenRouterProvider
await OpenRouterProvider().generate_audio(
text="Hello world",
model="openrouter/openai/gpt-audio-mini",
format="wav",
)
Error
RuntimeError: OpenRouter audio request failed (400):
{"error":{"message":"Unsupported value: 'audio.format' does not support
'wav' when stream=true. Supported values are: 'pcm16'."}}
Suggested fix
Either:
- auto-switch
stream=false when format != "pcm16", or
- request
pcm16 over the wire when the caller asks for wav and wrap the streamed pcm16 in the requested container client-side before returning.
Status
This was discovered against agentfield==0.1.84 while building the reel-af example. Looking at origin/main, the second strategy appears to already be implemented (see _wrap_pcm16_as_wav_b64 in media_providers.py and the wire_format = "pcm16" if audio_format == "wav" else audio_format branch around line 1382). Filing for tracking so a release can cut and consumers can drop their workarounds.
Found by the Python SDK consumer reel-af (an AgentField example pipeline).
Summary
OpenRouterProvider.generate_audioinsdk/python/agentfield/media_providers.pyhardcodes"stream": Truein the chat-completions audio payload. OpenRouter rejects every format exceptpcm16whenstream=true, soformat="wav"(andmp3,flac,opus) all 400 against models in the gpt-audio family.The SDK signature advertises
formatfrom{"wav", "mp3", "flac", "opus", "pcm16"}, so callers reasonably expect any of those values to work.Repro
Error
Suggested fix
Either:
stream=falsewhenformat != "pcm16", orpcm16over the wire when the caller asks forwavand wrap the streamed pcm16 in the requested container client-side before returning.Status
This was discovered against
agentfield==0.1.84while building thereel-afexample. Looking atorigin/main, the second strategy appears to already be implemented (see_wrap_pcm16_as_wav_b64inmedia_providers.pyand thewire_format = "pcm16" if audio_format == "wav" else audio_formatbranch around line 1382). Filing for tracking so a release can cut and consumers can drop their workarounds.Found by the Python SDK consumer
reel-af(an AgentField example pipeline).