-
Notifications
You must be signed in to change notification settings - Fork 2.6k
voice: add response_format to ModelSettings and chat() #4559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
voice: add response_format to ModelSettings and chat() #4559
Conversation
📝 WalkthroughWalkthroughThis pull request adds support for a Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/llm.py (1)
125-202:response_formatparameter is accepted but silently ignored.The
response_formatparameter (line 134) is part of the base LLM interface and is accepted by the AWS provider'schat()method, but it is never used—it's not added toopts, passed to the API, or processed in any way. AWS Bedrock's Converse API does not support a response_format or JSON mode parameter, so this silently ignores user configuration without any indication that the feature is unsupported.Add a warning when
response_formatis provided, or raise an error indicating the parameter is not supported for AWS Bedrock, consistent with how other unsupported features are handled in this provider.
🤖 Fix all issues with AI agents
In
`@livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`:
- Around line 122-123: The response_format parameter is accepted by the method
but never forwarded to the OpenAI Responses API; update the Responses API
invocation (the same call that already forwards parallel_tool_calls and
tool_choice) to include response_format when it is not NOT_GIVEN: check for
NOT_GIVEN and, if a real value, add response_format=response_format to the API
call arguments so the Responses API receives the structured output directive.
Ensure you reference the existing NOT_GIVEN sentinel and the response_format
parameter in your change.
🧹 Nitpick comments (1)
livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/llm.py (1)
133-133: Unusedresponse_formatparameter is acceptable for interface consistency.The parameter is added to match the abstract
LLM.chat()signature but is intentionally unused since Anthropic's API doesn't natively support structured response formats like OpenAI. Consider adding a brief comment noting this limitation, or logging a warning if a caller provides a value.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
livekit-agents/livekit/agents/llm/fallback_adapter.pylivekit-agents/livekit/agents/llm/llm.pylivekit-agents/livekit/agents/voice/agent.pylivekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/llm.pylivekit-plugins/livekit-plugins-aws/livekit/plugins/aws/llm.pylivekit-plugins/livekit-plugins-langchain/livekit/plugins/langchain/langgraph.pylivekit-plugins/livekit-plugins-mistralai/livekit/plugins/mistralai/llm.pylivekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.pytests/fake_llm.py
💤 Files with no reviewable changes (1)
- livekit-plugins/livekit-plugins-mistralai/livekit/plugins/mistralai/llm.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
livekit-agents/livekit/agents/llm/llm.pylivekit-plugins/livekit-plugins-langchain/livekit/plugins/langchain/langgraph.pylivekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/llm.pytests/fake_llm.pylivekit-plugins/livekit-plugins-aws/livekit/plugins/aws/llm.pylivekit-agents/livekit/agents/voice/agent.pylivekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.pylivekit-agents/livekit/agents/llm/fallback_adapter.py
🧠 Learnings (2)
📚 Learning: 2026-01-19T07:59:36.851Z
Learnt from: keenranger
Repo: livekit/agents PR: 4511
File: livekit-plugins/livekit-plugins-langchain/livekit/plugins/langchain/langgraph.py:46-54
Timestamp: 2026-01-19T07:59:36.851Z
Learning: In the LiveKit LangChain LangGraph integration (`livekit-plugins-langchain/livekit/plugins/langchain/langgraph.py`), passing an empty list for `stream_mode` parameter (i.e., `stream_mode=[]`) is valid and intentional behavior—it allows users to opt out of streaming modes.
Applied to files:
livekit-plugins/livekit-plugins-langchain/livekit/plugins/langchain/langgraph.py
📚 Learning: 2026-01-18T01:08:55.648Z
Learnt from: davidzhao
Repo: livekit/agents PR: 4548
File: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py:77-81
Timestamp: 2026-01-18T01:08:55.648Z
Learning: In the OpenAI responses LLM (`livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`), reasoning effort defaults are intentionally set lower than OpenAI's API defaults for voice interactions: "none" for gpt-5.1/gpt-5.2 and "minimal" for other reasoning-capable models like gpt-5, to avoid enabling reasoning by default in voice contexts.
Applied to files:
livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py
🧬 Code graph analysis (1)
livekit-agents/livekit/agents/voice/agent.py (2)
livekit-agents/livekit/agents/llm/llm.py (3)
chat(123-133)chat_ctx(336-337)tools(340-341)tests/fake_llm.py (1)
chat(57-68)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: unit-tests
- GitHub Check: type-check (3.9)
- GitHub Check: type-check (3.13)
🔇 Additional comments (8)
livekit-plugins/livekit-plugins-langchain/livekit/plugins/langchain/langgraph.py (1)
67-67: Unusedresponse_formatparameter is acceptable for LangGraph adapter.The parameter is added for interface consistency. Since LangGraph manages its own LLM invocations internally, propagating
response_formatwould require deeper integration with the underlying graph's LLM configuration.livekit-agents/livekit/agents/llm/llm.py (1)
131-131: Abstract signature correctly extended withresponse_formatparameter.The addition of
response_format: NotGivenOr[Any]to the abstractchat()method establishes the interface contract for all LLM implementations. TheAnytype provides flexibility for provider-specific formats.tests/fake_llm.py (1)
65-65: Test fake correctly updated to match the new interface.The
FakeLLM.chat()signature now includesresponse_formatfor interface compatibility. Not utilizing the parameter in the fake implementation is acceptable for testing purposes.livekit-agents/livekit/agents/voice/agent.py (2)
32-33:ModelSettingscorrectly extended withresponse_formatfield.The new field with proper typing and docstring follows the existing pattern for
tool_choice.
414-424:response_formatcorrectly extracted and propagated to LLM chat call.The implementation properly reads
response_formatfrommodel_settingsand passes it through toactivity_llm.chat(). The defensiveif model_settings else NOT_GIVENcheck on line 414 is consistent with line 413's handling oftool_choice.livekit-agents/livekit/agents/llm/fallback_adapter.py (3)
95-107: LGTM!The
response_formatparameter is correctly added to thechat()method signature and properly passed to theFallbackLLMStreamconstructor.
129-137: LGTM!The
response_formatparameter is properly accepted and stored as instance state for use during generation.
166-179: LGTM!The
response_formatis correctly propagated to the underlying LLM'schat()call, ensuring consistent behavior across fallback attempts.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| response_format: NotGivenOr[Any] = NOT_GIVEN, | ||
| extra_kwargs: NotGivenOr[dict[str, Any]] = NOT_GIVEN, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
OpenAI Responses API response_format parameter structure 2025
💡 Result:
Short answer — structure and examples (2025)
-
Two main modes: JSON mode and Structured Outputs (schema-enforced). Use JSON mode with type "json_object"; use Structured Outputs with type "json_schema" and include a JSON Schema under "json_schema". [1][2]
-
Current Responses API placement (2025): put the format inside the text field:
Example (schema-enforced):
{
"text": {
"format": {
"type": "json_schema",
"json_schema": {
"type": "object",
"properties": {
"name": {"type":"string"},
"age": {"type":"integer"}
},
"required":["name","age"],
"additionalProperties": false
}
}
}
}
Example (JSON mode):
{
"text": {
"format": { "type": "json_object" }
}
}
(The older parameter name response_format is still seen in examples but the Responses API places format under text.format). [1][2][3]
Sources
- OpenAI Structured Outputs / JSON mode docs. [1]
- OpenAI announcement + SDK examples for Structured Outputs. [2]
- 2025 technical deep-dive showing text.format usage and examples. [3]
🏁 Script executed:
# First, let's check the actual file to understand the context
fd -type f -name "llm.py" | grep -i openai | head -5Repository: livekit/agents
Length of output: 228
🏁 Script executed:
# Find the llm.py file in openai responses directory
find . -path "*/openai/responses/llm.py" -type fRepository: livekit/agents
Length of output: 138
🏁 Script executed:
# Also check the directory structure
fd "llm.py" | grep openaiRepository: livekit/agents
Length of output: 205
🏁 Script executed:
# Read the llm.py file to examine the response_format parameter and API call
cat -n ./livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py | sed -n '110,170p'Repository: livekit/agents
Length of output: 2576
🏁 Script executed:
# Search for all usages of response_format in the openai plugin
rg "response_format" livekit-plugins/livekit-plugins-openai/Repository: livekit/agents
Length of output: 1615
🏁 Script executed:
# Check LLMStream class definition to see if it handles response_format
cat -n ./livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py | sed -n '170,250p'Repository: livekit/agents
Length of output: 3694
🏁 Script executed:
# Check the regular llm.py file to see response_format handling
cat -n ./livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py | sed -n '1,50p'Repository: livekit/agents
Length of output: 1868
🏁 Script executed:
# Search for the chat() method in llm.py to see how response_format is used
rg -A 30 "def chat" ./livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py | head -80Repository: livekit/agents
Length of output: 1173
🏁 Script executed:
# Check llm_utils to understand to_openai_response_format
fd "llm_utils" livekit-plugins/livekit-plugins-openai/Repository: livekit/agents
Length of output: 40
🏁 Script executed:
# Find where llm_utils is located
fd -name "*.py" | xargs grep -l "def to_openai_response_format" | head -3Repository: livekit/agents
Length of output: 288
🏁 Script executed:
# Check how response_format is used in the regular llm.py
rg -B 5 -A 10 "extra\[\"response_format\"\]" ./livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.pyRepository: livekit/agents
Length of output: 684
🏁 Script executed:
# Check if there are any tests or examples showing how response_format is used
rg -B 5 -A 5 "response_format" ./livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/ | head -40Repository: livekit/agents
Length of output: 1398
🏁 Script executed:
# Find and examine the llm_utils to_openai_response_format function
find . -path "*/agents/llm/utils.py" -o -path "*/livekit/agents/llm/utils.py" | xargs cat 2>/dev/null | head -200Repository: livekit/agents
Length of output: 6557
🏁 Script executed:
# Or search more broadly
rg -B 2 -A 15 "def to_openai_response_format" --type pyRepository: livekit/agents
Length of output: 1403
🏁 Script executed:
# Let's verify the exact structure of response_format for Responses API
# by checking if there are any tests or examples
rg -r "responses.create" ./livekit-plugins/livekit-plugins-openai/ --type py -A 10 | head -50Repository: livekit/agents
Length of output: 40
🏁 Script executed:
# Check the full to_openai_response_format function to understand the structure better
rg -B 5 -A 25 "def to_openai_response_format" livekit-agents/Repository: livekit/agents
Length of output: 2378
response_format parameter is accepted but not utilized in the API call.
OpenAI's Responses API supports response_format for structured output control. The parameter is defined in the method signature but never propagated to the API request, unlike parallel_tool_calls and tool_choice.
🔧 Suggested fix to propagate response_format
if is_given(tool_choice):
oai_tool_choice: response_create_params.ToolChoice
if isinstance(tool_choice, dict):
oai_tool_choice = {
"type": "function",
"name": tool_choice["function"]["name"],
}
extra["tool_choice"] = oai_tool_choice
elif tool_choice in ("auto", "required", "none"):
oai_tool_choice = tool_choice # type: ignore
extra["tool_choice"] = oai_tool_choice
+ if is_given(response_format):
+ extra["text"] = {"format": response_format}
+
return LLMStream(🤖 Prompt for AI Agents
In
`@livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`
around lines 122 - 123, The response_format parameter is accepted by the method
but never forwarded to the OpenAI Responses API; update the Responses API
invocation (the same call that already forwards parallel_tool_calls and
tool_choice) to include response_format when it is not NOT_GIVEN: check for
NOT_GIVEN and, if a real value, add response_format=response_format to the API
call arguments so the Responses API receives the structured output directive.
Ensure you reference the existing NOT_GIVEN sentinel and the response_format
parameter in your change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the goal of this PR is to make it work only for google, leaving this as a passthrough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vishal-seshagiri-infinitusai, understood! Since the goal is to implement response_format for Google only, leaving it as a passthrough for OpenAI Responses makes sense for this PR. The parameter maintains interface consistency across providers while keeping the implementation scoped appropriately.
✏️ Learnings added
Learnt from: vishal-seshagiri-infinitusai
Repo: livekit/agents PR: 4559
File: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py:122-123
Timestamp: 2026-01-19T23:21:47.799Z
Learning: In PR `#4559`, response_format was intentionally added as a passthrough (accepted but not utilized) to the OpenAI Responses API in `livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`. The implementation is scoped only for Google provider, not for OpenAI.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: davidzhao
Repo: livekit/agents PR: 4548
File: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py:77-81
Timestamp: 2026-01-18T01:08:55.648Z
Learning: In the OpenAI responses LLM (`livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`), reasoning effort defaults are intentionally set lower than OpenAI's API defaults for voice interactions: "none" for gpt-5.1/gpt-5.2 and "minimal" for other reasoning-capable models like gpt-5, to avoid enabling reasoning by default in voice contexts.
livekit-plugins/livekit-plugins-mistralai/livekit/plugins/mistralai/llm.py
Show resolved
Hide resolved
| class ModelSettings: | ||
| tool_choice: NotGivenOr[llm.ToolChoice] = NOT_GIVEN | ||
| """The tool choice to use when calling the LLM.""" | ||
| response_format: NotGivenOr[Any] = NOT_GIVEN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since not all LLMs support response format, perhaps a better way is to use it in a custom llm_node instead of adding it to ModelSettings, here is an example.
btw, I didn't see how you want to pass the response_format to the agent from the user's code.
Addresses - #4558 (Support structured outputs for google llm models)
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.