voice: add response_format to ModelSettings and chat() #4559

vishal-seshagiri-infinitusai · 2026-01-19T22:54:01Z

Addresses - #4558 (Support structured outputs for google llm models)

Summary by CodeRabbit

New Features
- Added response format configuration support for LLM chat operations. Users can now specify response format preferences when interacting with language models across all supported integrations.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

CLAassistant · 2026-01-19T22:54:08Z

All committers have signed the CLA.

coderabbitai · 2026-01-19T22:54:14Z

📝 Walkthrough

Walkthrough

This pull request adds support for a response_format parameter across the LLM abstraction layer. The parameter is added to the base LLM.chat() method signature and propagated through implementations in fallback adapter, all plugins (Anthropic, AWS, Langchain, MistralAI, OpenAI), configuration settings, and test utilities.

Changes

Cohort / File(s)	Summary
Core LLM Abstraction `livekit-agents/livekit/agents/llm/llm.py`	Updated abstract `chat()` method signature to accept new optional parameter `response_format: NotGivenOr[Any] = NOT_GIVEN`
Fallback Adapter Implementation `livekit-agents/livekit/agents/llm/fallback_adapter.py`	Added `response_format` parameter to `FallbackAdapter.chat()` and `FallbackLLMStream.__init__()`, stored as instance state, and propagated through to underlying `llm.chat()` calls during generation
Agent Configuration `livekit-agents/livekit/agents/voice/agent.py`	Added `response_format: NotGivenOr[Any] = NOT_GIVEN` field to `ModelSettings` dataclass and integrated it into agent's `llm_node` default path to pass to LLM chat
Plugin Implementations - Signature Updates `livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/llm.py`, `livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/llm.py`, `livekit-plugins/livekit-plugins-langchain/livekit/plugins/langchain/langgraph.py`, `livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`	Added `response_format` parameter to `chat()` method signatures for signature compatibility; parameter not used in method bodies
MistralAI Plugin `livekit-plugins/livekit-plugins-mistralai/livekit/plugins/mistralai/llm.py`	Removed logic that propagated `response_format` into extra kwargs for underlying chat calls
Test Utilities `tests/fake_llm.py`	Added `response_format` parameter to `FakeLLM.chat()` method signature for test compatibility

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A format for responses, now passed all the way,
Through fallback and agents, plugins at play,
From abstract to concrete, the parameter flows,
Supporting new freedom in how language grows! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title accurately summarizes the main changes: adding response_format to ModelSettings and chat() methods across the codebase.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/llm.py (1)

125-202: response_format parameter is accepted but silently ignored.

The response_format parameter (line 134) is part of the base LLM interface and is accepted by the AWS provider's chat() method, but it is never used—it's not added to opts, passed to the API, or processed in any way. AWS Bedrock's Converse API does not support a response_format or JSON mode parameter, so this silently ignores user configuration without any indication that the feature is unsupported.

Add a warning when response_format is provided, or raise an error indicating the parameter is not supported for AWS Bedrock, consistent with how other unsupported features are handled in this provider.

🤖 Fix all issues with AI agents

In
`@livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`:
- Around line 122-123: The response_format parameter is accepted by the method
but never forwarded to the OpenAI Responses API; update the Responses API
invocation (the same call that already forwards parallel_tool_calls and
tool_choice) to include response_format when it is not NOT_GIVEN: check for
NOT_GIVEN and, if a real value, add response_format=response_format to the API
call arguments so the Responses API receives the structured output directive.
Ensure you reference the existing NOT_GIVEN sentinel and the response_format
parameter in your change.

🧹 Nitpick comments (1)

livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/llm.py (1)

133-133: Unused response_format parameter is acceptable for interface consistency.

The parameter is added to match the abstract LLM.chat() signature but is intentionally unused since Anthropic's API doesn't natively support structured response formats like OpenAI. Consider adding a brief comment noting this limitation, or logging a warning if a caller provides a value.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0722371 and b2d4fe6.

📒 Files selected for processing (9)

livekit-agents/livekit/agents/llm/fallback_adapter.py
livekit-agents/livekit/agents/llm/llm.py
livekit-agents/livekit/agents/voice/agent.py
livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/llm.py
livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/llm.py
livekit-plugins/livekit-plugins-langchain/livekit/plugins/langchain/langgraph.py
livekit-plugins/livekit-plugins-mistralai/livekit/plugins/mistralai/llm.py
livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py
tests/fake_llm.py

💤 Files with no reviewable changes (1)

livekit-plugins/livekit-plugins-mistralai/livekit/plugins/mistralai/llm.py

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

livekit-agents/livekit/agents/llm/llm.py
livekit-plugins/livekit-plugins-langchain/livekit/plugins/langchain/langgraph.py
livekit-plugins/livekit-plugins-anthropic/livekit/plugins/anthropic/llm.py
tests/fake_llm.py
livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/llm.py
livekit-agents/livekit/agents/voice/agent.py
livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py
livekit-agents/livekit/agents/llm/fallback_adapter.py

🧠 Learnings (2)

📚 Learning: 2026-01-19T07:59:36.851Z

Learnt from: keenranger
Repo: livekit/agents PR: 4511
File: livekit-plugins/livekit-plugins-langchain/livekit/plugins/langchain/langgraph.py:46-54
Timestamp: 2026-01-19T07:59:36.851Z
Learning: In the LiveKit LangChain LangGraph integration (`livekit-plugins-langchain/livekit/plugins/langchain/langgraph.py`), passing an empty list for `stream_mode` parameter (i.e., `stream_mode=[]`) is valid and intentional behavior—it allows users to opt out of streaming modes.

Applied to files:

livekit-plugins/livekit-plugins-langchain/livekit/plugins/langchain/langgraph.py

📚 Learning: 2026-01-18T01:08:55.648Z

Learnt from: davidzhao
Repo: livekit/agents PR: 4548
File: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py:77-81
Timestamp: 2026-01-18T01:08:55.648Z
Learning: In the OpenAI responses LLM (`livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`), reasoning effort defaults are intentionally set lower than OpenAI's API defaults for voice interactions: "none" for gpt-5.1/gpt-5.2 and "minimal" for other reasoning-capable models like gpt-5, to avoid enabling reasoning by default in voice contexts.

Applied to files:

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py

🧬 Code graph analysis (1)

livekit-agents/livekit/agents/voice/agent.py (2)

livekit-agents/livekit/agents/llm/llm.py (3)

chat (123-133)

chat_ctx (336-337)

tools (340-341)

tests/fake_llm.py (1)

chat (57-68)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: unit-tests
GitHub Check: type-check (3.9)
GitHub Check: type-check (3.13)

🔇 Additional comments (8)

livekit-plugins/livekit-plugins-langchain/livekit/plugins/langchain/langgraph.py (1)

67-67: Unused response_format parameter is acceptable for LangGraph adapter.

The parameter is added for interface consistency. Since LangGraph manages its own LLM invocations internally, propagating response_format would require deeper integration with the underlying graph's LLM configuration.

livekit-agents/livekit/agents/llm/llm.py (1)

131-131: Abstract signature correctly extended with response_format parameter.

The addition of response_format: NotGivenOr[Any] to the abstract chat() method establishes the interface contract for all LLM implementations. The Any type provides flexibility for provider-specific formats.

tests/fake_llm.py (1)

65-65: Test fake correctly updated to match the new interface.

The FakeLLM.chat() signature now includes response_format for interface compatibility. Not utilizing the parameter in the fake implementation is acceptable for testing purposes.

livekit-agents/livekit/agents/voice/agent.py (2)

32-33: ModelSettings correctly extended with response_format field.

The new field with proper typing and docstring follows the existing pattern for tool_choice.

414-424: response_format correctly extracted and propagated to LLM chat call.

The implementation properly reads response_format from model_settings and passes it through to activity_llm.chat(). The defensive if model_settings else NOT_GIVEN check on line 414 is consistent with line 413's handling of tool_choice.

livekit-agents/livekit/agents/llm/fallback_adapter.py (3)

95-107: LGTM!

The response_format parameter is correctly added to the chat() method signature and properly passed to the FallbackLLMStream constructor.

129-137: LGTM!

The response_format parameter is properly accepted and stored as instance state for use during generation.

166-179: LGTM!

The response_format is correctly propagated to the underlying LLM's chat() call, ensuring consistent behavior across fallback attempts.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-19T22:58:20Z

livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py

+        response_format: NotGivenOr[Any] = NOT_GIVEN,
        extra_kwargs: NotGivenOr[dict[str, Any]] = NOT_GIVEN,


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

OpenAI Responses API response_format parameter structure 2025

💡 Result:

Short answer — structure and examples (2025)

Two main modes: JSON mode and Structured Outputs (schema-enforced). Use JSON mode with type "json_object"; use Structured Outputs with type "json_schema" and include a JSON Schema under "json_schema". [1][2]

Current Responses API placement (2025): put the format inside the text field:
Example (schema-enforced):
{
"text": {
"format": {
"type": "json_schema",
"json_schema": {
"type": "object",
"properties": {
"name": {"type":"string"},
"age": {"type":"integer"}
},
"required":["name","age"],
"additionalProperties": false
}
}
}
}
Example (JSON mode):
{
"text": {
"format": { "type": "json_object" }
}
}
(The older parameter name response_format is still seen in examples but the Responses API places format under text.format). [1][2][3]

Sources

OpenAI Structured Outputs / JSON mode docs. [1]

OpenAI announcement + SDK examples for Structured Outputs. [2]

2025 technical deep-dive showing text.format usage and examples. [3]

🏁 Script executed:

# First, let's check the actual file to understand the context fd -type f -name "llm.py" | grep -i openai | head -5

Repository: livekit/agents

Length of output: 228

🏁 Script executed:

# Find the llm.py file in openai responses directory find . -path "*/openai/responses/llm.py" -type f

Repository: livekit/agents

Length of output: 138

🏁 Script executed:

# Also check the directory structure fd "llm.py" | grep openai

Repository: livekit/agents

Length of output: 205

🏁 Script executed:

# Read the llm.py file to examine the response_format parameter and API call cat -n ./livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py | sed -n '110,170p'

Repository: livekit/agents

Length of output: 2576

🏁 Script executed:

# Search for all usages of response_format in the openai plugin rg "response_format" livekit-plugins/livekit-plugins-openai/

Repository: livekit/agents

Length of output: 1615

🏁 Script executed:

# Check LLMStream class definition to see if it handles response_format cat -n ./livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py | sed -n '170,250p'

Repository: livekit/agents

Length of output: 3694

🏁 Script executed:

# Check the regular llm.py file to see response_format handling cat -n ./livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py | sed -n '1,50p'

Repository: livekit/agents

Length of output: 1868

🏁 Script executed:

# Search for the chat() method in llm.py to see how response_format is used rg -A 30 "def chat" ./livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py | head -80

Repository: livekit/agents

Length of output: 1173

🏁 Script executed:

# Check llm_utils to understand to_openai_response_format fd "llm_utils" livekit-plugins/livekit-plugins-openai/

Repository: livekit/agents

Length of output: 40

🏁 Script executed:

# Find where llm_utils is located fd -name "*.py" | xargs grep -l "def to_openai_response_format" | head -3

Repository: livekit/agents

Length of output: 288

🏁 Script executed:

# Check how response_format is used in the regular llm.py rg -B 5 -A 10 "extra\[\"response_format\"\]" ./livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/llm.py

Repository: livekit/agents

Length of output: 684

🏁 Script executed:

# Check if there are any tests or examples showing how response_format is used rg -B 5 -A 5 "response_format" ./livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/ | head -40

Repository: livekit/agents

Length of output: 1398

🏁 Script executed:

# Find and examine the llm_utils to_openai_response_format function find . -path "*/agents/llm/utils.py" -o -path "*/livekit/agents/llm/utils.py" | xargs cat 2>/dev/null | head -200

Repository: livekit/agents

Length of output: 6557

🏁 Script executed:

# Or search more broadly rg -B 2 -A 15 "def to_openai_response_format" --type py

Repository: livekit/agents

Length of output: 1403

🏁 Script executed:

# Let's verify the exact structure of response_format for Responses API # by checking if there are any tests or examples rg -r "responses.create" ./livekit-plugins/livekit-plugins-openai/ --type py -A 10 | head -50

Repository: livekit/agents

Length of output: 40

🏁 Script executed:

# Check the full to_openai_response_format function to understand the structure better rg -B 5 -A 25 "def to_openai_response_format" livekit-agents/

Repository: livekit/agents

Length of output: 2378

response_format parameter is accepted but not utilized in the API call.

OpenAI's Responses API supports response_format for structured output control. The parameter is defined in the method signature but never propagated to the API request, unlike parallel_tool_calls and tool_choice.

🔧 Suggested fix to propagate response_format

if is_given(tool_choice): oai_tool_choice: response_create_params.ToolChoice if isinstance(tool_choice, dict): oai_tool_choice = { "type": "function", "name": tool_choice["function"]["name"], } extra["tool_choice"] = oai_tool_choice elif tool_choice in ("auto", "required", "none"): oai_tool_choice = tool_choice # type: ignore extra["tool_choice"] = oai_tool_choice + if is_given(response_format): + extra["text"] = {"format": response_format} + return LLMStream(

🤖 Prompt for AI Agents

In `@livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py` around lines 122 - 123, The response_format parameter is accepted by the method but never forwarded to the OpenAI Responses API; update the Responses API invocation (the same call that already forwards parallel_tool_calls and tool_choice) to include response_format when it is not NOT_GIVEN: check for NOT_GIVEN and, if a real value, add response_format=response_format to the API call arguments so the Responses API receives the structured output directive. Ensure you reference the existing NOT_GIVEN sentinel and the response_format parameter in your change.

Since the goal of this PR is to make it work only for google, leaving this as a passthrough.

@vishal-seshagiri-infinitusai, understood! Since the goal is to implement response_format for Google only, leaving it as a passthrough for OpenAI Responses makes sense for this PR. The parameter maintains interface consistency across providers while keeping the implementation scoped appropriately.

✏️ Learnings added

Learnt from: vishal-seshagiri-infinitusai Repo: livekit/agents PR: 4559 File: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py:122-123 Timestamp: 2026-01-19T23:21:47.799Z Learning: In PR `#4559`, response_format was intentionally added as a passthrough (accepted but not utilized) to the OpenAI Responses API in `livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`. The implementation is scoped only for Google provider, not for OpenAI.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: davidzhao Repo: livekit/agents PR: 4548 File: livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/responses/llm.py:77-81 Timestamp: 2026-01-18T01:08:55.648Z Learning: In the OpenAI responses LLM (`livekit-plugins-openai/livekit/plugins/openai/responses/llm.py`), reasoning effort defaults are intentionally set lower than OpenAI's API defaults for voice interactions: "none" for gpt-5.1/gpt-5.2 and "minimal" for other reasoning-capable models like gpt-5, to avoid enabling reasoning by default in voice contexts.

livekit-plugins/livekit-plugins-mistralai/livekit/plugins/mistralai/llm.py

longcw · 2026-01-21T10:52:33Z

livekit-agents/livekit/agents/voice/agent.py

 class ModelSettings:
    tool_choice: NotGivenOr[llm.ToolChoice] = NOT_GIVEN
    """The tool choice to use when calling the LLM."""
+    response_format: NotGivenOr[Any] = NOT_GIVEN


since not all LLMs support response format, perhaps a better way is to use it in a custom llm_node instead of adding it to ModelSettings, here is an example.

btw, I didn't see how you want to pass the response_format to the agent from the user's code.

voice: add response_format to ModelSettings and chat()

b2d4fe6

coderabbitai bot reviewed Jan 19, 2026

View reviewed changes

vishal-seshagiri-infinitusai commented Jan 19, 2026

View reviewed changes

livekit-plugins/livekit-plugins-mistralai/livekit/plugins/mistralai/llm.py Show resolved Hide resolved

reverting inadvertant changes to mistralai/llm.py

c762105

longcw reviewed Jan 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

voice: add response_format to ModelSettings and chat() #4559

voice: add response_format to ModelSettings and chat() #4559

vishal-seshagiri-infinitusai commented Jan 19, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

CLAassistant commented Jan 19, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Jan 19, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 19, 2026 •

edited

Loading

Uh oh!

vishal-seshagiri-infinitusai Jan 19, 2026

Uh oh!

coderabbitai bot Jan 19, 2026

Uh oh!

Uh oh!

longcw Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		response_format: NotGivenOr[Any] = NOT_GIVEN,
		extra_kwargs: NotGivenOr[dict[str, Any]] = NOT_GIVEN,

voice: add response_format to ModelSettings and chat() #4559

Are you sure you want to change the base?

voice: add response_format to ModelSettings and chat() #4559

Conversation

vishal-seshagiri-infinitusai commented Jan 19, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

CLAassistant commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vishal-seshagiri-infinitusai Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

longcw Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vishal-seshagiri-infinitusai commented Jan 19, 2026 •

edited by coderabbitai bot

Loading

CLAassistant commented Jan 19, 2026 •

edited

Loading

coderabbitai bot commented Jan 19, 2026 •

edited

Loading

coderabbitai bot Jan 19, 2026 •

edited

Loading