Skip to content

Support structured outputs for google llm models #4558

@vishal-seshagiri-infinitusai

Description

Feature Type

Would make my life easier

Feature Description

Structured output is a crucial capability for modern LLMs, as it ensures predictable, type-safe results and simplifies extracting structured data from unstructured text in agentic workflows (ai.google.dev documentation).
Currently, utilizing this feature within a VoiceAgent requires developers to override the default llm_node. This issue proposes updating the base livekit-agents SDK's llm.ModelSettings dataclass to include a response_format field. This would allow agents to configure structured outputs explicitly during setup, simplifying integration for the Google plugin and hopefully other LLM providers in the future

Workarounds / Alternatives

The current workaround requires developers to create a custom VoiceAgent implementation and explicitly override the default llm_node method. This method must manually call the underlying self.llm.chat() function with the desired response_format parameter, adding boilerplate code for a common use case.

Additional Context

https://ai.google.dev/gemini-api/docs/structured-output?example=recipe
https://geminibyexample.com/020-structured-output/

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions