-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Description
Bug Description
When using LiveKit Agents with Gemini Live on Vertex AI (native audio), the agent consistently produces two spoken responses after a function call. The first response occurs during/after the tool call, and a second response occurs after the tool result is sent back. This happens even when the tool itself does not call generate_reply or say and simply returns a small string output.
WHY can't I move to Gemini API (and you shouldn't too)? All Gemini API models are preview. Not ready for production. Gemini realtime 12-2025 has issue with tool calls, unreliable and highly variable latency, and so on.
Expected Behavior
Only one spoken response per user turn. A tool call should not trigger a second, separate spoken response for the same turn.
Reproduction Steps
# repro.py
import asyncio
from typing import Any
from dotenv import load_dotenv
from livekit.agents import cli, JobContext, WorkerOptions, function_tool, RunContext
from livekit.agents.voice import AgentSession, Agent
from livekit.plugins import google
load_dotenv()
@function_tool
async def store_call_info(context: RunContext, info: dict[str, Any]) -> str:
# No extra say/generate_reply here
# await asyncio.sleep(0.1)
return "Information saved."
async def entrypoint(ctx: JobContext):
await ctx.connect()
participant = await ctx.wait_for_participant()
agent = Agent(
instructions=(
"Introduce yourself (you are John). Ask the user for their name, then call store_call_info when you have it to save his name."
"Then ask his job, and save also it with the function."
"Then, continue the conversation saying something about his job."
),
tools=[store_call_info],
)
session = AgentSession(
llm=google.realtime.RealtimeModel(
vertexai=True,
model="gemini-live-2.5-flash-native-audio",
voice="Charon",
),
)
await session.start(agent=agent, room=ctx.room)
if __name__ == "__main__":
cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))Operating System
macOS, Linux dev server
Models Used
gemini-live-2.5-flash-native-audio
Package Versions
python 3.12.11
livekit 1.0.23
livekit-agents 1.3.11
livekit-plugins-google 1.3.11
google-genai 1.58.0Session/Room/Call IDs
No response
Proposed Solution
Additional Context
No response
Screenshots and Recordings
No response