-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Description
Bug Description
The AI agent intermittently fails to produce audible speech in LiveKit rooms. In affected cases, the agent successfully joins the room, receives and processes jobs, initializes without errors, and publishes an audio track. The client can also receive the agent’s initial text response, confirming that the LLM logic and job execution are functioning correctly.
However, despite the audio track being published, no speech is heard by the client. This indicates that the failure occurs at the media/audio pipeline level rather than in the AI reasoning or job handling logic. The issue does not occur consistently and is observed only in some sessions, while other sessions work as expected.
Expected Behavior
When the AI agent joins a LiveKit room and receives a job, it should consistently generate both text and audible speech output for every session. After initializing successfully and publishing an audio track, the agent’s synthesized speech should be clearly heard by all connected clients without delay or silence.
Reproduction Steps
# Get model language (map 'eg' to 'ar')
lang_model = params.get_model_language()
# Get VAD
vad = self.model_provider.get_vad()
# Setup turn detector
turn_detector = MultilingualModel()
# Create session runner
session_runner = AgentSession(
llm=self.model_provider.get_llm(creativity=params.creativity),
stt=self.model_provider.get_stt(language=lang_model),
tts=self.model_provider.get_tts(voice_type=params.voice_type, speed=params.voice_speed),
vad=vad,
turn_detection=turn_detector,
)
# 12. Start session (this is blocking and won't return until session ends)
try:
logger.info(f"Starting session in room: {ctx.room.name}")
await session_runner.start(
room=ctx.room,
agent=coach,
room_input_options=RoomInputOptions(
noise_cancellation=noise_cancellation.BVC()
),
room_output_options=RoomOutputOptions(
audio_enabled=True,
transcription_enabled=True,
sync_transcription=False,
),
)
except Exception as e:
logger.exception("Failed to start session runner")
raise eOperating System
ubuntu 2024 and windows 10
Models Used
No response
Package Versions
"livekit-agents==1.3.11"
"livekit-plugins-silero",
"livekit-plugins-noise-cancellation",
"livekit-plugins-turn-detector",
"livekit-plugins-openai",
"livekit-plugins-deepgram",Session/Room/Call IDs
No response
Proposed Solution
Additional Context
No response
Screenshots and Recordings
No response