Skip to content

Fix playback flush and speech interruption races#1518

Open
toubatbrian wants to merge 6 commits into
mainfrom
brian/fix-audio-interruption-playback-races
Open

Fix playback flush and speech interruption races#1518
toubatbrian wants to merge 6 commits into
mainfrom
brian/fix-audio-interruption-playback-races

Conversation

@toubatbrian
Copy link
Copy Markdown
Contributor

@toubatbrian toubatbrian commented May 15, 2026

Summary

Fixes several related playback/interruption races found while debugging the agents-js repro:

  • Prevent ParticipantAudioOutput.flush() from canceling/replacing an active flush for the same playback segment, which could emit duplicate playback_finished events and desync segment counters.
  • Make VAD/adaptive interruption end agent speech only once per paused speech handle, avoiding repeated onEndOfAgentSpeech() cascades.
  • Ensure AudioRecognition leaves isAgentSpeaking before waiting on interruption sentinel delivery, so STT does not remain stuck buffering user speech.
  • End audio-recognition speech when pipeline completion moves the session from speaking to listening.
  • Handle confirmed interruption when only a paused speech remains, so later user speech is not treated as permanent overlap/interruption.
  • Stop tool-reply continuation after tool execution if the speech handle was interrupted.

Verification

  • Added regression coverage for duplicate flush finishing.
  • Added regression coverage for repeated VAD interruption on the same paused speech.
  • Added regression coverage for pipeline completion ending audio recognition.
  • Added regression coverage for confirmed interruption with only paused speech.
  • Added deterministic AudioRecognition regression for STT buffering while agent speech end is in progress.
  • Verified repro logs before/after: max onEndOfAgentSpeech() burst dropped from 36 to 1, and STT no longer stays stuck buffering after interruption.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 15, 2026

🦋 Changeset detected

Latest commit: 42261c3

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 31 packages
Name Type
@livekit/agents Patch
@livekit/agents-plugin-anam Patch
@livekit/agents-plugin-assemblyai Patch
@livekit/agents-plugin-baseten Patch
@livekit/agents-plugin-bey Patch
@livekit/agents-plugin-cartesia Patch
@livekit/agents-plugin-cerebras Patch
@livekit/agents-plugin-deepgram Patch
@livekit/agents-plugin-elevenlabs Patch
@livekit/agents-plugin-fishaudio Patch
@livekit/agents-plugin-google Patch
@livekit/agents-plugin-hedra Patch
@livekit/agents-plugin-hume Patch
@livekit/agents-plugin-inworld Patch
@livekit/agents-plugin-lemonslice Patch
@livekit/agents-plugin-liveavatar Patch
@livekit/agents-plugin-livekit Patch
@livekit/agents-plugin-minimax Patch
@livekit/agents-plugin-mistral Patch
@livekit/agents-plugin-mistralai Patch
@livekit/agents-plugin-neuphonic Patch
@livekit/agents-plugin-openai Patch
@livekit/agents-plugin-phonic Patch
@livekit/agents-plugin-resemble Patch
@livekit/agents-plugin-rime Patch
@livekit/agents-plugin-runway Patch
@livekit/agents-plugin-sarvam Patch
@livekit/agents-plugin-silero Patch
@livekit/agents-plugins-test Patch
@livekit/agents-plugin-trugen Patch
@livekit/agents-plugin-xai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

chatgpt-codex-connector[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Open in Devin Review

Ensure agent speech end state is synchronized across VAD interruption, confirmed interruption, and pipeline completion paths so STT does not remain stuck buffering user speech as overlap. Add focused regressions for repeated VAD interruption, paused-speech interruption recovery, and pipeline completion cleanup.
devin-ai-integration[bot]

This comment was marked as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant