-
Notifications
You must be signed in to change notification settings - Fork 212
Add support for noiseCancellation frameProcessors #966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🦋 Changeset detectedLatest commit: 509ab45 The changes in this PR will be included in the next version bump. This PR includes changesets to release 17 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
1egoman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any docs that need to be updated to go along with this? Maybe an example of using the aic enhancer via the FrameProcessor interface?
Other than that, this looks to be pretty much the same as the python change generally so this looks good to me! Not approving because I think probably somebody from the agents team should take a look and this is still a draft.
xianshijing-lk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
…into lukas/frame-processor
|
@lukasIO What is the status of this, is it ready to be merged in or is there more work you are intending to do? Also it looks like there are some merge conflicts, would you be able to fix those? |
📝 WalkthroughWalkthroughThis PR introduces support for custom Changes
Sequence Diagram(s)sequenceDiagram
participant App as Agent App
participant Stream as ParticipantAudioInputStream
participant FP as FrameProcessor
participant Room as Room
participant Track as AudioTrack
App->>Stream: new ParticipantAudioInputStream(frameProcessor)
Stream->>Stream: Store frameProcessor
Room->>Stream: TokenRefreshed event
Stream->>FP: onTokenRefreshed(updatedCredentials)
Room->>Stream: onTrackSubscribed(track)
Stream->>FP: Notify stream info & credentials
App->>Stream: createStream()
Stream->>FP: Process audio frames
FP-->>Track: Enhanced audio output
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
agents/src/voice/room_io/_input.ts (1)
121-128: Move frameProcessor closure to close() method instead of closeStream().The
closeStream()method is called on track changes (participant switches, track subscriptions, track unpublished events), and closing the frameProcessor there permanently disables noise cancellation for subsequent tracks. The frameProcessor is reused across tracks and updated with credentials and stream info after eachcloseStream()call (lines 152-160), making it essential to preserve its lifecycle independent of stream changes. Close the frameProcessor only during final cleanup in theclose()method.Suggested change
private closeStream() { if (this.deferredStream.isSourceSet) { this.deferredStream.detachSource(); } - - this.frameProcessor?.close(); - this.publication = null; } @@ async close() { this.room.off(RoomEvent.TrackSubscribed, this.onTrackSubscribed); this.room.off(RoomEvent.TrackUnpublished, this.onTrackUnpublished); this.room.off(RoomEvent.TokenRefreshed, this.onTokenRefreshed); + this.frameProcessor?.close(); this.closeStream(); // Ignore errors - stream may be locked by RecorderIO or already cancelled await this.deferredStream.stream.cancel().catch(() => {}); }examples/src/inworld_tts.ts (1)
4-12: Initialize the logger before any LLM usage (examples guideline).This example uses LLM functionality (line 66:
llm: 'openai/gpt-4.1-mini') but doesn't initialize the logger. AddinitializeLogger({ pretty: true })early before agent/session setup.Proposed fix
import { type JobContext, type JobProcess, WorkerOptions, cli, defineAgent, + initializeLogger, metrics, voice, } from '@livekit/agents'; import * as inworld from '@livekit/agents-plugin-inworld'; import * as livekit from '@livekit/agents-plugin-livekit'; import * as silero from '@livekit/agents-plugin-silero'; import { BackgroundVoiceCancellation } from '@livekit/noise-cancellation-node'; import { fileURLToPath } from 'node:url'; +initializeLogger({ pretty: true }); + export default defineAgent({examples/src/basic_agent.ts (1)
4-13: Initialize the logger before any LLM usage (example file requirement).This example uses LLM functionality (
llm: 'openai/gpt-4.1-mini') but doesn't initialize the logger. AddinitializeLogger({ pretty: true })at the top of the entry point before the agent/session setup.Proposed fix
import { type JobContext, type JobProcess, WorkerOptions, cli, defineAgent, + initializeLogger, llm, metrics, voice, } from '@livekit/agents'; import * as livekit from '@livekit/agents-plugin-livekit'; import * as silero from '@livekit/agents-plugin-silero'; import * as aic from '@livekit/plugins-ai-coustics'; import { fileURLToPath } from 'node:url'; import { z } from 'zod'; +initializeLogger({ pretty: true }); + export default defineAgent({
🤖 Fix all issues with AI agents
In `@examples/package.json`:
- Around line 41-43: Remove or correct the invalid npm dependency entry
"@livekit/plugins-ai-coustics": "0.1.7" in package.json: either delete that line
or replace it with the correct npm package name and version (or a valid
git/registry spec) for the intended Node.js plugin; ensure the resulting
package.json remains valid JSON and run npm install to verify the dependency
resolves.
🧹 Nitpick comments (2)
examples/src/inworld_tts.ts (1)
82-84: Prefer a typed alignment event overany(avoid eslint disable).If the Inworld SDK exposes an alignment event type, using it removes the need for
anyand the eslint override.♻️ Possible refinement
-// eslint-disable-next-line `@typescript-eslint/no-explicit-any` -session.tts!.on('alignment' as any, (data: any) => { +type AlignmentEvent = { + wordAlignment?: { words: string[]; starts: number[]; ends: number[] }; + characterAlignment?: { chars: string[]; starts: number[]; ends: number[] }; +}; + +session.tts!.on('alignment', (data: AlignmentEvent) => {.changeset/large-cars-pull.md (1)
1-2: Consider using "minor" instead of "patch" for this version bump.This PR introduces new functionality (support for
FrameProcessor<AudioFrame>as an alternative toNoiseCancellationOptions), which constitutes a backwards-compatible feature addition rather than a bug fix. According to semantic versioning conventions, new features should trigger a "minor" version bump, while "patch" is reserved for backwards-compatible bug fixes.📦 Proposed change to version bump type
--- -"@livekit/agents": patch +"@livekit/agents": minor ---
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (7)
.changeset/large-cars-pull.mdagents/src/voice/room_io/_input.tsagents/src/voice/room_io/room_io.tsexamples/package.jsonexamples/src/basic_agent.tsexamples/src/inworld_tts.tsturbo.json
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'
Files:
agents/src/voice/room_io/room_io.tsagents/src/voice/room_io/_input.tsexamples/src/inworld_tts.tsexamples/src/basic_agent.ts
**/*.{ts,tsx}?(test|example|spec)
📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
When testing inference LLM, always use full model names from
agents/src/inference/models.ts(e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')
Files:
agents/src/voice/room_io/room_io.tsagents/src/voice/room_io/_input.tsexamples/src/inworld_tts.tsexamples/src/basic_agent.ts
**/*.{ts,tsx}?(test|example)
📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)
Initialize logger before using any LLM functionality with
initializeLogger({ pretty: true })from '@livekit/agents'
Files:
agents/src/voice/room_io/room_io.tsagents/src/voice/room_io/_input.tsexamples/src/inworld_tts.tsexamples/src/basic_agent.ts
🔇 Additional comments (5)
turbo.json (1)
55-56: LGTM: Added VITEST to globalEnv.No concerns with this config update.
examples/src/basic_agent.ts (1)
85-85: LGTM: aic.audioEnhancement() as noiseCancellation.Looks good; please just confirm the plugin API returns a
FrameProcessor<AudioFrame>compatible with the widened noiseCancellation type.agents/src/voice/room_io/room_io.ts (1)
5-9: LGTM: noiseCancellation now supports FrameProcessor.This API widening fits the new frame-processor flow; please just confirm the rtc-node FrameProcessor type is the intended public surface here.
Also applies to: 80-80
agents/src/voice/room_io/_input.ts (1)
38-48: Theinstanceof FrameProcessorcheck is correct and will work at runtime.
FrameProcessoris imported as a concrete class (not a type-only import), and it's an abstract class from@livekit/rtc-nodethat supportsinstanceofchecks. The pattern is consistent with other similar checks in the same file (e.g.,instanceof RemoteParticipant).Likely an incorrect or invalid review comment.
.changeset/large-cars-pull.md (1)
5-5: Description is clear and concise.The changeset description accurately summarizes the feature addition. While it could be expanded to mention that
FrameProcessor<AudioFrame>is now supported as an alternative toNoiseCancellationOptions, the current brevity is typical and acceptable for changeset files.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
1egoman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, per my last comment I still think it would be good to get somebody from the agents team to look at this as well.
Add support for noiseCancellation frameProcessors, depends on livekit/node-sdks#605
Summary by CodeRabbit
Release Notes
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.