Skip to content

Conversation

@lukasIO
Copy link
Contributor

@lukasIO lukasIO commented Jan 14, 2026

Add support for noiseCancellation frameProcessors, depends on livekit/node-sdks#605

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for custom frame processors in audio noise cancellation, enabling more flexible audio enhancement options alongside traditional noise cancellation settings.
    • Example updated to demonstrate integration with AI-powered audio enhancement plugin.
  • Chores

    • Updated dependencies and build configuration.

✏️ Tip: You can customize this high-level summary in your review settings.

@changeset-bot
Copy link

changeset-bot bot commented Jan 14, 2026

🦋 Changeset detected

Latest commit: 509ab45

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 17 packages
Name Type
@livekit/agents Patch
@livekit/agents-plugin-anam Patch
@livekit/agents-plugin-baseten Patch
@livekit/agents-plugin-bey Patch
@livekit/agents-plugin-cartesia Patch
@livekit/agents-plugin-deepgram Patch
@livekit/agents-plugin-elevenlabs Patch
@livekit/agents-plugin-google Patch
@livekit/agents-plugin-inworld Patch
@livekit/agents-plugin-livekit Patch
@livekit/agents-plugin-neuphonic Patch
@livekit/agents-plugin-openai Patch
@livekit/agents-plugin-resemble Patch
@livekit/agents-plugin-rime Patch
@livekit/agents-plugin-silero Patch
@livekit/agents-plugins-test Patch
@livekit/agents-plugin-xai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@lukasIO lukasIO marked this pull request as draft January 14, 2026 10:37
@lukasIO lukasIO changed the title Lukas/frame processor Add support for noiseCancellation frameProcessors, Jan 14, 2026
@lukasIO lukasIO changed the title Add support for noiseCancellation frameProcessors, Add support for noiseCancellation frameProcessors Jan 14, 2026
Copy link

@1egoman 1egoman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any docs that need to be updated to go along with this? Maybe an example of using the aic enhancer via the FrameProcessor interface?

Other than that, this looks to be pretty much the same as the python change generally so this looks good to me! Not approving because I think probably somebody from the agents team should take a look and this is still a draft.

Copy link

@xianshijing-lk xianshijing-lk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@1egoman
Copy link

1egoman commented Jan 20, 2026

@lukasIO What is the status of this, is it ready to be merged in or is there more work you are intending to do?

Also it looks like there are some merge conflicts, would you be able to fix those?

@coderabbitai
Copy link

coderabbitai bot commented Jan 20, 2026

📝 Walkthrough

Walkthrough

This PR introduces support for custom FrameProcessor<AudioFrame> objects as an alternative to standard noise-cancellation options in the agents package. The core ParticipantAudioInputStream class is updated to accept and manage frame processors, including token refresh event handling and track subscription lifecycle management. Examples are updated to use the new AI-Coustics plugin.

Changes

Cohort / File(s) Summary
Changeset & Dependency Management
.changeset/large-cars-pull.md, examples/package.json
Added patch release changeset for frame processor support; introduced new @livekit/plugins-ai-coustics (v0.1.7) dependency
Core Audio Processing
agents/src/voice/room_io/_input.ts, agents/src/voice/room_io/room_io.ts
Extended ParticipantAudioInputStream and RoomInputOptions to accept FrameProcessor<AudioFrame> alongside NoiseCancellationOptions; added token refresh event listener and frame processor lifecycle management (initialization, cleanup, stream info/credentials updates)
Example Usage
examples/src/basic_agent.ts
Replaced BackgroundVoiceCancellation from @livekit/noise-cancellation-node with aic.audioEnhancement() from @livekit/plugins-ai-coustics
Lint & Configuration
examples/src/inworld_tts.ts, turbo.json
Added eslint disable comment for explicit any type in event handler; added VITEST environment variable to turbo.json globalEnv

Sequence Diagram(s)

sequenceDiagram
    participant App as Agent App
    participant Stream as ParticipantAudioInputStream
    participant FP as FrameProcessor
    participant Room as Room
    participant Track as AudioTrack

    App->>Stream: new ParticipantAudioInputStream(frameProcessor)
    Stream->>Stream: Store frameProcessor
    Room->>Stream: TokenRefreshed event
    Stream->>FP: onTokenRefreshed(updatedCredentials)
    
    Room->>Stream: onTrackSubscribed(track)
    Stream->>FP: Notify stream info & credentials
    
    App->>Stream: createStream()
    Stream->>FP: Process audio frames
    FP-->>Track: Enhanced audio output
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A frame processor hops into view,
With tokens refreshed and streams anew,
AI coustics make the audio bright,
The agents hear clearly, day and night!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is incomplete and does not follow the required template. It lacks sections for Changes Made, Pre-Review Checklist, Testing, and Additional Notes, providing only a one-line summary with a dependency reference. Complete the PR description by adding all required template sections: detailed Changes Made list, Pre-Review Checklist with completion status, Testing approach, and any Additional Notes. This helps reviewers understand the scope and testing coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'Add support for noiseCancellation frameProcessors' accurately reflects the main change: adding FrameProcessor support as an alternative to NoiseCancellationOptions in the noiseCancellation parameter across multiple files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

@lukasIO
Copy link
Contributor Author

lukasIO commented Jan 20, 2026

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Jan 20, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@lukasIO lukasIO marked this pull request as ready for review January 20, 2026 15:56
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
agents/src/voice/room_io/_input.ts (1)

121-128: Move frameProcessor closure to close() method instead of closeStream().

The closeStream() method is called on track changes (participant switches, track subscriptions, track unpublished events), and closing the frameProcessor there permanently disables noise cancellation for subsequent tracks. The frameProcessor is reused across tracks and updated with credentials and stream info after each closeStream() call (lines 152-160), making it essential to preserve its lifecycle independent of stream changes. Close the frameProcessor only during final cleanup in the close() method.

Suggested change
   private closeStream() {
     if (this.deferredStream.isSourceSet) {
       this.deferredStream.detachSource();
     }
-
-    this.frameProcessor?.close();
-
     this.publication = null;
   }
@@
   async close() {
     this.room.off(RoomEvent.TrackSubscribed, this.onTrackSubscribed);
     this.room.off(RoomEvent.TrackUnpublished, this.onTrackUnpublished);
     this.room.off(RoomEvent.TokenRefreshed, this.onTokenRefreshed);
+    this.frameProcessor?.close();
     this.closeStream();
     // Ignore errors - stream may be locked by RecorderIO or already cancelled
     await this.deferredStream.stream.cancel().catch(() => {});
   }
examples/src/inworld_tts.ts (1)

4-12: Initialize the logger before any LLM usage (examples guideline).

This example uses LLM functionality (line 66: llm: 'openai/gpt-4.1-mini') but doesn't initialize the logger. Add initializeLogger({ pretty: true }) early before agent/session setup.

Proposed fix
 import {
   type JobContext,
   type JobProcess,
   WorkerOptions,
   cli,
   defineAgent,
+  initializeLogger,
   metrics,
   voice,
 } from '@livekit/agents';
 import * as inworld from '@livekit/agents-plugin-inworld';
 import * as livekit from '@livekit/agents-plugin-livekit';
 import * as silero from '@livekit/agents-plugin-silero';
 import { BackgroundVoiceCancellation } from '@livekit/noise-cancellation-node';
 import { fileURLToPath } from 'node:url';
 
+initializeLogger({ pretty: true });
+
 export default defineAgent({
examples/src/basic_agent.ts (1)

4-13: Initialize the logger before any LLM usage (example file requirement).

This example uses LLM functionality (llm: 'openai/gpt-4.1-mini') but doesn't initialize the logger. Add initializeLogger({ pretty: true }) at the top of the entry point before the agent/session setup.

Proposed fix
 import {
   type JobContext,
   type JobProcess,
   WorkerOptions,
   cli,
   defineAgent,
+  initializeLogger,
   llm,
   metrics,
   voice,
 } from '@livekit/agents';
 import * as livekit from '@livekit/agents-plugin-livekit';
 import * as silero from '@livekit/agents-plugin-silero';
 import * as aic from '@livekit/plugins-ai-coustics';
 import { fileURLToPath } from 'node:url';
 import { z } from 'zod';
 
+initializeLogger({ pretty: true });
+
 export default defineAgent({
🤖 Fix all issues with AI agents
In `@examples/package.json`:
- Around line 41-43: Remove or correct the invalid npm dependency entry
"@livekit/plugins-ai-coustics": "0.1.7" in package.json: either delete that line
or replace it with the correct npm package name and version (or a valid
git/registry spec) for the intended Node.js plugin; ensure the resulting
package.json remains valid JSON and run npm install to verify the dependency
resolves.
🧹 Nitpick comments (2)
examples/src/inworld_tts.ts (1)

82-84: Prefer a typed alignment event over any (avoid eslint disable).

If the Inworld SDK exposes an alignment event type, using it removes the need for any and the eslint override.

♻️ Possible refinement
-// eslint-disable-next-line `@typescript-eslint/no-explicit-any`
-session.tts!.on('alignment' as any, (data: any) => {
+type AlignmentEvent = {
+  wordAlignment?: { words: string[]; starts: number[]; ends: number[] };
+  characterAlignment?: { chars: string[]; starts: number[]; ends: number[] };
+};
+
+session.tts!.on('alignment', (data: AlignmentEvent) => {
.changeset/large-cars-pull.md (1)

1-2: Consider using "minor" instead of "patch" for this version bump.

This PR introduces new functionality (support for FrameProcessor<AudioFrame> as an alternative to NoiseCancellationOptions), which constitutes a backwards-compatible feature addition rather than a bug fix. According to semantic versioning conventions, new features should trigger a "minor" version bump, while "patch" is reserved for backwards-compatible bug fixes.

📦 Proposed change to version bump type
 ---
-"@livekit/agents": patch
+"@livekit/agents": minor
 ---
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d35967 and 509ab45.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (7)
  • .changeset/large-cars-pull.md
  • agents/src/voice/room_io/_input.ts
  • agents/src/voice/room_io/room_io.ts
  • examples/package.json
  • examples/src/basic_agent.ts
  • examples/src/inworld_tts.ts
  • turbo.json
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

  • agents/src/voice/room_io/room_io.ts
  • agents/src/voice/room_io/_input.ts
  • examples/src/inworld_tts.ts
  • examples/src/basic_agent.ts
**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

  • agents/src/voice/room_io/room_io.ts
  • agents/src/voice/room_io/_input.ts
  • examples/src/inworld_tts.ts
  • examples/src/basic_agent.ts
**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

  • agents/src/voice/room_io/room_io.ts
  • agents/src/voice/room_io/_input.ts
  • examples/src/inworld_tts.ts
  • examples/src/basic_agent.ts
🔇 Additional comments (5)
turbo.json (1)

55-56: LGTM: Added VITEST to globalEnv.

No concerns with this config update.

examples/src/basic_agent.ts (1)

85-85: LGTM: aic.audioEnhancement() as noiseCancellation.

Looks good; please just confirm the plugin API returns a FrameProcessor<AudioFrame> compatible with the widened noiseCancellation type.

agents/src/voice/room_io/room_io.ts (1)

5-9: LGTM: noiseCancellation now supports FrameProcessor.

This API widening fits the new frame-processor flow; please just confirm the rtc-node FrameProcessor type is the intended public surface here.

Also applies to: 80-80

agents/src/voice/room_io/_input.ts (1)

38-48: The instanceof FrameProcessor check is correct and will work at runtime.

FrameProcessor is imported as a concrete class (not a type-only import), and it's an abstract class from @livekit/rtc-node that supports instanceof checks. The pattern is consistent with other similar checks in the same file (e.g., instanceof RemoteParticipant).

Likely an incorrect or invalid review comment.

.changeset/large-cars-pull.md (1)

5-5: Description is clear and concise.

The changeset description accurately summarizes the feature addition. While it could be expanded to mention that FrameProcessor<AudioFrame> is now supported as an alternative to NoiseCancellationOptions, the current brevity is typical and acceptable for changeset files.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link

@1egoman 1egoman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, per my last comment I still think it would be good to get somebody from the agents team to look at this as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants