Skip to content

feat: prevent OOM from unconsumed video frames in voice-only agents#458

Merged
aliev merged 7 commits intomainfrom
debug/video-memory-leak
Apr 2, 2026
Merged

feat: prevent OOM from unconsumed video frames in voice-only agents#458
aliev merged 7 commits intomainfrom
debug/video-memory-leak

Conversation

@aliev
Copy link
Copy Markdown
Member

@aliev aliev commented Mar 30, 2026

Why

Voice-only agents receive video tracks from WebRTC but never consume the frames. aiortc's RTCRtpReceiver uses an unbounded queue for decoded video frames (aiortc#554), causing ~400 MiB/10sec growth and OOM within minutes.

Changes

  • Extract Agent._needs_video() to check if agent has video consumers (VideoProcessors or VideoLLM)
  • Skip video track subscription in _on_track_added when agent doesn't need video, so stream-py's drain keeps running
  • Remove drain_video_frames flag from StreamEdge constructor (stream-py now drains by default)

How it works

stream-py attaches a MediaBlackhole to every incoming video track by default, draining unconsumed frames. When Agent needs video, it calls add_track_subscriber() which stops the drain for that track and returns a real proxy. Voice-only agents never call add_track_subscriber, so drain runs indefinitely.

Benchmark (local k8s, 30sec video call)

without drain with drain
Peak RSS 1654 MiB (+1404) 230 MiB (+10)

Companion PR

aliev added 3 commits March 30, 2026 19:10
Pass video_buffered=False to disable unbounded video frame buffering.
Defaults to True (current behavior). Propagates through rtc.join()
to ConnectionManager → PeerConnectionManager → SubscriberPeerConnection.

Voice-only agents should use video_buffered=False to prevent OOM from
accumulating unconsumed video frames (~400 MiB/10sec).
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 30, 2026

📝 Walkthrough

Walkthrough

StreamEdge.join now passes an additional boolean argument drain_video_frames=not agent._needs_video() to rtc.join(...). Agent class adds a private _needs_video() helper and _needs_audio_or_video_input() delegates video decision to it. No other control flow or error handling changes.

Changes

Cohort / File(s) Summary
StreamEdge / RTC join
plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py
rtc.join(...) call updated to include drain_video_frames=not agent._needs_video() as an explicit boolean argument. No other logic changes.
Agent media logic
agents-core/vision_agents/core/agents/agents.py
Added private Agent._needs_video() -> bool; _needs_audio_or_video_input() now uses self._needs_video() for video determination. Refactor only — behavior preserved.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

I clipped the dark with quiet scissors, kept
the small split light beneath my tongue;
a room of lenses breathing like a kept thing,
I watched the hush unspool into a wire,
and named the silence by the tilt of film.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately reflects the main objective: preventing out-of-memory errors in voice-only agents by automatically draining unconsumed video frames.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch debug/video-memory-leak

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py`:
- Around line 155-158: Validate that the constructor argument drain_video_frames
is a bool and annotate the signature: change def __init__(self,
drain_video_frames: bool, **kwargs: typing.Any) -> None, and at the start of
__init__ add a runtime check like if not isinstance(drain_video_frames, bool):
raise ValueError("drain_video_frames must be a bool") so non-bool values are
rejected early (this protects later uses such as in the code path referenced by
_drain_video_frames at line ~423); also annotate **kwargs with typing.Any and
the return type as None for consistency.
- Around line 420-423: The rtc.join(...) call in stream_edge_transport.py passes
an unsupported keyword drain_video_frames; either remove the drain_video_frames
argument from the rtc.join call (and any related property _drain_video_frames
usage) to stay compatible with released stream-py, or update the
getstream[webrtc,telemetry] dependency in pyproject.toml to point to the
development git reference/branch that provides drain_video_frames (e.g., set the
package to the git URL with branch "audio-more") so the runtime supports
rtc.join(..., drain_video_frames=...). Ensure you modify only the rtc.join
invocation and/or the pyproject.toml dependency entry and keep surrounding call
parameters (call, agent.agent_user.id, subscription_config) unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5a508231-3f96-415c-a5d1-222a80af8d77

📥 Commits

Reviewing files that changed from the base of the PR and between 37c479c and 6dadf78.

📒 Files selected for processing (1)
  • plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py

@aliev aliev changed the title feat: add drain_video_frames option to StreamEdge (WIP) feat: add drain_video_frames option to StreamEdge Mar 30, 2026
Remove manual drain_video_frames flag from StreamEdge constructor.
Instead, derive it from agent config: drain when no video_processors
and no VideoLLM. Prevents OOM from unbounded aiortc frame queue
in voice-only agents (~400 MiB/10sec growth).

Variant 1 approach (agent-level). Variant 2 (transport-level via
add_track_subscriber) to follow.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py (1)

419-422: Prefer a public capability contract over calling a private Agent method.

At Line 422, StreamEdge.join() depends on agent._needs_video(), which couples transport to Agent internals and makes refactors brittle. Consider exposing a public capability/property (or completing Variant 2 transport-local drain control) to keep boundaries clean.

As per coding guidelines, "Use leading underscore for private attributes and methods (_sessions, _warmup_agent). Public API is plain snake_case".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py`
around lines 419 - 422, StreamEdge.join currently calls the private Agent method
agent._needs_video(), coupling transport to Agent internals; change the call to
a public capability or property (e.g., agent.needs_video() or
agent.capabilities.get("video")) and implement that public accessor on the Agent
class so transports can query video capability without touching private members;
alternatively, implement the transport-local drain control variant and use that
public flag in StreamEdge.join instead of agent._needs_video().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py`:
- Around line 419-422: StreamEdge.join currently calls the private Agent method
agent._needs_video(), coupling transport to Agent internals; change the call to
a public capability or property (e.g., agent.needs_video() or
agent.capabilities.get("video")) and implement that public accessor on the Agent
class so transports can query video capability without touching private members;
alternatively, implement the transport-local drain control variant and use that
public flag in StreamEdge.join instead of agent._needs_video().

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6a385490-16f9-43a1-88b5-3acee27da4c9

📥 Commits

Reviewing files that changed from the base of the PR and between 6dadf78 and 6d850f3.

📒 Files selected for processing (2)
  • agents-core/vision_agents/core/agents/agents.py
  • plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py

…mers

Agent._on_track_added unconditionally called add_track_subscriber for
every video track, even for voice-only agents. This stopped the
stream-py drain and left an unread relay proxy queue growing unbounded.

Guard with _needs_video() so voice-only agents never subscribe.
Remove drain_video_frames from StreamEdge.join() - stream-py now
handles drain by default (companion PR stream-py#232).
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py`:
- Around line 419-422: The call in StreamEdge.join uses the private
Agent._needs_video(), so add a public Agent.needs_video_input() method that
returns the same boolean (delegating to the private implementation) and update
StreamEdge.join to call agent.needs_video_input() instead of
agent._needs_video(); ensure any other callers are updated and run/adjust tests
that reference the private method so the transport layer no longer depends on
Agent's private API.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c901036a-c0af-410a-9b09-10ab8f1f6df7

📥 Commits

Reviewing files that changed from the base of the PR and between 7337a70 and 59ce64f.

📒 Files selected for processing (2)
  • agents-core/vision_agents/core/agents/agents.py
  • plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • agents-core/vision_agents/core/agents/agents.py

@aliev aliev force-pushed the debug/video-memory-leak branch from 59ce64f to 7337a70 Compare April 2, 2026 11:28
@aliev aliev changed the title (WIP) feat: add drain_video_frames option to StreamEdge feat: add drain_video_frames option to StreamEdge Apr 2, 2026
@aliev aliev changed the title feat: add drain_video_frames option to StreamEdge feat: prevent OOM from unconsumed video frames in voice-only agents Apr 2, 2026
@aliev aliev merged commit 28eaf82 into main Apr 2, 2026
9 checks passed
@aliev aliev deleted the debug/video-memory-leak branch April 2, 2026 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants