feat: prevent OOM from unconsumed video frames in voice-only agents#458
feat: prevent OOM from unconsumed video frames in voice-only agents#458
Conversation
Pass video_buffered=False to disable unbounded video frame buffering. Defaults to True (current behavior). Propagates through rtc.join() to ConnectionManager → PeerConnectionManager → SubscriberPeerConnection. Voice-only agents should use video_buffered=False to prevent OOM from accumulating unconsumed video frames (~400 MiB/10sec).
📝 WalkthroughWalkthroughStreamEdge.join now passes an additional boolean argument Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py`:
- Around line 155-158: Validate that the constructor argument drain_video_frames
is a bool and annotate the signature: change def __init__(self,
drain_video_frames: bool, **kwargs: typing.Any) -> None, and at the start of
__init__ add a runtime check like if not isinstance(drain_video_frames, bool):
raise ValueError("drain_video_frames must be a bool") so non-bool values are
rejected early (this protects later uses such as in the code path referenced by
_drain_video_frames at line ~423); also annotate **kwargs with typing.Any and
the return type as None for consistency.
- Around line 420-423: The rtc.join(...) call in stream_edge_transport.py passes
an unsupported keyword drain_video_frames; either remove the drain_video_frames
argument from the rtc.join call (and any related property _drain_video_frames
usage) to stay compatible with released stream-py, or update the
getstream[webrtc,telemetry] dependency in pyproject.toml to point to the
development git reference/branch that provides drain_video_frames (e.g., set the
package to the git URL with branch "audio-more") so the runtime supports
rtc.join(..., drain_video_frames=...). Ensure you modify only the rtc.join
invocation and/or the pyproject.toml dependency entry and keep surrounding call
parameters (call, agent.agent_user.id, subscription_config) unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5a508231-3f96-415c-a5d1-222a80af8d77
📒 Files selected for processing (1)
plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py
plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py
Outdated
Show resolved
Hide resolved
plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py
Outdated
Show resolved
Hide resolved
Remove manual drain_video_frames flag from StreamEdge constructor. Instead, derive it from agent config: drain when no video_processors and no VideoLLM. Prevents OOM from unbounded aiortc frame queue in voice-only agents (~400 MiB/10sec growth). Variant 1 approach (agent-level). Variant 2 (transport-level via add_track_subscriber) to follow.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py (1)
419-422: Prefer a public capability contract over calling a private Agent method.At Line 422,
StreamEdge.join()depends onagent._needs_video(), which couples transport to Agent internals and makes refactors brittle. Consider exposing a public capability/property (or completing Variant 2 transport-local drain control) to keep boundaries clean.As per coding guidelines, "Use leading underscore for private attributes and methods (
_sessions,_warmup_agent). Public API is plain snake_case".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py` around lines 419 - 422, StreamEdge.join currently calls the private Agent method agent._needs_video(), coupling transport to Agent internals; change the call to a public capability or property (e.g., agent.needs_video() or agent.capabilities.get("video")) and implement that public accessor on the Agent class so transports can query video capability without touching private members; alternatively, implement the transport-local drain control variant and use that public flag in StreamEdge.join instead of agent._needs_video().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py`:
- Around line 419-422: StreamEdge.join currently calls the private Agent method
agent._needs_video(), coupling transport to Agent internals; change the call to
a public capability or property (e.g., agent.needs_video() or
agent.capabilities.get("video")) and implement that public accessor on the Agent
class so transports can query video capability without touching private members;
alternatively, implement the transport-local drain control variant and use that
public flag in StreamEdge.join instead of agent._needs_video().
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6a385490-16f9-43a1-88b5-3acee27da4c9
📒 Files selected for processing (2)
agents-core/vision_agents/core/agents/agents.pyplugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py
…mers Agent._on_track_added unconditionally called add_track_subscriber for every video track, even for voice-only agents. This stopped the stream-py drain and left an unread relay proxy queue growing unbounded. Guard with _needs_video() so voice-only agents never subscribe. Remove drain_video_frames from StreamEdge.join() - stream-py now handles drain by default (companion PR stream-py#232).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py`:
- Around line 419-422: The call in StreamEdge.join uses the private
Agent._needs_video(), so add a public Agent.needs_video_input() method that
returns the same boolean (delegating to the private implementation) and update
StreamEdge.join to call agent.needs_video_input() instead of
agent._needs_video(); ensure any other callers are updated and run/adjust tests
that reference the private method so the transport layer no longer depends on
Agent's private API.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c901036a-c0af-410a-9b09-10ab8f1f6df7
📒 Files selected for processing (2)
agents-core/vision_agents/core/agents/agents.pyplugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py
🚧 Files skipped from review as they are similar to previous changes (1)
- agents-core/vision_agents/core/agents/agents.py
plugins/getstream/vision_agents/plugins/getstream/stream_edge_transport.py
Outdated
Show resolved
Hide resolved
59ce64f to
7337a70
Compare
Why
Voice-only agents receive video tracks from WebRTC but never consume the frames. aiortc's
RTCRtpReceiveruses an unbounded queue for decoded video frames (aiortc#554), causing ~400 MiB/10sec growth and OOM within minutes.Changes
Agent._needs_video()to check if agent has video consumers (VideoProcessors or VideoLLM)_on_track_addedwhen agent doesn't need video, so stream-py's drain keeps runningdrain_video_framesflag fromStreamEdgeconstructor (stream-py now drains by default)How it works
stream-py attaches a MediaBlackhole to every incoming video track by default, draining unconsumed frames. When Agent needs video, it calls
add_track_subscriber()which stops the drain for that track and returns a real proxy. Voice-only agents never calladd_track_subscriber, so drain runs indefinitely.Benchmark (local k8s, 30sec video call)
Companion PR