node: Cap packet/frame pool memory via LimitedPool + MemoryLimiter #826

pr0gr3sR · 2026-02-08T04:14:54Z

Wire up the existing LimitedPool and MemoryLimiter infrastructure (added in 2023, never used in Context) to cap packet and frame buffer pool memory. This prevents unbounded memory growth (~137 MB/min) when roc_receiver_read() is not called, e.g. when a PipeWire node is suspended while UDP sources keep transmitting.

Changes:

context.h/cpp: Add MemoryLimiter and LimitedPool members that wrap the existing SlabPools. Pass limited pools to NetworkLoop. Default limits: 32 MB for packets, 8 MB for frames (0 = unlimited).
udp_port.cpp/h: Handle NULL buffer from alloc_cb_ gracefully in recv_cb_. When allocation fails, pause UDP receive and schedule a 200ms retry timer to avoid busy-waiting on a full pool.
memory_limiter.cpp: Change acquire-failure log from LogError to LogTrace since hitting the limit is expected behavior, not an error.

Description

roc-toolkit 0.4.0's receiver grows memory without bound when UDP packets arrive,
regardless of whether roc_receiver_read() is being called by the consumer. The
process grows from ~40 MB to 10-15 GB over hours, triggering the OOM killer.

Root cause: The receiver's UDP network thread (NetworkLoop::run()) allocates
packet buffers via UdpPort::alloc_cb_() and UdpPort::recv_cb_() for every
incoming packet. These buffers are stored in the SlabPool, which grows
exponentially and never frees memory. There is no back-pressure mechanism, no
queue size limit, and no way for the receiver to signal the network thread to
stop accepting packets when internal queues are full.

The problem is most severe when roc_receiver_read() is not called (e.g., when
the PipeWire node is suspended with no consumers), but also occurs during normal
operation due to session churn — the watchdog kills sessions during blank audio
periods, and new sessions are immediately created when packets arrive, each
cycle growing the slab pool.

I traced this using heaptrack with a debug build of roc-toolkit and roc-toolkit's
own debug logging (via roc_log_set_level(ROC_LOG_DEBUG)).

Environment

roc-toolkit: 0.4.0 (Arch package roc-toolkit 0.4.0-1; debug build from git commit d599961a)
PipeWire: 1.4.10-2 (using libpipewire-module-roc-source and libpipewire-module-roc-sink)
libuv: 1.51.0-1
OS: Arch Linux, kernel 6.12.67-1-lts (x86_64)
CPU: Intel i7-1195G7
RAM: 32 GB + 4 GB zram swap
Network: WiFi LAN, both machines on 192.168.1.0/24

Configuration

4 ROC endpoints loaded via PipeWire modules (2 receivers, 2 senders) for ham radio audio streaming between two machines:

# Receiver (roc-source) — ploptop listens, workstation sends
roc-source: local ports 10001-10003, fec=disable, latency=50ms
roc-source: local ports 10005-10007, fec=disable, latency=50ms

# Sender (roc-sink) — ploptop sends to workstation
roc-sink:   remote 192.168.1.151 ports 10101-10103, fec=disable, latency=50ms
roc-sink:   remote 192.168.1.151 ports 10105-10107, fec=disable, latency=100ms

The remote machine running the corresponding roc-sink/roc-source endpoints does NOT exhibit the leak.

Heaptrack Evidence

Run 1 — stripped binary (84 minutes)

total runtime: 5032.48s
peak heap memory consumption: 5.09 GB
total memory leaked: 5.09 GB

Run 2 — debug build (44 minutes)

total runtime: 2659.80s
calls to allocation functions: 174,946 (65/s)
peak heap memory consumption: 2.86 GB
peak RSS (including heaptrack overhead): 3.04 GB
total memory leaked: 2.86 GB

Leak rate: ~3.9 GB/hour (under heaptrack), ~700 MB/hour observed without heaptrack.

Leak path 1 — packet buffers (2.13 GB, 5 slab allocations)

roc::core::HeapArena::allocate()                    ← heap_arena.cpp:52
  roc::core::SlabPoolImpl::allocate_new_slab_()     ← slab_pool_impl.cpp:244
  roc::core::SlabPoolImpl::acquire_slot_()          ← slab_pool_impl.cpp:191
  roc::core::SlabPoolImpl::allocate()               ← slab_pool_impl.cpp:89
  roc::core::SlabPool<>::allocate()                 ← slab_pool.h:116
  operator new(unsigned long, roc::core::IPool&)    ← ipool.h:63
  roc::packet::PacketFactory::new_packet_buffer()   ← packet_factory.cpp:52
  roc::netio::UdpPort::alloc_cb_()                  ← udp_port.cpp:225
  libuv (uv__udp_recvmsg → uv_run)
  roc::netio::NetworkLoop::run()                    ← network_loop.cpp:279
  roc::core::Thread::thread_runner_()               ← thread.cpp:114

Leak path 2 — packets (731 MB, 5 slab allocations)

roc::core::HeapArena::allocate()                    ← heap_arena.cpp:52
  roc::core::SlabPoolImpl::allocate_new_slab_()     ← slab_pool_impl.cpp:244
  roc::core::SlabPoolImpl::acquire_slot_()          ← slab_pool_impl.cpp:191
  roc::core::SlabPoolImpl::allocate()               ← slab_pool_impl.cpp:89
  roc::core::SlabPool<>::allocate()                 ← slab_pool.h:116
  operator new(unsigned long, roc::core::IPool&)    ← ipool.h:63
  roc::packet::PacketFactory::new_packet()          ← packet_factory.cpp:56
  roc::netio::UdpPort::recv_cb_()                   ← udp_port.cpp:324
  libuv (uv__udp_recvmsg → uv_run)
  roc::netio::NetworkLoop::run()                    ← network_loop.cpp:279
  roc::core::Thread::thread_runner_()               ← thread.cpp:114

Session churn (10 KB, 58 sessions in 44 min)

ReceiverSessionGroup::create_session_()             ← receiver_session_group.cpp:379
  ReceiverSessionGroup::route_transport_packet_()   ← receiver_session_group.cpp:323
  ReceiverSessionGroup::route_packet()              ← receiver_session_group.cpp:82
  ReceiverEndpoint::pull_packets()                  ← receiver_endpoint.cpp:206
  ...
  roc_receiver_read()                               ← receiver.cpp:216

58 sessions created in 44 minutes (~1.3/min). Sessions are created and destroyed
repeatedly. roc-toolkit debug logging confirmed SSRCs are consistent (not randomly
changing); instead, the watchdog kills sessions due to blank audio, and new
sessions are immediately created when packets continue arriving.

Run 3 — with roc debug logging (6 minutes, 2026-02-07)

total runtime: 358.50s
calls to allocation functions: 25,030 (69/s)
peak heap memory consumption: 553.65 MB
peak RSS (including heaptrack overhead): 766.03 MB
total memory leaked: 0B (memory still held by SlabPool, just never freed to OS)

Growth rate: ~137 MB/min measured, ~117 MB/min theoretical
(2 receivers × 345 pkt/s × 2816 bytes/pkt).

roc-toolkit Debug Log Evidence (2026-02-07)

Enabled via roc_log_set_level(ROC_LOG_DEBUG):

20:51:25 [INF] session group: creating session: src=192.168.1.151:49033 dst=0.0.0.0:10005
20:51:25 [DBG] session router: SSRC does not exist, creating new route: ssrc=2871749397
20:51:28 [DBG] watchdog: status: .................ibb
20:51:28 [DBG] watchdog: status: bbbbbbbbbbbbbbbbD...
20:52:32 [INF] session group: creating session: src=192.168.1.151:52377 dst=0.0.0.0:10001
20:52:32 [INF] session group: removing session
20:52:32 [DBG] session router: removing route: ssrc cname="12ba0111..." (old FTdx10 session)
20:52:32 [INF] session group: creating session: src=192.168.1.151:49033 dst=0.0.0.0:10005

Key observations from debug logs:

SSRCs are consistent per sender (not randomly changing)
Sessions cycle ~1/min: created → blank audio → watchdog drop → removed → recreated
Watchdog status: .=OK i=init b=blank D=drop
The 7300-RX PipeWire node was suspended (no consumer connected)
Even the FTdx10-RX node (running) had blank audio periods

Analysis

Primary issue: No back-pressure on UDP receive path

The receiver's UDP network thread allocates a packet buffer for every incoming
UDP packet via UdpPort::alloc_cb_(). These are stored in SlabPool, which:

Uses exponential growth (slab_cur_slots_ *= 2, slab_pool_impl.cpp:231)
Has max_slab=0 (no limit on slab count or memory)
Never frees slabs — no deallocation, shrink, or reclamation path

When roc_receiver_read() is not called by the consumer (PipeWire node suspended),
packets accumulate in internal queues without limit. Even when it IS called, the
rate of packet allocation in the network thread exceeds the rate of consumption.

Growth rate math

packet.len=128 samples at 44.1kHz ≈ 345 packets/second per stream
packet_buffer_pool slot_size=2096 + packet_pool slot_size=720 = 2816 B/pkt
2 receivers × 345 pkt/s × 2816 B = ~1.94 MB/s = 117 MB/min (theoretical)
Measured: 137 MB/min (includes resampler/session overhead)

Session churn amplifies growth

Each session cycle (create → blank → drop → remove → recreate) grows the slab
pool because the new session's allocations overlap briefly with the old session's
cleanup, pushing the high-water mark. The SpeexResampler constructor allocates
~60 MB of frame buffers per session from the shared frame_buffer_pool.

Why the remote machine doesn't leak

Both machines run identical roc-toolkit 0.4.0-1 and pipewire 1.4.10-2. The remote
machine (zAI) is stable at 33 MB RSS after 3+ days. The difference is traffic
direction: zAI's roc-sources (receivers) get very few packets because the
laptop's TX sinks are suspended (nobody transmitting). The leak is proportional
to incoming UDP traffic volume.

OOM kill history (single boot, Feb 3-4 2026)

Time	PipeWire RSS at kill
Feb 03 21:19	~10.6 GB
Feb 03 22:45	~9.7 GB
Feb 04 01:37	~14.4 GB
Feb 04 04:29	~14.1 GB
Feb 04 07:21	~14.0 GB
Feb 04 10:12	~14.0 GB

Key observations

Leak is proportional to incoming UDP packet rate, not audio activity
Disabling ROC modules stops the leak completely
Remote machine does NOT leak (same roc-toolkit/pipewire versions) because its
receivers get very few packets (the laptop's senders are suspended)
Receiver (roc-source) leaks ~3x more than sender (roc-sink)
WirePlumber stable at ~20 MB (not involved)
Persists across pipewire 1.4.9 and 1.4.10 (bug is in libroc.so.0.4)
A slab reclamation patch (adding deallocation of fully-empty slabs) had
no effect — slabs never become fully empty because packet allocations
overlap across session cycles

Proposed fix (PR submitted)

I've implemented a fix that wires up roc-toolkit's existing LimitedPool +
MemoryLimiter infrastructure (added in 2023, never used in Context) to cap
packet pool memory. The fix is minimal (5 files, +94/-12 lines), fully backwards
compatible (default limits are 0 = unlimited for upstream), and uses only existing
roc-toolkit classes.

Files modified:

File	Change
`roc_node/context.h`	Add `MemoryLimiter` + `LimitedPool` members, `max_packet_pool_bytes` / `max_frame_pool_bytes` config fields
`roc_node/context.cpp`	Wire limiters into constructor, pass limited pools to `NetworkLoop`
`roc_netio/udp_port.h`	Add `recv_retry_timer_` and `recv_retry_cb_()`
`roc_netio/udp_port.cpp`	Handle NULL alloc in `recv_cb_()`, pause UDP recv + 200ms retry timer when pool full
`roc_core/memory_limiter.cpp`	Change acquire-failure log from `LogError` to `LogTrace` (expected behavior when limit active)

How it works:

LimitedPool wraps each SlabPool, calling MemoryLimiter::acquire() before allocating
When limit reached: allocate() returns NULL → alloc_cb_() returns buf->base=NULL
recv_cb_() detects NULL, calls uv_udp_recv_stop() to pause receiving
A 200ms one-shot timer restarts receiving — if pool is still full, pauses again
When packets are consumed and freed, MemoryLimiter::release() restores budget
Memory stabilizes at the configured cap instead of growing unbounded

Implementation notes:

uv_udp_recv_stop() must be called from recv_cb_(), NOT from alloc_cb_() — calling
it from alloc_cb_() causes libuv to NULL the recv callback pointer, and libuv then
SEGVs when it tries to call recv_cb_() after alloc_cb_() returns
The LogError in MemoryLimiter::acquire() was changed to LogTrace because hitting the
limit is expected behavior, not an error — at high packet rates (~300/sec) the log volume
alone pegs the CPU at 27%
Without the recv pause, libuv busy-waits (poll→alloc fail→poll) because the socket always
has data ready. The 200ms timer breaks the busy-wait loop.

Verified results (32 MB packet limit, 8 MB frame limit):

Memory: 81 MB stable (was growing ~137 MB/min → 10+ GB → OOM kill)
CPU: ~2.4% (comparable to pre-leak baseline)
Audio works normally with plenty of budget for active streams
Tested for 30+ minutes with continuous UDP traffic from 2 senders

Suggested additional improvements

Expose limits via public API: Add max_packet_pool_bytes / max_frame_pool_bytes
to roc_context_config so users can tune limits without recompiling
Pause reception when consumer is idle: If roc_receiver_read() hasn't been called
for a timeout period, stop UDP reception entirely (would be a PipeWire-side improvement)
Release empty slabs: After session cleanup, fully-empty slabs could be freed to the
OS. (Note: this alone does NOT fix the issue since slabs rarely become fully empty)

Reproducer

Create a roc_receiver (or use PipeWire libpipewire-module-roc-source)
Send continuous audio via roc-send or libpipewire-module-roc-sink from
another machine
Do NOT call roc_receiver_read() (or let the PipeWire node be suspended
with no consumers connected)
Monitor RSS: while true; do ps -o rss= -p $(pidof pipewire); sleep 60; done
Growth rate: ~137 MB/min with 2 receivers at 44.1kHz/128-sample packets
Even WITH active reading, memory grows due to session churn and packet
buffer accumulation outpacing consumption

Attachments

I can provide:

heaptrack dumps (4 captures: stripped, debug, patched, and with roc logging)
roc-toolkit debug logs showing session lifecycle and watchdog events
3-day memory monitoring CSV (3,165 samples at 1-minute intervals)
PipeWire module configuration files
LD_PRELOAD shim for enabling roc debug logging in PipeWire

Wire up the existing LimitedPool and MemoryLimiter infrastructure (added in 2023, never used in Context) to cap packet and frame buffer pool memory. This prevents unbounded memory growth (~137 MB/min) when roc_receiver_read() is not called, e.g. when a PipeWire node is suspended while UDP sources keep transmitting. Changes: - context.h/cpp: Add MemoryLimiter and LimitedPool members that wrap the existing SlabPools. Pass limited pools to NetworkLoop. Default limits: 32 MB for packets, 8 MB for frames (0 = unlimited). - udp_port.cpp/h: Handle NULL buffer from alloc_cb_ gracefully in recv_cb_. When allocation fails, pause UDP receive and schedule a 200ms retry timer to avoid busy-waiting on a full pool. - memory_limiter.cpp: Change acquire-failure log from LogError to LogTrace since hitting the limit is expected behavior, not an error. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

rocstreaming-bot · 2026-02-08T04:21:22Z

🤖 Pull request description does not have a link to an issue.
If there is a related issue, please add it to the description using any of the supported formats.

rocstreaming-bot · 2026-02-08T04:21:23Z

🤖 Pull request is not targeted to develop branch, which is usually wrong.
If this was not intentional, please rebase on fresh develop branch, force-push, and re-target pull request using github web interface. Remember to use rebase with force-push instead of regular merge.

pr0gr3sR closed this Feb 8, 2026

rocstreaming-bot added the contrib PR not by a maintainer label Feb 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

node: Cap packet/frame pool memory via LimitedPool + MemoryLimiter #826

node: Cap packet/frame pool memory via LimitedPool + MemoryLimiter #826

Uh oh!

pr0gr3sR commented Feb 8, 2026

Uh oh!

rocstreaming-bot commented Feb 8, 2026

Uh oh!

rocstreaming-bot commented Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

node: Cap packet/frame pool memory via LimitedPool + MemoryLimiter #826

node: Cap packet/frame pool memory via LimitedPool + MemoryLimiter #826

Uh oh!

Conversation

pr0gr3sR commented Feb 8, 2026

Description

Environment

Configuration

Heaptrack Evidence

Run 1 — stripped binary (84 minutes)

Run 2 — debug build (44 minutes)

Leak path 1 — packet buffers (2.13 GB, 5 slab allocations)

Leak path 2 — packets (731 MB, 5 slab allocations)

Session churn (10 KB, 58 sessions in 44 min)

Run 3 — with roc debug logging (6 minutes, 2026-02-07)

roc-toolkit Debug Log Evidence (2026-02-07)

Analysis

Primary issue: No back-pressure on UDP receive path

Growth rate math

Session churn amplifies growth

Why the remote machine doesn't leak

OOM kill history (single boot, Feb 3-4 2026)

Key observations

Proposed fix (PR submitted)

Suggested additional improvements

Reproducer

Attachments

Uh oh!

rocstreaming-bot commented Feb 8, 2026

Uh oh!

rocstreaming-bot commented Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants