feat: port dynamic load-based idle process scaling from Python SDK#1512
feat: port dynamic load-based idle process scaling from Python SDK#1512KrishnaShuk wants to merge 2 commits into
Conversation
🦋 Changeset detectedLatest commit: 89d525f The changes in this PR will be included in the next version bump. This PR includes changesets to release 31 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
| if (isFull) { | ||
| this.#procPool.setTargetIdleProcesses(this.#opts.numIdleProcesses); |
There was a problem hiding this comment.
🔴 When worker is at full load, target idle processes is set to maximum instead of zero
In worker.ts:686-687, when isFull is true (worker load >= threshold), the code sets targetIdleProcesses to this.#opts.numIdleProcesses (the configured maximum). This is inverted — when the system is overloaded, idle processes should be reduced to 0, not maximized. The else branch correctly computes a lower target as load increases (proportional to remaining capacity), but the isFull branch jumps to max, defeating the purpose of load-based scaling and causing the pool to spawn/maintain idle processes even when the worker is at full capacity, wasting resources and potentially worsening the overload.
| if (isFull) { | |
| this.#procPool.setTargetIdleProcesses(this.#opts.numIdleProcesses); | |
| if (isFull) { | |
| this.#procPool.setTargetIdleProcesses(0); |
Was this helpful? React with 👍 or 👎 to provide feedback.
| } | ||
| }); | ||
| while (!signal.aborted) { | ||
| const currentPending = this.warmedProcQueue.items.length + this.spawnTasks.size; |
There was a problem hiding this comment.
🔴 spawnTasks overcounts by including tasks for processes already running jobs, preventing idle pool replenishment
In proc_pool.ts:142, currentPending is computed as warmedProcQueue.items.length + spawnTasks.size. However, spawnTasks includes tasks for processes that have already been consumed from the queue by launchJob() and are currently running jobs (still awaiting proc.join() at proc_pool.ts:115). This causes double-counting: idle processes appear both in items.length and spawnTasks, and running processes appear in spawnTasks despite no longer being idle. As a result, toSpawn = target - currentPending is systematically too low, and the pool never spawns replacement idle processes after jobs consume them.
Concrete scenario showing the bug
With numIdleProcesses=4, after 4 processes are warmed: queue=4, spawnTasks=4 (waiting on join). After 4 jobs consume all processes: queue=0, spawnTasks=4. A 5th job calls launchJob() where spawnTasks.size (4) < jobsWaitingForProcess (1) is false, so no demand spawn occurs. The run() loop computes currentPending=4, target=4, toSpawn=0. The 5th job blocks on warmedProcQueue.get() indefinitely until a prior job finishes — a regression from the old MultiMutex approach where entry.unlock() immediately freed a slot.
Prompt for agents
The root cause is that spawnTasks includes tasks for processes currently running jobs (they stay in spawnTasks until proc.join() resolves), so spawnTasks.size conflates initializing processes, idle processes, and running-job processes.
In the old code, the MultiMutex slot was released via entry.unlock() the moment a process was consumed from the queue, immediately allowing a replacement to be spawned. The new code needs an equivalent mechanism.
Possible approaches:
1. Track a separate counter or set for processes that are actively running jobs (post-consumption). Subtract those from currentPending or exclude them from spawnTasks.
2. Remove the spawn task from spawnTasks when its process is consumed from warmedProcQueue, and track the running-job lifecycle separately.
3. Use a different signal (like the queue's consumption event) to trigger replacement spawning, rather than relying on polling with spawnTasks.size.
The fix should also update the demand-driven check in launchJob() (line 69) which has the same overcounting issue: spawnTasks.size < this.jobsWaitingForProcess should compare against tasks that haven't yet produced a consumable process, not all spawn tasks.
Was this helpful? React with 👍 or 👎 to provide feedback.
Description
ref: #1449
This PR ports the dynamic idle process scaling logic from the Python SDK (
livekit-agents) into the JS SDK, ensuring the worker dynamically scales down idle processes to prevent over-allocation on large multicore systems under heavy load.Pre-Review Checklist
Testing
restaurant_agent.tsandrealtime_agent.tswork properly (for major changes)Additional Notes
Note to reviewers: Please ensure the pre-review checklist is completed before starting your review.