[scheduler/cuebot] Bulk resource accounting by DiegoTavares · Pull Request #2198 · AcademySoftwareFoundation/OpenCue

DiegoTavares · 2026-03-06T23:38:25Z

This change shifts resource accounting (subscription, layer_resource, job_resource, folder_resource,
point tables) from incremental delta updates at dispatch/release time to periodic bulk re-computation
from the proc table. This affects both the Java Cuebot and the Rust scheduler.

Key changes:

Cuebot: Wraps existing incremental resource updates behind a dispatcher.scheduler_manages_resources feature flag
Scheduler: Replaces the delta-accumulate-and-flush pattern with periodic recompute_all_from_proc() and recalculate_subs()
New ResourceAccountingService: Periodic loop recomputing layer/job/folder/point resource tables
Simplified AllocationService: Removes pending_deltas mutex, DeltaKey/DeltaValue types, retry logic, and delta re-application after cache refresh

Attention: When dispatcher.scheduler_manages_resources=true, the scheduler service needs to be active to make sure resources (subscription, layer_resource, job_resource, folder_resource,
point tables) are updated from the proc table periodically.

Motivation

Each frame dispatch currently triggers updates across 5 resource-accounting tables, where concurrent dispatches contend for the same rows. This creates lock contention on the database that scales with dispatch volume. During crunch times, this contention has led to instability (deadlocks, slow dispatches, cascading timeouts).

Having multiple frames starting on the same subscription leads to lock contention when trying to update the subscription table. There was already a cache on the scheduler being used for reads, but writes were being dispatched on each frame update. To prevent lock contention, the chache is now updated on each dispatch and a flush happens on each cache update tick (defaults to each 3 seconds). When running multiple instances, this can lead to running slighly above allocation limits, but the recalculate_subs scheduled function and the trigger__verify_subscription should prevent big drifts from happening on the long run. Entire-Checkpoint: 059ff47f5f92

Entire-Checkpoint: 57fcdec6f3b4

Signed-off-by: Diego Tavares <dtavares@imageworks.com>

This change shifts resource accounting (subscription, layer_resource, job_resource, folder_resource, point tables) from incremental delta updates at dispatch/release time to periodic bulk recomputation from the proc table. This affects both the Java cuebot and the Rust scheduler. Key changes: 1. Java (cuebot): Wraps existing incremental resource updates behind a dispatcher.scheduler_manages_resources feature flag 2. Rust (scheduler): Replaces the delta-accumulate-and-flush pattern with periodic recompute_all_from_proc() and recalculate_subs() 3. New ResourceAccountingService: Periodic loop recomputing layer/job/folder/point resource tables 4. Simplified AllocationService: Removes pending_deltas mutex, DeltaKey/DeltaValue types, retry logic, and delta re-application after cache refresh

DiegoTavares · 2026-03-11T01:31:22Z

PR Assessment using Claude Code

PR Evaluation: Bulk Resource Accounting Effectiveness

Context

This PR replaces per-frame resource accounting updates with periodic bulk recomputation. The scheduler_manages_resources flag is global — when true, Cuebot skips ALL resource accounting updates (subscription, layer_resource, job_resource, folder_resource, point).

What the PR Removes from the Dispatch Path

Each frame dispatch previously executed 4-5 inline UPDATEs within the dispatch transaction:

UPDATE subscription SET int_cores = int_cores + ?
UPDATE layer_resource SET int_cores = int_cores + ?
UPDATE job_resource SET int_cores = int_cores + ?
UPDATE folder_resource SET int_cores = int_cores + ?
UPDATE point SET int_cores = int_cores + ?

These are now gone from the per-frame transaction. This is the core fix — it eliminates the primary source of row contention that scales with dispatch throughput.

What Remains in the Per-Frame Transaction

INSERT proc — no row contention (new rows)
UPDATE host — idle cores/memory decrement (mitigated by advisory locks per host)
UPDATE frame SET state='RUNNING' — fires trigger that updates layer_stat and job_stat

The layer_stat/job_stat trigger is the remaining hotspot: every frame in the same layer/job contends on these rows. However, Cuebot has the same trigger and handles 10 shows without issues, so this alone isn't the destabilizing factor — it was the combination of trigger contention + resource accounting contention that overwhelmed the database.

Effectiveness Assessment

Since scheduler_manages_resources is global, cross-system contention on resource tables is eliminated — Cuebot won't be doing inline updates.

Aspect	Before PR	After PR	Verdict
Per-frame resource UPDATEs	4-5 per frame, O(dispatch_rate)	0	Fixed
Transaction duration	Long (5+ UPDATEs + trigger)	Short (proc INSERT + host UPDATE + frame UPDATE)	Fixed
Subscription contention	Per-frame deltas, O(dispatch_rate)	Bulk recompute every 3s (no inline contention)	Fixed
Resource table contention	Per-frame, O(dispatch_rate)	Bulk recompute every 10s (no inline contention)	Fixed
layer_stat/job_stat triggers	Contended during long transactions	Contended during shorter transactions	Improved

Overall: The PR is effective. It addresses the root cause (high-concurrency dispatch generating O(dispatch_rate) contention on resource tables) and the shorter transactions also reduce lock hold time on layer_stat/job_stat.

Remaining Concerns (Minor)

1. `recalculate_subs()` Efficiency (LOW PRIORITY)

The scheduler calls recalculate_subs() every ~3s. This PL/pgSQL function:

Zeros ALL subscription rows (UPDATE subscription SET int_cores = 0)
Loops through proc aggregates, doing 3 queries per subscription (SELECT burst, UPDATE with burst bypass, UPDATE to restore burst)

This is not a contention issue (Cuebot won't be updating subscriptions), but it's unnecessarily expensive: it touches every subscription row even if nothing changed, and the row-by-row loop doesn't scale well with many subscriptions.

Optional improvement: Replace with a single bulk UPDATE (same pattern as resource_accounting_dao.rs) that also handles the burst-bypass. This would be cleaner and faster but is not blocking.

2. layer_stat/job_stat Trigger Contention (MONITOR)

This remains the only per-dispatch hot-row contention. The PR's shorter transactions help (trigger locks held for less time), but under extreme dispatch rates this could still surface. Worth monitoring but unlikely to be a problem given Cuebot handles 10 shows with the same triggers.

If it surfaces: Consider reducing cluster_buffer_size or job_buffer_size to throttle concurrent dispatch streams, or batch frame status updates per layer.

When a show or show.allocation is being served to the scheduler, only resources for that show should be recomputed on a schedule. Refactor allocation_dao into resource_accounting as both serve a similar purpose.

DiegoTavares added 4 commits March 6, 2026 15:05

Fix wrong column names

145fcec

Entire-Checkpoint: 57fcdec6f3b4

Reorder frame operations to reduce lock contention

2c484cf

Merge branch 'master' into sched_subs_lock

690d3be

Signed-off-by: Diego Tavares <dtavares@imageworks.com>

DiegoTavares changed the title ~~[scheduler] Accumulate subscription updates to avoid locks~~ [scheduler/cuebot] Bulk resource accounting Mar 10, 2026

DiegoTavares force-pushed the sched_subs_lock branch from ef9d907 to 13bdb9f Compare March 10, 2026 23:24

Scope resource updates to the active shows

984da29

When a show or show.allocation is being served to the scheduler, only resources for that show should be recomputed on a schedule. Refactor allocation_dao into resource_accounting as both serve a similar purpose.

DiegoTavares force-pushed the sched_subs_lock branch from 13bdb9f to 984da29 Compare March 11, 2026 19:15

DiegoTavares added 7 commits March 12, 2026 14:19

Optimize resource queries and clear compilation warnings

f58dd7d

Version up and minor improvements

1ddae44

Fix bug leading to CoreSize(0) bookings

dd2ebb2

Fix bug on record_booking cores

f45c750

Apply layer.core_limit cap on scheduler bookings

91e5fb7

Rename index to match the remaining indexes

89b2f82

Reformat file

e0e3b9a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[scheduler/cuebot] Bulk resource accounting#2198

[scheduler/cuebot] Bulk resource accounting#2198
DiegoTavares wants to merge 13 commits intoAcademySoftwareFoundation:masterfrom
DiegoTavares:sched_subs_lock

DiegoTavares commented Mar 6, 2026 •

edited

Loading

Uh oh!

DiegoTavares commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DiegoTavares commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Uh oh!

DiegoTavares commented Mar 11, 2026

PR Evaluation: Bulk Resource Accounting Effectiveness

Context

What the PR Removes from the Dispatch Path

What Remains in the Per-Frame Transaction

Effectiveness Assessment

Remaining Concerns (Minor)

1. recalculate_subs() Efficiency (LOW PRIORITY)

2. layer_stat/job_stat Trigger Contention (MONITOR)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

DiegoTavares commented Mar 6, 2026 •

edited

Loading

1. `recalculate_subs()` Efficiency (LOW PRIORITY)