Fix/actionhandler frozen mtc passthrough by ibiltari · Pull Request #11 · stagesoft/cuems-engine

ibiltari · 2026-05-04T17:22:13Z

Summary

When a post_go='go' chain reached an ActionCue with play or fade_in, the chain's MTC snapshot was dropped at every link of the dispatch path. The target then read live MTC inside CueHandler.go and drifted relative to the chain's other cues by however long the chain had been running.

Fix

Threads frozen_mtc_ms through every link:

run_actionCue forwards it to execute_action
ActionHandler.execute_action accepts and propagates; ActionHookContext carries it
All 8 _handle_* accept the parameter (most ignore)
_handle_play and _handle_fade_in pass it to ch.go(target, mtc, frozen_mtc_ms)
CueHandler.execute_action wrapper accepts and forwards

Test plan

2 new unit tests pin threading: frozen value reaches CueHandler.go; None for standalone ActionCues
All 52 existing ActionCue tests pass
Manual chain test: ActionCue('play', target=audio_cue) inside a chain — verify target's start_mtc matches chain origin's snapshot (not live MTC at action firing)

Discovery

Found during a chain-dispatch audit done while planning §5.1.2 (per-cue MTC snapshot) for a separate audio-cut fix.

Add Oschub in controller and None mode

coonect each player to the audiomixer

New ossia coms

improve task handling

Duplicate NNG load commands (from multiple editor clients or repeated requests) spawned concurrent load threads that armed the same cue twice. The second audioplayer failed at UDP bind but overwrote the first's tracking reference, orphaning it. The orphan held the JACK client name, so new players got a -01 suffix and audio was routed into the dead process. Add threading lock+flag in NodeEngine.load_project to reject concurrent loads while one is already in progress.

After engine restart, audioplayer processes from the previous instance survive as independent subprocesses. The new engine has no reference to them, so they steal JACK client names causing new players to get a -01 suffix and routing audio into the dead orphan. Add kill_orphaned_audio_processes() that pgrep's for audioplayer-cuems processes not tracked by the current engine and SIGKILL's them. Called during project load cleanup before arming new cues.

Swap the order of OSC commands sent to the videocomposer during GO: offset → mtcfollow → visible (was: offset → visible → mtcfollow). This ensures the videocomposer starts MTC sync and loads the correct frame while the layer is still invisible, preventing a stale frame from previous playback from flashing on screen. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Three fixes for multi-output video layer positioning: 1. Add VideoOutput.get_layer_scale() — returns uniform scale factor based on output region height vs canvas height, so video layers fit within smaller output regions (e.g. 1080p on a 4K canvas). 2. Send /scale OSC to videocomposer in arm_cue and run_cue when the output region is smaller than the canvas. 3. Fix Y-axis inversion in get_layer_placement() — the OpenGL FBO has Y=0 at the bottom, and the renderer negates Y. The old formula (output_cy - canvas_cy) placed layers at the wrong vertical position for outputs not centered on the canvas. Changed to (canvas_cy - output_cy) to compensate.

arm_videoCue() now sends /layer/load for the first output (driver) and /layer/load_shared for subsequent outputs, enabling the videocomposer to share a single decoder across all outputs of the same cue.

UI sliderToFloat() already sends 0.0-1.0 gain values; the engine was dividing by 100 again, which would clamp any realtime cue volume to the 0.0-0.01 range.

The per-cue /mtcfollow OSC control was accidentally removed during the blackout race-condition refactor (c858705). Without it the dmxplayer runs on its own internal clock and ignores MTC stopping, so queued scenes keep firing after the user hits STOP. - DmxPlayer: add _mtcfollow_param and enable/disable_mtcfollow() - run_cue: send /mtcfollow 1 before each DMX scene - NodeEngine: send /mtcfollow 0 on STOP before ola_set_dmx blackout

Replace "arm all at load" strategy with a duration-aware sliding window that arms 2 cues with meaningful duration ahead in the target chain. Short/zero-duration cues (ActionCue, short DMX) are armed but don't count toward the limit — the window walks past them. Key changes: - arm() rewritten with _loading sentinel to prevent race conditions (double arm_cue) and infinite recursion (non-local/disabled cycles) - ActionCue(play) + its target treated as 1 unit — arming an ActionCue with action_type='play' also arms _action_target_object - _effective_duration_ms calculates prewait + body + postwait per type - _arm_ahead walks the target chain, counting only cues >= 1000ms - go() and go_threaded() use _arm_ahead instead of fixed lookahead - initial_cuelist_process arms first cue + _arm_ahead (not all cues) - set_next_cue extends the arm window when user selects a cue - Safety net in go() re-arms with Logger.warning if pre-arm missed Fixes ActionCue not re-firing on cuelist loop (ClickUp 869cqzr67). Also reduces VRAM usage for large go_at_end projects (~3-5 layers loaded instead of all).

CuemsDeploy._create_deploy_log() failed with Permission denied when /tmp/cuems was missing or owned by root. Add os.makedirs() to create the directory with correct ownership before writing.

connect_to_jack() had no retry mechanism unlike connect_player_to_mixer(). Add port_exists() checks with retries so it survives race conditions when jack-volume hasn't registered its ports yet.

DmxCue and CueList types are armed but never create player entries, so remove_cue_player() logged spurious ERRORs on every disarm. Change to DEBUG and return early when no player is found.

Send an early ADD notification before the pre-wait sleep so the controller sets status=1 and the UI colors the cue immediately. Guard with _stop_requested checks to handle stop-during-prewait cleanly and bail out if stop arrives while sleeping.

- ossia: fix Impulse push_value(None) crash by sending True instead - ossia: force RepetitionFilter OFF for Impulse parameters (prevents silent suppression of repeated sends — root cause of reset not reaching videocomposer) - stop_playback: call stop_all_cues() before cleanup so loop_cue threads exit before DMX blackout and video reset - load_project: add video reset and DMX mtcfollow disable (were missing — stale layers survived project reload) - PlayerHandler: upgrade reset error log from DEBUG to WARNING

- DmxPlayer: send /blackout impulse to dmxplayer to clear its internal fade engine (scenes queue + active transitions), fixing HTP merge issue where dmxplayer's fading values overrode ola_set_dmx zeros - endpoints: add /blackout Impulse endpoint to OSC_DMXPLAYER_CONF - CueHandler: check _stop_requested before firing post_go and go_at_end chains, preventing DMX cue cascade after STOP - NodeEngine: call stop_all_cues() before cleanup in load_project

…ccess When two threads try to arm the same cue simultaneously (e.g. _arm_ahead from one loop and _handle_play from another), the _loading sentinel caused the second caller to return False immediately, breaking the loop chain permanently with "Target could not be armed". Replace the boolean _loading sentinel with a threading.Event so concurrent callers block until the in-progress arm completes, then return the result.

Allow the UI to toggle cue enabled/disabled at show time via /engine/command/cue_enabled. Engine broadcasts confirmation at /engine/status/cue_enabled/<uuid>. Show-time overrides are in-memory only and reset to XML values on project reload. Includes: - WebSocket on_connect callback for late-join state dump - NNG feedback from NodeEngine to sync ActionCue-driven toggles - Project generation counter to abort stale daemon arm threads - cue.enabled check in go_script to prevent firing disabled cues - Async re-arm in daemon thread to avoid blocking GO command

Disabled cues are now invisible to the execution engine: - go_script: advances next_cue_pointer past disabled cues on GO - ready_script: sets initial pointer to first enabled cue - _handle_cue_enabled: recalculates pointer when nextcue is disabled - CueHandler.go: enabled gate rejects disabled cues (returns None) - go_threaded: guards wait_for_cue against None from disabled targets - _arm_ahead: skips disabled cues in lookahead window - arm: skips recursive arm of disabled post_go='go' targets - initial_cuelist_process: arms first enabled cue, not contents[0] - run_cueList: dispatches run_cue() on first enabled child instead of calling nonexistent .go() method - is_playing: checks loaded instead of not _stop_requested, matching comment intent (loaded=False after disarm means not playing) All checks happen at execution time for real-time disable/enable during a running show. A disabled ActionCue will not execute its action target. All 50 tests pass.

…s-dmxplayer Track the player-binary rename happening across the ecosystem. Runtime-critical updates: - PlayerHandler.kill_orphaned_audio_processes pgrep filter - scripts/kill_cuems.sh and kill_cuems_processes.py patterns - dev/test_xml_files/settings.xml <path> values Docstrings and comments in AudioPlayer, AudioMixer, DmxPlayer, PlayerHandler, mock_audioplayer, mock_dmxplayer updated to match.

The dev copy had drifted far from the production unit in cuems-common and was never installed on any system. Specifically: - pkill -u stagelab (prod uses -u cuems) - ExecStart=/home/ion/.pyenv/…/python3 /home/ion/src/… (hardcoded to a different developer's machine; also points at an obsolete scripts/node_engine.py entrypoint) - no User=/Group=, no hardening directives, missing jackd-cuems.service dependency The production unit at cuems-common/etc/systemd/system/cuems-node-engine.service is the single source of truth. Developers can read that file or install the cuems-common package.

loop_audioCue and loop_videoCue rebuilt _start_mtc via CTimecode(start_seconds=_end_mtc.milliseconds/1000), losing one frame of the target framerate each iteration (40ms at 25fps MTC) through the ms->s->frames round-trip. Replace with direct frame-count assignment to skip the lossy conversion. Zero drift verified across 24/25/30fps over 10+ iterations. Keep Phase 1 diagnostic _dbg instrumentation at /offset send points as permanent infrastructure. Add tests/test_loop_rebase.py pinning both the fix (zero drift) and the pre-fix behaviour (-40ms/iter at 25fps) as a regression sentinel.

Fixture was missing required <id> elements on every audio/video/dmx output and <canvas_region> on every video output. project_mappings.xsd has required these since the canvas_region addition, but the fixture had drifted. When engine tests set CUEMS_CONF_PATH to this directory and ConfigManager.load_net_and_node_mappings() falls back to the default mappings, BaseEngine would exit(-1) on schema validation.

settings.xsd VideoPlayerType requires an <outputs> child (positive integer). The dev fixture was stale and caused BaseEngine.load_config() to exit(-1) during test startup, cascading into 7 SystemExit errors across test_core_baseengine_status.

Adds required <id> on every put, <canvas_region> on every video output, and the required (possibly empty) <new_nodes> element — all required by project_mappings.xsd but missing from the fixture.

Parameterized XSD check over every config fixture under dev/test_xml_files/ that BaseEngine loads when tests set CUEMS_CONF_PATH there. Catches drift the moment a schema gains a required element and a fixture is not updated to match.

NodeEngine reads output_latency_ms from node_conf for audioplayer and dmxplayer. If the value is a Python int (operator set a specific ms value in settings.xml), appends --output-latency-ms <int> to the args string when spawning the process. If the value is "auto" or absent, no flag is emitted and the binary uses its built-in default/auto path (audioplayer: JACK query; dmxplayer: 35 ms Phase-5A default). Videocomposer is not a subprocess of the engine (systemd service); its args come from its systemd unit, so its --output-latency-ms flag is wired in cuems-common instead of here. isinstance(value, int) distinguishes int from the "auto" string returned by xmlschema's union decoder — the typing contract pinned by test_output_latency_ms_type_round_trip in cuems-utils.

Seeds the dev settings.xml template with output_latency_ms values and inline comments documenting per-component semantics: - videoplayer: "auto" (systemd drop-in documents override path) - audioplayer: "auto" (JACK query) - dmxplayer: 35 ms with per-adapter starting-point table (ENTTEC/ DMXKing, OpenDMX, ArtNet) Operators use this file as a starting point for /etc/cuems/settings.xml on each node. Semantics mirror cuems-utils settings.xsd.

…flag When <args></args> is empty in settings.xml, xmlschema decodes it as Python None (not ''). The original f-string f'{args} --output-latency-ms {value}' produced a literal "None --output-latency-ms 42" token sequence which leaked into the spawned argv of dmxplayer (dmx has empty default args in the shipped settings.xml). Normalize args to '' via `args = args or ''` before concatenation and strip trailing whitespace on the returned value. Verified against 7 shapes of (args, output_latency_ms) including the None/int combination that triggered this on dev node 02 during Phase-5B manual test today.

Undoes pollution from my previous commit (79a9a9e) which accidentally used `git add -A` and slurped in ~96 files of debuild output plus local IDE/Claude state. Adds .gitignore entries so a future `git add -A` in a freshly-built tree stays clean. Files untracked (still present on disk): - debian/.debhelper/ — dh build stamps - debian/cuems-engine/ + debian/cuems-engine-mock/ — staging dirs - debian/*.substvars, debian/*.debhelper, debian/debhelper-build-stamp, debian/files — per-build artifacts - .claude/ + CLAUDE.md — local IDE state The useful change from 79a9a9e (NodeEngine.py None-args fix) stays intact.

Closes the "audioplayer not observed live" gap from the 2026-04-23 Phase-5 manual round-trip test. Audio is spawned on-demand per cue, so no audio process was running during the dev-node test — only video and dmx were observed receiving --output-latency-ms 42. Two test classes, 11 tests: - TestAppendOutputLatencyFlag: helper under every cross product of (args: string | '' | None, output_latency_ms: int | 'auto' | absent). Includes the None-args + int combination that produced a literal "None" token on dev node 02 (fixed in 79a9a9e) — asserts 'None' is not in the result as a regression gate. - TestSubprocessArgvComposition: mirrors the `args.split()` loop in DmxPlayer.run() / AudioPlayer.run() to prove the helper's output survives intact into the final subprocess argv. Covers the audioplayer `-w -1` shape end-to-end. Audio's spawn path is structurally identical to dmx's (same helper, same split loop, same argv extension), so these tests guarantee by construction that a live audio cue will land `--output-latency-ms` in the correct argv position. Faster than waiting for the next audio cue to fire on a live node and covers edge cases that happen-to-work manual tests wouldn't hit.

When a post_go='go' chain reached an ActionCue with action_type='play' or 'fade_in', the chain's MTC snapshot was dropped along the dispatch path: run_actionCue(cue, mtc, frozen_mtc_ms) -> ACTION_HANDLER.execute_action(cue, mtc) [drop] -> handler(ch, target, mtc) [drop] -> ch.go(target, mtc) [drop] -> CueHandler.go captured live MTC The target then read live MTC and drifted relative to the chain's other cues by however long the chain had been running. Threads frozen_mtc_ms through every link: - run_actionCue forwards it to execute_action. - ActionHandler.execute_action and the dataclass ActionHookContext carry it. - All 8 _handle_* functions accept frozen_mtc_ms (most ignore it). - _handle_play and _handle_fade_in pass it to ch.go(target, mtc, frozen_mtc_ms). - CueHandler.execute_action wrapper accepts and forwards it too. Two unit tests pin the threading: a frozen value reaches CueHandler.go; None reaches it for standalone (non-chain) ActionCues.

adria and others added 30 commits September 11, 2025 03:42

fix: no update on OSCDevice

62b2387

test: hardcoded initial arming

a1e61d7

test: limit cue checking

b4ad6db

test: revert node endpoints

c7b8d96

test: GO as Impulse

49d7287

test: gocue from go

83ed496

First play!

dc9ffcc

add mtcmaster source file

e3c007b

feat: loop through duration

7bc8eb6

fixx loop

3d34c0d

fix: loop with now

bc90be4

fix: negative offset on loop_cue

4141e57

fix remove cue

699c821

mtcmaster opens thread, move to start

20ef6fa

add env to players so they cand find X display

2fb4fc0

feat: threading handled on cue go | dev: increase loop logging

21186d3

dev: logging post_go

e6de5e6

only one player per video dev

5cd3a4b

mtclistener start thread in start() not init

30a4a4b

format: daemon parameter to new line

8702986

feat: cursor mdc rules

d846ca7

partial osc modification

750ce71

Add communications mode CONTROLLER|NODE

dc3893a

Add Oschub in controller and None mode

HWdicovery and nodeconf request methods

c724b9d

start audiomixer in each node

035c967

coonect each player to the audiomixer

Rename files

1b9de9c

Merge pull request #6 from stagesoft/new_ossia_coms

b3eeb41

New ossia coms

move Osc_nodes_hub from utils to engine

2691482

dev: new utils release and testing.mdc for python

262bd47

move json encoding/decoding to base class

a43703b

improve task handling

ibiltari and others added 30 commits March 27, 2026 17:20

feat(shared-decoder): send load_shared for secondary video outputs

188037c

arm_videoCue() now sends /layer/load for the first output (driver) and /layer/load_shared for subsequent outputs, enabling the videocomposer to share a single decoder across all outputs of the same cue.

fix(audio): remove double volume conversion in realtime cue routing

9de0d1a

UI sliderToFloat() already sends 0.0-1.0 gain values; the engine was dividing by 100 again, which would clamp any realtime cue volume to the 0.0-0.01 range.

fix(deploy): ensure tmp directory exists before writing rsync log files

a934a3b

CuemsDeploy._create_deploy_log() failed with Permission denied when /tmp/cuems was missing or owned by root. Add os.makedirs() to create the directory with correct ownership before writing.

fix(audio): add retry logic to connect_to_jack() for startup race

5399a59

connect_to_jack() had no retry mechanism unlike connect_player_to_mixer(). Add port_exists() checks with retries so it survives race conditions when jack-volume hasn't registered its ports yet.

fix(players): downgrade 'player not found' to debug with early return

9a9523e

DmxCue and CueList types are armed but never create player entries, so remove_cue_player() logged spurious ERRORs on every disarm. Change to DEBUG and return early when no player is found.

fix(test-fixtures): update project_mappings.xml to match schema

dfcf21f

Adds required <id> on every put, <canvas_region> on every video output, and the required (possibly empty) <new_nodes> element — all required by project_mappings.xsd but missing from the fixture.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/actionhandler frozen mtc passthrough#11

Fix/actionhandler frozen mtc passthrough#11
ibiltari wants to merge 468 commits into
mainfrom
fix/actionhandler-frozen-mtc-passthrough

ibiltari commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ibiltari commented May 4, 2026

Summary

Fix

Test plan

Discovery

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants