OpenUTAU CLI#2162
Open
christianazinn wants to merge 9 commits into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds a headless CLI entry point (render) to the existing OpenUtau executable, enabling scripted/batch project rendering to WAV without launching the GUI. It introduces a headless host to initialize the normally-UI-owned managers/scheduler and adds a phonemizer “idle wait” API so phonemization can complete before mixdown rendering.
Changes:
- Add a
rendercommand routed fromOpenUtau/Program.csto run headlessly (including console attach on Windows). - Introduce
OpenUtau.Core.Headlesscomponents (HeadlessOpenUtauHost,HeadlessRenderCommand,HeadlessRenderer, scheduler/sync context, job/options models). - Extend
PhonemizerRunnerwithWaitForIdleAsync()to support awaiting phonemization completion in headless rendering, plus add CLI-focused tests.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| OpenUtau/Program.cs | Routes render invocations into headless execution path before Avalonia startup; adds console attach + NetMQ cleanup refactor. |
| OpenUtau.Test/Core/Headless/HeadlessRenderCommandTest.cs | Adds tests for CLI detection/parsing, render plan expansion, and override resolution/validation helpers. |
| OpenUtau.Core/Properties/AssemblyInfo.cs | Exposes internals to the test assembly for headless/CLI testing. |
| OpenUtau.Core/Headless/RenderJob.cs | Adds CLI job + options models for rendering and preference overrides. |
| OpenUtau.Core/Headless/HeadlessTaskScheduler.cs | Provides a single-threaded task scheduler + sync context to emulate UI-thread dispatch headlessly. |
| OpenUtau.Core/Headless/HeadlessRenderException.cs | Adds a dedicated exception type for headless rendering errors. |
| OpenUtau.Core/Headless/HeadlessRenderer.cs | Implements load/override/validate/await-phonemize/render mixdown flow for one project. |
| OpenUtau.Core/Headless/HeadlessRenderCommand.cs | Implements render command parsing, batch planning, and serial execution with exit codes. |
| OpenUtau.Core/Headless/HeadlessOpenUtauHost.cs | Initializes required managers/scheduler/subscriptions for headless rendering and captures errors/progress. |
| OpenUtau.Core/Api/PhonemizerRunner.cs | Adds WaitForIdleAsync() and request accounting so headless rendering can await phonemizer completion reliably. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+35
to
+38
| private readonly object pendingLock = new object(); | ||
| private int pendingRequests; | ||
| private Exception pendingException; | ||
| private List<TaskCompletionSource<object>> idleWaiters = new List<TaskCompletionSource<object>>(); |
Comment on lines
+239
to
244
| if (pendingRequests == 0) { | ||
| if (pendingException != null) { | ||
| var exception = pendingException; | ||
| pendingException = null; | ||
| return Task.FromException(exception); | ||
| } |
Comment on lines
+254
to
+257
| void CompleteRequest(Exception exception = null) { | ||
| List<TaskCompletionSource<object>> waiters = null; | ||
| Exception completedException = null; | ||
| lock (pendingLock) { |
Comment on lines
+277
to
281
| } | ||
| } else { | ||
| foreach (var waiter in waiters) { | ||
| waiter.SetResult(null); | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements command-line rendering for OpenUTAU.
This PR adds a headless
rendercommand to the existingOpenUtauexecutable so projects can be rendered from scripts without opening the GUI. It currently supports rendering one project to one WAV, or rendering all files in a folder to an output folder in batch processing (this is not parallelized - more on this below). Related: Discord #ideas post, #1615Since this is a somewhat new direction for OpenUTAU and will likely need to be added to the docs, I've done my best to provide as much detail as possible in this PR description.
Motivation
OpenUTAU provides an implementation for rendering vocal synths with a variety of backends, but it's only usable through a graphical interface. This makes it difficult to render files en masse for any reason, such as for DiffSinger "gacha rendering," where by leveraging the stochastic nature of the acoustic model, one can repeatedly render a segment until the tuning sounds right. Other uses include scripting renders, data generation, and automated regression workflows.
Usage
Single project:
Batch folder render:
When
--inputis a directory, files inside that directory are rendered serially to matching.wavfiles in the output directory. Supported project extensions in directory mode are the same as in the GUI (it dispatches to the sameFormats.LoadProjectunder the hood).Note that this doesn't scan subdirectories. It also runs serially because it seems like concurrency is messy, which means it doesn't provide as much runtime benefit as I would like, although this should still provide better performance than individual CLI invocations because you skip the setup/teardown cost of the headless host (more on this below). I'd like to look into parallelizing this in a later PR.
With overrides, e.g. to change the voicebank:
Supported override options:
--singer <id-or-name>--renderer <CLASSIC|WORLDLINE-R|WORLDLINE-R2|ENUNU|VOGEN|DIFFSINGER|VOICEVOX>--phonemizer <name-or-type>--resampler <name>--wavtool <name>--singers-path <path>--onnx-runner <CPU|DirectML|CoreML|NNAPI>--onnx-gpu <device-index>--diffsinger-depth <value>--diffsinger-steps <count>--diffsinger-variance-steps <count>--diffsinger-pitch-steps <count>--diffsinger-tensor-cache <true|false>If no override is provided, rendering uses the project’s saved settings and the same fallback behavior as GUI project loading/validation.
These overrides are applied to the whole project, i.e. every track gets the same override. Per-track shenanigans seem too complicated to implement for now, and I personally use different USTs for each vocal track, so I'm not too inclined to bother.
Implementation details
Conveniently, the basic tools we need to do programmatic rendering already exist, in the form of
Formats.LoadProjectto load a project file andPlaybackManager.RenderMixdownto render it to WAV. Surely, then, it's just a matter of piping one into the other and setting overrides, right? Wrong!RenderMixdownassumes there exist a bunch of utilities and managers that are started by the Avalonia app launch path. Therefore, we need to somehow launch these, or else risk running into weird problems like voicebanks not loading properly, as in this previous attempt at a CLI.Therefore, we need to do some scaffolding. This is accomplished in the
HeadlessOpenUtauHostwhich essentially handles the CLI context and thus allows us to implement the simple load-then-render approach. This rendering is handled through aHeadlessRenderCommandand routed to individualHeadlessRenderer.RenderOneAsynccalls to render each individual project file. Details below:Brief overview of roles of new classes
HeadlessOpenUtauHostToolsManager,SingerManager, andDocManagerPostOnUIThreadreplacement for command dispatchHeadlessRenderer.RenderOneAsyncFormats.ReadProjectPlaybackManager.RenderMixdownHeadlessRenderCommandRenderJobinstancesHeadlessOpenUtauHostThe CLI is routed from
OpenUtau/Program.csbefore Avalonia startup, and the normal GUI startup path is unchanged for all other invocations, so apart from some small changes to thePhonemizerRunner, this should not affect the GUI app at all. Speaking of which...Phonemization
Project validation can enqueue asynchronous phonemizer work, and for headless rendering, we need to make sure we await on all phonemizer responses before starting mixdown. To do this, we add a narrow idle-wait API to
PhonemizerRunner, in the form ofTask WaitForIdleAsync(), and we also track the idle waiters on each queued request, which are completed only after the main scheduler runs the response application task. This path shouldn't be touched by the GUI app at all, with the exception of a few functions that have been updated to accept the new API, which should be backward-compatible.Override behavior
CLI overrides apply to every track in the project, and in batch mode, every track in every project. Maybe later we can address tracks individually, or individual projects within a folder, but I feel that is outside the scope of this PR.
Please let me know if I have missed any pertinent config options to pass through to any of the backends.
Phonemizer resolution accepts:
JA VCVorDIFFS JARenderer names are validated before rendering. If the renderer exists but does not support the selected singer type, rendering fails with a clear error such as:
Batch rendering
Continued from the bottom of "Usage," batch mode reuses one headless host for all files in the command. which avoids repeating manager/plugin/singer/tool initialization for each render. However, since the existing render path uses process-wide manager, scheduler, cache, and notification state, I don't think it's safe to try rendering multiple projects concurrently in this PR. The internal renderer may still use its own existing parallelism. (To be looked into in a future PR?)
If one file fails in batch mode, the command reports the error, continues with the remaining files, and exits with code
1at the end.Exit codes
0: render completed successfully1: render failed, or at least one batch item failed2: invalid command-line usageTesting
I added a suite of CLI tests for what I thought was pretty comprehensive, including:
I've also tested it locally on my Windows machine in both single-file and batch modes, but only on UTAU; I have not yet tried DiffSinger or ENUNU.