Skip to content

Supports NEUTRINO#2136

Open
rokujyushi wants to merge 22 commits into
openutau:masterfrom
rokujyushi:neutrino
Open

Supports NEUTRINO#2136
rokujyushi wants to merge 22 commits into
openutau:masterfrom
rokujyushi:neutrino

Conversation

@rokujyushi
Copy link
Copy Markdown
Contributor

Please merge this pull request first.

This pull request introduces NEUTRINO engine support to OpenUtau, adding a new singer type, phonemizer, and related infrastructure. The core changes involve implementing the NEUTRINO singer class, phonemizer logic, and a server launcher for background processing. Additional updates ensure integration with the rendering pipeline and singer type adjustments.

NEUTRINO Engine Integration:

  • Added a new NeutrinoSinger class implementing the USinger interface, including logic to load NEUTRINO-specific voicebank data, parse configuration and table files, and provide phoneme and oto data. (OpenUtau.Core/Neutrino/NeutrinoSinger.cs)
  • Introduced a NEUTRINO-specific phonemizer (NeutrinoLabelPhonemizer), handling dictionary loading, G2P (grapheme-to-phoneme) logic, phrase adjustments, and score sending to the NEUTRINO engine, with support for different NEUTRINO versions. (OpenUtau.Core/Neutrino/NeutrinoLabelPhonemizer.cs)
  • Implemented a NeutrinoServerLauncher utility to manage NEUTRINO server processes in the background, ensuring they are started, monitored, and stopped appropriately. (OpenUtau.Core/Neutrino/NeutrinoServerLauncher.cs)

Integration and Pipeline Adjustments:

  • Updated the singer loader to recognize and instantiate the new NeutrinoSinger type when appropriate. (OpenUtau.Core/Classic/ClassicSingerLoader.cs)
  • Modified the rendering pipeline to pass the note index to RenderPhone for compatibility with engines like NEUTRINO that may require note position information. (OpenUtau.Core/Render/RenderPhrase.cs) [1] [2] [3]

About Support Content

Version Feature Support
v2.7.x NSF Vocoder
WORLD Vocoder
WORLDLINE
Style Shift
Formant Shift
Breathiness
Smooth Pitch
v3.x.x Diffusion Vocoder
Style Shift
Supported Models X (Support planned in future updates)
Smooth Pitch

rokujyushi added 13 commits May 18, 2026 16:51
Changes have been implemented to support NEUTRINO type singers and renderers.

- Added USingerType.Neutrino to ClassicSingerLoader.cs.
- Added NeutrinoLabelPhonemizer.cs and implemented label generation.
- Added NeutrinoRenderer.cs and implemented audio rendering.
- Added NeutrinoServerLauncher.cs and implemented server management.
- Added NeutrinoSinger.cs and implemented NEUTRINO singers.
- Added the NEUTRINO renderer to Renderers.cs.
- Added USingerType.Neutrino to USinger.cs.
- Added locale settings and background process launching to ProcessRunner.cs.
- Added TomlData.cs and implemented TOML file reading.
- Added TomlDataTests.cs and implemented tests for TOML utilities.
- Added the NEUTRINO type option to SingersViewModel.cs.
The test assertions within the MeasureForwardBackwardAreComputedPerBar method have been fixed. The expected values for the indices e0, e1, and e2 have been modified, and the following items were updated:

- Forward Index (e10): Fixed expected values for e0[9], e1[9], and e2[9].
- Backward Index (e11): Fixed expected values for e0[10], e1[10], and e2[10].
- Forward Percent (e16): Fixed expected values for e1[15] and e2[15].
- Backward Percent (e17): Fixed expected values for e0[16] and e1[16].

As a result, the test's expected values have been updated to align with the specifications.
…nting everything in the subclasses

Change HTSLabelRenderer: SetUp method to abstract

The SetUp method has been changed from virtual to abstract.
As a result, all subclasses are now required to implement SetUp.
The original logic within the SetUp method (initialization of phoneDict, language settings, dictionary loading, etc.) has been removed, and these responsibilities are now delegated to the subclasses.
Added a process to adjust the pitch based on the toneShift value of phrase.phones[0] when generating RenderPitchResult.
This allows the pitch to be shifted in units of 12-interval semitones, ensuring accurate pitch calculation.
In NeutrinoServerLauncher.cs, I changed the log output to string interpolation to improve readability.

In ProcessRunner.cs, I removed the GetLanguageEnvironmentValue method and set the LANG environment variable to a fixed value of "ja_JP.utf8".

Additionally, I updated the code to dynamically control RedirectStandardOutput based on the DebugSwitch.
Path handling has been unified into a platform-independent format using Path.Join. This removes Windows-style path separators and improves code readability and portability.
Copilot AI review requested due to automatic review settings May 20, 2026 05:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds NEUTRINO engine support to OpenUtau by introducing a NEUTRINO singer/renderer/phonemizer stack and by extending the HTS label infrastructure so NEUTRINO (and other HTS-label-driven engines) can integrate with the render pipeline.

Changes:

  • Add NEUTRINO integration: new NeutrinoSinger, NeutrinoRenderer, NeutrinoServerLauncher, and NEUTRINO HTS-label phonemizer.
  • Refactor/centralize HTS utilities into OpenUtau.Core/Util and add HTS/TOML unit tests.
  • Update rendering pipeline to carry note index into RenderPhone, and wire new singer type + renderer support into UI and loaders.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
OpenUtau/ViewModels/SingersViewModel.cs Adds neutrino to singer type menu options.
OpenUtau.Test/Plugins/HtsLabelPhonemizerTest.cs Adds tests for HTS label phonemizer pipeline and basic synthesis plumbing.
OpenUtau.Test/Core/Util/TomlDataTests.cs Adds tests for new lightweight TOML parser.
OpenUtau.Test/Core/Util/HtsSpecTests.cs Adds tests validating HTS note/phoneme context behavior.
OpenUtau.Plugin.Builtin/EnunuOnnx/HTS.cs Removes legacy HTS implementation from plugin (moved/refactored to Core).
OpenUtau.Plugin.Builtin/EnunuOnnx/EnunuOnnxPhonemizer.cs Updates EnunuOnnx phonemizer to new Core HTS types/contexts.
OpenUtau.Core/Util/TomlData.cs Adds a simple TOML loader used by NEUTRINO voicebanks (info.toml).
OpenUtau.Core/Util/Scaler.cs Moves scaler utility into OpenUtau.Core/Util namespace.
OpenUtau.Core/Util/Python.cs Moves python exception/util namespace into Core Util.
OpenUtau.Core/Util/ProcessRunner.cs Adds StartBackground process helper for launching NEUTRINO servers.
OpenUtau.Core/Util/Merlin.cs Moves Merlin frontend utility into Core Util namespace.
OpenUtau.Core/Util/HTSLabelFile.cs Moves HTS label IO into Core Util namespace.
OpenUtau.Core/Util/HTS.cs Adds refactored HTS context builder + note/phoneme/phrase structures.
OpenUtau.Core/Ustx/USinger.cs Adds Neutrino singer type and name mappings.
OpenUtau.Core/Render/RenderPhrase.cs Passes noteIndex into RenderPhone for HTS-label engines.
OpenUtau.Core/Render/Renderers.cs Registers NEUTRINO renderer and supported renderer list.
OpenUtau.Core/Neutrino/NeutrinoSinger.cs Implements NEUTRINO voicebank loading + dummy oto/phoneme inventory.
OpenUtau.Core/Neutrino/NeutrinoServerLauncher.cs Adds background server process management (start/stop/readiness).
OpenUtau.Core/Neutrino/NeutrinoRenderer.cs Implements NEUTRINO rendering via NEUTRINO/WORLD/NSF tooling + pitch loading.
OpenUtau.Core/Neutrino/NeutrinoLabelPhonemizer.cs Adds NEUTRINO HTS-label phonemizer and timing generation via NEUTRINO tools.
OpenUtau.Core/Hts/HTSLabelRenderer.cs Introduces base renderer for HTS-label-based engines (label generation + layout).
OpenUtau.Core/Hts/HTSLabelPhonemizer.cs Introduces base HTS-label phonemizer producing aligned phoneme timing.
OpenUtau.Core/Classic/ClassicSingerLoader.cs Instantiates NeutrinoSinger for the new singer type.
Comments suppressed due to low confidence (3)

OpenUtau.Core/Hts/HTSLabelRenderer.cs:462

  • Bar-length in milliseconds is computed as (60000 / bpm) * beatPerBar, but it ignores beatUnit. For non-4 denominators (e.g. 6/8), this will over/under-estimate a bar and will desync the HTS label padding/contexts. Consider multiplying by (4.0 / sigStart.beatUnit) (and similarly for end), or derive bar duration via timeAxis using the segment’s ticks-per-bar and TickPosToMsPos.
            // パディングを小節長で設定(開始・終了ともに1小節)
            sigStart = timeAxis.TimeSignatureAtTick(startTick);
            bpmStart = timeAxis.GetBpmAtTick(startTick);
            headMs = (int)Math.Round((60000.0 / bpmStart) * sigStart.beatPerBar);

            sigEnd = timeAxis.TimeSignatureAtTick(endTick);
            bpmEnd = timeAxis.GetBpmAtTick(endTick);
            tailMs = (int)Math.Round((60000.0 / bpmEnd) * sigEnd.beatPerBar);
            return new RenderResult() {

OpenUtau.Plugin.Builtin/EnunuOnnx/EnunuOnnxPhonemizer.cs:562

  • After switching to HTSPhrase, the loop that links htsNotes no longer assigns index, indexBackwards, and sentence duration fields for each non-padding note. HTSNote.e() uses these fields for multiple context slots, so leaving them at defaults will output xx / incorrect contexts and can degrade the ONNX model input. Restore the per-note initialization (or move it into HTSPhrase so every caller gets consistent context).
            var htsPhrase = new HTSPhrase(htsNotes.ToArray());
            htsPhrase.totalNotes = htsNotes.Count;
            htsPhrase.totalPhonemes = htsPhonemes.Count;
            //make neighborhood links between htsNotes and between htsPhonemes
            foreach (int i in Enumerable.Range(0, htsNotes.Count)) {
                htsNotes[i].parent = htsPhrase;
                if (i > 0) {
                    htsNotes[i].prev = htsNotes[i - 1];
                    htsNotes[i - 1].next = htsNotes[i];
                }

OpenUtau.Core/Hts/HTSLabelPhonemizer.cs:66

  • The code/comment says the g2p dictionary should be loaded after the phoneme dictionary, but LoadG2p(rootPath) is called before LoadDict(...) populates phoneDict. For implementations that rely on phoneDict to seed G2pDictionary, this will build an incomplete g2p and lead to missing symbol mappings. Swap the order (load dict first, then build g2p), or rebuild g2p after LoadDict completes.
            //Load g2p from enunux.yaml
            //g2p dict should be load after enunu dict
            try {
                g2p = LoadG2p(rootPath);
            } catch (Exception e) {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread OpenUtau.Core/Ustx/USinger.cs
Comment thread OpenUtau.Core/Hts/HTSLabelRenderer.cs
Comment thread OpenUtau.Core/Hts/HTSLabelRenderer.cs Outdated
Comment thread OpenUtau.Core/Neutrino/NeutrinoRenderer.cs Outdated
Comment thread OpenUtau.Core/Util/ProcessRunner.cs
Comment thread OpenUtau.Core/Hts/HTSLabelPhonemizer.cs
Comment thread OpenUtau.Core/Neutrino/NeutrinoSinger.cs
Comment thread OpenUtau.Core/Neutrino/NeutrinoLabelPhonemizer.cs Outdated
Comment thread OpenUtau.Core/Neutrino/NeutrinoLabelPhonemizer.cs
Comment thread OpenUtau.Core/Neutrino/NeutrinoRenderer.cs
rokujyushi and others added 9 commits May 20, 2026 18:50
The conditions for the IsSyllableVowelExtensionNote method have been expanded to recognize lyrics starting with specific symbols as vowel extension notes. Additionally, the calculation of phonemeDuration within the ProcessPart method has been removed, and a logic to directly calculate startMs and endMs has been introduced.

In phoneme timing calculations, new logic considering headMs and phrase.positionMs has been added, and a process to adjust the end time of existing monoLabels has been implemented. This prevents overlaps and inconsistencies, improving the accuracy of timing.

Furthermore, the startMs of the monoLabel at the end of a phrase has been changed to sentenceDurMs - tailMs to ensure that the timing of the entire phrase is accurately reflected.
This now allows for the regeneration of parameters when the pitch is changed.
Modified path strings within ArgParam to be enclosed in double quotes.
This fix ensures correct processing even when paths contain spaces.
Updated multiple locations within the SendScore method of NeutrinoLabelPhonemizer.cs and the Render method of NeutrinoRenderer.cs.
The monoLabel struct has been changed to public, and a customizable virtual method CustomMonoLabel has been added. This allows subclasses to override the processing logic for monoLabels_.

Additionally, in NeutrinoRenderer.cs, CustomMonoLabel is overridden to implement a process that rounds label timings to 10ms increments when singerVersion is "v2.7".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants