Skip to content

add ensembler#261

Open
makhlwf wants to merge 6 commits intonomadkaraoke:mainfrom
makhlwf:add-ensemble
Open

add ensembler#261
makhlwf wants to merge 6 commits intonomadkaraoke:mainfrom
makhlwf:add-ensemble

Conversation

@makhlwf
Copy link

@makhlwf makhlwf commented Mar 11, 2026

add ensembler based on other emplemtasions
more then one ensembler method are supported
use as CLI or as a package is supported by just proviting more then one model name
added example in readme.md
added unit test
evry thing has been tested locally

Summary by CodeRabbit

  • New Features

    • Multi-model ensembling: combine outputs from multiple models with selectable algorithms (average, median, FFT-based, UVR-based, min/max variants, ensemble WAV) and optional weights.
    • Separator now supports an ensemble processing path when multiple models are provided.
    • CLI accepts multiple models, ensemble algorithm, and ensemble weights; supports multi-model downloads.
  • Documentation

    • README expanded with ensemble concepts, CLI/Python examples, supported algorithms, and new parameters.
  • Tests

    • Added unit tests for ensembling behaviors and updated CLI tests.

@coderabbitai
Copy link

coderabbitai bot commented Mar 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds multi-model ensembling: a new Ensembler class with multiple algorithms, Separator orchestration for per-model inference and ensemble combining, CLI support for multiple models and ensemble options, unit tests for ensembling, and README documentation updates.

Changes

Cohort / File(s) Summary
Documentation
README.md
Added "Ensembling Multiple Models" guidance, CLI and Python usage examples, supported algorithms list, and new ensemble_algorithm / ensemble_weights parameters.
Core Ensembler Implementation
audio_separator/separator/ensembler.py
New Ensembler class with ensemble() implementing avg/weighted_avg/median/min/max, FFT and UVR variants, STFT/ISTFT helpers, length normalization/padding, weight validation, and logging.
Separator Integration
audio_separator/separator/separator.py
Separator accepts ensemble_algorithm and ensemble_weights; load_model handles lists; adds _separate_ensemble() to run per-model separation into temp workspaces, aggregate stems, call Ensembler, and write ensembled outputs; imports librosa/numpy.
CLI Enhancements
audio_separator/utils/cli.py
-m/--model_filename accepts multiple values; added --ensemble_algorithm and --ensemble_weights flags; supports downloading multiple models and passes ensemble options to Separator.
Tests — Ensembler & CLI
tests/unit/test_ensembler.py, tests/unit/test_cli.py
Added unit tests covering Ensembler algorithms, padding/shape behavior; CLI tests updated to mock distribution, include ensemble_algorithm/ensemble_weights defaults, and expect model filename as list.
Packaging / Requirements
manifest_file, requirements.txt
Updated manifest/requirements to include new dependencies and reflect added module files.
sequenceDiagram
    participant User as User/CLI
    participant Sep as Separator
    participant Loader as Model Loader
    participant Model as Model Instance
    participant Ens as Ensembler
    participant File as File I/O

    User->>Sep: separate(audio_file, models=[...], ensemble_algorithm, ensemble_weights)
    Sep->>Sep: detect multiple models
    loop per model
        Sep->>Loader: load_model(model_filename)
        Loader-->>Model: model_instance
        Sep->>Model: separate(audio_file)
        Model-->>Sep: stems (waveforms)
        Sep->>Sep: collect stems by type
    end
    Sep->>Ens: init(algorithm, weights)
    loop per stem type
        Sep->>Ens: ensemble([wave1, wave2, ...])
        Ens->>Ens: normalize lengths / apply algorithm
        Ens-->>Sep: ensembled_waveform
    end
    Sep->>File: write ensembled outputs
    File-->>Sep: files_written
    Sep-->>User: return ensembled stems
Loading

🎯 4 (Complex) | ⏱️ ~45 minutes

🐇 I hopped through models, one by one,
Averaged, medians, FFTs for fun,
I padded tails and balanced the score,
Collected stems and stitched once more,
Ensemble complete — hop, done!

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 21.21% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'add ensembler' is vague and generic, using minimal descriptive language that does not convey meaningful information about the substantial changes in the changeset. Consider a more descriptive title such as 'Add ensemble support for multiple model outputs' or 'Implement multi-model ensembling with CLI and API support' to better communicate the scope and purpose of changes.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
tests/unit/test_ensembler.py (1)

1-79: Good test coverage for Ensembler algorithms.

The tests cover the main ensemble algorithms effectively. Consider adding edge case tests for:

  • Empty waveforms list (returns None)
  • Single waveform (returns as-is)
  • Mismatched channel counts between waveforms (potential silent corruption)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_ensembler.py` around lines 1 - 79, Add unit tests in
tests/unit/test_ensembler.py to cover the edge cases: call Ensembler(logger,
algorithm="avg_wave").ensemble([]) and assert it returns None (empty waveforms
list), call ensemble on a single waveform and assert it returns the same array
(single waveform passthrough), and add a test that passes two arrays with
different channel counts into Ensembler.ensemble and assert the function either
raises a clear error (e.g., ValueError) or handles it predictably; reference the
Ensembler class and its ensemble method when adding these tests so reviewers can
locate the behavior under test.
audio_separator/separator/ensembler.py (1)

81-95: STFT/ISTFT methods assume stereo audio.

These methods index wave[0] and wave[1] directly, which will fail if a mono waveform (1 channel) is passed. While separator.py currently converts mono to stereo before calling ensemble, this assumption is fragile.

💡 Consider adding defensive handling
def _stft(self, wave, nfft=2048, hl=1024):
    if wave.shape[0] == 1:
        wave = np.vstack([wave, wave])  # Convert mono to stereo
    wave_left = np.asfortranarray(wave[0])
    wave_right = np.asfortranarray(wave[1])
    # ... rest of method
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@audio_separator/separator/ensembler.py` around lines 81 - 95, _stft and
_istft assume two channels and will crash for mono input; update both methods to
defensively handle mono inputs by detecting 1D waveforms or shapes with a single
channel and converting them to two channels before indexing (e.g., if wave.ndim
== 1 or wave.shape[0] == 1, duplicate the channel using np.vstack or np.repeat
to produce a 2xN array), then proceed to compute/restore left/right as currently
done; apply the same detection/duplication logic in _istft for mono spectrograms
(shape with one channel) so the functions work whether input is mono or stereo
while keeping the existing return format.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@audio_separator/separator/ensembler.py`:
- Around line 27-33: The ensemble() method is mutating instance state by
assigning to self.weights; change this to use a local variable weights instead:
inside Ensembler.ensemble(), compute weights = np.ones(len(waveforms)) or
weights = np.array(self.weights) and validate its length (logging and falling
back to equal weights) without assigning back to self.weights, and then update
all subsequent uses inside ensemble (the places currently referencing
self.weights) to reference the local weights variable so the Ensembler
instance's state is not modified across calls.

In `@audio_separator/separator/separator.py`:
- Around line 1234-1240: The current block assumes self.model_instance is always
present before calling self.model_instance.write_audio, so if all model
separations fail and self.model_instance is None the ensembled output will never
be written; modify the code around the block that references
self.model_instance, write_audio, ensembled_wav, output_path and output_files to
guard and handle the None case by either (a) raising a clear exception with
context (including path/output_path) so the caller can handle the failure, or
(b) providing a fallback writer that writes ensembled_wav directly to
final_output_path (using a simple WAV write path) and appends that path to
output_files; ensure audio_file_path/output_dir are still populated or logged
when using the fallback so downstream code gets consistent metadata.

In `@audio_separator/utils/cli.py`:
- Line 63: The help string variable ensemble_algorithm_help in
audio_separator/utils/cli.py contains an invalid example value; update the
example from --ensemble_algorithm=max_spec to a valid choice such as
--ensemble_algorithm=max_fft or --ensemble_algorithm=uvr_max_spec so the example
matches the documented choices (avg_wave, median_wave, min_wave, max_wave,
avg_fft, median_fft, min_fft, max_fft, uvr_max_spec, uvr_min_spec,
ensemble_wav).
- Line 192: The log call uses args.model_filename which can be a list, producing
awkward output; update the logger.info call (in the download/notify flow where
logger.info is called) to detect if args.model_filename is a list/iterable and
join its items into a clean comma-separated string (or format singular values
as-is) before inserting into the message so lines like "Model m1.ckpt, m2.ckpt
downloaded successfully." are logged instead of "Model ['m1.ckpt', 'm2.ckpt']
downloaded successfully."; specifically adjust the usage around logger.info and
args.model_filename to normalize to a string (e.g., ", ".join(...) for lists)
before formatting the message.

---

Nitpick comments:
In `@audio_separator/separator/ensembler.py`:
- Around line 81-95: _stft and _istft assume two channels and will crash for
mono input; update both methods to defensively handle mono inputs by detecting
1D waveforms or shapes with a single channel and converting them to two channels
before indexing (e.g., if wave.ndim == 1 or wave.shape[0] == 1, duplicate the
channel using np.vstack or np.repeat to produce a 2xN array), then proceed to
compute/restore left/right as currently done; apply the same
detection/duplication logic in _istft for mono spectrograms (shape with one
channel) so the functions work whether input is mono or stereo while keeping the
existing return format.

In `@tests/unit/test_ensembler.py`:
- Around line 1-79: Add unit tests in tests/unit/test_ensembler.py to cover the
edge cases: call Ensembler(logger, algorithm="avg_wave").ensemble([]) and assert
it returns None (empty waveforms list), call ensemble on a single waveform and
assert it returns the same array (single waveform passthrough), and add a test
that passes two arrays with different channel counts into Ensembler.ensemble and
assert the function either raises a clear error (e.g., ValueError) or handles it
predictably; reference the Ensembler class and its ensemble method when adding
these tests so reviewers can locate the behavior under test.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 45a1da00-ce1e-4c7d-b813-89328be0a793

📥 Commits

Reviewing files that changed from the base of the PR and between 12f8fc6 and 371de86.

📒 Files selected for processing (6)
  • README.md
  • audio_separator/separator/ensembler.py
  • audio_separator/separator/separator.py
  • audio_separator/utils/cli.py
  • tests/unit/test_cli.py
  • tests/unit/test_ensembler.py

@makhlwf makhlwf marked this pull request as draft March 11, 2026 19:37
@makhlwf makhlwf marked this pull request as ready for review March 11, 2026 19:54
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
audio_separator/utils/cli.py (1)

187-194: Minor inefficiency: Creating new Separator for each model download.

A new Separator instance is created for each model in the download loop. Consider reusing a single instance for efficiency when downloading multiple models.

♻️ Proposed fix
     if args.download_model_only:
         models_to_download = args.model_filename if isinstance(args.model_filename, list) else [args.model_filename]
+        separator = Separator(log_formatter=log_formatter, log_level=log_level, model_file_dir=args.model_file_dir)
         for model in models_to_download:
             logger.info(f"Separator version {package_version} downloading model {model} to directory {args.model_file_dir}")
-            separator = Separator(log_formatter=log_formatter, log_level=log_level, model_file_dir=args.model_file_dir)
             separator.download_model_and_data(model)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@audio_separator/utils/cli.py` around lines 187 - 194, The loop creates a new
Separator for every model; instead, instantiate a single Separator once (using
log_formatter, log_level, and args.model_file_dir) and reuse it to call
separator.download_model_and_data(model) for each entry in models_to_download
(compute models_to_download as you already do). Update the logging to still
report per-model download start and a final success message for models_string,
and remove the per-iteration Separator construction.
audio_separator/separator/ensembler.py (2)

39-41: Add strict=True to zip() for safety.

Per static analysis hint, adding strict=True ensures a clear error if the iterables have mismatched lengths, which is good defensive practice even though validation exists.

♻️ Proposed fix
         if self.algorithm == "avg_wave":
             ensembled = np.zeros_like(waveforms[0])
-            for w, weight in zip(waveforms, weights):
+            for w, weight in zip(waveforms, weights, strict=True):
                 ensembled += w * weight
             return ensembled / np.sum(weights)

And similarly at line 112:

-            for s, weight in zip(specs, weights):
+            for s, weight in zip(specs, weights, strict=True):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@audio_separator/separator/ensembler.py` around lines 39 - 41, The loop using
for w, weight in zip(waveforms, weights): should use zip(waveforms, weights,
strict=True) to raise an explicit error on length mismatch; update that
occurrence and the similar zip(...) at the later occurrence (the one around line
112 that also zips waveforms and weights) to include strict=True so mismatched
iterable lengths fail fast.

37-47: Weights are ignored for non-average algorithms.

The weights parameter is computed but only used by avg_wave (line 39-41) and avg_fft (line 112-114). The median_wave, min_wave, and max_wave algorithms silently ignore weights, which may surprise users who expect weighted behavior.

Consider either:

  1. Documenting this limitation clearly, or
  2. Logging a warning when weights are provided but ignored
📝 Proposed documentation/warning
         self.logger.debug(f"Ensembling {len(waveforms)} waveforms using algorithm {self.algorithm}")

+        if self.weights is not None and self.algorithm not in ["avg_wave", "avg_fft"]:
+            self.logger.warning(f"Weights are only used by 'avg_wave' and 'avg_fft' algorithms, ignoring weights for '{self.algorithm}'")
+
         if self.algorithm == "avg_wave":
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@audio_separator/separator/ensembler.py` around lines 37 - 47, The code treats
weights only for avg_wave and avg_fft while median_wave, min_wave, and max_wave
silently ignore the weights; update the ensemble logic to detect when a
non-weighted algorithm is selected (self.algorithm in "median_wave", "min_wave",
"max_wave") and weights is provided/not-uniform and emit a clear warning (e.g.,
via logging.getLogger(__name__).warning or an instance logger) stating that
weights are ignored for these algorithms, or alternatively implement weighted
versions if desired; reference the symbols self.algorithm, weights, avg_wave,
avg_fft, median_wave, min_wave, max_wave, _lambda_min and _lambda_max when
making the change so the warning check is added before returning from those
branches.
audio_separator/separator/separator.py (1)

1262-1270: Consider clearing model_instance after ensemble processing.

After ensemble processing, self.model_instance is left pointing to the last loaded model rather than being cleared. If a user calls separate() again on a single file without calling load_model() first, they might get unexpected behavior.

♻️ Proposed fix
             finally:
                 # Restore original model filenames state
                 self.model_filename = original_model_filename
                 self.model_filenames = original_model_filenames
+                # Clear model instance to avoid confusion after ensemble
+                self.model_instance = None

                 # Clean up temporary directory
                 if os.path.exists(temp_dir):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@audio_separator/separator/separator.py` around lines 1262 - 1270, After
ensemble processing, clear the lingering model reference by setting
self.model_instance = None in the same finally block where you restore
original_model_filename(s) and clean temp_dir; locate the finally block
surrounding the ensemble logic (references: self.model_filename,
self.model_filenames, temp_dir) and add a line to reset self.model_instance to
None so subsequent calls to separate() won't reuse the last-loaded model
unexpectedly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@audio_separator/separator/separator.py`:
- Around line 1188-1196: The stem name normalization currently uses
stem_name.capitalize() in the separator logic (the block handling lower_name and
stem_name) which mangles multi-word stems; update the normalization to preserve
multi-word capitalization by using a title-casing approach (e.g.,
stem_name.title()) or, better, map normalized lower-case keys to canonical names
from CommonSeparator (use a lookup dictionary keyed by lower_name to return the
canonical name) instead of capitalize(), and apply this change where stem_name
is reassigned in that normalization block.

---

Nitpick comments:
In `@audio_separator/separator/ensembler.py`:
- Around line 39-41: The loop using for w, weight in zip(waveforms, weights):
should use zip(waveforms, weights, strict=True) to raise an explicit error on
length mismatch; update that occurrence and the similar zip(...) at the later
occurrence (the one around line 112 that also zips waveforms and weights) to
include strict=True so mismatched iterable lengths fail fast.
- Around line 37-47: The code treats weights only for avg_wave and avg_fft while
median_wave, min_wave, and max_wave silently ignore the weights; update the
ensemble logic to detect when a non-weighted algorithm is selected
(self.algorithm in "median_wave", "min_wave", "max_wave") and weights is
provided/not-uniform and emit a clear warning (e.g., via
logging.getLogger(__name__).warning or an instance logger) stating that weights
are ignored for these algorithms, or alternatively implement weighted versions
if desired; reference the symbols self.algorithm, weights, avg_wave, avg_fft,
median_wave, min_wave, max_wave, _lambda_min and _lambda_max when making the
change so the warning check is added before returning from those branches.

In `@audio_separator/separator/separator.py`:
- Around line 1262-1270: After ensemble processing, clear the lingering model
reference by setting self.model_instance = None in the same finally block where
you restore original_model_filename(s) and clean temp_dir; locate the finally
block surrounding the ensemble logic (references: self.model_filename,
self.model_filenames, temp_dir) and add a line to reset self.model_instance to
None so subsequent calls to separate() won't reuse the last-loaded model
unexpectedly.

In `@audio_separator/utils/cli.py`:
- Around line 187-194: The loop creates a new Separator for every model;
instead, instantiate a single Separator once (using log_formatter, log_level,
and args.model_file_dir) and reuse it to call
separator.download_model_and_data(model) for each entry in models_to_download
(compute models_to_download as you already do). Update the logging to still
report per-model download start and a final success message for models_string,
and remove the per-iteration Separator construction.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 793f9d1e-7498-458c-b4f1-e71ff051366c

📥 Commits

Reviewing files that changed from the base of the PR and between 371de86 and 3c16178.

📒 Files selected for processing (4)
  • audio_separator/separator/ensembler.py
  • audio_separator/separator/separator.py
  • audio_separator/utils/cli.py
  • tests/unit/test_ensembler.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unit/test_ensembler.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
audio_separator/separator/ensembler.py (1)

44-45: Make the zip() contract explicit.

weights is length-checked just above, so strict=True is safe here and clears Ruff's B905 warning in both loops.

💡 Suggested cleanup
-            for w, weight in zip(waveforms, weights):
+            for w, weight in zip(waveforms, weights, strict=True):
@@
-            for s, weight in zip(specs, weights):
+            for s, weight in zip(specs, weights, strict=True):
As per coding guidelines, "Use ruff for code linting and formatting checks".

Also applies to: 117-118

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@audio_separator/separator/ensembler.py` around lines 44 - 45, Make the zip
contract explicit by using zip(..., strict=True) in the loops inside
ensembler.py where waveforms and weights are paired (the loop that does "for w,
weight in zip(waveforms, weights)" and the similar loop around lines 117-118);
since weights are length-checked earlier, replace those zip calls with
strict=True to satisfy linting (Ruff B905) and ensure mismatched lengths raise
immediately.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@audio_separator/separator/ensembler.py`:
- Around line 32-38: The current weights handling only checks length; validate
the actual weight values before they are used as divisors by checking that all
entries are finite (use np.isfinite) and that the total sum is not zero (and not
NaN/inf) so you never divide by a zero or non-finite sum in methods that use
weights (the local variable weights derived from self.weights and any averaging
helpers like avg_time_domain / avg_spectrogram). If the validation fails
(non-finite entries, sum == 0, or sum is NaN/Inf), log a warning on self.logger
referencing self.weights and fall back to equal weights
(np.ones(len(waveforms))) just like the length-mismatch case; keep using the
validated numpy array otherwise.
- Around line 86-108: The mono-channel duplication in _stft causes _istft to
always return two channels; update the flow so the original channel count is
captured (e.g., store original_channels in the caller or return it from _stft)
and modify _istft to accept an optional original_channels parameter and trim the
reconstructed wave to that number of channels before returning; specifically,
change callers like _ensemble_fft to pass the original channel count into _istft
and in _istft (function name _istft) if original_channels == 1 slice the output
to a single channel (keep behavior for >=2 unchanged).

---

Nitpick comments:
In `@audio_separator/separator/ensembler.py`:
- Around line 44-45: Make the zip contract explicit by using zip(...,
strict=True) in the loops inside ensembler.py where waveforms and weights are
paired (the loop that does "for w, weight in zip(waveforms, weights)" and the
similar loop around lines 117-118); since weights are length-checked earlier,
replace those zip calls with strict=True to satisfy linting (Ruff B905) and
ensure mismatched lengths raise immediately.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ec3da7de-812b-4f82-bab4-cf12ab44f6dc

📥 Commits

Reviewing files that changed from the base of the PR and between 3c16178 and 3bac786.

📒 Files selected for processing (1)
  • audio_separator/separator/ensembler.py

@makhlwf
Copy link
Author

makhlwf commented Mar 11, 2026

If we continue in this loop, I will revert to my first commit.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
audio_separator/separator/separator.py (1)

1268-1280: Fallback writer has limited format support.

The fallback writer only supports WAV format and forces WAV extension for other formats with an error log. Consider using soundfile with format-specific subtypes, or note this as a known limitation in documentation.

💡 Consider adding FLAC support to fallback writer
                         if self.output_format.lower() == "wav":
                             import soundfile as sf
 
                             sf.write(final_output_path, ensembled_wav.T, self.sample_rate)
+                        elif self.output_format.lower() == "flac":
+                            import soundfile as sf
+                            
+                            sf.write(final_output_path, ensembled_wav.T, self.sample_rate, format='FLAC')
                         else:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@audio_separator/separator/separator.py` around lines 1268 - 1280, The
fallback writer currently forces a .wav write for non-WAV formats and only logs
an error; update the logic in separator.py around the output_format check
(symbols: self.output_format, final_output_path, ensembled_wav,
self.sample_rate, self.logger) to attempt writing the requested format first by
inferring the file extension/format and calling soundfile.write with explicit
format and subtype options, catch any exceptions, and only then fall back to
writing WAV (updating final_output_path accordingly) while logging the caught
error; alternatively, if you prefer not to implement additional formats now,
update the code comment/log message to clearly state this known limitation and
add documentation referencing that CommonSeparator handles other formats.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@audio_separator/separator/ensembler.py`:
- Around line 153-156: The _ensemble_uvr implementation doesn't preserve
original channel count: capture the original channel configuration from the
input (e.g., inspect the first entry in waveforms to determine if it was mono vs
multichannel) before calling spec_utils.wave_to_spectrogram_no_mp; after getting
the waveform back from spec_utils.spectrogram_to_wave_no_mp, if the original was
mono but the returned wave is stereo/2D, collapse it back to mono (for example
by averaging channels or taking the first channel) so mono inputs produce mono
outputs — mirror the behavior of _ensemble_fft/_istft by using original channel
info to restore shape.

In `@audio_separator/separator/separator.py`:
- Around line 737-742: The early return in load_model incorrectly treats any
list (even single-item lists) as "multiple" and prevents loading the
model_instance; update the logic in load_model (the block handling
model_filename / self.model_filenames) so that only lists with length > 1 set up
ensembling and return early, while single-item lists are converted to a
single-element list (self.model_filenames = [model_filename[0]] or leave as-is)
but continue to load and initialize self.model_instance; ensure the downstream
separate() branch (which checks len(self.model_filenames) > 1) will correctly
route to ensemble only for true multi-model cases.

In `@audio_separator/utils/cli.py`:
- Line 48: The CLI currently defines model_filename with nargs="+"
(io_params.add_argument("-m", "--model_filename", ... , nargs="+")), which
forces args.model_filename to always be a list; fix by changing the CLI to
accept a single string (remove nargs or use nargs=None) so model_filename is a
plain str for single-model runs, or alternatively keep nargs but ensure
separator.py's load_model function normalizes a single-item list to a string
before proceeding (unwrap args.model_filename if isinstance(list) and len==1) so
load_model and separation logic work with both single and multiple models.

---

Nitpick comments:
In `@audio_separator/separator/separator.py`:
- Around line 1268-1280: The fallback writer currently forces a .wav write for
non-WAV formats and only logs an error; update the logic in separator.py around
the output_format check (symbols: self.output_format, final_output_path,
ensembled_wav, self.sample_rate, self.logger) to attempt writing the requested
format first by inferring the file extension/format and calling soundfile.write
with explicit format and subtype options, catch any exceptions, and only then
fall back to writing WAV (updating final_output_path accordingly) while logging
the caught error; alternatively, if you prefer not to implement additional
formats now, update the code comment/log message to clearly state this known
limitation and add documentation referencing that CommonSeparator handles other
formats.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a83fadf3-bea2-49e5-abc0-f3301e2f5667

📥 Commits

Reviewing files that changed from the base of the PR and between 3bac786 and ac68238.

📒 Files selected for processing (3)
  • audio_separator/separator/ensembler.py
  • audio_separator/separator/separator.py
  • audio_separator/utils/cli.py

Comment on lines +153 to +156
def _ensemble_uvr(self, waveforms, uvr_algorithm):
specs = [spec_utils.wave_to_spectrogram_no_mp(w) for w in waveforms]
ense_spec = spec_utils.ensembling(uvr_algorithm, specs)
return spec_utils.spectrogram_to_wave_no_mp(ense_spec) No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check spec_utils for mono handling in ensemble functions
rg -n "def wave_to_spectrogram_no_mp|def spectrogram_to_wave_no_mp|def ensembling" audio_separator/separator/uvr_lib_v5/spec_utils.py -A 10

Repository: nomadkaraoke/python-audio-separator

Length of output: 983


🏁 Script executed:

# Find and examine _ensemble_fft and _ensemble_uvr methods
rg -n "def _ensemble_fft|def _ensemble_uvr" audio_separator/separator/ensembler.py -A 15

Repository: nomadkaraoke/python-audio-separator

Length of output: 1163


🏁 Script executed:

# Get the full ensembling function from spec_utils
rg -n "def ensembling" audio_separator/separator/uvr_lib_v5/spec_utils.py -A 30

Repository: nomadkaraoke/python-audio-separator

Length of output: 1300


🏁 Script executed:

# Check how _ensemble_uvr is called to understand expected input/output
rg -n "_ensemble_uvr" audio_separator/separator/ensembler.py -B 3 -A 3

Repository: nomadkaraoke/python-audio-separator

Length of output: 975


🏁 Script executed:

# Find _istft method to see how it uses original_channels
rg -n "def _istft" audio_separator/separator/ensembler.py -A 10

Repository: nomadkaraoke/python-audio-separator

Length of output: 612


🏁 Script executed:

# Also check the _stft method to understand the flow
rg -n "def _stft" audio_separator/separator/ensembler.py -A 10

Repository: nomadkaraoke/python-audio-separator

Length of output: 610


🏁 Script executed:

# Get the rest of the _istft method to see the original_channels handling
rg -n "def _istft" audio_separator/separator/ensembler.py -A 15

Repository: nomadkaraoke/python-audio-separator

Length of output: 734


_ensemble_uvr does not preserve mono channel shape.

Unlike _ensemble_fft, which captures the original channel count and passes it to _istft for restoration, _ensemble_uvr has no mechanism to track or restore the input channel configuration. The spec_utils.wave_to_spectrogram_no_mp function converts mono (1D) to stereo (2D) by duplicating channels, and spec_utils.spectrogram_to_wave_no_mp has no original_channels parameter to reverse this conversion. As a result, mono waveforms will be output as stereo.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@audio_separator/separator/ensembler.py` around lines 153 - 156, The
_ensemble_uvr implementation doesn't preserve original channel count: capture
the original channel configuration from the input (e.g., inspect the first entry
in waveforms to determine if it was mono vs multichannel) before calling
spec_utils.wave_to_spectrogram_no_mp; after getting the waveform back from
spec_utils.spectrogram_to_wave_no_mp, if the original was mono but the returned
wave is stereo/2D, collapse it back to mono (for example by averaging channels
or taking the first channel) so mono inputs produce mono outputs — mirror the
behavior of _ensemble_fft/_istft by using original channel info to restore
shape.


io_params = parser.add_argument_group("Separation I/O Params")
io_params.add_argument("-m", "--model_filename", default="model_bs_roformer_ep_317_sdr_12.9755.ckpt", help=model_filename_help)
io_params.add_argument("-m", "--model_filename", default="model_bs_roformer_ep_317_sdr_12.9755.ckpt", nargs="+", help=model_filename_help)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

nargs="+" makes model_filename always a list.

With nargs="+", args.model_filename is always a list, even for a single model. This interacts with the issue identified in separator.py where load_model returns early for lists, breaking single-model separation. The fix in separator.py (unwrapping single-item lists) will address this.

This is the CLI side of the single-model-as-list issue. Ensure the fix in separator.py:load_model is applied to handle this case.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@audio_separator/utils/cli.py` at line 48, The CLI currently defines
model_filename with nargs="+" (io_params.add_argument("-m", "--model_filename",
... , nargs="+")), which forces args.model_filename to always be a list; fix by
changing the CLI to accept a single string (remove nargs or use nargs=None) so
model_filename is a plain str for single-model runs, or alternatively keep nargs
but ensure separator.py's load_model function normalizes a single-item list to a
string before proceeding (unwrap args.model_filename if isinstance(list) and
len==1) so load_model and separation logic work with both single and multiple
models.

@makhlwf
Copy link
Author

makhlwf commented Mar 11, 2026

What do you think @beveradb
Should I continue with this or river to the first commit because it was working great
Unfortunately there is already a valid comments here but there's also invalid comments that leads to failing tests and unneeded changes

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
audio_separator/separator/separator.py (2)

1287-1288: Document that model_instance is cleared after ensemble separation.

Setting self.model_instance = None after ensemble processing means subsequent calls to separate() without first calling load_model() again will fail. This may be intentional but could surprise users. Consider documenting this behavior in the method docstring.

📝 Suggested docstring addition
     def _separate_ensemble(self, audio_file_path, custom_output_names=None):
         """
         Internal method to handle ensembling of multiple models.
+        
+        Note: After ensemble processing completes, the model_instance is cleared.
+        Call load_model() again before performing additional separations.
         """
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@audio_separator/separator/separator.py` around lines 1287 - 1288, The code
clears self.model_instance after ensemble separation which makes subsequent
calls to separate() fail unless load_model() is called again; update the
separate() method docstring to document that behavior clearly (mention that
self.model_instance is set to None after ensemble processing), state that
callers must call load_model() before calling separate() again, and give a brief
rationale or usage example referencing the separate() and load_model() methods
and the model_instance attribute so users are not surprised.

1165-1180: Consider adding per-model error handling for resilience.

If one model fails during separation, the entire ensemble operation aborts. Depending on use case, you may want to allow partial ensembling with the models that succeeded.

♻️ Optional: Add try/except around individual model separation
                 for model_filename in original_model_filenames:
                     self.logger.info(f"Processing with model: {model_filename}")

-                    # Load the model
-                    self.load_model(model_filename)
+                    try:
+                        # Load the model
+                        self.load_model(model_filename)
+                    except Exception as e:
+                        self.logger.warning(f"Failed to load model {model_filename}: {e}, skipping")
+                        continue

                     # Set temporary output directory for this model
                     original_output_dir = self.output_dir
                     self.output_dir = temp_dir
                     if self.model_instance:
                         self.model_instance.output_dir = temp_dir

                     try:
                         # Perform separation
                         model_stems = self._separate_file(path, custom_output_names)
+                    except Exception as e:
+                        self.logger.warning(f"Separation failed for model {model_filename}: {e}, skipping")
+                        continue
+                    finally:
+                        self.output_dir = original_output_dir
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@audio_separator/separator/separator.py` around lines 1165 - 1180, The loop
over original_model_filenames currently calls load_model and _separate_file
without per-model error handling so a single model error aborts the whole
ensemble; wrap the per-model block (load_model, setting output_dir/temp_dir,
calling _separate_file) in a try/except to catch exceptions, log the error via
self.logger.exception or self.logger.error (include model_filename), and
continue to the next model, and use a finally block to always restore
original_output_dir and self.model_instance.output_dir; also ensure you track
which model_stems succeeded (e.g., append to a list) so downstream ensembling
only uses successful outputs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@audio_separator/separator/separator.py`:
- Around line 1272-1278: In the except block that catches failures when writing
via sf.write (inside the method using final_output_path, ensembled_wav,
self.sample_rate and self.output_format), replace the current
self.logger.error(...) with a call that logs the exception and stack trace (e.g.
self.logger.exception or self.logger.error(..., exc_info=True)) before
performing the fallback to WAV; keep the same descriptive message, then update
final_output_path to the .wav variant and call sf.write(final_output_path,
ensembled_wav.T, self.sample_rate) as before.

---

Nitpick comments:
In `@audio_separator/separator/separator.py`:
- Around line 1287-1288: The code clears self.model_instance after ensemble
separation which makes subsequent calls to separate() fail unless load_model()
is called again; update the separate() method docstring to document that
behavior clearly (mention that self.model_instance is set to None after ensemble
processing), state that callers must call load_model() before calling separate()
again, and give a brief rationale or usage example referencing the separate()
and load_model() methods and the model_instance attribute so users are not
surprised.
- Around line 1165-1180: The loop over original_model_filenames currently calls
load_model and _separate_file without per-model error handling so a single model
error aborts the whole ensemble; wrap the per-model block (load_model, setting
output_dir/temp_dir, calling _separate_file) in a try/except to catch
exceptions, log the error via self.logger.exception or self.logger.error
(include model_filename), and continue to the next model, and use a finally
block to always restore original_output_dir and self.model_instance.output_dir;
also ensure you track which model_stems succeeded (e.g., append to a list) so
downstream ensembling only uses successful outputs.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d1b4f626-ece3-480f-93be-3511fdd4ee16

📥 Commits

Reviewing files that changed from the base of the PR and between ac68238 and 19ff223.

📒 Files selected for processing (2)
  • audio_separator/separator/separator.py
  • audio_separator/utils/cli.py

Comment on lines +1272 to +1278
try:
self.logger.debug(f"Attempting to write ensembled audio to {final_output_path}...")
sf.write(final_output_path, ensembled_wav.T, self.sample_rate)
except Exception as e:
self.logger.error(f"Error writing {self.output_format} format: {e}. Falling back to WAV.")
final_output_path = final_output_path.rsplit(".", 1)[0] + ".wav"
sf.write(final_output_path, ensembled_wav.T, self.sample_rate)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use logging.exception to capture stack trace on fallback writer error.

The static analysis tool correctly identifies that logging.exception would be more informative here, as it includes the stack trace automatically.

🐛 Proposed fix
                         try:
                             self.logger.debug(f"Attempting to write ensembled audio to {final_output_path}...")
                             sf.write(final_output_path, ensembled_wav.T, self.sample_rate)
-                        except Exception as e:
-                            self.logger.error(f"Error writing {self.output_format} format: {e}. Falling back to WAV.")
+                        except OSError as e:
+                            self.logger.exception(f"Error writing {self.output_format} format. Falling back to WAV.")
                             final_output_path = final_output_path.rsplit(".", 1)[0] + ".wav"
                             sf.write(final_output_path, ensembled_wav.T, self.sample_rate)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
self.logger.debug(f"Attempting to write ensembled audio to {final_output_path}...")
sf.write(final_output_path, ensembled_wav.T, self.sample_rate)
except Exception as e:
self.logger.error(f"Error writing {self.output_format} format: {e}. Falling back to WAV.")
final_output_path = final_output_path.rsplit(".", 1)[0] + ".wav"
sf.write(final_output_path, ensembled_wav.T, self.sample_rate)
try:
self.logger.debug(f"Attempting to write ensembled audio to {final_output_path}...")
sf.write(final_output_path, ensembled_wav.T, self.sample_rate)
except OSError as e:
self.logger.exception(f"Error writing {self.output_format} format. Falling back to WAV.")
final_output_path = final_output_path.rsplit(".", 1)[0] + ".wav"
sf.write(final_output_path, ensembled_wav.T, self.sample_rate)
🧰 Tools
🪛 Ruff (0.15.5)

[warning] 1275-1275: Do not catch blind exception: Exception

(BLE001)


[warning] 1276-1276: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@audio_separator/separator/separator.py` around lines 1272 - 1278, In the
except block that catches failures when writing via sf.write (inside the method
using final_output_path, ensembled_wav, self.sample_rate and
self.output_format), replace the current self.logger.error(...) with a call that
logs the exception and stack trace (e.g. self.logger.exception or
self.logger.error(..., exc_info=True)) before performing the fallback to WAV;
keep the same descriptive message, then update final_output_path to the .wav
variant and call sf.write(final_output_path, ensembled_wav.T, self.sample_rate)
as before.

@makhlwf
Copy link
Author

makhlwf commented Mar 11, 2026

I think every good is thing now

@beveradb
Copy link
Collaborator

Sorry haven't had a chance to look at this thoroughly yet but my general take is coderabbit is useful and often catches things I would miss, but should still be taken with a pinch of salt and have a little bit of human brainpower applied to decide what to do 😄 - seems you reached a similar conclusion yourself anyway!

My first glance at this looks good, not sure I'm wild about changing the CLI ordering though, I feel like listing the models to ensemble as a comma-separated list would be less invasive for folks who already use the CLI.

Either way I'll give it a proper test later today and hopefully we can get this over the line - thanks for contributing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants