Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds multi-model ensembling: a new Ensembler class with multiple algorithms, Separator orchestration for per-model inference and ensemble combining, CLI support for multiple models and ensemble options, unit tests for ensembling, and README documentation updates. Changes
sequenceDiagram
participant User as User/CLI
participant Sep as Separator
participant Loader as Model Loader
participant Model as Model Instance
participant Ens as Ensembler
participant File as File I/O
User->>Sep: separate(audio_file, models=[...], ensemble_algorithm, ensemble_weights)
Sep->>Sep: detect multiple models
loop per model
Sep->>Loader: load_model(model_filename)
Loader-->>Model: model_instance
Sep->>Model: separate(audio_file)
Model-->>Sep: stems (waveforms)
Sep->>Sep: collect stems by type
end
Sep->>Ens: init(algorithm, weights)
loop per stem type
Sep->>Ens: ensemble([wave1, wave2, ...])
Ens->>Ens: normalize lengths / apply algorithm
Ens-->>Sep: ensembled_waveform
end
Sep->>File: write ensembled outputs
File-->>Sep: files_written
Sep-->>User: return ensembled stems
🎯 4 (Complex) | ⏱️ ~45 minutes
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
tests/unit/test_ensembler.py (1)
1-79: Good test coverage for Ensembler algorithms.The tests cover the main ensemble algorithms effectively. Consider adding edge case tests for:
- Empty waveforms list (returns
None)- Single waveform (returns as-is)
- Mismatched channel counts between waveforms (potential silent corruption)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/test_ensembler.py` around lines 1 - 79, Add unit tests in tests/unit/test_ensembler.py to cover the edge cases: call Ensembler(logger, algorithm="avg_wave").ensemble([]) and assert it returns None (empty waveforms list), call ensemble on a single waveform and assert it returns the same array (single waveform passthrough), and add a test that passes two arrays with different channel counts into Ensembler.ensemble and assert the function either raises a clear error (e.g., ValueError) or handles it predictably; reference the Ensembler class and its ensemble method when adding these tests so reviewers can locate the behavior under test.audio_separator/separator/ensembler.py (1)
81-95: STFT/ISTFT methods assume stereo audio.These methods index
wave[0]andwave[1]directly, which will fail if a mono waveform (1 channel) is passed. Whileseparator.pycurrently converts mono to stereo before calling ensemble, this assumption is fragile.💡 Consider adding defensive handling
def _stft(self, wave, nfft=2048, hl=1024): if wave.shape[0] == 1: wave = np.vstack([wave, wave]) # Convert mono to stereo wave_left = np.asfortranarray(wave[0]) wave_right = np.asfortranarray(wave[1]) # ... rest of method🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@audio_separator/separator/ensembler.py` around lines 81 - 95, _stft and _istft assume two channels and will crash for mono input; update both methods to defensively handle mono inputs by detecting 1D waveforms or shapes with a single channel and converting them to two channels before indexing (e.g., if wave.ndim == 1 or wave.shape[0] == 1, duplicate the channel using np.vstack or np.repeat to produce a 2xN array), then proceed to compute/restore left/right as currently done; apply the same detection/duplication logic in _istft for mono spectrograms (shape with one channel) so the functions work whether input is mono or stereo while keeping the existing return format.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@audio_separator/separator/ensembler.py`:
- Around line 27-33: The ensemble() method is mutating instance state by
assigning to self.weights; change this to use a local variable weights instead:
inside Ensembler.ensemble(), compute weights = np.ones(len(waveforms)) or
weights = np.array(self.weights) and validate its length (logging and falling
back to equal weights) without assigning back to self.weights, and then update
all subsequent uses inside ensemble (the places currently referencing
self.weights) to reference the local weights variable so the Ensembler
instance's state is not modified across calls.
In `@audio_separator/separator/separator.py`:
- Around line 1234-1240: The current block assumes self.model_instance is always
present before calling self.model_instance.write_audio, so if all model
separations fail and self.model_instance is None the ensembled output will never
be written; modify the code around the block that references
self.model_instance, write_audio, ensembled_wav, output_path and output_files to
guard and handle the None case by either (a) raising a clear exception with
context (including path/output_path) so the caller can handle the failure, or
(b) providing a fallback writer that writes ensembled_wav directly to
final_output_path (using a simple WAV write path) and appends that path to
output_files; ensure audio_file_path/output_dir are still populated or logged
when using the fallback so downstream code gets consistent metadata.
In `@audio_separator/utils/cli.py`:
- Line 63: The help string variable ensemble_algorithm_help in
audio_separator/utils/cli.py contains an invalid example value; update the
example from --ensemble_algorithm=max_spec to a valid choice such as
--ensemble_algorithm=max_fft or --ensemble_algorithm=uvr_max_spec so the example
matches the documented choices (avg_wave, median_wave, min_wave, max_wave,
avg_fft, median_fft, min_fft, max_fft, uvr_max_spec, uvr_min_spec,
ensemble_wav).
- Line 192: The log call uses args.model_filename which can be a list, producing
awkward output; update the logger.info call (in the download/notify flow where
logger.info is called) to detect if args.model_filename is a list/iterable and
join its items into a clean comma-separated string (or format singular values
as-is) before inserting into the message so lines like "Model m1.ckpt, m2.ckpt
downloaded successfully." are logged instead of "Model ['m1.ckpt', 'm2.ckpt']
downloaded successfully."; specifically adjust the usage around logger.info and
args.model_filename to normalize to a string (e.g., ", ".join(...) for lists)
before formatting the message.
---
Nitpick comments:
In `@audio_separator/separator/ensembler.py`:
- Around line 81-95: _stft and _istft assume two channels and will crash for
mono input; update both methods to defensively handle mono inputs by detecting
1D waveforms or shapes with a single channel and converting them to two channels
before indexing (e.g., if wave.ndim == 1 or wave.shape[0] == 1, duplicate the
channel using np.vstack or np.repeat to produce a 2xN array), then proceed to
compute/restore left/right as currently done; apply the same
detection/duplication logic in _istft for mono spectrograms (shape with one
channel) so the functions work whether input is mono or stereo while keeping the
existing return format.
In `@tests/unit/test_ensembler.py`:
- Around line 1-79: Add unit tests in tests/unit/test_ensembler.py to cover the
edge cases: call Ensembler(logger, algorithm="avg_wave").ensemble([]) and assert
it returns None (empty waveforms list), call ensemble on a single waveform and
assert it returns the same array (single waveform passthrough), and add a test
that passes two arrays with different channel counts into Ensembler.ensemble and
assert the function either raises a clear error (e.g., ValueError) or handles it
predictably; reference the Ensembler class and its ensemble method when adding
these tests so reviewers can locate the behavior under test.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 45a1da00-ce1e-4c7d-b813-89328be0a793
📒 Files selected for processing (6)
README.mdaudio_separator/separator/ensembler.pyaudio_separator/separator/separator.pyaudio_separator/utils/cli.pytests/unit/test_cli.pytests/unit/test_ensembler.py
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
audio_separator/utils/cli.py (1)
187-194: Minor inefficiency: Creating new Separator for each model download.A new
Separatorinstance is created for each model in the download loop. Consider reusing a single instance for efficiency when downloading multiple models.♻️ Proposed fix
if args.download_model_only: models_to_download = args.model_filename if isinstance(args.model_filename, list) else [args.model_filename] + separator = Separator(log_formatter=log_formatter, log_level=log_level, model_file_dir=args.model_file_dir) for model in models_to_download: logger.info(f"Separator version {package_version} downloading model {model} to directory {args.model_file_dir}") - separator = Separator(log_formatter=log_formatter, log_level=log_level, model_file_dir=args.model_file_dir) separator.download_model_and_data(model)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@audio_separator/utils/cli.py` around lines 187 - 194, The loop creates a new Separator for every model; instead, instantiate a single Separator once (using log_formatter, log_level, and args.model_file_dir) and reuse it to call separator.download_model_and_data(model) for each entry in models_to_download (compute models_to_download as you already do). Update the logging to still report per-model download start and a final success message for models_string, and remove the per-iteration Separator construction.audio_separator/separator/ensembler.py (2)
39-41: Addstrict=Truetozip()for safety.Per static analysis hint, adding
strict=Trueensures a clear error if the iterables have mismatched lengths, which is good defensive practice even though validation exists.♻️ Proposed fix
if self.algorithm == "avg_wave": ensembled = np.zeros_like(waveforms[0]) - for w, weight in zip(waveforms, weights): + for w, weight in zip(waveforms, weights, strict=True): ensembled += w * weight return ensembled / np.sum(weights)And similarly at line 112:
- for s, weight in zip(specs, weights): + for s, weight in zip(specs, weights, strict=True):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@audio_separator/separator/ensembler.py` around lines 39 - 41, The loop using for w, weight in zip(waveforms, weights): should use zip(waveforms, weights, strict=True) to raise an explicit error on length mismatch; update that occurrence and the similar zip(...) at the later occurrence (the one around line 112 that also zips waveforms and weights) to include strict=True so mismatched iterable lengths fail fast.
37-47: Weights are ignored for non-average algorithms.The
weightsparameter is computed but only used byavg_wave(line 39-41) andavg_fft(line 112-114). Themedian_wave,min_wave, andmax_wavealgorithms silently ignore weights, which may surprise users who expect weighted behavior.Consider either:
- Documenting this limitation clearly, or
- Logging a warning when weights are provided but ignored
📝 Proposed documentation/warning
self.logger.debug(f"Ensembling {len(waveforms)} waveforms using algorithm {self.algorithm}") + if self.weights is not None and self.algorithm not in ["avg_wave", "avg_fft"]: + self.logger.warning(f"Weights are only used by 'avg_wave' and 'avg_fft' algorithms, ignoring weights for '{self.algorithm}'") + if self.algorithm == "avg_wave":🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@audio_separator/separator/ensembler.py` around lines 37 - 47, The code treats weights only for avg_wave and avg_fft while median_wave, min_wave, and max_wave silently ignore the weights; update the ensemble logic to detect when a non-weighted algorithm is selected (self.algorithm in "median_wave", "min_wave", "max_wave") and weights is provided/not-uniform and emit a clear warning (e.g., via logging.getLogger(__name__).warning or an instance logger) stating that weights are ignored for these algorithms, or alternatively implement weighted versions if desired; reference the symbols self.algorithm, weights, avg_wave, avg_fft, median_wave, min_wave, max_wave, _lambda_min and _lambda_max when making the change so the warning check is added before returning from those branches.audio_separator/separator/separator.py (1)
1262-1270: Consider clearingmodel_instanceafter ensemble processing.After ensemble processing,
self.model_instanceis left pointing to the last loaded model rather than being cleared. If a user callsseparate()again on a single file without callingload_model()first, they might get unexpected behavior.♻️ Proposed fix
finally: # Restore original model filenames state self.model_filename = original_model_filename self.model_filenames = original_model_filenames + # Clear model instance to avoid confusion after ensemble + self.model_instance = None # Clean up temporary directory if os.path.exists(temp_dir):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@audio_separator/separator/separator.py` around lines 1262 - 1270, After ensemble processing, clear the lingering model reference by setting self.model_instance = None in the same finally block where you restore original_model_filename(s) and clean temp_dir; locate the finally block surrounding the ensemble logic (references: self.model_filename, self.model_filenames, temp_dir) and add a line to reset self.model_instance to None so subsequent calls to separate() won't reuse the last-loaded model unexpectedly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@audio_separator/separator/separator.py`:
- Around line 1188-1196: The stem name normalization currently uses
stem_name.capitalize() in the separator logic (the block handling lower_name and
stem_name) which mangles multi-word stems; update the normalization to preserve
multi-word capitalization by using a title-casing approach (e.g.,
stem_name.title()) or, better, map normalized lower-case keys to canonical names
from CommonSeparator (use a lookup dictionary keyed by lower_name to return the
canonical name) instead of capitalize(), and apply this change where stem_name
is reassigned in that normalization block.
---
Nitpick comments:
In `@audio_separator/separator/ensembler.py`:
- Around line 39-41: The loop using for w, weight in zip(waveforms, weights):
should use zip(waveforms, weights, strict=True) to raise an explicit error on
length mismatch; update that occurrence and the similar zip(...) at the later
occurrence (the one around line 112 that also zips waveforms and weights) to
include strict=True so mismatched iterable lengths fail fast.
- Around line 37-47: The code treats weights only for avg_wave and avg_fft while
median_wave, min_wave, and max_wave silently ignore the weights; update the
ensemble logic to detect when a non-weighted algorithm is selected
(self.algorithm in "median_wave", "min_wave", "max_wave") and weights is
provided/not-uniform and emit a clear warning (e.g., via
logging.getLogger(__name__).warning or an instance logger) stating that weights
are ignored for these algorithms, or alternatively implement weighted versions
if desired; reference the symbols self.algorithm, weights, avg_wave, avg_fft,
median_wave, min_wave, max_wave, _lambda_min and _lambda_max when making the
change so the warning check is added before returning from those branches.
In `@audio_separator/separator/separator.py`:
- Around line 1262-1270: After ensemble processing, clear the lingering model
reference by setting self.model_instance = None in the same finally block where
you restore original_model_filename(s) and clean temp_dir; locate the finally
block surrounding the ensemble logic (references: self.model_filename,
self.model_filenames, temp_dir) and add a line to reset self.model_instance to
None so subsequent calls to separate() won't reuse the last-loaded model
unexpectedly.
In `@audio_separator/utils/cli.py`:
- Around line 187-194: The loop creates a new Separator for every model;
instead, instantiate a single Separator once (using log_formatter, log_level,
and args.model_file_dir) and reuse it to call
separator.download_model_and_data(model) for each entry in models_to_download
(compute models_to_download as you already do). Update the logging to still
report per-model download start and a final success message for models_string,
and remove the per-iteration Separator construction.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 793f9d1e-7498-458c-b4f1-e71ff051366c
📒 Files selected for processing (4)
audio_separator/separator/ensembler.pyaudio_separator/separator/separator.pyaudio_separator/utils/cli.pytests/unit/test_ensembler.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/unit/test_ensembler.py
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
audio_separator/separator/ensembler.py (1)
44-45: Make thezip()contract explicit.
weightsis length-checked just above, sostrict=Trueis safe here and clears Ruff'sB905warning in both loops.As per coding guidelines, "Use ruff for code linting and formatting checks".💡 Suggested cleanup
- for w, weight in zip(waveforms, weights): + for w, weight in zip(waveforms, weights, strict=True): @@ - for s, weight in zip(specs, weights): + for s, weight in zip(specs, weights, strict=True):Also applies to: 117-118
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@audio_separator/separator/ensembler.py` around lines 44 - 45, Make the zip contract explicit by using zip(..., strict=True) in the loops inside ensembler.py where waveforms and weights are paired (the loop that does "for w, weight in zip(waveforms, weights)" and the similar loop around lines 117-118); since weights are length-checked earlier, replace those zip calls with strict=True to satisfy linting (Ruff B905) and ensure mismatched lengths raise immediately.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@audio_separator/separator/ensembler.py`:
- Around line 32-38: The current weights handling only checks length; validate
the actual weight values before they are used as divisors by checking that all
entries are finite (use np.isfinite) and that the total sum is not zero (and not
NaN/inf) so you never divide by a zero or non-finite sum in methods that use
weights (the local variable weights derived from self.weights and any averaging
helpers like avg_time_domain / avg_spectrogram). If the validation fails
(non-finite entries, sum == 0, or sum is NaN/Inf), log a warning on self.logger
referencing self.weights and fall back to equal weights
(np.ones(len(waveforms))) just like the length-mismatch case; keep using the
validated numpy array otherwise.
- Around line 86-108: The mono-channel duplication in _stft causes _istft to
always return two channels; update the flow so the original channel count is
captured (e.g., store original_channels in the caller or return it from _stft)
and modify _istft to accept an optional original_channels parameter and trim the
reconstructed wave to that number of channels before returning; specifically,
change callers like _ensemble_fft to pass the original channel count into _istft
and in _istft (function name _istft) if original_channels == 1 slice the output
to a single channel (keep behavior for >=2 unchanged).
---
Nitpick comments:
In `@audio_separator/separator/ensembler.py`:
- Around line 44-45: Make the zip contract explicit by using zip(...,
strict=True) in the loops inside ensembler.py where waveforms and weights are
paired (the loop that does "for w, weight in zip(waveforms, weights)" and the
similar loop around lines 117-118); since weights are length-checked earlier,
replace those zip calls with strict=True to satisfy linting (Ruff B905) and
ensure mismatched lengths raise immediately.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ec3da7de-812b-4f82-bab4-cf12ab44f6dc
📒 Files selected for processing (1)
audio_separator/separator/ensembler.py
|
If we continue in this loop, I will revert to my first commit. |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
audio_separator/separator/separator.py (1)
1268-1280: Fallback writer has limited format support.The fallback writer only supports WAV format and forces WAV extension for other formats with an error log. Consider using
soundfilewith format-specific subtypes, or note this as a known limitation in documentation.💡 Consider adding FLAC support to fallback writer
if self.output_format.lower() == "wav": import soundfile as sf sf.write(final_output_path, ensembled_wav.T, self.sample_rate) + elif self.output_format.lower() == "flac": + import soundfile as sf + + sf.write(final_output_path, ensembled_wav.T, self.sample_rate, format='FLAC') else:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@audio_separator/separator/separator.py` around lines 1268 - 1280, The fallback writer currently forces a .wav write for non-WAV formats and only logs an error; update the logic in separator.py around the output_format check (symbols: self.output_format, final_output_path, ensembled_wav, self.sample_rate, self.logger) to attempt writing the requested format first by inferring the file extension/format and calling soundfile.write with explicit format and subtype options, catch any exceptions, and only then fall back to writing WAV (updating final_output_path accordingly) while logging the caught error; alternatively, if you prefer not to implement additional formats now, update the code comment/log message to clearly state this known limitation and add documentation referencing that CommonSeparator handles other formats.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@audio_separator/separator/ensembler.py`:
- Around line 153-156: The _ensemble_uvr implementation doesn't preserve
original channel count: capture the original channel configuration from the
input (e.g., inspect the first entry in waveforms to determine if it was mono vs
multichannel) before calling spec_utils.wave_to_spectrogram_no_mp; after getting
the waveform back from spec_utils.spectrogram_to_wave_no_mp, if the original was
mono but the returned wave is stereo/2D, collapse it back to mono (for example
by averaging channels or taking the first channel) so mono inputs produce mono
outputs — mirror the behavior of _ensemble_fft/_istft by using original channel
info to restore shape.
In `@audio_separator/separator/separator.py`:
- Around line 737-742: The early return in load_model incorrectly treats any
list (even single-item lists) as "multiple" and prevents loading the
model_instance; update the logic in load_model (the block handling
model_filename / self.model_filenames) so that only lists with length > 1 set up
ensembling and return early, while single-item lists are converted to a
single-element list (self.model_filenames = [model_filename[0]] or leave as-is)
but continue to load and initialize self.model_instance; ensure the downstream
separate() branch (which checks len(self.model_filenames) > 1) will correctly
route to ensemble only for true multi-model cases.
In `@audio_separator/utils/cli.py`:
- Line 48: The CLI currently defines model_filename with nargs="+"
(io_params.add_argument("-m", "--model_filename", ... , nargs="+")), which
forces args.model_filename to always be a list; fix by changing the CLI to
accept a single string (remove nargs or use nargs=None) so model_filename is a
plain str for single-model runs, or alternatively keep nargs but ensure
separator.py's load_model function normalizes a single-item list to a string
before proceeding (unwrap args.model_filename if isinstance(list) and len==1) so
load_model and separation logic work with both single and multiple models.
---
Nitpick comments:
In `@audio_separator/separator/separator.py`:
- Around line 1268-1280: The fallback writer currently forces a .wav write for
non-WAV formats and only logs an error; update the logic in separator.py around
the output_format check (symbols: self.output_format, final_output_path,
ensembled_wav, self.sample_rate, self.logger) to attempt writing the requested
format first by inferring the file extension/format and calling soundfile.write
with explicit format and subtype options, catch any exceptions, and only then
fall back to writing WAV (updating final_output_path accordingly) while logging
the caught error; alternatively, if you prefer not to implement additional
formats now, update the code comment/log message to clearly state this known
limitation and add documentation referencing that CommonSeparator handles other
formats.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a83fadf3-bea2-49e5-abc0-f3301e2f5667
📒 Files selected for processing (3)
audio_separator/separator/ensembler.pyaudio_separator/separator/separator.pyaudio_separator/utils/cli.py
| def _ensemble_uvr(self, waveforms, uvr_algorithm): | ||
| specs = [spec_utils.wave_to_spectrogram_no_mp(w) for w in waveforms] | ||
| ense_spec = spec_utils.ensembling(uvr_algorithm, specs) | ||
| return spec_utils.spectrogram_to_wave_no_mp(ense_spec) No newline at end of file |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check spec_utils for mono handling in ensemble functions
rg -n "def wave_to_spectrogram_no_mp|def spectrogram_to_wave_no_mp|def ensembling" audio_separator/separator/uvr_lib_v5/spec_utils.py -A 10Repository: nomadkaraoke/python-audio-separator
Length of output: 983
🏁 Script executed:
# Find and examine _ensemble_fft and _ensemble_uvr methods
rg -n "def _ensemble_fft|def _ensemble_uvr" audio_separator/separator/ensembler.py -A 15Repository: nomadkaraoke/python-audio-separator
Length of output: 1163
🏁 Script executed:
# Get the full ensembling function from spec_utils
rg -n "def ensembling" audio_separator/separator/uvr_lib_v5/spec_utils.py -A 30Repository: nomadkaraoke/python-audio-separator
Length of output: 1300
🏁 Script executed:
# Check how _ensemble_uvr is called to understand expected input/output
rg -n "_ensemble_uvr" audio_separator/separator/ensembler.py -B 3 -A 3Repository: nomadkaraoke/python-audio-separator
Length of output: 975
🏁 Script executed:
# Find _istft method to see how it uses original_channels
rg -n "def _istft" audio_separator/separator/ensembler.py -A 10Repository: nomadkaraoke/python-audio-separator
Length of output: 612
🏁 Script executed:
# Also check the _stft method to understand the flow
rg -n "def _stft" audio_separator/separator/ensembler.py -A 10Repository: nomadkaraoke/python-audio-separator
Length of output: 610
🏁 Script executed:
# Get the rest of the _istft method to see the original_channels handling
rg -n "def _istft" audio_separator/separator/ensembler.py -A 15Repository: nomadkaraoke/python-audio-separator
Length of output: 734
_ensemble_uvr does not preserve mono channel shape.
Unlike _ensemble_fft, which captures the original channel count and passes it to _istft for restoration, _ensemble_uvr has no mechanism to track or restore the input channel configuration. The spec_utils.wave_to_spectrogram_no_mp function converts mono (1D) to stereo (2D) by duplicating channels, and spec_utils.spectrogram_to_wave_no_mp has no original_channels parameter to reverse this conversion. As a result, mono waveforms will be output as stereo.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@audio_separator/separator/ensembler.py` around lines 153 - 156, The
_ensemble_uvr implementation doesn't preserve original channel count: capture
the original channel configuration from the input (e.g., inspect the first entry
in waveforms to determine if it was mono vs multichannel) before calling
spec_utils.wave_to_spectrogram_no_mp; after getting the waveform back from
spec_utils.spectrogram_to_wave_no_mp, if the original was mono but the returned
wave is stereo/2D, collapse it back to mono (for example by averaging channels
or taking the first channel) so mono inputs produce mono outputs — mirror the
behavior of _ensemble_fft/_istft by using original channel info to restore
shape.
audio_separator/utils/cli.py
Outdated
|
|
||
| io_params = parser.add_argument_group("Separation I/O Params") | ||
| io_params.add_argument("-m", "--model_filename", default="model_bs_roformer_ep_317_sdr_12.9755.ckpt", help=model_filename_help) | ||
| io_params.add_argument("-m", "--model_filename", default="model_bs_roformer_ep_317_sdr_12.9755.ckpt", nargs="+", help=model_filename_help) |
There was a problem hiding this comment.
nargs="+" makes model_filename always a list.
With nargs="+", args.model_filename is always a list, even for a single model. This interacts with the issue identified in separator.py where load_model returns early for lists, breaking single-model separation. The fix in separator.py (unwrapping single-item lists) will address this.
This is the CLI side of the single-model-as-list issue. Ensure the fix in separator.py:load_model is applied to handle this case.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@audio_separator/utils/cli.py` at line 48, The CLI currently defines
model_filename with nargs="+" (io_params.add_argument("-m", "--model_filename",
... , nargs="+")), which forces args.model_filename to always be a list; fix by
changing the CLI to accept a single string (remove nargs or use nargs=None) so
model_filename is a plain str for single-model runs, or alternatively keep nargs
but ensure separator.py's load_model function normalizes a single-item list to a
string before proceeding (unwrap args.model_filename if isinstance(list) and
len==1) so load_model and separation logic work with both single and multiple
models.
|
What do you think @beveradb |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
audio_separator/separator/separator.py (2)
1287-1288: Document thatmodel_instanceis cleared after ensemble separation.Setting
self.model_instance = Noneafter ensemble processing means subsequent calls toseparate()without first callingload_model()again will fail. This may be intentional but could surprise users. Consider documenting this behavior in the method docstring.📝 Suggested docstring addition
def _separate_ensemble(self, audio_file_path, custom_output_names=None): """ Internal method to handle ensembling of multiple models. + + Note: After ensemble processing completes, the model_instance is cleared. + Call load_model() again before performing additional separations. """🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@audio_separator/separator/separator.py` around lines 1287 - 1288, The code clears self.model_instance after ensemble separation which makes subsequent calls to separate() fail unless load_model() is called again; update the separate() method docstring to document that behavior clearly (mention that self.model_instance is set to None after ensemble processing), state that callers must call load_model() before calling separate() again, and give a brief rationale or usage example referencing the separate() and load_model() methods and the model_instance attribute so users are not surprised.
1165-1180: Consider adding per-model error handling for resilience.If one model fails during separation, the entire ensemble operation aborts. Depending on use case, you may want to allow partial ensembling with the models that succeeded.
♻️ Optional: Add try/except around individual model separation
for model_filename in original_model_filenames: self.logger.info(f"Processing with model: {model_filename}") - # Load the model - self.load_model(model_filename) + try: + # Load the model + self.load_model(model_filename) + except Exception as e: + self.logger.warning(f"Failed to load model {model_filename}: {e}, skipping") + continue # Set temporary output directory for this model original_output_dir = self.output_dir self.output_dir = temp_dir if self.model_instance: self.model_instance.output_dir = temp_dir try: # Perform separation model_stems = self._separate_file(path, custom_output_names) + except Exception as e: + self.logger.warning(f"Separation failed for model {model_filename}: {e}, skipping") + continue + finally: + self.output_dir = original_output_dir🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@audio_separator/separator/separator.py` around lines 1165 - 1180, The loop over original_model_filenames currently calls load_model and _separate_file without per-model error handling so a single model error aborts the whole ensemble; wrap the per-model block (load_model, setting output_dir/temp_dir, calling _separate_file) in a try/except to catch exceptions, log the error via self.logger.exception or self.logger.error (include model_filename), and continue to the next model, and use a finally block to always restore original_output_dir and self.model_instance.output_dir; also ensure you track which model_stems succeeded (e.g., append to a list) so downstream ensembling only uses successful outputs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@audio_separator/separator/separator.py`:
- Around line 1272-1278: In the except block that catches failures when writing
via sf.write (inside the method using final_output_path, ensembled_wav,
self.sample_rate and self.output_format), replace the current
self.logger.error(...) with a call that logs the exception and stack trace (e.g.
self.logger.exception or self.logger.error(..., exc_info=True)) before
performing the fallback to WAV; keep the same descriptive message, then update
final_output_path to the .wav variant and call sf.write(final_output_path,
ensembled_wav.T, self.sample_rate) as before.
---
Nitpick comments:
In `@audio_separator/separator/separator.py`:
- Around line 1287-1288: The code clears self.model_instance after ensemble
separation which makes subsequent calls to separate() fail unless load_model()
is called again; update the separate() method docstring to document that
behavior clearly (mention that self.model_instance is set to None after ensemble
processing), state that callers must call load_model() before calling separate()
again, and give a brief rationale or usage example referencing the separate()
and load_model() methods and the model_instance attribute so users are not
surprised.
- Around line 1165-1180: The loop over original_model_filenames currently calls
load_model and _separate_file without per-model error handling so a single model
error aborts the whole ensemble; wrap the per-model block (load_model, setting
output_dir/temp_dir, calling _separate_file) in a try/except to catch
exceptions, log the error via self.logger.exception or self.logger.error
(include model_filename), and continue to the next model, and use a finally
block to always restore original_output_dir and self.model_instance.output_dir;
also ensure you track which model_stems succeeded (e.g., append to a list) so
downstream ensembling only uses successful outputs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d1b4f626-ece3-480f-93be-3511fdd4ee16
📒 Files selected for processing (2)
audio_separator/separator/separator.pyaudio_separator/utils/cli.py
| try: | ||
| self.logger.debug(f"Attempting to write ensembled audio to {final_output_path}...") | ||
| sf.write(final_output_path, ensembled_wav.T, self.sample_rate) | ||
| except Exception as e: | ||
| self.logger.error(f"Error writing {self.output_format} format: {e}. Falling back to WAV.") | ||
| final_output_path = final_output_path.rsplit(".", 1)[0] + ".wav" | ||
| sf.write(final_output_path, ensembled_wav.T, self.sample_rate) |
There was a problem hiding this comment.
Use logging.exception to capture stack trace on fallback writer error.
The static analysis tool correctly identifies that logging.exception would be more informative here, as it includes the stack trace automatically.
🐛 Proposed fix
try:
self.logger.debug(f"Attempting to write ensembled audio to {final_output_path}...")
sf.write(final_output_path, ensembled_wav.T, self.sample_rate)
- except Exception as e:
- self.logger.error(f"Error writing {self.output_format} format: {e}. Falling back to WAV.")
+ except OSError as e:
+ self.logger.exception(f"Error writing {self.output_format} format. Falling back to WAV.")
final_output_path = final_output_path.rsplit(".", 1)[0] + ".wav"
sf.write(final_output_path, ensembled_wav.T, self.sample_rate)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try: | |
| self.logger.debug(f"Attempting to write ensembled audio to {final_output_path}...") | |
| sf.write(final_output_path, ensembled_wav.T, self.sample_rate) | |
| except Exception as e: | |
| self.logger.error(f"Error writing {self.output_format} format: {e}. Falling back to WAV.") | |
| final_output_path = final_output_path.rsplit(".", 1)[0] + ".wav" | |
| sf.write(final_output_path, ensembled_wav.T, self.sample_rate) | |
| try: | |
| self.logger.debug(f"Attempting to write ensembled audio to {final_output_path}...") | |
| sf.write(final_output_path, ensembled_wav.T, self.sample_rate) | |
| except OSError as e: | |
| self.logger.exception(f"Error writing {self.output_format} format. Falling back to WAV.") | |
| final_output_path = final_output_path.rsplit(".", 1)[0] + ".wav" | |
| sf.write(final_output_path, ensembled_wav.T, self.sample_rate) |
🧰 Tools
🪛 Ruff (0.15.5)
[warning] 1275-1275: Do not catch blind exception: Exception
(BLE001)
[warning] 1276-1276: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@audio_separator/separator/separator.py` around lines 1272 - 1278, In the
except block that catches failures when writing via sf.write (inside the method
using final_output_path, ensembled_wav, self.sample_rate and
self.output_format), replace the current self.logger.error(...) with a call that
logs the exception and stack trace (e.g. self.logger.exception or
self.logger.error(..., exc_info=True)) before performing the fallback to WAV;
keep the same descriptive message, then update final_output_path to the .wav
variant and call sf.write(final_output_path, ensembled_wav.T, self.sample_rate)
as before.
|
I think every good is thing now |
|
Sorry haven't had a chance to look at this thoroughly yet but my general take is coderabbit is useful and often catches things I would miss, but should still be taken with a pinch of salt and have a little bit of human brainpower applied to decide what to do 😄 - seems you reached a similar conclusion yourself anyway! My first glance at this looks good, not sure I'm wild about changing the CLI ordering though, I feel like listing the models to ensemble as a comma-separated list would be less invasive for folks who already use the CLI. Either way I'll give it a proper test later today and hopefully we can get this over the line - thanks for contributing! |
add ensembler based on other emplemtasions
more then one ensembler method are supported
use as CLI or as a package is supported by just proviting more then one model name
added example in readme.md
added unit test
evry thing has been tested locally
Summary by CodeRabbit
New Features
Documentation
Tests