Skip to content

Lastgenre: Cleanup existing genres#6317

Merged
JOJ0 merged 1 commit intobeetbox:masterfrom
Nukesor:canonicalize-existing
Mar 4, 2026
Merged

Lastgenre: Cleanup existing genres#6317
JOJ0 merged 1 commit intobeetbox:masterfrom
Nukesor:canonicalize-existing

Conversation

@Nukesor
Copy link
Copy Markdown
Contributor

@Nukesor Nukesor commented Jan 24, 2026

Description

Fixes #6305

Implements a new cleanup_existing config flag. The added documentation should explain its behavior. If there're any unclarities, we need to adjust the docs :)

To Do

  • Documentation.
  • Changelog.
  • Tests.

@Nukesor Nukesor requested a review from a team as a code owner January 24, 2026 21:35
Copilot AI review requested due to automatic review settings January 24, 2026 21:35
@Nukesor Nukesor changed the base branch from master to lastgenre_item_fallback January 24, 2026 21:35
@github-actions
Copy link
Copy Markdown

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

@Nukesor Nukesor changed the base branch from lastgenre_item_fallback to master January 24, 2026 21:36
@Nukesor Nukesor force-pushed the canonicalize-existing branch 2 times, most recently from 4296305 to 64a9c41 Compare January 24, 2026 21:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a new canonicalize_existing configuration flag for the lastgenre plugin to enable whitelist canonicalization of existing genres when force: no, canonical: yes, and whitelist: yes are set.

Changes:

  • Adds a new canonicalize_existing boolean configuration option with a default value of False
  • Updates the genre resolution logic to canonicalize existing genres before returning them when the new flag is enabled
  • Adds documentation explaining the new flag's behavior and requirements

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
beetsplug/lastgenre/init.py Adds the canonicalize_existing config option and implements canonicalization logic for existing genres
docs/plugins/lastgenre.rst Documents the new canonicalize_existing configuration option
docs/changelog.rst Adds changelog entry for the new feature
test/plugins/test_lastgenre.py Adds test case to verify canonicalization of existing genres

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread beetsplug/lastgenre/__init__.py Outdated
Comment thread docs/plugins/lastgenre.rst Outdated
@codecov
Copy link
Copy Markdown

codecov bot commented Jan 24, 2026

Codecov Report

❌ Patch coverage is 60.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.39%. Comparing base (c1fa0a6) to head (13fe82f).
⚠️ Report is 2 commits behind head on master.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
beetsplug/lastgenre/__init__.py 60.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6317      +/-   ##
==========================================
- Coverage   69.39%   69.39%   -0.01%     
==========================================
  Files         141      141              
  Lines       18811    18816       +5     
  Branches     3066     3068       +2     
==========================================
+ Hits        13054    13057       +3     
- Misses       5111     5112       +1     
- Partials      646      647       +1     
Files with missing lines Coverage Δ
beetsplug/lastgenre/__init__.py 70.22% <60.00%> (-0.20%) ⬇️
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Nukesor Nukesor changed the title Canonicalize existing [LastGenre] Canonicalize existing genres Jan 24, 2026
@Nukesor Nukesor force-pushed the canonicalize-existing branch 2 times, most recently from 1c99cf1 to c796161 Compare January 29, 2026 19:32
@JOJ0 JOJ0 force-pushed the canonicalize-existing branch from c796161 to c5eb470 Compare January 31, 2026 12:32
@JOJ0
Copy link
Copy Markdown
Member

JOJ0 commented Jan 31, 2026

Hm interesting feature and useful, but I'm not sure if this should maybe be called something containing the word "cleanup". It might even include whitelist filtering without canonicalization too.

I see it as a "cleanup existing only" kind of functionality.

And are you sure this can't be included in the force branch somehow? We are moving away from force:no being what one would expect from it. let's brainstorm some more here please

@JOJ0
Copy link
Copy Markdown
Member

JOJ0 commented Feb 1, 2026

@Nukesor I adjusted my wording above slightly. I think it was weird. Maybe still is...

@Nukesor
Copy link
Copy Markdown
Contributor Author

Nukesor commented Feb 3, 2026

I didn't read your first response, but the edited doesn't seem weird at all :D

And are you sure this can't be included in the force branch somehow?

The force-branch already canonicalizes local genres (if keep_existing is set), so that case is already covered. Right now, it's just not possible to canonicalize genres in non-force mode.

I see it as a "cleanup existing only" kind of functionality.

It's more of a "cleanup stuff from lastfm, but please also cleanup local stuff that isn't force-overwritten/merged" :D

Regarding "cleanup", I would be up for it, although I think that canonicalization better conveys what actually happens. If we were to change the wording, we should do it for the whole plugin though.

We are moving away from force:no being what one would expect from it.

Neat. How would that look like?
The current flag handling is definitely a bit confusing. A different approach to configuration would probably be easier to understand and use. Although I'm not sure how that would look like yet. Maybe enum-based configuration, where sensible combinations of behavior are given a dedicated name?

@JOJ0 JOJ0 changed the title [LastGenre] Canonicalize existing genres Lastgenre: Canonicalize existing genres Feb 8, 2026
@JOJ0
Copy link
Copy Markdown
Member

JOJ0 commented Feb 8, 2026

I didn't read your first response, but the edited doesn't seem weird at all :D

And are you sure this can't be included in the force branch somehow?

The force-branch already canonicalizes local genres (if keep_existing is set), so that case is already covered. Right now, it's just not possible to canonicalize genres in non-force mode.

Yes you are right, my bad, I didn't think it through.

Still: We are introducing an option that allows the user / beets to manipulate something even though they said: DO NOT FORCE, that is why I want to be supercareful to make this most obvious and understandable.

I see it as a "cleanup existing only" kind of functionality.

It's more of a "cleanup stuff from lastfm, but please also cleanup local stuff that isn't force-overwritten/merged" :D

You are right

Regarding "cleanup", I would be up for it, although I think that canonicalization better conveys what actually happens. If we were to change the wording, we should do it for the whole plugin though.

We are moving away from force:no being what one would expect from it.

Neat. How would that look like? The current flag handling is definitely a bit confusing. A different approach to configuration would probably be easier to understand and use. Although I'm not sure how that would look like yet. Maybe enum-based configuration, where sensible combinations of behavior are given a dedicated name?

I agree that it could be easier and we discussed it way back when -k was introduced and the initial force behaviour was fixed. An option like --mode overwrite, --mode combine, and so on was on the table but I was against it because it felt tedious from a cli usability point of view. Using beet lastgenre -f -k term or beet lastgenre -f -K seemed more practcally usable.

Each way has it's strenghtes though, I agree, but still I don't want to change the usability concept now again. It's established and well documented (I hope!) now.

What I'm trying to say with a cleanup option is something like this:

beet lastgenre -F --cleanup-existing -> whitelist filtering and/or canonicalization (and in the future normalizing of spelling / applying of regex aliases) will take place if configured.

A shortform -c would be available.

No better idea for option naming currently, please help me out! :-)

--help could state something like this:

  -c, --cleanup-existing   even with --no-force provided, existing genres get cleaned up (whitelist, canoncialitation, ...)

maybe make this shorter more concise help text...

We should not restrict this no-force-cleanup to canonicalization only.

@JOJ0 JOJ0 added lastgenre lastgenre plugin plugin Pull requests that are plugins related labels Feb 11, 2026
@Nukesor
Copy link
Copy Markdown
Contributor Author

Nukesor commented Feb 18, 2026

Damn. I started writing an answer and then got sidetracked. Sorry for the long wait.

beet lastgenre -F --cleanup-existing -> whitelist filtering and/or canonicalization (and in the future normalizing of spelling / applying of regex aliases) will take place if configured.

I see. In that context, cleanup makes a lot more sense. I like the idea where lastgenre is heading! Rather than a "Pull genres from lastfm", it feels more like a general canonicalization/cleanup/completion tool, which is pretty much what I'm using it for :D.

Tbh. now that I know the planned changes, I'm perfectly fine with --cleanup-existing. It'll be a bit confusing in the beginning, but it'll make sense in the long term.

So just to double-check, regarding this PR my todos would be:

  • Rename the config flag
  • Anything else :D?

We should not restrict this no-force-cleanup to canonicalization only.

What other cleanup would be there to perform? The current impl does _try_resolve_stage, which performs the other cleanup as well, right?

@Nukesor Nukesor force-pushed the canonicalize-existing branch from c5eb470 to 16d4df8 Compare February 18, 2026 16:25
@Nukesor
Copy link
Copy Markdown
Contributor Author

Nukesor commented Feb 18, 2026

btw: The CI is complaining about docstring formatting.
But when I'm running the lint checks and on my machine it looks perfectly fine:

docstrfmt docs README_kr.rst CONTRIBUTING.rst CODE_OF_CONDUCT.rst README.rst
121 files was checked.

I'm using docstrfmt v2 though, as that's the only thing running with python 3.14
See: #6377

Comment thread beetsplug/lastgenre/__init__.py Outdated
@snejus
Copy link
Copy Markdown
Member

snejus commented Feb 18, 2026

I'm using docstrfmt v2 though, as that's the only thing running with python 3.14

Beets does not support 3.14 yet - use 3.13 instead and install the dependencies with poe install to make sure they are in sync.

@JOJ0 JOJ0 changed the title Lastgenre: Canonicalize existing genres Lastgenre: Cleanup existing genres Feb 18, 2026
Comment thread beetsplug/lastgenre/__init__.py Outdated
Comment thread beetsplug/lastgenre/__init__.py Outdated
Comment thread beetsplug/lastgenre/__init__.py Outdated
Copy link
Copy Markdown
Member

@snejus snejus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on this! One thought on the condition here - if the user has set cleanup_existing, they've already opted in, so silently doing nothing when whitelist or canonical aren't set seems like a footgun.

_try_resolve_stage already handles whether whitelist filtering or canonicalization applies based on its own config, so wouldn't the following be sufficient?

            if self.config["cleanup_existing"]:

Comment thread beetsplug/lastgenre/__init__.py Outdated
@Nukesor
Copy link
Copy Markdown
Contributor Author

Nukesor commented Feb 21, 2026

@JOJ0 created #6386 :)
I left this MR on top of the other one so the tests pass :)

(Did a bit of a oopsie on the rebase+cherrypick, but the history should be clean now again)

@Nukesor Nukesor force-pushed the canonicalize-existing branch 7 times, most recently from acaf88a to ef58634 Compare February 28, 2026 13:39
@Nukesor
Copy link
Copy Markdown
Contributor Author

Nukesor commented Feb 28, 2026

@JOJ0 I rebased onto master and resolved any conflicts with the recent lastgenre merges :)
Anything else to do in here (or the other MR) or is this waiting for #6368 to land (just asking whether I should keep checking/rebasing or wait until you ping me :D )?

@JOJ0
Copy link
Copy Markdown
Member

JOJ0 commented Mar 1, 2026

@JOJ0 I rebased onto master and resolved any conflicts with the recent lastgenre merges :) Anything else to do in here (or the other MR) or is this waiting for #6368 to land (just asking whether I should keep checking/rebasing or wait until you ping me :D )?

Thanks. Nothing to do here. LGTM but I would like to wait for the test thing yes. Best if I ping you once that is done so you don't have to keep doing boring rebase work :-) Thanks!!!

snejus added a commit that referenced this pull request Mar 2, 2026
Fixes a bug in the lastgenre plugin, where test state bled into the
following fixtures.

Each plugin has a view to the global persisted beets.config field. As a
result, config variables that aren't explicitly overwritten are
persisted in that global config view.

This commit exposes the lastgenre default config as a static method and
uses that default config to reset the state in between fixture calls.

There were 3 tests that depended on `count: 10` being set on previous
test fixtures, which I adjusted accordingly.

Discovered and discussed in #6317 , see
#6317 (comment)
@Nukesor Nukesor force-pushed the canonicalize-existing branch from ef58634 to 92b9392 Compare March 3, 2026 08:50
Comment thread docs/changelog.rst Outdated
@Nukesor Nukesor force-pushed the canonicalize-existing branch 2 times, most recently from 49c62b0 to 8124de7 Compare March 3, 2026 18:09
@snejus
Copy link
Copy Markdown
Member

snejus commented Mar 3, 2026

@JOJ0 back to you!

@JOJ0
Copy link
Copy Markdown
Member

JOJ0 commented Mar 4, 2026

lastgenre: Alves - Episode 05 - 01 - Dark Era Of Amnesia (Resurrection) (2016)  [] $genre
lastgenre: raw last.fm tags: []
lastgenre: existing genres taken into account: ['drum and bass']
lastgenre: Resolved (keep + original, whitelist): ['Drum And Bass']

"keep + original, whitelist" is the log label we get now, right?

when running without --force we usually get "keep any, no-force"

hmm,. let's review what the purpose of this logging label was as its "kind of" documented in the docstring of _get_genre

        A `(genres, label)` pair is returned, where `label` is a string used for
        logging. For example, "keep + artist, whitelist" indicates that existing
        genres were combined with new last.fm genres and whitelist filtering was
        applied, while "artist, any" means only new last.fm genres are included
        and the whitelist feature was disabled.

In the fallback stage we have a pretty similar logic and these are the possible outcomes:

original fallback
keep + original fallback, whitelist
keep + original fallback, any

This all is very confusing to me already. Haha. Hmm I'm just thinking out lout here trying to justify that the label we get now "keep + original, whitelist" is good enough. Or would it make more sense if we pass original cleanup to _try_resolve_stage, that way we would get something like this:

keep + original cleanup, whitelist

which is also kind of weird because it says: we keep original genres and combine them with original cleaned up genres.

Maybe the keep + logic in try_resolve_genre is just to dumb to clearly help creating a log label that makes sense for this case.

Help me think here @Nukesor and sorry for another "brainstorming" delay! We are almost done here! Thanks for your patience!

One thing I do like very much as this feature turned out: In the beginning I thought we should have this option as --cleanup-existing too, to be able to use it on a per-run basis. Now I think it is good we make it a "once and for all decision", so it doesn't further complicate the already rather complex UI option combination possibilities! And I suppose deciding this once will suffice for most users.

@Nukesor
Copy link
Copy Markdown
Contributor Author

Nukesor commented Mar 4, 2026

Haha, yep. I was pretty confused when picking the label as well 😅. It's a bit obscure.

which is also kind of weird because it says: we keep original genres and combine them with original cleaned up genres.

Maybe a keep + cleanup, whitelist? IMO, That conveys the meaning pretty well. keep (keep the originals) + cleanup (but clean them up) , whitlist (and apply whitelist logic)
I changed it to "cleanup" for now :)

One thing I do like very much as this feature turned out: In the beginning I thought we should have this option as --cleanup-existing too, to be able to use it on a per-run basis. Now I think it is good we make it a "once and for all decision", so it doesn't further complicate the already rather complex UI option combination possibilities! And I suppose deciding this once will suffice for most users.

Yep, I agree :D

@Nukesor Nukesor force-pushed the canonicalize-existing branch 3 times, most recently from 0b25c83 to f5e89e5 Compare March 4, 2026 15:22
Introduce a new lastgenre `cleanup_existing` flag.

It handles the case where canonicalization is desired on existing tags.
The new logic triggers if:
- `force`: False
- `cleanup_existing: True

Depending on whether `whitelist: True` or `canonical: True`, the genres
are then canonicalized and/or whitelisting is applied
@JOJ0 JOJ0 force-pushed the canonicalize-existing branch from f5e89e5 to 13fe82f Compare March 4, 2026 20:01
@JOJ0 JOJ0 enabled auto-merge March 4, 2026 20:04
@JOJ0 JOJ0 merged commit edbf737 into beetbox:master Mar 4, 2026
14 checks passed
@JOJ0 JOJ0 removed the plugin Pull requests that are plugins related label Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lastgenre lastgenre plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow canonicalization of existing genres

5 participants