From 66e5819177c4d0604861676d492312dce588bff6 Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Thu, 28 May 2026 17:11:44 +0800 Subject: [PATCH 01/17] docs: update scraper subsystem documentation --- docs/scraper.md | 160 ++++++++++++++++++++++++++++++------------------ 1 file changed, 101 insertions(+), 59 deletions(-) diff --git a/docs/scraper.md b/docs/scraper.md index 6bb99c01..e95898d5 100644 --- a/docs/scraper.md +++ b/docs/scraper.md @@ -1,102 +1,102 @@ # Scraper Subsystem -The scraper subsystem enriches existing MediaDB records with metadata from external sources. The filesystem scanner owns record creation; scrapers only update records that already exist. +The scraper subsystem enriches existing MediaDB records with metadata from external sources. The filesystem scanner owns record creation; scrapers update records that already exist. -The first scraper implementation is `gamelist.xml`, which imports EmulationStation metadata such as developer, publisher, genre, rating, player count, descriptions, artwork paths, videos, manuals, and ScreenScraper game IDs. +The only current scraper implementation is `gamelist.xml`, which imports EmulationStation metadata such as developer, publisher, genre, rating, player count, descriptions, artwork paths, videos, manuals, and ScreenScraper game IDs. ## Code Layout | Path | Purpose | |---|---| -| `pkg/database/scraper/` | Public scraper interface, shared types, and generic run loop | -| `pkg/database/scraper/gamelistxml/` | EmulationStation `gamelist.xml` implementation | -| `pkg/database/mediadb/sql_scraper.go` | MediaDB tag/property reads and writes used by scrapers | -| `pkg/api/methods/media_scrape.go` | JSON-RPC scrape start/cancel handlers and scraper listing | -| `pkg/api/methods/media_image.go` | JSON-RPC image lookup from scraped properties | +| `pkg/database/scraper/` | Shared scrape types (`ScrapeOptions`, `ScrapeUpdate`), sentinel helper, and small channel startup helper | +| `pkg/database/scraper/gamelistxml/` | EmulationStation `gamelist.xml` scraper loop, matcher, mapper, and companion-entry handling | +| `pkg/platforms/*` | Platform scraper registration through `Platform.Scrapers` | +| `pkg/database/mediadb/sql_scraper.go` | MediaDB scraper read/write helpers, property/blob helpers, and metadata graph queries | +| `pkg/api/methods/media_scrape.go` | JSON-RPC scrape start/status/cancel/resume handlers and scraper listing | | `pkg/api/methods/media_meta.go` | JSON-RPC metadata graph lookup for media rows | +| `pkg/api/methods/media_image.go` | JSON-RPC image lookup from scraped properties | -## Interfaces +## Registration And API Lifecycle -All metadata scrapers implement `scraper.Scraper`: +Platforms expose available scrapers with: ```go -type Scraper interface { - ID() string - Name() string - SupportedSystems() []string - Scrape(ctx context.Context, opts ScrapeOptions) (<-chan ScrapeUpdate, error) -} +Scrapers(*config.Instance) map[string]platforms.Scraper ``` -`ID` is stable API and is also used in sentinel tag names. `SupportedSystems` returns an empty slice when the scraper supports all systems. - -Concrete scrapers plug into the shared loop by implementing `ScraperLoop[T]`: +`platforms.Scraper` carries `ID`, `Name`, `SupportedSystemIDs`, optional `CustomOpts`, and a `Scrape` callback. The callback receives context, config, platform, filesystem, database, shared scrape options, custom options, and an update channel. -```go -type ScraperLoop[T any] interface { - ID() string - LoadRecords(ctx context.Context, system ScrapeSystem) ([]T, error) - Match(ctx context.Context, record T, system ScrapeSystem, db database.MediaDBI) (*MatchResult, error) - MapToDB(record T) MapResult -} -``` +`media.scrape` looks up the requested `scraperId` from `env.Platform.Scrapers(env.Config)`, rejects the request if media indexing or another scrape is active, creates an app-scoped cancelable context, starts the scraper in the background, tracks it as a MediaDB background operation, and publishes `media.scraping` notifications. -The API layer registers scraper instances in `RequestEnv.Scrapers`. `media.scrape` looks up the scraper by `scraperId` and runs it as a background operation. +`media.scrape.status` returns the latest in-memory status snapshot plus a fresh scraped-count query. `media.scrape.cancel` cancels the active scrape context. `media.scrape.resume` resumes the shared scrape pauser. Scraping and indexing are mutually exclusive. ## Run Loop -`RunScraper` in `pkg/database/scraper/run.go` is generic over the source record type. The caller must pass resolved systems with `ID`, `DBID`, and `ROMPaths` populated. +There is no generic source-record scrape loop. `pkg/database/scraper/run.go` only provides a small helper for wrapping callback/channel startup. The `gamelist.xml` implementation owns its loop in `GamelistXMLScraper.scrapeLoop`. -For each system, the loop: +For each system, the normal loop: -1. Emits an initial update with an unknown total. -2. Loads source records with `LoadRecords`. -3. Emits the total record count. -4. Matches each source record to an existing `Media` and `MediaTitle` row. -5. Skips already-scraped records unless `force` is set. -6. Maps matched records to tag and property writes. -7. Writes media tags, title tags, title properties, and media properties. -8. Writes the scraper sentinel tag last. +1. Resolves target systems from indexed MediaDB systems and platform launcher paths. +2. Runs ZaparooCompanion processing first. This is a special path; see [ZaparooCompanion Entries](#zaparoocompanion-entries). +3. Builds candidate titles: + - `force=true`: all titles for the system. + - `force=false`: titles without sentinel tag `scraper.gamelist.xml:scraped`. +4. Loads `gamelist.xml` from each ROM root. +5. Resolves each `` path under its ROM root and derives a scanner-compatible title slug, using `` as provided name when present. +6. Matches the slug to an existing `MediaTitle` row. +7. Chooses the first Media DBID for the matched title as the sentinel write target. +8. Maps XML fields to title-level tags and properties. +9. Writes metadata through `MediaDB.ApplyScrapeResult`. +10. Writes the scraper sentinel tag last inside the same transaction. +11. Emits progress updates and a final done update. -The sentinel tag format is `scraper.:scraped`, for example `scraper.gamelist.xml:scraped`. Writing it last is intentional: if a run fails halfway through a record, the missing sentinel leaves that record eligible for retry on the next run. +The sentinel tag format is `scraper.:scraped`, for example `scraper.gamelist.xml:scraped`. Writing it last is intentional: if a normal record write fails, the transaction rolls back and the missing sentinel leaves that title eligible for retry. -`Match` failures and write failures are non-fatal per-record errors. They increment `Skipped`, emit `Err`, and the loop continues. `LoadRecords` failures are fatal unless they are caused by context cancellation. The channel always closes after a terminal `Done` or `FatalErr` update. +Per-record write failures are non-fatal: they increment `Skipped`, emit `Err`, and continue. Fatal setup/load/database errors end the run with a terminal update unless caused by context cancellation. ## Tags And Properties -Scrapers write normal MediaDB tags and properties: +The DB supports tags/properties at both media and title scope, but the normal `gamelist.xml` scraper currently writes title metadata only. -| Storage | Scope | Examples | +| Storage | Scope | Current normal `gamelist.xml` use | |---|---|---| -| `MediaTags` | ROM-level variant metadata | language, region, scraper sentinel | -| `MediaTitleTags` | Title-level shared metadata | developer, publisher, year, rating, genre, players | +| `MediaTags` | ROM-level variant metadata | Sentinel tag only for normal entries; companion child `region`/`lang` only | +| `MediaTitleTags` | Title-level shared metadata | developer, publisher, year, rating, genre, players, arcadeboard, gamefamily | | `MediaTitleProperties` | Title-level static content | description, artwork paths, video path, manual path, XML game ID | -| `MediaProperties` | ROM-level static content | Supported by the loop, but not currently written by `gamelist.xml` | +| `MediaProperties` | ROM-level static content | Supported by DB helpers, not currently written by normal `gamelist.xml` entries | -Tag exclusivity is controlled by the canonical `TagTypes.IsExclusive` flag. Exclusive tag types replace existing values for that type; additive tag types accumulate distinct values. The database does not enforce this directly. The scraper write path groups tags by type and applies the correct behavior in `upsertTags`. +Tag exclusivity is controlled by `TagTypes.IsExclusive`. Exclusive types replace existing values for that type; additive types accumulate distinct values. The scraper write path groups tags by type and applies that behavior in `upsertTags`. -Property rows are keyed by entity and property type tag. Re-scraping the same property type updates the row in place and preserves the row DBID. +Property rows are keyed by entity and property type tag. Re-scraping the same property type updates the row in place and preserves row DBID. + +Path-backed properties persist their text path and optional `BlobDBID`; the property tables do not persist the `ContentType` computed by the mapper for path values. Blob-backed properties expose content type from `MediaBlobs`. API responses can infer extensions from content type or text path, but path-backed `contentType` may be empty. + +## Title-level Metadata Invariant + +Normal `gamelist.xml` metadata is shared by title. The sentinel is written to one Media row associated with the matched title, and `FindMediaTitlesWithoutSentinel` skips a title if any associated Media row has that sentinel. + +This is safe while the normal scraper writes only `MediaTitleTags` and `MediaTitleProperties`. Future per-ROM metadata imports need a different sentinel strategy, such as tagging every media row or splitting title-level and media-level sentinel tags. ## gamelist.xml Behavior -`GamelistXMLScraper` scans each system ROM root for `gamelist.xml`. Each `` entry is resolved to an absolute path and matched against existing media with `FindMediaBySystemAndPathFold`. +`GamelistXMLScraper` scans each system ROM root for `gamelist.xml`. Regular `` entries are resolved to absolute paths under the system ROM root, converted to scanner-compatible slugs, and matched to existing `MediaTitle` rows. Scrapers do not create `Media` or `MediaTitle` rows. Path handling: | Input | Behavior | |---|---| | `./relative` or `relative` | Resolved under the system ROM root and rejected if it escapes that root | -| `~/...` | Resolved under the current user's home directory | -| Absolute path | Cleaned and used as-is | +| `~/...` | Resolved under the current user's home directory, then rejected unless still under the system ROM root | +| Absolute path | Cleaned and rejected unless under the system ROM root | -Source fields are cleaned before mapping: HTML entities are unescaped, control whitespace is collapsed, and surrounding whitespace is trimmed. +Source fields are cleaned before mapping: HTML entities are unescaped, tab/newline/carriage-return characters become spaces, and surrounding whitespace is trimmed. ### Field Mapping +Regular ES `lang` and `region` fields are not currently imported for normal game entries. + | ES field | Destination | Notes | |---|---|---| -| `lang` | `MediaTags: lang` | CSV split, lowercased, additive | -| `region` | `MediaTags: region` | CSV split, lowercased, additive | | `developer` | `MediaTitleTags: developer` | Exclusive | | `publisher` | `MediaTitleTags: publisher` | Exclusive | | `releasedate` | `MediaTitleTags: year` | First four characters when present | @@ -109,17 +109,46 @@ Source fields are cleaned before mapping: HTML entities are unescaped, control w | ScreenScraper game ID | `MediaTitleProperties: property:xml-game-id` | From XML attribute or element value | | `image` | `MediaTitleProperties: property:image-image` | XML path or filesystem fallback | | `thumbnail` | `MediaTitleProperties: property:image-thumbnail` | Cover/thumbnail path in most ES forks | +| `boxart2d` | `MediaTitleProperties: property:image-boxart` | XML path or filesystem fallback | +| `boxart3d` | `MediaTitleProperties: property:image-boxart3d` | XML path or filesystem fallback | +| `screenshot` | `MediaTitleProperties: property:image-screenshot` | XML path or filesystem fallback | | `video` | `MediaTitleProperties: property:video` | Filesystem path | | `marquee` | `MediaTitleProperties: property:image-marquee` | XML path or filesystem fallback | -| `wheel` | `MediaTitleProperties: property:image-wheel` | XML path or filesystem fallback | +| `logo` / `wheel` | `MediaTitleProperties: property:image-wheel` | `logo` takes priority over `wheel`; XML path or filesystem fallback | | `fanart` | `MediaTitleProperties: property:image-fanart` | XML path or filesystem fallback | -| `titleshot` | `MediaTitleProperties: property:image-titleshot` | XML path or filesystem fallback | +| `titlescreen` / `titleshot` | `MediaTitleProperties: property:image-titleshot` | `titlescreen` takes priority over `titleshot`; XML path or filesystem fallback | | `map` | `MediaTitleProperties: property:image-map` | XML path or filesystem fallback | | `manual` | `MediaTitleProperties: property:manual` | PDF path | +Filesystem fallback searches known subdirectories under `/media/` for a `.png` file when an XML path is absent. Side/back box art are filesystem-fallback only. + `gamelist.xml` deliberately does not scrape user-state fields such as favorite, hidden, or kidgame. It also does not overwrite filename-parser-owned fields such as disc and track. -For image properties, the scraper first uses the XML path if present. If the XML field is empty, it searches known media subdirectories under `/media/` for a file matching the ROM filename stem. `image-boxart` and `image-screenshot` are filesystem-fallback only because they do not have dedicated EmulationStation XML fields. +## ZaparooCompanion Entries + +`gamelist.xml` has a special path for entries marked with `source="ZaparooCompanion"` as either a `source` attribute or `` element. + +Companion records are split into: + +- Parent entries: have an ID attribute and no path. They carry shared title metadata. +- Child entries: have `parentid` and path. They reference parent metadata. + +Child matching: + +- Paths ending in `.slug` match an existing title by slug. +- Other child paths match media rows by filename suffix with `FindMediaBySystemAndPathSuffix`. + +For matched children, parent metadata is upserted onto the child title. Child `region` and `lang` are upserted to the child Media row as media-level tags. + +Current caveats: + +- Companion processing runs before normal sentinel filtering. +- Companion-only writes do not write scraper sentinel tags. +- Companion writes are not included in scrape progress totals. +- Companion title metadata and child media tags are separate DB writes, not one `ApplyScrapeResult` transaction. +- Filename-suffix matching can update multiple same-basename media rows. + +These caveats document current behavior, not necessarily desired long-term behavior. ## API Surface @@ -129,8 +158,9 @@ JSON-RPC methods: |---|---| | `scrapers` | Lists registered scrapers with ID, name, and supported systems | | `media.scrape` | Starts a scraper run as a background operation | -| `media.scrape.status` | Returns the latest scraper status snapshot | +| `media.scrape.status` | Returns latest in-memory scraper status plus current DB scraped count | | `media.scrape.cancel` | Cancels the active scraper run | +| `media.scrape.resume` | Resumes a paused scraper run | | `media.meta` | Returns tags and metadata-only properties for one or more media rows and their titles | | `media.image` | Returns the best matching image property as base64 data for one media row | | `media.clean.orphans` | Removes missing media rows and orphaned related data | @@ -155,13 +185,25 @@ Progress is queryable with `media.scrape.status` and broadcast as `media.scrapin "total": 100, "matched": 38, "skipped": 4, + "totalScraped": 1000, "scraping": true, - "done": false + "done": false, + "paused": false } ``` +`totalScraped` is derived from scraper sentinel tags in the database, not from the current run's `matched` count. + Only one scraper can run at a time, and scraping is mutually exclusive with media indexing. -`media.meta` returns the full metadata graph for media rows: media-level tags and properties, title-level tags and properties, and the stored system identity. Single requests accept `mediaId` or `system`/`path` and keep the single-response shape; batch requests use `items` and return per-item results. Binary property bytes are not included; clients should use `media.image` for image data. +`media.meta` returns the metadata graph for media rows: media-level tags and properties, title-level tags and properties, and stored system identity. Single requests accept `mediaId` or `system`/`path` and keep the single-response shape; batch requests use `items` and return per-item results. Binary property bytes are not included; clients should use `media.image` for image data. -`media.image` accepts one media ref plus image type preferences such as `image`, `boxart`, `screenshot`, `wheel`, `titleshot`, `map`, `marquee`, and `fanart`. These resolve to canonical image property tags; for example `boxart` becomes `property:image-boxart` and `image` becomes `property:image-image`. Media-level properties are preferred over title-level properties for the same type. Stale file paths are removed automatically and lookup falls through to the next available source. +`media.image` accepts one media ref plus image type preferences such as `image`, `boxart`, `boxart3d`, `screenshot`, `wheel`, `titleshot`, `map`, `marquee`, and `fanart`. These resolve to canonical image property tags; for example `boxart` becomes `property:image-boxart` and `image` becomes `property:image-image`. Media-level properties are preferred over title-level properties for the same type. Stale file path properties are removed automatically and lookup falls through to the next available source. + +## Useful Focused Tests + +```bash +go test ./pkg/database/scraper/... +go test ./pkg/database/mediadb/ -run 'Scrape|Property|Blob|Sentinel|MediaImage' +go test ./pkg/api/methods/ -run 'Scrape|MediaImage|MediaMeta' +``` From 81592bcaff839c045cb2121f5a9527da747cd2db Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Thu, 28 May 2026 17:30:12 +0800 Subject: [PATCH 02/17] fix(scraper): tighten companion entry handling --- docs/scraper.md | 16 +- pkg/database/scraper/gamelistxml/scraper.go | 248 ++++++++++-------- .../scraper/gamelistxml/scraper_test.go | 162 +++++++++--- 3 files changed, 274 insertions(+), 152 deletions(-) diff --git a/docs/scraper.md b/docs/scraper.md index e95898d5..121997a8 100644 --- a/docs/scraper.md +++ b/docs/scraper.md @@ -135,18 +135,18 @@ Companion records are split into: Child matching: -- Paths ending in `.slug` match an existing title by slug. -- Other child paths match media rows by filename suffix with `FindMediaBySystemAndPathSuffix`. +- Paths ending in `.slug` match an existing title by slug, then use the first Media row for that title as the write target. +- Other child paths first try an exact case-insensitive media path lookup. +- If exact lookup fails, the scraper falls back to filename suffix matching with `FindMediaBySystemAndPathSuffix`. +- Ambiguous suffix matches are skipped instead of updating multiple same-basename media rows. -For matched children, parent metadata is upserted onto the child title. Child `region` and `lang` are upserted to the child Media row as media-level tags. +For matched children, parent metadata is written onto the child title, child `region` and `lang` are written to the child Media row as media-level tags, and the scraper sentinel is written to that child Media row. These writes use `ApplyScrapeResult`, so title metadata, child tags, and the sentinel are committed together. Current caveats: -- Companion processing runs before normal sentinel filtering. -- Companion-only writes do not write scraper sentinel tags. -- Companion writes are not included in scrape progress totals. -- Companion title metadata and child media tags are separate DB writes, not one `ApplyScrapeResult` transaction. -- Filename-suffix matching can update multiple same-basename media rows. +- Companion processing still runs before normal title filtering. +- With `force=false`, child media rows that already have the `scraper.gamelist.xml:scraped` sentinel are skipped. +- Companion processed/matched/skipped counts contribute to run counters, but companion entries do not have a separate total in status updates. These caveats document current behavior, not necessarily desired long-term behavior. diff --git a/pkg/database/scraper/gamelistxml/scraper.go b/pkg/database/scraper/gamelistxml/scraper.go index 520a735e..02ce37c2 100644 --- a/pkg/database/scraper/gamelistxml/scraper.go +++ b/pkg/database/scraper/gamelistxml/scraper.go @@ -323,7 +323,10 @@ func (g *GamelistXMLScraper) scrapeLoop( } // Process ZaparooCompanion parent/child entries before regular slug scrape. - g.processCompanionEntries(ctx, system, mdb) + companion := g.processCompanionEntries(ctx, opts, system, mdb) + totalProcessed += companion.Processed + totalMatched += companion.Matched + totalSkipped += companion.Skipped // Build slug → MediaTitle map: only unscraped titles unless Force. titlesBySlug := make(map[string]database.MediaTitle) @@ -941,6 +944,12 @@ type companionChild struct { Lang string } +type companionStats struct { + Processed int + Matched int + Skipped int +} + // loadCompanionEntries scans all gamelist.xml files for the system and separates // entries with source="ZaparooCompanion" into parent records (id attr, no path) and // child records (parentid attr, has path). @@ -1051,13 +1060,14 @@ func (g *GamelistXMLScraper) mapCompanionParentToResult(p *companionParent) scra // to the child Media row. func (g *GamelistXMLScraper) processCompanionEntries( ctx context.Context, + opts scraper.ScrapeOptions, system scraper.ScrapeSystem, mdb database.MediaDBI, -) { +) companionStats { parents, children := g.loadCompanionEntries(ctx, system) if len(parents) == 0 && len(children) == 0 { log.Debug().Msg("gamelistxml: companion entries not found") - return + return companionStats{} } log.Info(). @@ -1078,128 +1088,160 @@ func (g *GamelistXMLScraper) processCompanionEntries( Int("props", len(parentMeta[p.GameID].TitleProps)). Msg("gamelistxml: companion: parent mapped") } + if len(children) == 0 { + return companionStats{} + } + + mediaByTitleDBID, mediaErr := companionMediaByTitle(system.ID, mdb) + if mediaErr != nil { + log.Warn().Err(mediaErr).Str("system", system.ID). + Msg("gamelistxml: companion: failed to load media map, skipping companion entries") + return companionStats{Processed: len(children), Skipped: len(children)} + } + + sentinel := scraper.SentinelTagInfo("gamelist.xml") + sentinelTag := sentinel.Type + ":" + sentinel.Tag + var stats companionStats - // Phase 2: enrich each child's existing MediaTitle with parent metadata. - // Child paths are root-relative (./file.rom) so match by filename suffix. - seenTitles := make(map[int64]bool) for _, c := range children { + stats.Processed++ meta, ok := parentMeta[c.ParentGameID] if !ok { log.Debug().Str("parentGameID", c.ParentGameID). Msg("gamelistxml: companion: parent not found for child, skipping") + stats.Skipped++ continue } - filename := filepath.Base(c.ResolvedPath) - - if filepath.Ext(filename) == ".slug" { - slug := strings.TrimSuffix(filename, ".slug") - title, titleErr := mdb.FindMediaTitleBySystemAndSlug(ctx, system.DBID, slug) - if titleErr != nil { - log.Debug().Err(titleErr).Str("slug", slug). - Msg("gamelistxml: companion: error looking up child title by slug") - continue - } - if title == nil { - log.Debug().Str("slug", slug). - Msg("gamelistxml: companion: no indexed title found for child slug, skipping") - continue - } - if !seenTitles[title.DBID] { - titleSuccess := true - if len(meta.TitleTags) > 0 { - if tagsErr := mdb.UpsertMediaTitleTags(ctx, title.DBID, meta.TitleTags); tagsErr != nil { - log.Warn().Err(tagsErr).Int64("mediaTitleDBID", title.DBID). - Msg("gamelistxml: companion: upsert parent tags on slug-matched title failed") - titleSuccess = false - } - } - if len(meta.TitleProps) > 0 { - if propsErr := mdb.UpsertMediaTitleProperties(ctx, title.DBID, meta.TitleProps); propsErr != nil { - log.Warn().Err(propsErr).Int64("mediaTitleDBID", title.DBID). - Msg("gamelistxml: companion: upsert parent props on slug-matched title failed") - titleSuccess = false - } - } - if titleSuccess { - seenTitles[title.DBID] = true - } - } else { - log.Debug().Int64("mediaTitleDBID", title.DBID). - Msg("gamelistxml: companion: slug-matched title already enriched, skipping") - } - continue - } - - matched, mediaErr := mdb.FindMediaBySystemAndPathSuffix(ctx, system.DBID, filename) - if mediaErr != nil { - log.Debug().Err(mediaErr).Str("filename", filename). - Msg("gamelistxml: companion: error looking up child by filename") - continue - } + matched := g.matchCompanionChildMedia(ctx, system, c, mediaByTitleDBID, mdb) if len(matched) == 0 { - log.Debug().Str("filename", filename). - Msg("gamelistxml: companion: no indexed media found for child filename, skipping") + stats.Skipped++ continue } - log.Debug(). - Str("filename", filename). - Int("matches", len(matched)). - Msg("gamelistxml: companion: child filename matches") - - var childTags []database.TagInfo - if c.Region != "" { - childTags = append(childTags, database.TagInfo{Type: string(tags.TagTypeRegion), Tag: c.Region}) - } - if c.Lang != "" { - childTags = append(childTags, database.TagInfo{Type: string(tags.TagTypeLang), Tag: c.Lang}) - } for _, media := range matched { - log.Debug(). - Str("path", media.Path). - Int64("mediaDBID", media.DBID). - Int64("mediaTitleDBID", media.MediaTitleDBID). - Msg("gamelistxml: companion: enriching child media title with parent metadata") - - if !seenTitles[media.MediaTitleDBID] { - titleSuccess := true - if len(meta.TitleTags) > 0 { - if tagsErr := mdb.UpsertMediaTitleTags(ctx, media.MediaTitleDBID, meta.TitleTags); tagsErr != nil { - log.Warn().Err(tagsErr).Int64("mediaTitleDBID", media.MediaTitleDBID). - Msg("gamelistxml: companion: upsert parent tags on child title failed") - titleSuccess = false - } - } - if len(meta.TitleProps) > 0 { - propsErr := mdb.UpsertMediaTitleProperties(ctx, media.MediaTitleDBID, meta.TitleProps) - if propsErr != nil { - log.Warn().Err(propsErr).Int64("mediaTitleDBID", media.MediaTitleDBID). - Msg("gamelistxml: companion: upsert parent props on child title failed") - titleSuccess = false - } + if !opts.Force { + scraped, tagErr := mdb.MediaHasTag(ctx, media.DBID, sentinelTag) + if tagErr != nil { + log.Warn().Err(tagErr).Int64("mediaDBID", media.DBID). + Msg("gamelistxml: companion: sentinel check failed, skipping child media") + stats.Skipped++ + continue } - if titleSuccess { - seenTitles[media.MediaTitleDBID] = true + if scraped { + log.Debug().Int64("mediaDBID", media.DBID). + Msg("gamelistxml: companion: child media already scraped, skipping") + stats.Skipped++ + continue } - } else { - log.Debug().Int64("mediaTitleDBID", media.MediaTitleDBID). - Msg("gamelistxml: companion: title already enriched, skipping") } - if len(childTags) > 0 { - log.Debug(). + writeErr := mdb.ApplyScrapeResult(ctx, media.DBID, media.MediaTitleDBID, &database.ScrapeWrite{ + Sentinel: sentinel, + MediaTags: companionChildTags(c), + TitleTags: meta.TitleTags, + TitleProps: meta.TitleProps, + }) + if writeErr != nil { + log.Warn().Err(writeErr). Int64("mediaDBID", media.DBID). - Str("region", c.Region). - Str("lang", c.Lang). - Msg("gamelistxml: companion: writing child region/lang tags") - if tagsErr := mdb.UpsertMediaTags(ctx, media.DBID, childTags); tagsErr != nil { - log.Warn().Err(tagsErr).Int64("mediaDBID", media.DBID). - Msg("gamelistxml: companion: upsert child tags failed") - } + Int64("mediaTitleDBID", media.MediaTitleDBID). + Msg("gamelistxml: companion: write failed") + stats.Skipped++ + continue + } + stats.Matched++ + } + } + return stats +} + +func companionMediaByTitle(systemID string, mdb database.MediaDBI) (map[int64]database.Media, error) { + allMedia, err := mdb.GetMediaBySystemID(systemID) + if err != nil { + return nil, err + } + mediaByTitleDBID := make(map[int64]database.Media, len(allMedia)) + for _, m := range allMedia { + if _, exists := mediaByTitleDBID[m.MediaTitleDBID]; !exists { + mediaByTitleDBID[m.MediaTitleDBID] = database.Media{ + DBID: m.DBID, + MediaTitleDBID: m.MediaTitleDBID, + Path: m.Path, } } } + return mediaByTitleDBID, nil +} + +func (g *GamelistXMLScraper) matchCompanionChildMedia( + ctx context.Context, + system scraper.ScrapeSystem, + child companionChild, + mediaByTitleDBID map[int64]database.Media, + mdb database.MediaDBI, +) []database.Media { + filename := filepath.Base(child.ResolvedPath) + if filepath.Ext(filename) == ".slug" { + slug := strings.TrimSuffix(filename, ".slug") + title, titleErr := mdb.FindMediaTitleBySystemAndSlug(ctx, system.DBID, slug) + if titleErr != nil { + log.Debug().Err(titleErr).Str("slug", slug). + Msg("gamelistxml: companion: error looking up child title by slug") + return nil + } + if title == nil { + log.Debug().Str("slug", slug). + Msg("gamelistxml: companion: no indexed title found for child slug, skipping") + return nil + } + media, ok := mediaByTitleDBID[title.DBID] + if !ok || media.DBID == 0 { + log.Debug().Int64("mediaTitleDBID", title.DBID). + Msg("gamelistxml: companion: no media row for slug-matched title, skipping") + return nil + } + return []database.Media{media} + } + + resolvedPath := filepath.ToSlash(child.ResolvedPath) + exact, exactErr := mdb.FindMediaBySystemAndPathFold(ctx, system.DBID, resolvedPath) + if exactErr != nil { + log.Debug().Err(exactErr).Str("path", resolvedPath). + Msg("gamelistxml: companion: exact child path lookup failed") + } + if exact != nil { + return []database.Media{*exact} + } + + matched, mediaErr := mdb.FindMediaBySystemAndPathSuffix(ctx, system.DBID, filename) + if mediaErr != nil { + log.Debug().Err(mediaErr).Str("filename", filename). + Msg("gamelistxml: companion: error looking up child by filename") + return nil + } + if len(matched) == 0 { + log.Debug().Str("filename", filename). + Msg("gamelistxml: companion: no indexed media found for child filename, skipping") + return nil + } + if len(matched) > 1 { + log.Warn().Str("filename", filename).Int("matches", len(matched)). + Msg("gamelistxml: companion: ambiguous child filename matches, skipping") + return nil + } + return matched +} + +func companionChildTags(c companionChild) []database.TagInfo { + var childTags []database.TagInfo + if c.Region != "" { + childTags = append(childTags, database.TagInfo{Type: string(tags.TagTypeRegion), Tag: c.Region}) + } + if c.Lang != "" { + childTags = append(childTags, database.TagInfo{Type: string(tags.TagTypeLang), Tag: c.Lang}) + } + return childTags } // mimeFromExt returns a MIME type based on file extension. diff --git a/pkg/database/scraper/gamelistxml/scraper_test.go b/pkg/database/scraper/gamelistxml/scraper_test.go index b9ad2dbb..00bb91a0 100644 --- a/pkg/database/scraper/gamelistxml/scraper_test.go +++ b/pkg/database/scraper/gamelistxml/scraper_test.go @@ -1479,6 +1479,14 @@ func TestLoadCompanionEntries_ChildPathTraversalRejected(t *testing.T) { // --- processCompanionEntries --- +func companionWriteMatcher( + mediaTags []database.TagInfo, +) any { + return mock.MatchedBy(func(w *database.ScrapeWrite) bool { + return w != nil && w.Sentinel == scraper.SentinelTagInfo("gamelist.xml") && assert.ObjectsAreEqual(mediaTags, w.MediaTags) + }) +} + func TestProcessCompanionEntries_NoEntries(t *testing.T) { t.Parallel() root := t.TempDir() @@ -1490,26 +1498,32 @@ func TestProcessCompanionEntries_NoEntries(t *testing.T) { mockDB := helpers.NewMockMediaDBI() s := &GamelistXMLScraper{db: mockDB} system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 1} - s.processCompanionEntries(context.Background(), system, mockDB) - // No companion entries → no DB calls. + stats := s.processCompanionEntries(context.Background(), scraper.ScrapeOptions{}, system, mockDB) + assert.Equal(t, companionStats{}, stats) mockDB.AssertExpectations(t) } -func TestProcessCompanionEntries_ChildByFilename(t *testing.T) { +func TestProcessCompanionEntries_ChildByExactPath(t *testing.T) { t.Parallel() root := t.TempDir() require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(companionXML), 0o600)) + resolvedPath := filepath.ToSlash(filepath.Join(root, "child.rom")) + childTags := []database.TagInfo{ + {Type: string(tags.TagTypeRegion), Tag: "usa"}, + {Type: string(tags.TagTypeLang), Tag: "en"}, + } mockDB := helpers.NewMockMediaDBI() - mockDB.On("FindMediaBySystemAndPathSuffix", mock.Anything, int64(1), "child.rom"). - Return([]database.Media{{DBID: 10, MediaTitleDBID: 20}}, nil) - mockDB.On("UpsertMediaTitleTags", mock.Anything, int64(20), mock.Anything).Return(nil) - mockDB.On("UpsertMediaTitleProperties", mock.Anything, int64(20), mock.Anything).Return(nil) - mockDB.On("UpsertMediaTags", mock.Anything, int64(10), mock.Anything).Return(nil) + mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) + mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), resolvedPath). + Return(&database.Media{DBID: 10, MediaTitleDBID: 20}, nil) + mockDB.On("MediaHasTag", mock.Anything, int64(10), "scraper.gamelist.xml:scraped").Return(false, nil) + mockDB.On("ApplyScrapeResult", mock.Anything, int64(10), int64(20), companionWriteMatcher(childTags)).Return(nil) s := &GamelistXMLScraper{db: mockDB} system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 1} - s.processCompanionEntries(context.Background(), system, mockDB) + stats := s.processCompanionEntries(context.Background(), scraper.ScrapeOptions{}, system, mockDB) + assert.Equal(t, companionStats{Processed: 1, Matched: 1}, stats) mockDB.AssertExpectations(t) } @@ -1528,14 +1542,53 @@ func TestProcessCompanionEntries_ChildBySlugFile(t *testing.T) { title := &database.MediaTitle{DBID: 30} mockDB := helpers.NewMockMediaDBI() + mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{{DBID: 40, MediaTitleDBID: 30}}, nil) mockDB.On("FindMediaTitleBySystemAndSlug", mock.Anything, int64(5), "myslug").Return(title, nil) - mockDB.On("UpsertMediaTitleTags", mock.Anything, int64(30), mock.Anything).Return(nil) - mockDB.On("UpsertMediaTitleProperties", mock.Anything, int64(30), mock.Anything).Return(nil) - // No region/lang on child → UpsertMediaTags not called. + mockDB.On("MediaHasTag", mock.Anything, int64(40), "scraper.gamelist.xml:scraped").Return(false, nil) + mockDB.On("ApplyScrapeResult", mock.Anything, int64(40), int64(30), companionWriteMatcher(nil)).Return(nil) s := &GamelistXMLScraper{db: mockDB} system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 5} - s.processCompanionEntries(context.Background(), system, mockDB) + stats := s.processCompanionEntries(context.Background(), scraper.ScrapeOptions{}, system, mockDB) + assert.Equal(t, companionStats{Processed: 1, Matched: 1}, stats) + mockDB.AssertExpectations(t) +} + +func TestProcessCompanionEntries_SkipsAlreadyScrapedUnlessForce(t *testing.T) { + t.Parallel() + root := t.TempDir() + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(companionXML), 0o600)) + + resolvedPath := filepath.ToSlash(filepath.Join(root, "child.rom")) + mockDB := helpers.NewMockMediaDBI() + mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) + mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), resolvedPath). + Return(&database.Media{DBID: 10, MediaTitleDBID: 20}, nil) + mockDB.On("MediaHasTag", mock.Anything, int64(10), "scraper.gamelist.xml:scraped").Return(true, nil) + + s := &GamelistXMLScraper{db: mockDB} + system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 1} + stats := s.processCompanionEntries(context.Background(), scraper.ScrapeOptions{}, system, mockDB) + assert.Equal(t, companionStats{Processed: 1, Skipped: 1}, stats) + mockDB.AssertExpectations(t) +} + +func TestProcessCompanionEntries_ForceRewritesAlreadyScraped(t *testing.T) { + t.Parallel() + root := t.TempDir() + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(companionXML), 0o600)) + + resolvedPath := filepath.ToSlash(filepath.Join(root, "child.rom")) + mockDB := helpers.NewMockMediaDBI() + mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) + mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), resolvedPath). + Return(&database.Media{DBID: 10, MediaTitleDBID: 20}, nil) + mockDB.On("ApplyScrapeResult", mock.Anything, int64(10), int64(20), mock.Anything).Return(nil) + + s := &GamelistXMLScraper{db: mockDB} + system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 1} + stats := s.processCompanionEntries(context.Background(), scraper.ScrapeOptions{Force: true}, system, mockDB) + assert.Equal(t, companionStats{Processed: 1, Matched: 1}, stats) mockDB.AssertExpectations(t) } @@ -1553,20 +1606,20 @@ func TestProcessCompanionEntries_SlugNotIndexed(t *testing.T) { `), 0o600)) mockDB := helpers.NewMockMediaDBI() - // FindMediaTitleBySystemAndSlug returns nil → child skipped. + mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) mockDB.On("FindMediaTitleBySystemAndSlug", mock.Anything, int64(5), "missing"). Return((*database.MediaTitle)(nil), nil) s := &GamelistXMLScraper{db: mockDB} system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 5} - s.processCompanionEntries(context.Background(), system, mockDB) + stats := s.processCompanionEntries(context.Background(), scraper.ScrapeOptions{}, system, mockDB) + assert.Equal(t, companionStats{Processed: 1, Skipped: 1}, stats) mockDB.AssertExpectations(t) } func TestProcessCompanionEntries_ParentNotFoundForChild(t *testing.T) { t.Parallel() root := t.TempDir() - // Child references parentid "99" but no parent entry exists. require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` ./child.rom @@ -1574,17 +1627,17 @@ func TestProcessCompanionEntries_ParentNotFoundForChild(t *testing.T) { `), 0o600)) mockDB := helpers.NewMockMediaDBI() - // Parent not found → child skipped → no DB calls. + mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) s := &GamelistXMLScraper{db: mockDB} system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 1} - s.processCompanionEntries(context.Background(), system, mockDB) + stats := s.processCompanionEntries(context.Background(), scraper.ScrapeOptions{}, system, mockDB) + assert.Equal(t, companionStats{Processed: 1, Skipped: 1}, stats) mockDB.AssertExpectations(t) } -func TestProcessCompanionEntries_SeenTitleDedup(t *testing.T) { +func TestProcessCompanionEntries_EachChildUsesAtomicScrapeWrite(t *testing.T) { t.Parallel() root := t.TempDir() - // Two children from same parent; both map to MediaTitleDBID=20. require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` Test Game @@ -1602,28 +1655,29 @@ func TestProcessCompanionEntries_SeenTitleDedup(t *testing.T) { `), 0o600)) + child1Path := filepath.ToSlash(filepath.Join(root, "child1.rom")) + child2Path := filepath.ToSlash(filepath.Join(root, "child2.rom")) mockDB := helpers.NewMockMediaDBI() - mockDB.On("FindMediaBySystemAndPathSuffix", mock.Anything, int64(1), "child1.rom"). - Return([]database.Media{{DBID: 10, MediaTitleDBID: 20}}, nil) - mockDB.On("FindMediaBySystemAndPathSuffix", mock.Anything, int64(1), "child2.rom"). - Return([]database.Media{{DBID: 11, MediaTitleDBID: 20}}, nil) - // Title-level tags and props written only once (seenTitles dedup). - mockDB.On("UpsertMediaTitleTags", mock.Anything, int64(20), mock.Anything).Return(nil).Once() - mockDB.On("UpsertMediaTitleProperties", mock.Anything, int64(20), mock.Anything).Return(nil).Once() - // Media-level (region/lang) tags written for each child Media row. - mockDB.On("UpsertMediaTags", mock.Anything, int64(10), mock.Anything).Return(nil) - mockDB.On("UpsertMediaTags", mock.Anything, int64(11), mock.Anything).Return(nil) + mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) + mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), child1Path). + Return(&database.Media{DBID: 10, MediaTitleDBID: 20}, nil) + mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), child2Path). + Return(&database.Media{DBID: 11, MediaTitleDBID: 20}, nil) + mockDB.On("MediaHasTag", mock.Anything, int64(10), "scraper.gamelist.xml:scraped").Return(false, nil) + mockDB.On("MediaHasTag", mock.Anything, int64(11), "scraper.gamelist.xml:scraped").Return(false, nil) + mockDB.On("ApplyScrapeResult", mock.Anything, int64(10), int64(20), mock.Anything).Return(nil) + mockDB.On("ApplyScrapeResult", mock.Anything, int64(11), int64(20), mock.Anything).Return(nil) s := &GamelistXMLScraper{db: mockDB} system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 1} - s.processCompanionEntries(context.Background(), system, mockDB) + stats := s.processCompanionEntries(context.Background(), scraper.ScrapeOptions{}, system, mockDB) + assert.Equal(t, companionStats{Processed: 2, Matched: 2}, stats) mockDB.AssertExpectations(t) } -func TestProcessCompanionEntries_NoRegionLangNoChildTags(t *testing.T) { +func TestProcessCompanionEntries_NoRegionLangStillWritesSentinel(t *testing.T) { t.Parallel() root := t.TempDir() - // Child has no region or lang → childTags empty → UpsertMediaTags NOT called. require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` Game @@ -1634,16 +1688,38 @@ func TestProcessCompanionEntries_NoRegionLangNoChildTags(t *testing.T) { `), 0o600)) + gamePath := filepath.ToSlash(filepath.Join(root, "game.rom")) mockDB := helpers.NewMockMediaDBI() - mockDB.On("FindMediaBySystemAndPathSuffix", mock.Anything, int64(1), "game.rom"). - Return([]database.Media{{DBID: 5, MediaTitleDBID: 6}}, nil) - mockDB.On("UpsertMediaTitleTags", mock.Anything, int64(6), mock.Anything).Return(nil) - mockDB.On("UpsertMediaTitleProperties", mock.Anything, int64(6), mock.Anything).Return(nil) - // UpsertMediaTags must NOT be called (no region or lang). + mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) + mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), gamePath). + Return(&database.Media{DBID: 5, MediaTitleDBID: 6}, nil) + mockDB.On("MediaHasTag", mock.Anything, int64(5), "scraper.gamelist.xml:scraped").Return(false, nil) + mockDB.On("ApplyScrapeResult", mock.Anything, int64(5), int64(6), companionWriteMatcher(nil)).Return(nil) + + s := &GamelistXMLScraper{db: mockDB} + system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 1} + stats := s.processCompanionEntries(context.Background(), scraper.ScrapeOptions{}, system, mockDB) + assert.Equal(t, companionStats{Processed: 1, Matched: 1}, stats) + mockDB.AssertExpectations(t) +} + +func TestProcessCompanionEntries_AmbiguousSuffixMatchSkipped(t *testing.T) { + t.Parallel() + root := t.TempDir() + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(companionXML), 0o600)) + + resolvedPath := filepath.ToSlash(filepath.Join(root, "child.rom")) + mockDB := helpers.NewMockMediaDBI() + mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) + mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), resolvedPath). + Return((*database.Media)(nil), nil) + mockDB.On("FindMediaBySystemAndPathSuffix", mock.Anything, int64(1), "child.rom"). + Return([]database.Media{{DBID: 10, MediaTitleDBID: 20}, {DBID: 11, MediaTitleDBID: 21}}, nil) s := &GamelistXMLScraper{db: mockDB} system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 1} - s.processCompanionEntries(context.Background(), system, mockDB) + stats := s.processCompanionEntries(context.Background(), scraper.ScrapeOptions{}, system, mockDB) + assert.Equal(t, companionStats{Processed: 1, Skipped: 1}, stats) mockDB.AssertExpectations(t) } @@ -1652,14 +1728,18 @@ func TestProcessCompanionEntries_FilenameNotIndexed(t *testing.T) { root := t.TempDir() require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(companionXML), 0o600)) + resolvedPath := filepath.ToSlash(filepath.Join(root, "child.rom")) mockDB := helpers.NewMockMediaDBI() - // FindMediaBySystemAndPathSuffix returns empty → child silently skipped. + mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) + mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), resolvedPath). + Return((*database.Media)(nil), nil) mockDB.On("FindMediaBySystemAndPathSuffix", mock.Anything, int64(1), "child.rom"). Return([]database.Media{}, nil) s := &GamelistXMLScraper{db: mockDB} system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 1} - s.processCompanionEntries(context.Background(), system, mockDB) + stats := s.processCompanionEntries(context.Background(), scraper.ScrapeOptions{}, system, mockDB) + assert.Equal(t, companionStats{Processed: 1, Skipped: 1}, stats) mockDB.AssertExpectations(t) } From 0983a5d929faba9f13db5b8e0bdbfa91bfd56a1f Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Thu, 28 May 2026 17:47:15 +0800 Subject: [PATCH 03/17] fix(api): infer content types for media paths --- pkg/api/methods/media_content.go | 11 ++++++++++ pkg/api/methods/media_image.go | 2 +- pkg/api/methods/media_image_test.go | 33 +++++++++++++++++++++++++++++ pkg/api/methods/media_meta.go | 5 +++-- pkg/api/methods/media_meta_test.go | 14 ++++++++++++ 5 files changed, 62 insertions(+), 3 deletions(-) diff --git a/pkg/api/methods/media_content.go b/pkg/api/methods/media_content.go index 2d8326f1..8affa918 100644 --- a/pkg/api/methods/media_content.go +++ b/pkg/api/methods/media_content.go @@ -46,6 +46,17 @@ func mediaContentExtension(contentType, text string) *string { return nil } +func mediaContentType(contentType, text string) string { + if strings.TrimSpace(contentType) != "" { + return contentType + } + ext := extensionFromTextPath(text) + if ext == "" { + return "" + } + return mime.TypeByExtension("." + ext) +} + func extensionFromContentType(contentType string) string { mediaType, _, err := mime.ParseMediaType(contentType) if err != nil { diff --git a/pkg/api/methods/media_image.go b/pkg/api/methods/media_image.go index ce2204e0..0c199577 100644 --- a/pkg/api/methods/media_image.go +++ b/pkg/api/methods/media_image.go @@ -274,7 +274,7 @@ func loadMediaImageProperty( maxBytes int64, ) (*models.MediaImageResponse, bool, error) { var binary []byte - contentType := prop.ContentType + contentType := mediaContentType(prop.ContentType, prop.Text) switch { case len(prop.Binary) > 0: diff --git a/pkg/api/methods/media_image_test.go b/pkg/api/methods/media_image_test.go index c0dea9ef..715bcdaa 100644 --- a/pkg/api/methods/media_image_test.go +++ b/pkg/api/methods/media_image_test.go @@ -182,6 +182,39 @@ func TestHandleMediaImage_BinaryNil_LoadsFromFile(t *testing.T) { // TestHandleMediaImage_NoMatchFound verifies that an error is returned when no // image property matches the preference list. +func TestHandleMediaImage_FilePathInfersContentType(t *testing.T) { + t.Parallel() + + dir := t.TempDir() + imgPath := filepath.Join(dir, "boxart.png") + fileContents := []byte("real-png-data") + require.NoError(t, os.WriteFile(imgPath, fileContents, 0o600)) + + mockDB := testhelpers.NewMockMediaDBI() + row := makeMediaFullRow(33, 330) + expectMediaImageResolve(mockDB, row) + mockDB.On("GetMediaProperties", mock.Anything, int64(33)). + Return([]database.MediaProperty{}, nil) + mockDB.On("GetMediaTitleProperties", mock.Anything, int64(330)). + Return([]database.MediaProperty{ + {TypeTag: "property:image-boxart", Text: imgPath, Binary: nil}, + }, nil) + + env := makeMediaImageEnv(t, mockDB, mediaImageParams(row, `"imageTypes": ["boxart"]`)) + result, err := HandleMediaImage(env) + require.NoError(t, err) + + resp, ok := result.(models.MediaImageResponse) + require.True(t, ok) + assert.Equal(t, "image/png", resp.ContentType) + assert.NotNil(t, resp.Extension) + assert.Equal(t, "png", *resp.Extension) + decoded, decErr := base64.StdEncoding.DecodeString(resp.Data) + require.NoError(t, decErr) + assert.Equal(t, fileContents, decoded) + mockDB.AssertExpectations(t) +} + func TestHandleMediaImage_NoMatchFound(t *testing.T) { t.Parallel() diff --git a/pkg/api/methods/media_meta.go b/pkg/api/methods/media_meta.go index c0adf3de..900802da 100644 --- a/pkg/api/methods/media_meta.go +++ b/pkg/api/methods/media_meta.go @@ -174,10 +174,11 @@ func buildMediaMetaResponse( func mapMediaProperties(props []database.MediaProperty) map[string]models.MediaMetaPropertyItem { m := make(map[string]models.MediaMetaPropertyItem, len(props)) for _, p := range props { + contentType := mediaContentType(p.ContentType, p.Text) item := models.MediaMetaPropertyItem{ Text: p.Text, - ContentType: p.ContentType, - Extension: mediaContentExtension(p.ContentType, p.Text), + ContentType: contentType, + Extension: mediaContentExtension(contentType, p.Text), BlobSize: p.BlobSize, } m[p.TypeTag] = item diff --git a/pkg/api/methods/media_meta_test.go b/pkg/api/methods/media_meta_test.go index ca39a203..8cf7d20f 100644 --- a/pkg/api/methods/media_meta_test.go +++ b/pkg/api/methods/media_meta_test.go @@ -100,6 +100,20 @@ func TestHandleMediaMeta_FullResult(t *testing.T) { mockDB.AssertExpectations(t) } +func TestMapMediaProperties_InfersPathContentType(t *testing.T) { + t.Parallel() + + props := mapMediaProperties([]database.MediaProperty{{ + TypeTag: "property:image-boxart", + Text: filepath.Join("media", "boxart", "mario.png"), + }}) + + prop := props["property:image-boxart"] + assert.Equal(t, "image/png", prop.ContentType) + require.NotNil(t, prop.Extension) + assert.Equal(t, "png", *prop.Extension) +} + func TestHandleMediaMeta_SecondarySlug(t *testing.T) { t.Parallel() From 4edd549255e82dc2d04a7d025535d253a6ebdbb3 Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Thu, 28 May 2026 18:41:04 +0800 Subject: [PATCH 04/17] fix(scraper): import gamelist region and language tags --- docs/scraper.md | 36 ++-- pkg/database/scraper/gamelistxml/scraper.go | 138 ++++++-------- .../scraper/gamelistxml/scraper_test.go | 180 +++++------------- 3 files changed, 124 insertions(+), 230 deletions(-) diff --git a/docs/scraper.md b/docs/scraper.md index 121997a8..259a4767 100644 --- a/docs/scraper.md +++ b/docs/scraper.md @@ -38,29 +38,27 @@ For each system, the normal loop: 1. Resolves target systems from indexed MediaDB systems and platform launcher paths. 2. Runs ZaparooCompanion processing first. This is a special path; see [ZaparooCompanion Entries](#zaparoocompanion-entries). -3. Builds candidate titles: - - `force=true`: all titles for the system. - - `force=false`: titles without sentinel tag `scraper.gamelist.xml:scraped`. -4. Loads `gamelist.xml` from each ROM root. -5. Resolves each `` path under its ROM root and derives a scanner-compatible title slug, using `` as provided name when present. -6. Matches the slug to an existing `MediaTitle` row. -7. Chooses the first Media DBID for the matched title as the sentinel write target. -8. Maps XML fields to title-level tags and properties. +3. Loads indexed media rows for the system. +4. With `force=false`, removes media rows that already have sentinel tag `scraper.gamelist.xml:scraped`. +5. Loads `gamelist.xml` from each ROM root. +6. Resolves each `` path under its ROM root. +7. Matches the resolved path to an existing Media row with case-insensitive path matching. +8. Maps XML fields to media-level tags plus title-level tags/properties. 9. Writes metadata through `MediaDB.ApplyScrapeResult`. -10. Writes the scraper sentinel tag last inside the same transaction. +10. Writes the scraper sentinel tag to the matched Media row last inside the same transaction. 11. Emits progress updates and a final done update. -The sentinel tag format is `scraper.:scraped`, for example `scraper.gamelist.xml:scraped`. Writing it last is intentional: if a normal record write fails, the transaction rolls back and the missing sentinel leaves that title eligible for retry. +The sentinel tag format is `scraper.:scraped`, for example `scraper.gamelist.xml:scraped`. Writing it last is intentional: if a normal record write fails, the transaction rolls back and the missing sentinel leaves that media row eligible for retry. Per-record write failures are non-fatal: they increment `Skipped`, emit `Err`, and continue. Fatal setup/load/database errors end the run with a terminal update unless caused by context cancellation. ## Tags And Properties -The DB supports tags/properties at both media and title scope, but the normal `gamelist.xml` scraper currently writes title metadata only. +The DB supports tags/properties at both media and title scope. Normal `gamelist.xml` scraping writes per-ROM `region`/`lang` tags and shared title metadata. | Storage | Scope | Current normal `gamelist.xml` use | |---|---|---| -| `MediaTags` | ROM-level variant metadata | Sentinel tag only for normal entries; companion child `region`/`lang` only | +| `MediaTags` | ROM-level variant metadata | region, lang, scraper sentinel | | `MediaTitleTags` | Title-level shared metadata | developer, publisher, year, rating, genre, players, arcadeboard, gamefamily | | `MediaTitleProperties` | Title-level static content | description, artwork paths, video path, manual path, XML game ID | | `MediaProperties` | ROM-level static content | Supported by DB helpers, not currently written by normal `gamelist.xml` entries | @@ -69,17 +67,17 @@ Tag exclusivity is controlled by `TagTypes.IsExclusive`. Exclusive types replace Property rows are keyed by entity and property type tag. Re-scraping the same property type updates the row in place and preserves row DBID. -Path-backed properties persist their text path and optional `BlobDBID`; the property tables do not persist the `ContentType` computed by the mapper for path values. Blob-backed properties expose content type from `MediaBlobs`. API responses can infer extensions from content type or text path, but path-backed `contentType` may be empty. +Path-backed properties persist their text path and optional `BlobDBID`; the property tables do not persist the `ContentType` computed by the mapper for path values. Blob-backed properties expose content type from `MediaBlobs`. API responses infer path-backed content type and extension from the stored path when DB content type is empty. -## Title-level Metadata Invariant +## Media-level Sentinel Invariant -Normal `gamelist.xml` metadata is shared by title. The sentinel is written to one Media row associated with the matched title, and `FindMediaTitlesWithoutSentinel` skips a title if any associated Media row has that sentinel. +Normal `gamelist.xml` entries are matched to concrete Media rows by path. The sentinel is written to the same Media row that receives ROM-level tags such as `region` and `lang`. -This is safe while the normal scraper writes only `MediaTitleTags` and `MediaTitleProperties`. Future per-ROM metadata imports need a different sentinel strategy, such as tagging every media row or splitting title-level and media-level sentinel tags. +Title metadata remains shared by `MediaTitleDBID`, so multiple ROM variants can write the same title-level tags/properties. Rewrites are idempotent: exclusive title tags replace same-type values, additive tags are inserted-or-ignored, and properties upsert by type. ## gamelist.xml Behavior -`GamelistXMLScraper` scans each system ROM root for `gamelist.xml`. Regular `` entries are resolved to absolute paths under the system ROM root, converted to scanner-compatible slugs, and matched to existing `MediaTitle` rows. Scrapers do not create `Media` or `MediaTitle` rows. +`GamelistXMLScraper` scans each system ROM root for `gamelist.xml`. Regular `` entries are resolved to absolute paths under the system ROM root and matched to existing Media rows by case-insensitive path. Scrapers do not create `Media` or `MediaTitle` rows. Path handling: @@ -93,10 +91,10 @@ Source fields are cleaned before mapping: HTML entities are unescaped, tab/newli ### Field Mapping -Regular ES `lang` and `region` fields are not currently imported for normal game entries. - | ES field | Destination | Notes | |---|---|---| +| `lang` | `MediaTags: lang` | CSV split, trimmed, lowercased, additive | +| `region` | `MediaTags: region` | CSV split, trimmed, lowercased, additive | | `developer` | `MediaTitleTags: developer` | Exclusive | | `publisher` | `MediaTitleTags: publisher` | Exclusive | | `releasedate` | `MediaTitleTags: year` | First four characters when present | diff --git a/pkg/database/scraper/gamelistxml/scraper.go b/pkg/database/scraper/gamelistxml/scraper.go index 02ce37c2..ea4683e0 100644 --- a/pkg/database/scraper/gamelistxml/scraper.go +++ b/pkg/database/scraper/gamelistxml/scraper.go @@ -193,20 +193,13 @@ func resolveSystemsFromPlatform( } // LoadRecords iterates gamelist.xml files found under each ROM root path for -// the given system. For each game entry whose slug matches a key in -// titlesBySlug, a [GamelistRecord] is emitted and the slug is removed from -// titlesBySlug (first match wins across all gamelists for the system). -// titlesBySlug is mutated by this call — callers must not reuse it. -// -// mediaByTitleDBID supplies the Media DBID to use as the sentinel write target: -// it must map each MediaTitle DBID to any one of its associated Media DBIDs. -// A record whose title DBID has no entry in mediaByTitleDBID will have -// MatchedMediaDBID = 0 and will be skipped by the scrape loop. +// the given system. Each game entry is resolved to an absolute path and matched +// to an existing Media row by case-insensitive path. The matched Media row +// supplies both the per-ROM write target and parent MediaTitle DBID. func (g *GamelistXMLScraper) LoadRecords( ctx context.Context, system scraper.ScrapeSystem, - titlesBySlug map[string]database.MediaTitle, - mediaByTitleDBID map[int64]int64, + mediaByPathFold map[string]database.Media, ) ([]*GamelistRecord, error) { var records []*GamelistRecord @@ -243,15 +236,7 @@ outer: continue } - pf := mediascanner.GetPathFragments(&mediascanner.PathFragmentParams{ - Config: g.cfg, - Path: resolved, - SystemID: system.ID, - NoExt: true, - ProvidedName: gl.Games[i].Name, - }) - - title, ok := titlesBySlug[pf.Slug] + media, ok := mediaByPathFold[pathFoldKey(resolved)] if !ok { continue } @@ -260,11 +245,11 @@ outer: SystemRootPath: rootPath, AvailableMediaDirs: availableMediaDirs, Game: gl.Games[i], - MatchedTitleDBID: title.DBID, - MatchedMediaDBID: mediaByTitleDBID[title.DBID], + MatchedTitleDBID: media.MediaTitleDBID, + MatchedMediaDBID: media.DBID, }) - delete(titlesBySlug, pf.Slug) - if len(titlesBySlug) == 0 { + delete(mediaByPathFold, pathFoldKey(resolved)) + if len(mediaByPathFold) == 0 { break outer } } @@ -328,78 +313,57 @@ func (g *GamelistXMLScraper) scrapeLoop( totalMatched += companion.Matched totalSkipped += companion.Skipped - // Build slug → MediaTitle map: only unscraped titles unless Force. - titlesBySlug := make(map[string]database.MediaTitle) - if opts.Force { - allTitles, titlesErr := mdb.GetTitlesBySystemID(system.ID) - if titlesErr != nil { - if errors.Is(titlesErr, context.Canceled) || errors.Is(titlesErr, context.DeadlineExceeded) { - ch <- scraper.ScrapeUpdate{ - Done: true, Processed: totalProcessed, Matched: totalMatched, Skipped: totalSkipped, - } - return - } + allMedia, mediaErr := mdb.GetMediaBySystemID(system.ID) + if mediaErr != nil { + if errors.Is(mediaErr, context.Canceled) || errors.Is(mediaErr, context.DeadlineExceeded) { ch <- scraper.ScrapeUpdate{ - SystemID: system.ID, FatalErr: titlesErr, - Done: true, Processed: totalProcessed, Matched: totalMatched, Skipped: totalSkipped, + SystemID: system.ID, Done: true, Processed: totalProcessed, + Matched: totalMatched, Skipped: totalSkipped, } return } - for _, t := range allTitles { - titlesBySlug[t.Slug] = database.MediaTitle{ - DBID: t.DBID, SystemDBID: t.SystemDBID, Slug: t.Slug, Name: t.Name, - } + ch <- scraper.ScrapeUpdate{ + SystemID: system.ID, FatalErr: mediaErr, + Done: true, Processed: totalProcessed, Matched: totalMatched, Skipped: totalSkipped, } - } else { - sentinel := scraper.SentinelTagInfo(id) - sentinelTag := sentinel.Type + ":" + sentinel.Tag - unscraped, titlesErr := mdb.FindMediaTitlesWithoutSentinel(ctx, system.DBID, sentinelTag) - if titlesErr != nil { - if errors.Is(titlesErr, context.Canceled) || errors.Is(titlesErr, context.DeadlineExceeded) { + return + } + scrapedIDs := map[int64]struct{}{} + if !opts.Force { + var scrapedErr error + scrapedIDs, scrapedErr = mdb.GetScrapedMediaIDs(ctx, id, system.DBID) + if scrapedErr != nil { + if errors.Is(scrapedErr, context.Canceled) || errors.Is(scrapedErr, context.DeadlineExceeded) { ch <- scraper.ScrapeUpdate{ - Done: true, Processed: totalProcessed, Matched: totalMatched, Skipped: totalSkipped, + SystemID: system.ID, Done: true, Processed: totalProcessed, + Matched: totalMatched, Skipped: totalSkipped, } return } ch <- scraper.ScrapeUpdate{ - SystemID: system.ID, FatalErr: titlesErr, + SystemID: system.ID, FatalErr: scrapedErr, Done: true, Processed: totalProcessed, Matched: totalMatched, Skipped: totalSkipped, } return } - for _, t := range unscraped { - titlesBySlug[t.Slug] = t - } - } - - if len(titlesBySlug) == 0 { - continue } - // Build MediaTitle DBID → first Media DBID map for sentinel writes. - allMedia, mediaErr := mdb.GetMediaBySystemID(system.ID) - if mediaErr != nil { - if errors.Is(mediaErr, context.Canceled) || errors.Is(mediaErr, context.DeadlineExceeded) { - ch <- scraper.ScrapeUpdate{ - SystemID: system.ID, Done: true, Processed: totalProcessed, - Matched: totalMatched, Skipped: totalSkipped, - } - return + mediaByPathFold := make(map[string]database.Media, len(allMedia)) + for _, m := range allMedia { + if _, scraped := scrapedIDs[m.DBID]; scraped { + continue } - ch <- scraper.ScrapeUpdate{ - SystemID: system.ID, FatalErr: mediaErr, - Done: true, Processed: totalProcessed, Matched: totalMatched, Skipped: totalSkipped, + mediaByPathFold[pathFoldKey(m.Path)] = database.Media{ + DBID: m.DBID, + MediaTitleDBID: m.MediaTitleDBID, + Path: m.Path, } - return } - mediaByTitleDBID := make(map[int64]int64, len(allMedia)) - for _, m := range allMedia { - if _, exists := mediaByTitleDBID[m.MediaTitleDBID]; !exists { - mediaByTitleDBID[m.MediaTitleDBID] = m.DBID - } + if len(mediaByPathFold) == 0 { + continue } - records, loadErr := g.LoadRecords(ctx, system, titlesBySlug, mediaByTitleDBID) + records, loadErr := g.LoadRecords(ctx, system, mediaByPathFold) if loadErr != nil { if errors.Is(loadErr, context.Canceled) || errors.Is(loadErr, context.DeadlineExceeded) { ch <- scraper.ScrapeUpdate{ @@ -492,6 +456,7 @@ func (g *GamelistXMLScraper) scrapeLoop( ctx, record.MatchedMediaDBID, record.MatchedTitleDBID, &database.ScrapeWrite{ Sentinel: scraper.SentinelTagInfo(id), + MediaTags: mapped.MediaTags, TitleTags: mapped.TitleTags, TitleProps: mapped.TitleProps, }) @@ -528,9 +493,9 @@ func (g *GamelistXMLScraper) scrapeLoop( } // MapToDB converts a GamelistRecord into the tag and property writes to apply -// to the matched MediaTitle row. Media-level tags are not written by this -// scraper; only title-level tags and properties are populated. +// to the matched Media and MediaTitle rows. func (g *GamelistXMLScraper) MapToDB(record *GamelistRecord) scraper.MapResult { + var mediaTags []database.TagInfo var titleTags []database.TagInfo var titleProps []database.MediaProperty game := record.Game @@ -545,6 +510,8 @@ func (g *GamelistXMLScraper) MapToDB(record *GamelistRecord) scraper.MapResult { game.Players = cleanField(game.Players) game.ArcadeSystemName = cleanField(game.ArcadeSystemName) game.Family = cleanField(game.Family) + game.Region = cleanField(game.Region) + game.Lang = cleanField(game.Lang) game.Desc = cleanField(game.Desc) game.Image = cleanField(game.Image) game.Thumbnail = cleanField(game.Thumbnail) @@ -561,6 +528,11 @@ func (g *GamelistXMLScraper) MapToDB(record *GamelistRecord) scraper.MapResult { game.Boxart3D = cleanField(game.Boxart3D) game.Logo = cleanField(game.Logo) + // --- MediaTags: ROM-level variant metadata --- + + mediaTags = appendCSVTags(mediaTags, string(tags.TagTypeRegion), game.Region) + mediaTags = appendCSVTags(mediaTags, string(tags.TagTypeLang), game.Lang) + // --- MediaTitleTags: title-level, shared across all ROMs --- if game.Developer != "" { @@ -668,6 +640,7 @@ func (g *GamelistXMLScraper) MapToDB(record *GamelistRecord) scraper.MapResult { } return scraper.MapResult{ + MediaTags: mediaTags, TitleTags: titleTags, TitleProps: titleProps, } @@ -809,6 +782,13 @@ func splitCSV(s string) []string { return result } +func appendCSVTags(tagInfos []database.TagInfo, tagType, raw string) []database.TagInfo { + for _, value := range splitCSV(raw) { + tagInfos = append(tagInfos, database.TagInfo{Type: tagType, Tag: strings.ToLower(value)}) + } + return tagInfos +} + // pathProp resolves esPath to an absolute path and returns a MediaProperty for // the given typeTag. Returns nil if the path cannot be resolved (skipped cleanly). func pathProp(typeTag, esPath, systemRootPath string) *database.MediaProperty { @@ -909,6 +889,10 @@ func findMediaFilePropFS( return nil } +func pathFoldKey(path string) string { + return strings.ToLower(filepath.ToSlash(filepath.Clean(path))) +} + func readGameListXMLFS(fs afero.Fs, path string) (*esapi.GameList, error) { data, err := afero.ReadFile(fs, path) if err != nil { diff --git a/pkg/database/scraper/gamelistxml/scraper_test.go b/pkg/database/scraper/gamelistxml/scraper_test.go index 00bb91a0..f75c3992 100644 --- a/pkg/database/scraper/gamelistxml/scraper_test.go +++ b/pkg/database/scraper/gamelistxml/scraper_test.go @@ -263,9 +263,15 @@ func TestMimeFromExt_Unknown(t *testing.T) { // --- LoadRecords --- -// TestLoadRecords_SlugMatch verifies that a gamelist entry whose slug matches a -// key in titlesBySlug produces a GamelistRecord with the correct DB IDs. -func TestLoadRecords_SlugMatch(t *testing.T) { +func mediaByPath(rows ...database.Media) map[string]database.Media { + result := make(map[string]database.Media, len(rows)) + for _, row := range rows { + result[pathFoldKey(row.Path)] = row + } + return result +} + +func TestLoadRecords_PathMatch(t *testing.T) { t.Parallel() root := t.TempDir() @@ -276,21 +282,12 @@ func TestLoadRecords_SlugMatch(t *testing.T) { ./zelda.nesZelda `), 0o600)) - marioSlug := slugFor("nes", filepath.Join(root, "mario.nes")) - - titlesBySlug := map[string]database.MediaTitle{ - marioSlug: {DBID: 22, Slug: marioSlug}, - } - mediaByTitleDBID := map[int64]int64{22: 11} - records, err := (&GamelistXMLScraper{}).LoadRecords( context.Background(), scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}}, - titlesBySlug, - mediaByTitleDBID, + mediaByPath(database.Media{DBID: 11, MediaTitleDBID: 22, Path: filepath.Join(root, "mario.nes")}), ) require.NoError(t, err) - // Only mario has a matching MediaTitle slug; zelda is silently skipped. require.Len(t, records, 1) assert.Equal(t, root, records[0].SystemRootPath) assert.Equal(t, "./mario.nes", records[0].Game.Path) @@ -300,13 +297,7 @@ func TestLoadRecords_SlugMatch(t *testing.T) { assert.Equal(t, filepath.Join(root, "media", "image"), records[0].AvailableMediaDirs["image"]) } -// TestLoadRecords_NameDerivedSlugMatch verifies the scraper matches titles -// whose stored slug derives from a scanner-provided display name rather than -// the filename. This is a regression test for the fix where the indexer uses -// ScanResult.Name (e.g. NeoGeo AltName, gamelist.xml ) to build the -// MediaTitle slug, requiring the scraper to also derive its lookup slug from -// gamelist.xml's rather than the filename. -func TestLoadRecords_NameDerivedSlugMatch(t *testing.T) { +func TestLoadRecords_TitleNameDoesNotNeedSlugMatch(t *testing.T) { t.Parallel() root := t.TempDir() @@ -315,41 +306,18 @@ func TestLoadRecords_NameDerivedSlugMatch(t *testing.T) { ./mslug.zipMetal Slug `), 0o600)) - // Indexer would have stored the slug derived from ProvidedName="Metal Slug", - // not from the filename "mslug". Build titlesBySlug accordingly. - nameSlug := mediascanner.GetPathFragments(&mediascanner.PathFragmentParams{ - SystemID: "NeoGeo", - Path: filepath.Join(root, "mslug.zip"), - NoExt: true, - ProvidedName: "Metal Slug", - }).Slug - - // Confirm the test setup actually exercises the regression: the - // name-derived slug must differ from a filename-derived one. - filenameSlug := slugFor("NeoGeo", filepath.Join(root, "mslug.zip")) - require.NotEqual(t, filenameSlug, nameSlug, - "test precondition: name-derived slug must differ from filename-derived") - - titlesBySlug := map[string]database.MediaTitle{ - nameSlug: {DBID: 42, Slug: nameSlug, Name: "Metal Slug"}, - } - mediaByTitleDBID := map[int64]int64{42: 7} - records, err := (&GamelistXMLScraper{}).LoadRecords( context.Background(), scraper.ScrapeSystem{ID: "NeoGeo", ROMPaths: []string{root}}, - titlesBySlug, - mediaByTitleDBID, + mediaByPath(database.Media{DBID: 7, MediaTitleDBID: 42, Path: filepath.Join(root, "mslug.zip")}), ) require.NoError(t, err) - require.Len(t, records, 1, "scraper must match the title via name-derived slug") + require.Len(t, records, 1, "path match should work even when title slug derives from display name") assert.Equal(t, "Metal Slug", records[0].Game.Name) assert.Equal(t, int64(42), records[0].MatchedTitleDBID) assert.Equal(t, int64(7), records[0].MatchedMediaDBID) } -// TestLoadRecords_SkipsMissingAndMalformedGameLists verifies that ROM roots -// without a gamelist.xml and roots with a malformed file are silently skipped. func TestLoadRecords_SkipsMissingAndMalformedGameLists(t *testing.T) { t.Parallel() @@ -367,20 +335,13 @@ func TestLoadRecords_SkipsMissingAndMalformedGameLists(t *testing.T) { ./mario.nesMario `), 0o600)) - marioSlug := slugFor("nes", filepath.Join(validRoot, "mario.nes")) - titlesBySlug := map[string]database.MediaTitle{ - marioSlug: {DBID: 5, Slug: marioSlug}, - } - mediaByTitleDBID := map[int64]int64{5: 3} - records, err := (&GamelistXMLScraper{}).LoadRecords( context.Background(), scraper.ScrapeSystem{ ID: "nes", ROMPaths: []string{missingRoot, malformedRoot, validRoot}, }, - titlesBySlug, - mediaByTitleDBID, + mediaByPath(database.Media{DBID: 3, MediaTitleDBID: 5, Path: filepath.Join(validRoot, "mario.nes")}), ) require.NoError(t, err) require.Len(t, records, 1) @@ -388,9 +349,7 @@ func TestLoadRecords_SkipsMissingAndMalformedGameLists(t *testing.T) { assert.Equal(t, int64(5), records[0].MatchedTitleDBID) } -// TestLoadRecords_FirstWins verifies that when two gamelist roots contain an -// entry with the same slug, only the first is recorded. -func TestLoadRecords_FirstWins(t *testing.T) { +func TestLoadRecords_FirstPathWins(t *testing.T) { t.Parallel() root1 := t.TempDir() @@ -405,56 +364,16 @@ func TestLoadRecords_FirstWins(t *testing.T) { writeGamelist(root1) writeGamelist(root2) - slug := slugFor("nes", filepath.Join(root1, "game.nes")) - titlesBySlug := map[string]database.MediaTitle{ - slug: {DBID: 1, Slug: slug}, - } - mediaByTitleDBID := map[int64]int64{1: 10} - records, err := (&GamelistXMLScraper{}).LoadRecords( context.Background(), scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root1, root2}}, - titlesBySlug, - mediaByTitleDBID, + mediaByPath(database.Media{DBID: 10, MediaTitleDBID: 1, Path: filepath.Join(root1, "game.nes")}), ) require.NoError(t, err) - require.Len(t, records, 1, "second root's duplicate slug must be skipped") + require.Len(t, records, 1, "second root's duplicate path must be skipped after first match") assert.Equal(t, root1, records[0].SystemRootPath, "first root wins") } -// TestLoadRecords_NoMediaForTitle verifies that a slug match with no -// corresponding Media row yields MatchedMediaDBID = 0. The scrape loop will -// skip such records via its zero-ID guard. -func TestLoadRecords_NoMediaForTitle(t *testing.T) { - t.Parallel() - - root := t.TempDir() - require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` - - ./mario.nesMario -`), 0o600)) - - slug := slugFor("nes", filepath.Join(root, "mario.nes")) - titlesBySlug := map[string]database.MediaTitle{ - slug: {DBID: 7, Slug: slug}, - } - // Intentionally omit title DBID 7 from mediaByTitleDBID. - mediaByTitleDBID := map[int64]int64{} - - records, err := (&GamelistXMLScraper{}).LoadRecords( - context.Background(), - scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}}, - titlesBySlug, - mediaByTitleDBID, - ) - require.NoError(t, err) - require.Len(t, records, 1) - assert.Equal(t, int64(7), records[0].MatchedTitleDBID) - assert.Equal(t, int64(0), records[0].MatchedMediaDBID) -} - -// TestLoadRecords_ContextCancellation verifies that a cancelled context causes -// LoadRecords to return context.Canceled immediately. func TestLoadRecords_ContextCancellation(t *testing.T) { t.Parallel() @@ -470,14 +389,11 @@ func TestLoadRecords_ContextCancellation(t *testing.T) { _, err := (&GamelistXMLScraper{}).LoadRecords( ctx, scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}}, - map[string]database.MediaTitle{}, - map[int64]int64{}, + map[string]database.Media{}, ) require.ErrorIs(t, err, context.Canceled) } -// TestLoadRecords_PathTraversalSkipped verifies that gamelist entries whose -// paths escape the system root are silently skipped. func TestLoadRecords_PathTraversalSkipped(t *testing.T) { t.Parallel() @@ -488,16 +404,10 @@ func TestLoadRecords_PathTraversalSkipped(t *testing.T) { ./mario.nesMario `), 0o600)) - slug := slugFor("nes", filepath.Join(root, "mario.nes")) - titlesBySlug := map[string]database.MediaTitle{ - slug: {DBID: 1, Slug: slug}, - } - records, err := (&GamelistXMLScraper{}).LoadRecords( context.Background(), scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}}, - titlesBySlug, - map[int64]int64{1: 9}, + mediaByPath(database.Media{DBID: 9, MediaTitleDBID: 1, Path: filepath.Join(root, "mario.nes")}), ) require.NoError(t, err) require.Len(t, records, 1, "traversal entry must be dropped; mario must still match") @@ -554,8 +464,8 @@ func TestMapToDB_FullGame(t *testing.T) { result := (&GamelistXMLScraper{}).MapToDB(&rec) - // Media-level tags: not written by this scraper. - assert.Empty(t, result.MediaTags, "gamelistxml scraper writes no media-level tags") + assert.Contains(t, result.MediaTags, database.TagInfo{Type: string(tags.TagTypeLang), Tag: "en"}) + assert.Contains(t, result.MediaTags, database.TagInfo{Type: string(tags.TagTypeRegion), Tag: "usa"}) // Title-level tags assert.Contains(t, result.TitleTags, database.TagInfo{Type: string(tags.TagTypeDeveloper), Tag: "Nintendo"}) @@ -1762,20 +1672,19 @@ func TestScrapeLoop_NormalMode_Success(t *testing.T) { ./mario.nesMario `), 0o600)) - slug := slugFor("nes", filepath.Join(root, "mario.nes")) const ( titleDBID = int64(1) mediaDBID = int64(10) systemDBID = int64(100) ) - sentinel := scraper.SentinelTagInfo("gamelist.xml") - sentinelTag := sentinel.Type + ":" + sentinel.Tag mockDB := helpers.NewMockMediaDBI() - mockDB.On("FindMediaTitlesWithoutSentinel", mock.Anything, systemDBID, sentinelTag). - Return([]database.MediaTitle{{DBID: titleDBID, SystemDBID: systemDBID, Slug: slug}}, nil) mockDB.On("GetMediaBySystemID", "nes"). - Return([]database.MediaWithFullPath{{DBID: mediaDBID, MediaTitleDBID: titleDBID}}, nil) + Return([]database.MediaWithFullPath{{ + DBID: mediaDBID, MediaTitleDBID: titleDBID, Path: filepath.Join(root, "mario.nes"), + }}, nil) + mockDB.On("GetScrapedMediaIDs", mock.Anything, "gamelist.xml", systemDBID). + Return(map[int64]struct{}{}, nil) mockDB.On("ApplyScrapeResult", mock.Anything, mediaDBID, titleDBID, mock.Anything).Return(nil) s := &GamelistXMLScraper{db: mockDB} @@ -1808,7 +1717,6 @@ func TestScrapeLoop_ForceMode_Success(t *testing.T) { ./sonic.mdSonic `), 0o600)) - slug := slugFor("genesis", filepath.Join(root, "sonic.md")) const ( titleDBID = int64(2) mediaDBID = int64(20) @@ -1816,10 +1724,10 @@ func TestScrapeLoop_ForceMode_Success(t *testing.T) { ) mockDB := helpers.NewMockMediaDBI() - mockDB.On("GetTitlesBySystemID", "genesis"). - Return([]database.TitleWithSystem{{DBID: titleDBID, SystemDBID: systemDBID, Slug: slug}}, nil) mockDB.On("GetMediaBySystemID", "genesis"). - Return([]database.MediaWithFullPath{{DBID: mediaDBID, MediaTitleDBID: titleDBID}}, nil) + Return([]database.MediaWithFullPath{{ + DBID: mediaDBID, MediaTitleDBID: titleDBID, Path: filepath.Join(root, "sonic.md"), + }}, nil) mockDB.On("ApplyScrapeResult", mock.Anything, mediaDBID, titleDBID, mock.Anything).Return(nil) s := &GamelistXMLScraper{db: mockDB} @@ -1851,20 +1759,19 @@ func TestScrapeLoop_WriteError_RecordSkipped(t *testing.T) { ./mario.nesMario `), 0o600)) - slug := slugFor("nes", filepath.Join(root, "mario.nes")) const ( titleDBID = int64(3) mediaDBID = int64(30) systemDBID = int64(300) ) - sentinel := scraper.SentinelTagInfo("gamelist.xml") - sentinelTag := sentinel.Type + ":" + sentinel.Tag mockDB := helpers.NewMockMediaDBI() - mockDB.On("FindMediaTitlesWithoutSentinel", mock.Anything, systemDBID, sentinelTag). - Return([]database.MediaTitle{{DBID: titleDBID, SystemDBID: systemDBID, Slug: slug}}, nil) mockDB.On("GetMediaBySystemID", "nes"). - Return([]database.MediaWithFullPath{{DBID: mediaDBID, MediaTitleDBID: titleDBID}}, nil) + Return([]database.MediaWithFullPath{{ + DBID: mediaDBID, MediaTitleDBID: titleDBID, Path: filepath.Join(root, "mario.nes"), + }}, nil) + mockDB.On("GetScrapedMediaIDs", mock.Anything, "gamelist.xml", systemDBID). + Return(map[int64]struct{}{}, nil) mockDB.On("ApplyScrapeResult", mock.Anything, mediaDBID, titleDBID, mock.Anything). Return(assert.AnError) @@ -1889,7 +1796,7 @@ func TestScrapeLoop_WriteError_RecordSkipped(t *testing.T) { mockDB.AssertExpectations(t) } -func TestScrapeLoop_EmptyTitles_SkipsSystem(t *testing.T) { +func TestScrapeLoop_AllMediaScraped_SkipsSystem(t *testing.T) { t.Parallel() root := t.TempDir() require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` @@ -1897,14 +1804,19 @@ func TestScrapeLoop_EmptyTitles_SkipsSystem(t *testing.T) { ./mario.nesMario `), 0o600)) - const systemDBID = int64(400) - sentinel := scraper.SentinelTagInfo("gamelist.xml") - sentinelTag := sentinel.Type + ":" + sentinel.Tag + const ( + mediaDBID = int64(40) + titleDBID = int64(4) + systemDBID = int64(400) + ) mockDB := helpers.NewMockMediaDBI() - // All titles already scraped → empty result → GetMediaBySystemID and ApplyScrapeResult never called. - mockDB.On("FindMediaTitlesWithoutSentinel", mock.Anything, systemDBID, sentinelTag). - Return([]database.MediaTitle{}, nil) + mockDB.On("GetMediaBySystemID", "nes"). + Return([]database.MediaWithFullPath{{ + DBID: mediaDBID, MediaTitleDBID: titleDBID, Path: filepath.Join(root, "mario.nes"), + }}, nil) + mockDB.On("GetScrapedMediaIDs", mock.Anything, "gamelist.xml", systemDBID). + Return(map[int64]struct{}{mediaDBID: {}}, nil) s := &GamelistXMLScraper{db: mockDB} system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: systemDBID} From 64454a9638becf291953b6efefbee1e4fb62eb79 Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Thu, 28 May 2026 19:21:32 +0800 Subject: [PATCH 05/17] fix(scraper): aggregate companion progress counts --- pkg/database/scraper/gamelistxml/scraper.go | 9 +- .../scraper/gamelistxml/scraper_test.go | 88 ++++++++++++++++--- 2 files changed, 80 insertions(+), 17 deletions(-) diff --git a/pkg/database/scraper/gamelistxml/scraper.go b/pkg/database/scraper/gamelistxml/scraper.go index ea4683e0..6c6ed8c2 100644 --- a/pkg/database/scraper/gamelistxml/scraper.go +++ b/pkg/database/scraper/gamelistxml/scraper.go @@ -400,7 +400,10 @@ func (g *GamelistXMLScraper) scrapeLoop( return true } update.SystemID = system.ID - update.Total = totalRecords + update.Total = totalProcessed + totalRecords + update.Processed += totalProcessed + update.Matched += totalMatched + update.Skipped += totalSkipped select { case <-ctx.Done(): ch <- scraper.ScrapeUpdate{ @@ -1143,7 +1146,7 @@ func (g *GamelistXMLScraper) processCompanionEntries( func companionMediaByTitle(systemID string, mdb database.MediaDBI) (map[int64]database.Media, error) { allMedia, err := mdb.GetMediaBySystemID(systemID) if err != nil { - return nil, err + return nil, fmt.Errorf("load media by system %q: %w", systemID, err) } mediaByTitleDBID := make(map[int64]database.Media, len(allMedia)) for _, m := range allMedia { @@ -1158,7 +1161,7 @@ func companionMediaByTitle(systemID string, mdb database.MediaDBI) (map[int64]da return mediaByTitleDBID, nil } -func (g *GamelistXMLScraper) matchCompanionChildMedia( +func (*GamelistXMLScraper) matchCompanionChildMedia( ctx context.Context, system scraper.ScrapeSystem, child companionChild, diff --git a/pkg/database/scraper/gamelistxml/scraper_test.go b/pkg/database/scraper/gamelistxml/scraper_test.go index f75c3992..36e66b3b 100644 --- a/pkg/database/scraper/gamelistxml/scraper_test.go +++ b/pkg/database/scraper/gamelistxml/scraper_test.go @@ -27,7 +27,6 @@ import ( "testing" "github.com/ZaparooProject/zaparoo-core/v2/pkg/database" - "github.com/ZaparooProject/zaparoo-core/v2/pkg/database/mediascanner" "github.com/ZaparooProject/zaparoo-core/v2/pkg/database/scraper" "github.com/ZaparooProject/zaparoo-core/v2/pkg/database/tags" "github.com/ZaparooProject/zaparoo-core/v2/pkg/helpers/syncutil" @@ -39,16 +38,6 @@ import ( "github.com/stretchr/testify/require" ) -// slugFor computes the MediaTitle slug that would be stored for a ROM path -// using the same parameters as the scraper's LoadRecords call. -func slugFor(systemID, path string) string { - return mediascanner.GetPathFragments(&mediascanner.PathFragmentParams{ - SystemID: systemID, - Path: path, - NoExt: true, - }).Slug -} - // --- cleanField --- func TestCleanField_TrimSpace(t *testing.T) { @@ -1393,7 +1382,9 @@ func companionWriteMatcher( mediaTags []database.TagInfo, ) any { return mock.MatchedBy(func(w *database.ScrapeWrite) bool { - return w != nil && w.Sentinel == scraper.SentinelTagInfo("gamelist.xml") && assert.ObjectsAreEqual(mediaTags, w.MediaTags) + return w != nil && + w.Sentinel == scraper.SentinelTagInfo("gamelist.xml") && + assert.ObjectsAreEqual(mediaTags, w.MediaTags) }) } @@ -1545,6 +1536,68 @@ func TestProcessCompanionEntries_ParentNotFoundForChild(t *testing.T) { mockDB.AssertExpectations(t) } +func TestProcessCompanionEntries_StatsAggregateWithNormalProgress(t *testing.T) { + t.Parallel() + root := t.TempDir() + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` + + Companion Game + Dev Corp + + + ./companion.rom + usa + en + + ./normal.romNormal Game +`), 0o600)) + + const ( + companionTitleDBID = int64(20) + companionMediaDBID = int64(10) + normalTitleDBID = int64(21) + normalMediaDBID = int64(11) + systemDBID = int64(1) + ) + companionPath := filepath.ToSlash(filepath.Join(root, "companion.rom")) + mockDB := helpers.NewMockMediaDBI() + mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{ + {DBID: normalMediaDBID, MediaTitleDBID: normalTitleDBID, Path: filepath.Join(root, "normal.rom")}, + }, nil) + mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, systemDBID, companionPath). + Return(&database.Media{DBID: companionMediaDBID, MediaTitleDBID: companionTitleDBID}, nil) + mockDB.On("MediaHasTag", mock.Anything, companionMediaDBID, "scraper.gamelist.xml:scraped").Return(false, nil) + mockDB.On("ApplyScrapeResult", mock.Anything, companionMediaDBID, companionTitleDBID, mock.Anything).Return(nil) + mockDB.On("GetScrapedMediaIDs", mock.Anything, "gamelist.xml", systemDBID). + Return(map[int64]struct{}{}, nil) + mockDB.On("ApplyScrapeResult", mock.Anything, normalMediaDBID, normalTitleDBID, mock.Anything).Return(nil) + + s := &GamelistXMLScraper{db: mockDB} + system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: systemDBID} + ch := make(chan scraper.ScrapeUpdate, 128) + + s.scrapeLoop(context.Background(), scraper.ScrapeOptions{Pauser: syncutil.NewPauser()}, + []scraper.ScrapeSystem{system}, mockDB, ch) + + updates := drainChannel(ch) + var progress, done scraper.ScrapeUpdate + for _, u := range updates { + if u.Done { + done = u + continue + } + if u.Processed == 2 && u.Matched == 2 { + progress = u + } + } + require.Equal(t, 2, progress.Processed, "normal progress should include companion counts") + assert.Equal(t, 2, progress.Total) + assert.Equal(t, 2, progress.Matched) + assert.Equal(t, 2, done.Processed) + assert.Equal(t, 2, done.Matched) + mockDB.AssertExpectations(t) +} + func TestProcessCompanionEntries_EachChildUsesAtomicScrapeWrite(t *testing.T) { t.Parallel() root := t.TempDir() @@ -1669,7 +1722,7 @@ func TestScrapeLoop_NormalMode_Success(t *testing.T) { root := t.TempDir() require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` - ./mario.nesMario + ./mario.nesMariousa, euren `), 0o600)) const ( @@ -1685,7 +1738,14 @@ func TestScrapeLoop_NormalMode_Success(t *testing.T) { }}, nil) mockDB.On("GetScrapedMediaIDs", mock.Anything, "gamelist.xml", systemDBID). Return(map[int64]struct{}{}, nil) - mockDB.On("ApplyScrapeResult", mock.Anything, mediaDBID, titleDBID, mock.Anything).Return(nil) + mediaTagMatcher := mock.MatchedBy(func(w *database.ScrapeWrite) bool { + return w != nil && assert.ElementsMatch(t, []database.TagInfo{ + {Type: string(tags.TagTypeRegion), Tag: "usa"}, + {Type: string(tags.TagTypeRegion), Tag: "eur"}, + {Type: string(tags.TagTypeLang), Tag: "en"}, + }, w.MediaTags) + }) + mockDB.On("ApplyScrapeResult", mock.Anything, mediaDBID, titleDBID, mediaTagMatcher).Return(nil) s := &GamelistXMLScraper{db: mockDB} system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: systemDBID} From 4fd618cacddcda28bc78d4cac29e95216043a043 Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Thu, 28 May 2026 19:46:56 +0800 Subject: [PATCH 06/17] test(scraper): cover companion default artwork selection --- pkg/api/methods/media_image_test.go | 77 +++++++++++++++++++ .../scraper/gamelistxml/scraper_test.go | 61 +++++++++++++++ 2 files changed, 138 insertions(+) diff --git a/pkg/api/methods/media_image_test.go b/pkg/api/methods/media_image_test.go index 715bcdaa..a73847ac 100644 --- a/pkg/api/methods/media_image_test.go +++ b/pkg/api/methods/media_image_test.go @@ -115,6 +115,83 @@ func TestHandleMediaImage_DefaultPrefs_TitleBlobFound(t *testing.T) { mockDB.AssertExpectations(t) } +func TestHandleMediaImage_DefaultPrefs_PathBackedCompanionArtwork(t *testing.T) { + t.Parallel() + + dir := t.TempDir() + files := map[string][]byte{ + "boxart.png": []byte("boxart-2d"), + "boxart3d.png": []byte("boxart-3d"), + "screenshot.png": []byte("screenshot"), + "wheel.png": []byte("wheel"), + "titleshot.png": []byte("titleshot"), + } + for name, data := range files { + require.NoError(t, os.WriteFile(filepath.Join(dir, name), data, 0o600)) + } + + mockDB := testhelpers.NewMockMediaDBI() + row := makeMediaFullRow(11, 110) + expectMediaImageResolve(mockDB, row) + mockDB.On("GetMediaProperties", mock.Anything, int64(11)). + Return([]database.MediaProperty{}, nil) + mockDB.On("GetMediaTitleProperties", mock.Anything, int64(110)). + Return([]database.MediaProperty{ + {TypeTag: "property:image-screenshot", Text: filepath.Join(dir, "screenshot.png")}, + {TypeTag: "property:image-titleshot", Text: filepath.Join(dir, "titleshot.png")}, + {TypeTag: "property:image-boxart", Text: filepath.Join(dir, "boxart.png")}, + {TypeTag: "property:image-boxart3d", Text: filepath.Join(dir, "boxart3d.png")}, + {TypeTag: "property:image-wheel", Text: filepath.Join(dir, "wheel.png")}, + }, nil) + + env := makeMediaImageEnv(t, mockDB, mediaImageParams(row, "")) + result, err := HandleMediaImage(env) + require.NoError(t, err) + + resp, ok := result.(models.MediaImageResponse) + require.True(t, ok) + assert.Equal(t, "property:image-boxart", resp.TypeTag) + assert.Equal(t, "image/png", resp.ContentType) + assert.NotNil(t, resp.Extension) + assert.Equal(t, "png", *resp.Extension) + assert.Equal(t, base64.StdEncoding.EncodeToString(files["boxart.png"]), resp.Data) + mockDB.AssertExpectations(t) +} + +func TestHandleMediaImage_DefaultPrefs_FallsBackToNextCompanionArtwork(t *testing.T) { + t.Parallel() + + dir := t.TempDir() + boxart3D := []byte("boxart-3d") + screenshot := []byte("screenshot") + require.NoError(t, os.WriteFile(filepath.Join(dir, "boxart3d.png"), boxart3D, 0o600)) + require.NoError(t, os.WriteFile(filepath.Join(dir, "screenshot.png"), screenshot, 0o600)) + + mockDB := testhelpers.NewMockMediaDBI() + row := makeMediaFullRow(12, 120) + expectMediaImageResolve(mockDB, row) + mockDB.On("GetMediaProperties", mock.Anything, int64(12)). + Return([]database.MediaProperty{}, nil) + mockDB.On("GetMediaTitleProperties", mock.Anything, int64(120)). + Return([]database.MediaProperty{ + {TypeTag: "property:image-screenshot", Text: filepath.Join(dir, "screenshot.png")}, + {TypeTag: "property:image-boxart3d", Text: filepath.Join(dir, "boxart3d.png")}, + }, nil) + + env := makeMediaImageEnv(t, mockDB, mediaImageParams(row, "")) + result, err := HandleMediaImage(env) + require.NoError(t, err) + + resp, ok := result.(models.MediaImageResponse) + require.True(t, ok) + assert.Equal(t, "property:image-boxart3d", resp.TypeTag) + assert.Equal(t, "image/png", resp.ContentType) + assert.NotNil(t, resp.Extension) + assert.Equal(t, "png", *resp.Extension) + assert.Equal(t, base64.StdEncoding.EncodeToString(boxart3D), resp.Data) + mockDB.AssertExpectations(t) +} + // TestHandleMediaImage_ExplicitPrefs_MediaLevelPriority verifies that media-level // properties take priority over title-level properties for the same TypeTag. func TestHandleMediaImage_ExplicitPrefs_MediaLevelPriority(t *testing.T) { diff --git a/pkg/database/scraper/gamelistxml/scraper_test.go b/pkg/database/scraper/gamelistxml/scraper_test.go index 36e66b3b..09f6efc4 100644 --- a/pkg/database/scraper/gamelistxml/scraper_test.go +++ b/pkg/database/scraper/gamelistxml/scraper_test.go @@ -1388,6 +1388,25 @@ func companionWriteMatcher( }) } +func companionArtworkWriteMatcher(t *testing.T, expected map[string]string) any { + t.Helper() + return mock.MatchedBy(func(w *database.ScrapeWrite) bool { + if w == nil || w.Sentinel != scraper.SentinelTagInfo("gamelist.xml") { + return false + } + got := make(map[string]string, len(w.TitleProps)) + for _, p := range w.TitleProps { + got[p.TypeTag] = p.Text + } + for key, path := range expected { + if !assert.Equal(t, filepath.ToSlash(path), got[key], "companion artwork %s", key) { + return false + } + } + return true + }) +} + func TestProcessCompanionEntries_NoEntries(t *testing.T) { t.Parallel() root := t.TempDir() @@ -1536,6 +1555,48 @@ func TestProcessCompanionEntries_ParentNotFoundForChild(t *testing.T) { mockDB.AssertExpectations(t) } +func TestProcessCompanionEntries_MapsIssue161CompanionArtwork(t *testing.T) { + t.Parallel() + root := t.TempDir() + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` + + Doom + ./media/screenshot/Doom.png + ./media/titlescreen/Doom.png + ./media/box2d/Doom.png + ./media/box3d/Doom.png + ./media/logo/Doom.png + + + ./Doom.rom + +`), 0o600)) + + childPath := filepath.ToSlash(filepath.Join(root, "Doom.rom")) + propPrefix := string(tags.TagTypeProperty) + ":" + expectedArtwork := map[string]string{ + propPrefix + string(tags.TagPropertyImageScreenshot): filepath.Join(root, "media", "screenshot", "Doom.png"), + propPrefix + string(tags.TagPropertyImageTitleshot): filepath.Join(root, "media", "titlescreen", "Doom.png"), + propPrefix + string(tags.TagPropertyImageBoxart): filepath.Join(root, "media", "box2d", "Doom.png"), + propPrefix + string(tags.TagPropertyImageBoxart3D): filepath.Join(root, "media", "box3d", "Doom.png"), + propPrefix + string(tags.TagPropertyImageWheel): filepath.Join(root, "media", "logo", "Doom.png"), + } + mockDB := helpers.NewMockMediaDBI() + mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) + mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), childPath). + Return(&database.Media{DBID: 10, MediaTitleDBID: 20}, nil) + mockDB.On("MediaHasTag", mock.Anything, int64(10), "scraper.gamelist.xml:scraped").Return(false, nil) + mockDB.On( + "ApplyScrapeResult", mock.Anything, int64(10), int64(20), companionArtworkWriteMatcher(t, expectedArtwork), + ).Return(nil) + + s := &GamelistXMLScraper{db: mockDB} + system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 1} + stats := s.processCompanionEntries(context.Background(), scraper.ScrapeOptions{}, system, mockDB) + assert.Equal(t, companionStats{Processed: 1, Matched: 1}, stats) + mockDB.AssertExpectations(t) +} + func TestProcessCompanionEntries_StatsAggregateWithNormalProgress(t *testing.T) { t.Parallel() root := t.TempDir() From ee7515a8780bc1a4e3cdbd6496fcd8515d3fce9e Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Thu, 28 May 2026 22:51:03 +0800 Subject: [PATCH 07/17] fix(scraper): match zip-as-dir gamelist entries --- pkg/database/scraper/gamelistxml/scraper.go | 42 ++++- .../scraper/gamelistxml/scraper_test.go | 143 ++++++++++++++++++ 2 files changed, 183 insertions(+), 2 deletions(-) diff --git a/pkg/database/scraper/gamelistxml/scraper.go b/pkg/database/scraper/gamelistxml/scraper.go index 6c6ed8c2..d2fb8798 100644 --- a/pkg/database/scraper/gamelistxml/scraper.go +++ b/pkg/database/scraper/gamelistxml/scraper.go @@ -236,7 +236,7 @@ outer: continue } - media, ok := mediaByPathFold[pathFoldKey(resolved)] + media, matchedKey, ok := matchMediaByResolvedPath(mediaByPathFold, resolved) if !ok { continue } @@ -248,7 +248,7 @@ outer: MatchedTitleDBID: media.MediaTitleDBID, MatchedMediaDBID: media.DBID, }) - delete(mediaByPathFold, pathFoldKey(resolved)) + delete(mediaByPathFold, matchedKey) if len(mediaByPathFold) == 0 { break outer } @@ -892,6 +892,44 @@ func findMediaFilePropFS( return nil } +func matchMediaByResolvedPath( + mediaByPathFold map[string]database.Media, + resolved string, +) (database.Media, string, bool) { + key := pathFoldKey(resolved) + if media, ok := mediaByPathFold[key]; ok { + return media, key, true + } + + if !strings.EqualFold(filepath.Ext(resolved), ".zip") { + return database.Media{}, "", false + } + + prefix := key + "/" + var matchedMedia database.Media + var matchedKey string + matches := 0 + for mediaKey, media := range mediaByPathFold { + if !strings.HasPrefix(mediaKey, prefix) { + continue + } + matches++ + if matches == 1 { + matchedMedia = media + matchedKey = mediaKey + } + } + + if matches == 1 { + return matchedMedia, matchedKey, true + } + if matches > 1 { + log.Warn().Str("path", resolved).Int("matches", matches). + Msg("gamelistxml: zip-as-dir path matched multiple indexed media rows, skipping") + } + return database.Media{}, "", false +} + func pathFoldKey(path string) string { return strings.ToLower(filepath.ToSlash(filepath.Clean(path))) } diff --git a/pkg/database/scraper/gamelistxml/scraper_test.go b/pkg/database/scraper/gamelistxml/scraper_test.go index 09f6efc4..43e417a5 100644 --- a/pkg/database/scraper/gamelistxml/scraper_test.go +++ b/pkg/database/scraper/gamelistxml/scraper_test.go @@ -307,6 +307,75 @@ func TestLoadRecords_TitleNameDoesNotNeedSlugMatch(t *testing.T) { assert.Equal(t, int64(7), records[0].MatchedMediaDBID) } +func TestLoadRecords_ZipAsDirChildMatch(t *testing.T) { + t.Parallel() + + root := t.TempDir() + zipPath := filepath.Join(root, "Japan", "10-Yard Fight (Japan) (Rev 1).zip") + innerPath := filepath.Join(zipPath, "10-Yard Fight (Japan) (Rev 1).nes") + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` + + ./Japan/10-Yard Fight (Japan) (Rev 1).zip10-Yard Fight +`), 0o600)) + + records, err := (&GamelistXMLScraper{}).LoadRecords( + context.Background(), + scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}}, + mediaByPath(database.Media{DBID: 70, MediaTitleDBID: 80, Path: innerPath}), + ) + require.NoError(t, err) + require.Len(t, records, 1) + assert.Equal(t, "10-Yard Fight", records[0].Game.Name) + assert.Equal(t, int64(70), records[0].MatchedMediaDBID) + assert.Equal(t, int64(80), records[0].MatchedTitleDBID) +} + +func TestLoadRecords_ZipAsDirAmbiguousChildrenSkipped(t *testing.T) { + t.Parallel() + + root := t.TempDir() + zipPath := filepath.Join(root, "Japan", "Multi Game.zip") + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` + + ./Japan/Multi Game.zipMulti Game +`), 0o600)) + + records, err := (&GamelistXMLScraper{}).LoadRecords( + context.Background(), + scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}}, + mediaByPath( + database.Media{DBID: 71, MediaTitleDBID: 81, Path: filepath.Join(zipPath, "one.nes")}, + database.Media{DBID: 72, MediaTitleDBID: 82, Path: filepath.Join(zipPath, "two.nes")}, + ), + ) + require.NoError(t, err) + assert.Empty(t, records) +} + +func TestLoadRecords_ZipAsDirExactMatchWins(t *testing.T) { + t.Parallel() + + root := t.TempDir() + zipPath := filepath.Join(root, "Japan", "Game.zip") + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` + + ./Japan/Game.zipGame +`), 0o600)) + + records, err := (&GamelistXMLScraper{}).LoadRecords( + context.Background(), + scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}}, + mediaByPath( + database.Media{DBID: 73, MediaTitleDBID: 83, Path: zipPath}, + database.Media{DBID: 74, MediaTitleDBID: 84, Path: filepath.Join(zipPath, "inner.nes")}, + ), + ) + require.NoError(t, err) + require.Len(t, records, 1) + assert.Equal(t, int64(73), records[0].MatchedMediaDBID) + assert.Equal(t, int64(83), records[0].MatchedTitleDBID) +} + func TestLoadRecords_SkipsMissingAndMalformedGameLists(t *testing.T) { t.Parallel() @@ -1830,6 +1899,80 @@ func TestScrapeLoop_NormalMode_Success(t *testing.T) { mockDB.AssertExpectations(t) } +func TestScrapeLoop_Issue794ZipAsDirMedia(t *testing.T) { + t.Parallel() + root := t.TempDir() + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` + + + ./Japan/10-Yard Fight (Japan) (Rev 1).zip + 10-Yard Fight + Football game. + 0.4 + 19850830T000000 + Irem + Nintendo + Sports / Football (American)-Sports + 1-2 + ./media/images/Japan/10-Yard Fight (Japan) (Rev 1).png + ./media/box2dfront/Japan/10-Yard Fight (Japan) (Rev 1).png + +`), 0o600)) + + const ( + mediaDBID = int64(7940) + titleDBID = int64(7941) + systemDBID = int64(7942) + ) + zipPath := filepath.Join(root, "Japan", "10-Yard Fight (Japan) (Rev 1).zip") + innerPath := filepath.Join(zipPath, "10-Yard Fight (Japan) (Rev 1).nes") + propPrefix := string(tags.TagTypeProperty) + ":" + imageProp := propPrefix + string(tags.TagPropertyImageImage) + thumbnailProp := propPrefix + string(tags.TagPropertyImageThumbnail) + writeMatcher := mock.MatchedBy(func(w *database.ScrapeWrite) bool { + if w == nil || w.Sentinel != scraper.SentinelTagInfo("gamelist.xml") { + return false + } + got := make(map[string]string, len(w.TitleProps)) + for _, p := range w.TitleProps { + got[p.TypeTag] = p.Text + } + return assert.Equal(t, filepath.ToSlash(filepath.Join( + root, "media", "images", "Japan", "10-Yard Fight (Japan) (Rev 1).png", + )), got[imageProp]) && assert.Equal(t, filepath.ToSlash(filepath.Join( + root, "media", "box2dfront", "Japan", "10-Yard Fight (Japan) (Rev 1).png", + )), got[thumbnailProp]) + }) + + mockDB := helpers.NewMockMediaDBI() + mockDB.On("GetMediaBySystemID", "nes"). + Return([]database.MediaWithFullPath{{DBID: mediaDBID, MediaTitleDBID: titleDBID, Path: innerPath}}, nil) + mockDB.On("GetScrapedMediaIDs", mock.Anything, "gamelist.xml", systemDBID). + Return(map[int64]struct{}{}, nil) + mockDB.On("ApplyScrapeResult", mock.Anything, mediaDBID, titleDBID, writeMatcher).Return(nil) + + s := &GamelistXMLScraper{db: mockDB} + system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: systemDBID} + ch := make(chan scraper.ScrapeUpdate, 128) + + s.scrapeLoop(context.Background(), scraper.ScrapeOptions{ + Pauser: syncutil.NewPauser(), + }, []scraper.ScrapeSystem{system}, mockDB, ch) + + updates := drainChannel(ch) + var done scraper.ScrapeUpdate + for _, u := range updates { + if u.Done { + done = u + } + } + require.True(t, done.Done) + assert.Equal(t, 1, done.Processed) + assert.Equal(t, 1, done.Matched) + assert.Equal(t, 0, done.Skipped) + mockDB.AssertExpectations(t) +} + func TestScrapeLoop_ForceMode_Success(t *testing.T) { t.Parallel() root := t.TempDir() From b961f5239e768aa267ea13953be9ebd42093ffe0 Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Fri, 29 May 2026 00:02:03 +0800 Subject: [PATCH 08/17] fix(scraper): show artwork for nested games --- docs/scraper.md | 6 +- pkg/api/methods/media_image.go | 3 +- pkg/api/methods/media_image_test.go | 53 ++++++ pkg/database/scraper/gamelistxml/scraper.go | 71 ++++++-- .../scraper/gamelistxml/scraper_test.go | 154 ++++++++++++++++++ 5 files changed, 270 insertions(+), 17 deletions(-) diff --git a/docs/scraper.md b/docs/scraper.md index 259a4767..92c88f2f 100644 --- a/docs/scraper.md +++ b/docs/scraper.md @@ -118,7 +118,9 @@ Source fields are cleaned before mapping: HTML entities are unescaped, tab/newli | `map` | `MediaTitleProperties: property:image-map` | XML path or filesystem fallback | | `manual` | `MediaTitleProperties: property:manual` | PDF path | -Filesystem fallback searches known subdirectories under `/media/` for a `.png` file when an XML path is absent. Side/back box art are filesystem-fallback only. +Filesystem fallback searches known subdirectories under `/media/` when an XML path is absent. For games in subfolders, it searches the mirrored ROM-relative path before the flat filename; for example `./Japan/Game.nes` checks `media/images/Japan/Game.png` before `media/images/Game.png`. Side/back box art are filesystem-fallback only. + +Only `/gamelist.xml` files are loaded. Nested files such as `/Japan/gamelist.xml` are not read by the current scraper. `gamelist.xml` deliberately does not scrape user-state fields such as favorite, hidden, or kidgame. It also does not overwrite filename-parser-owned fields such as disc and track. @@ -160,7 +162,7 @@ JSON-RPC methods: | `media.scrape.cancel` | Cancels the active scraper run | | `media.scrape.resume` | Resumes a paused scraper run | | `media.meta` | Returns tags and metadata-only properties for one or more media rows and their titles | -| `media.image` | Returns the best matching image property as base64 data for one media row | +| `media.image` | Returns the best matching image property as base64 data for one media row, including thumbnail art | | `media.clean.orphans` | Removes missing media rows and orphaned related data | `media.scrape` params: diff --git a/pkg/api/methods/media_image.go b/pkg/api/methods/media_image.go index 0c199577..49beb5dc 100644 --- a/pkg/api/methods/media_image.go +++ b/pkg/api/methods/media_image.go @@ -45,12 +45,13 @@ const ( // defaultImageTypes is the preference order used when no imageTypes param is provided. var defaultImageTypes = []string{ - "image", "boxart", "boxart3d", "screenshot", "wheel", "titleshot", "map", + "image", "thumbnail", "boxart", "boxart3d", "screenshot", "wheel", "titleshot", "map", "marquee", "fanart", } var imageTypeTags = map[string]string{ "image": tags.PropertyTypeTag(tags.TagPropertyImageImage), + "thumbnail": tags.PropertyTypeTag(tags.TagPropertyImageThumbnail), "boxart": tags.PropertyTypeTag(tags.TagPropertyImageBoxart), "boxart3d": tags.PropertyTypeTag(tags.TagPropertyImageBoxart3D), "boxartside": tags.PropertyTypeTag(tags.TagPropertyImageBoxartSide), diff --git a/pkg/api/methods/media_image_test.go b/pkg/api/methods/media_image_test.go index a73847ac..6d4874aa 100644 --- a/pkg/api/methods/media_image_test.go +++ b/pkg/api/methods/media_image_test.go @@ -158,6 +158,59 @@ func TestHandleMediaImage_DefaultPrefs_PathBackedCompanionArtwork(t *testing.T) mockDB.AssertExpectations(t) } +func TestHandleMediaImage_DefaultPrefs_TitleThumbnailFound(t *testing.T) { + t.Parallel() + + mockDB := testhelpers.NewMockMediaDBI() + thumbnailData := []byte("thumbnail-png-bytes") + + row := makeMediaFullRow(13, 130) + expectMediaImageResolve(mockDB, row) + mockDB.On("GetMediaProperties", mock.Anything, int64(13)). + Return([]database.MediaProperty{}, nil) + mockDB.On("GetMediaTitleProperties", mock.Anything, int64(130)). + Return([]database.MediaProperty{ + {TypeTag: "property:image-thumbnail", ContentType: "image/png", Binary: thumbnailData}, + }, nil) + + env := makeMediaImageEnv(t, mockDB, mediaImageParams(row, "")) + result, err := HandleMediaImage(env) + require.NoError(t, err) + + resp, ok := result.(models.MediaImageResponse) + require.True(t, ok) + assert.Equal(t, "property:image-thumbnail", resp.TypeTag) + assert.Equal(t, "image/png", resp.ContentType) + assert.Equal(t, base64.StdEncoding.EncodeToString(thumbnailData), resp.Data) + mockDB.AssertExpectations(t) +} + +func TestHandleMediaImage_ExplicitThumbnailType(t *testing.T) { + t.Parallel() + + mockDB := testhelpers.NewMockMediaDBI() + thumbnailData := []byte("thumbnail-png-bytes") + + row := makeMediaFullRow(14, 140) + expectMediaImageResolve(mockDB, row) + mockDB.On("GetMediaProperties", mock.Anything, int64(14)). + Return([]database.MediaProperty{}, nil) + mockDB.On("GetMediaTitleProperties", mock.Anything, int64(140)). + Return([]database.MediaProperty{ + {TypeTag: "property:image-thumbnail", ContentType: "image/png", Binary: thumbnailData}, + }, nil) + + env := makeMediaImageEnv(t, mockDB, mediaImageParams(row, `"imageTypes": ["thumbnail"]`)) + result, err := HandleMediaImage(env) + require.NoError(t, err) + + resp, ok := result.(models.MediaImageResponse) + require.True(t, ok) + assert.Equal(t, "property:image-thumbnail", resp.TypeTag) + assert.Equal(t, base64.StdEncoding.EncodeToString(thumbnailData), resp.Data) + mockDB.AssertExpectations(t) +} + func TestHandleMediaImage_DefaultPrefs_FallsBackToNextCompanionArtwork(t *testing.T) { t.Parallel() diff --git a/pkg/database/scraper/gamelistxml/scraper.go b/pkg/database/scraper/gamelistxml/scraper.go index d2fb8798..d6e14bf1 100644 --- a/pkg/database/scraper/gamelistxml/scraper.go +++ b/pkg/database/scraper/gamelistxml/scraper.go @@ -65,12 +65,12 @@ type GamelistRecord struct { // expected filename wins. var mediaDirCandidates = map[string][]string{ string(tags.TagPropertyImageImage): {"image", "images"}, - string(tags.TagPropertyImageBoxart): {"boxart", "boxart2d", "boxart2dfront"}, + string(tags.TagPropertyImageBoxart): {"boxart", "boxart2d", "boxart2dfront", "box2dfront"}, string(tags.TagPropertyImageBoxart3D): {"boxart3d"}, string(tags.TagPropertyImageBoxartSide): {"boxart2dside"}, string(tags.TagPropertyImageBoxartBack): {"boxart2dback"}, string(tags.TagPropertyImageScreenshot): {"screenshot", "screenshots"}, - string(tags.TagPropertyImageThumbnail): {"thumbnail", "thumbnails", "supporttexture"}, + string(tags.TagPropertyImageThumbnail): {"thumbnail", "thumbnails", "box2dfront", "boxart2dfront", "supporttexture"}, string(tags.TagPropertyImageMarquee): {"marquee", "marquees"}, string(tags.TagPropertyImageWheel): {"wheel", "wheels"}, string(tags.TagPropertyImageFanart): {"fanart", "fanarts"}, @@ -579,9 +579,9 @@ func (g *GamelistXMLScraper) MapToDB(record *GamelistRecord) scraper.MapResult { propType := string(tags.TagTypeProperty) root := record.SystemRootPath - // stem is the ROM filename without extension, used to locate matching + // fallbackNames are ROM-relative PNG filenames used to locate matching // artwork files under media/ sub-directories. - stem := strings.TrimSuffix(filepath.Base(game.Path), filepath.Ext(game.Path)) + fallbackNames := artworkFallbackNames(game.Path, record.SystemRootPath) if game.Desc != "" { titleProps = append(titleProps, @@ -603,7 +603,7 @@ func (g *GamelistXMLScraper) MapToDB(record *GamelistRecord) scraper.MapResult { p := pathProp(key, xmlPath, root) if p == nil { p = findMediaFilePropFS( - g.filesystem(), key, stem, + g.filesystem(), key, fallbackNames, mediaDirCandidates[string(propValue)], record.AvailableMediaDirs, ) } @@ -863,16 +863,53 @@ func findMediaFileProp( candidates []string, availableDirs map[string]string, ) *database.MediaProperty { - return findMediaFilePropFS(afero.NewOsFs(), typeTag, stem, candidates, availableDirs) + if stem == "" || stem == "." { + return nil + } + return findMediaFilePropFS(afero.NewOsFs(), typeTag, []string{stem + ".png"}, candidates, availableDirs) +} + +func artworkFallbackNames(gamePath, systemRootPath string) []string { + resolved := resolveESPath(gamePath, systemRootPath) + if resolved == "" { + return nil + } + + rootAbs, err := filepath.Abs(systemRootPath) + if err != nil { + return nil + } + rel, err := filepath.Rel(filepath.Clean(rootAbs), filepath.Clean(resolved)) + if err != nil || rel == "." || rel == ".." || strings.HasPrefix(rel, ".."+string(filepath.Separator)) { + return nil + } + + stem := strings.TrimSuffix(filepath.Base(rel), filepath.Ext(rel)) + if stem == "" || stem == "." { + return nil + } + + flat := stem + ".png" + dir := filepath.Dir(rel) + if dir == "." || dir == "" { + return []string{flat} + } + + nested := filepath.Join(dir, flat) + if nested == flat { + return []string{flat} + } + return []string{nested, flat} } func findMediaFilePropFS( fs afero.Fs, - typeTag, stem string, + typeTag string, + fallbackNames []string, candidates []string, availableDirs map[string]string, ) *database.MediaProperty { - if stem == "" || stem == "." { + if len(fallbackNames) == 0 { return nil } for _, dir := range candidates { @@ -880,12 +917,18 @@ func findMediaFilePropFS( if !ok { continue } - candidate := filepath.Join(dirPath, stem+".png") - if exists, err := afero.Exists(fs, candidate); err == nil && exists { - return &database.MediaProperty{ - TypeTag: typeTag, - Text: filepath.ToSlash(candidate), - ContentType: "image/png", + for _, name := range fallbackNames { + cleanName := filepath.Clean(name) + if name == "" || cleanName == "." || cleanName == ".." || strings.HasPrefix(cleanName, ".."+string(filepath.Separator)) { + continue + } + candidate := filepath.Join(dirPath, name) + if exists, err := afero.Exists(fs, candidate); err == nil && exists { + return &database.MediaProperty{ + TypeTag: typeTag, + Text: filepath.ToSlash(candidate), + ContentType: "image/png", + } } } } diff --git a/pkg/database/scraper/gamelistxml/scraper_test.go b/pkg/database/scraper/gamelistxml/scraper_test.go index 43e417a5..8a8320a6 100644 --- a/pkg/database/scraper/gamelistxml/scraper_test.go +++ b/pkg/database/scraper/gamelistxml/scraper_test.go @@ -286,6 +286,47 @@ func TestLoadRecords_PathMatch(t *testing.T) { assert.Equal(t, filepath.Join(root, "media", "image"), records[0].AvailableMediaDirs["image"]) } +func TestLoadRecords_SubfolderPathMatch(t *testing.T) { + t.Parallel() + + root := t.TempDir() + require.NoError(t, os.MkdirAll(filepath.Join(root, "Japan"), 0o750)) + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` + + ./Japan/Game.nesGame +`), 0o600)) + + records, err := (&GamelistXMLScraper{}).LoadRecords( + context.Background(), + scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}}, + mediaByPath(database.Media{DBID: 12, MediaTitleDBID: 23, Path: filepath.Join(root, "Japan", "Game.nes")}), + ) + require.NoError(t, err) + require.Len(t, records, 1) + assert.Equal(t, "./Japan/Game.nes", records[0].Game.Path) + assert.Equal(t, int64(12), records[0].MatchedMediaDBID) + assert.Equal(t, int64(23), records[0].MatchedTitleDBID) +} + +func TestLoadRecords_DoesNotReadNestedGameList(t *testing.T) { + t.Parallel() + + root := t.TempDir() + require.NoError(t, os.MkdirAll(filepath.Join(root, "Japan"), 0o750)) + require.NoError(t, os.WriteFile(filepath.Join(root, "Japan", "gamelist.xml"), []byte(` + + ./Game.nesGame +`), 0o600)) + + records, err := (&GamelistXMLScraper{}).LoadRecords( + context.Background(), + scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}}, + mediaByPath(database.Media{DBID: 13, MediaTitleDBID: 24, Path: filepath.Join(root, "Japan", "Game.nes")}), + ) + require.NoError(t, err) + assert.Empty(t, records) +} + func TestLoadRecords_TitleNameDoesNotNeedSlugMatch(t *testing.T) { t.Parallel() @@ -779,6 +820,31 @@ func TestFindMediaFileProp_EmptyStem(t *testing.T) { // --- MapToDB filesystem fallback --- +func TestMapToDB_NestedExplicitImagePath(t *testing.T) { + t.Parallel() + root := t.TempDir() + + rec := GamelistRecord{ + SystemRootPath: root, + Game: esapi.Game{ + Path: "./Japan/Game.nes", + Image: "./media/images/Japan/Game.png", + }, + } + + result := (&GamelistXMLScraper{}).MapToDB(&rec) + + propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageImage) + var found bool + for _, p := range result.TitleProps { + if p.TypeTag == propKey { + found = true + assert.Equal(t, filepath.ToSlash(filepath.Join(root, "media", "images", "Japan", "Game.png")), p.Text) + } + } + assert.True(t, found, "nested explicit image property missing") +} + func TestMapToDB_FilesystemFallback_Image(t *testing.T) { t.Parallel() root := t.TempDir() @@ -810,6 +876,94 @@ func TestMapToDB_FilesystemFallback_Image(t *testing.T) { assert.True(t, found, "filesystem fallback image property missing") } +func TestMapToDB_FilesystemFallback_NestedGamePath(t *testing.T) { + t.Parallel() + root := t.TempDir() + + imgDir := filepath.Join(root, "media", "images") + imgPath := filepath.Join(imgDir, "Japan", "Game.png") + require.NoError(t, os.MkdirAll(filepath.Dir(imgPath), 0o750)) + require.NoError(t, os.WriteFile(imgPath, []byte{}, 0o600)) + + rec := GamelistRecord{ + SystemRootPath: root, + AvailableMediaDirs: map[string]string{"images": imgDir}, + Game: esapi.Game{ + Path: "./Japan/Game.nes", + }, + } + + result := (&GamelistXMLScraper{}).MapToDB(&rec) + + propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageImage) + var found bool + for _, p := range result.TitleProps { + if p.TypeTag == propKey { + found = true + assert.Equal(t, filepath.ToSlash(imgPath), p.Text) + } + } + assert.True(t, found, "nested filesystem fallback image property missing") +} + +func TestMapToDB_FilesystemFallback_NestedWinsBeforeFlat(t *testing.T) { + t.Parallel() + root := t.TempDir() + + imgDir := filepath.Join(root, "media", "images") + nestedPath := filepath.Join(imgDir, "Japan", "Game.png") + flatPath := filepath.Join(imgDir, "Game.png") + require.NoError(t, os.MkdirAll(filepath.Dir(nestedPath), 0o750)) + require.NoError(t, os.WriteFile(nestedPath, []byte{}, 0o600)) + require.NoError(t, os.WriteFile(flatPath, []byte{}, 0o600)) + + rec := GamelistRecord{ + SystemRootPath: root, + AvailableMediaDirs: map[string]string{"images": imgDir}, + Game: esapi.Game{Path: "./Japan/Game.nes"}, + } + + result := (&GamelistXMLScraper{}).MapToDB(&rec) + + propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageImage) + var found bool + for _, p := range result.TitleProps { + if p.TypeTag == propKey { + found = true + assert.Equal(t, filepath.ToSlash(nestedPath), p.Text) + } + } + assert.True(t, found, "nested filesystem fallback image property missing") +} + +func TestMapToDB_FilesystemFallback_ThumbnailBox2DFrontAlias(t *testing.T) { + t.Parallel() + root := t.TempDir() + + thumbDir := filepath.Join(root, "media", "box2dfront") + thumbPath := filepath.Join(thumbDir, "Game.png") + require.NoError(t, os.MkdirAll(thumbDir, 0o750)) + require.NoError(t, os.WriteFile(thumbPath, []byte{}, 0o600)) + + rec := GamelistRecord{ + SystemRootPath: root, + AvailableMediaDirs: map[string]string{"box2dfront": thumbDir}, + Game: esapi.Game{Path: "./Game.nes"}, + } + + result := (&GamelistXMLScraper{}).MapToDB(&rec) + + propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageThumbnail) + var found bool + for _, p := range result.TitleProps { + if p.TypeTag == propKey { + found = true + assert.Equal(t, filepath.ToSlash(thumbPath), p.Text) + } + } + assert.True(t, found, "box2dfront thumbnail fallback property missing") +} + func TestMapToDB_FilesystemFallback_Boxart(t *testing.T) { t.Parallel() root := t.TempDir() From dde6bcc59991b838d448fdc61cff666655042aff Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Fri, 29 May 2026 00:03:20 +0800 Subject: [PATCH 09/17] style(scraper): wrap nested artwork fallback lines --- pkg/database/scraper/gamelistxml/scraper.go | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/pkg/database/scraper/gamelistxml/scraper.go b/pkg/database/scraper/gamelistxml/scraper.go index d6e14bf1..23216fcc 100644 --- a/pkg/database/scraper/gamelistxml/scraper.go +++ b/pkg/database/scraper/gamelistxml/scraper.go @@ -70,12 +70,14 @@ var mediaDirCandidates = map[string][]string{ string(tags.TagPropertyImageBoxartSide): {"boxart2dside"}, string(tags.TagPropertyImageBoxartBack): {"boxart2dback"}, string(tags.TagPropertyImageScreenshot): {"screenshot", "screenshots"}, - string(tags.TagPropertyImageThumbnail): {"thumbnail", "thumbnails", "box2dfront", "boxart2dfront", "supporttexture"}, - string(tags.TagPropertyImageMarquee): {"marquee", "marquees"}, - string(tags.TagPropertyImageWheel): {"wheel", "wheels"}, - string(tags.TagPropertyImageFanart): {"fanart", "fanarts"}, - string(tags.TagPropertyImageTitleshot): {"titleshot", "titleshots", "screenshottitle"}, - string(tags.TagPropertyImageMap): {"map", "maps"}, + string(tags.TagPropertyImageThumbnail): { + "thumbnail", "thumbnails", "box2dfront", "boxart2dfront", "supporttexture", + }, + string(tags.TagPropertyImageMarquee): {"marquee", "marquees"}, + string(tags.TagPropertyImageWheel): {"wheel", "wheels"}, + string(tags.TagPropertyImageFanart): {"fanart", "fanarts"}, + string(tags.TagPropertyImageTitleshot): {"titleshot", "titleshots", "screenshottitle"}, + string(tags.TagPropertyImageMap): {"map", "maps"}, } // GamelistXMLScraper loads and maps EmulationStation gamelist.xml records. @@ -919,7 +921,8 @@ func findMediaFilePropFS( } for _, name := range fallbackNames { cleanName := filepath.Clean(name) - if name == "" || cleanName == "." || cleanName == ".." || strings.HasPrefix(cleanName, ".."+string(filepath.Separator)) { + if name == "" || cleanName == "." || cleanName == ".." || + strings.HasPrefix(cleanName, ".."+string(filepath.Separator)) { continue } candidate := filepath.Join(dirPath, name) From 3c3a1cccaf51d2cd408500dc113a424eccd2146f Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Fri, 29 May 2026 06:15:38 +0800 Subject: [PATCH 10/17] fix(scraper): normalize companion child tags --- pkg/api/methods/media_content.go | 5 +- pkg/api/methods/media_content_test.go | 32 +++++++++++++ pkg/database/scraper/gamelistxml/scraper.go | 20 ++++---- .../scraper/gamelistxml/scraper_test.go | 47 +++++++++++++++++-- 4 files changed, 86 insertions(+), 18 deletions(-) create mode 100644 pkg/api/methods/media_content_test.go diff --git a/pkg/api/methods/media_content.go b/pkg/api/methods/media_content.go index 8affa918..7b3054d7 100644 --- a/pkg/api/methods/media_content.go +++ b/pkg/api/methods/media_content.go @@ -47,8 +47,9 @@ func mediaContentExtension(contentType, text string) *string { } func mediaContentType(contentType, text string) string { - if strings.TrimSpace(contentType) != "" { - return contentType + trimmedContentType := strings.TrimSpace(contentType) + if trimmedContentType != "" { + return trimmedContentType } ext := extensionFromTextPath(text) if ext == "" { diff --git a/pkg/api/methods/media_content_test.go b/pkg/api/methods/media_content_test.go new file mode 100644 index 00000000..39436649 --- /dev/null +++ b/pkg/api/methods/media_content_test.go @@ -0,0 +1,32 @@ +// Zaparoo Core +// Copyright (c) 2026 The Zaparoo Project Contributors. +// SPDX-License-Identifier: GPL-3.0-or-later +// +// This file is part of Zaparoo Core. +// +// Zaparoo Core is free software: you can redistribute it and/or modify +// it under the terms of the GNU General Public License as published by +// the Free Software Foundation, either version 3 of the License, or +// (at your option) any later version. +// +// Zaparoo Core is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with Zaparoo Core. If not, see . + +package methods + +import ( + "testing" + + "github.com/stretchr/testify/assert" +) + +func TestMediaContentType_TrimsProvidedContentType(t *testing.T) { + t.Parallel() + + assert.Equal(t, "image/png", mediaContentType(" image/png ", "cover.jpg")) +} diff --git a/pkg/database/scraper/gamelistxml/scraper.go b/pkg/database/scraper/gamelistxml/scraper.go index 23216fcc..a627a33d 100644 --- a/pkg/database/scraper/gamelistxml/scraper.go +++ b/pkg/database/scraper/gamelistxml/scraper.go @@ -88,6 +88,12 @@ type GamelistXMLScraper struct { cfg *config.Instance } +type companionStats struct { + Processed int + Matched int + Skipped int +} + func (g *GamelistXMLScraper) filesystem() afero.Fs { if g == nil || g.fs == nil { return afero.NewOsFs() @@ -1015,12 +1021,6 @@ type companionChild struct { Lang string } -type companionStats struct { - Processed int - Matched int - Skipped int -} - // loadCompanionEntries scans all gamelist.xml files for the system and separates // entries with source="ZaparooCompanion" into parent records (id attr, no path) and // child records (parentid attr, has path). @@ -1306,12 +1306,8 @@ func (*GamelistXMLScraper) matchCompanionChildMedia( func companionChildTags(c companionChild) []database.TagInfo { var childTags []database.TagInfo - if c.Region != "" { - childTags = append(childTags, database.TagInfo{Type: string(tags.TagTypeRegion), Tag: c.Region}) - } - if c.Lang != "" { - childTags = append(childTags, database.TagInfo{Type: string(tags.TagTypeLang), Tag: c.Lang}) - } + childTags = appendCSVTags(childTags, string(tags.TagTypeRegion), c.Region) + childTags = appendCSVTags(childTags, string(tags.TagTypeLang), c.Lang) return childTags } diff --git a/pkg/database/scraper/gamelistxml/scraper_test.go b/pkg/database/scraper/gamelistxml/scraper_test.go index 8a8320a6..a91494d9 100644 --- a/pkg/database/scraper/gamelistxml/scraper_test.go +++ b/pkg/database/scraper/gamelistxml/scraper_test.go @@ -240,6 +240,19 @@ func TestSplitCSV_Empty(t *testing.T) { assert.Empty(t, got) } +func TestCompanionChildTags_NormalizesCSV(t *testing.T) { + t.Parallel() + + got := companionChildTags(companionChild{Region: "USA, EUR", Lang: "EN, JA"}) + + assert.Equal(t, []database.TagInfo{ + {Type: string(tags.TagTypeRegion), Tag: "usa"}, + {Type: string(tags.TagTypeRegion), Tag: "eur"}, + {Type: string(tags.TagTypeLang), Tag: "en"}, + {Type: string(tags.TagTypeLang), Tag: "ja"}, + }, got) +} + // --- mimeFromExt --- func TestMimeFromExt_PNG(t *testing.T) { assert.Equal(t, "image/png", mimeFromExt("art.PNG")) } @@ -1601,13 +1614,27 @@ func TestLoadCompanionEntries_ChildPathTraversalRejected(t *testing.T) { // --- processCompanionEntries --- +func companionXMLGameIDProps(id string) []database.MediaProperty { + return []database.MediaProperty{ + { + TypeTag: string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyXMLGameID), + Text: id, + ContentType: "text/plain", + }, + } +} + func companionWriteMatcher( mediaTags []database.TagInfo, + titleTags []database.TagInfo, + titleProps []database.MediaProperty, ) any { return mock.MatchedBy(func(w *database.ScrapeWrite) bool { return w != nil && w.Sentinel == scraper.SentinelTagInfo("gamelist.xml") && - assert.ObjectsAreEqual(mediaTags, w.MediaTags) + assert.ObjectsAreEqual(mediaTags, w.MediaTags) && + assert.ObjectsAreEqual(titleTags, w.TitleTags) && + assert.ObjectsAreEqual(titleProps, w.TitleProps) }) } @@ -1656,12 +1683,14 @@ func TestProcessCompanionEntries_ChildByExactPath(t *testing.T) { {Type: string(tags.TagTypeRegion), Tag: "usa"}, {Type: string(tags.TagTypeLang), Tag: "en"}, } + titleTags := []database.TagInfo{{Type: string(tags.TagTypeDeveloper), Tag: "Dev Corp"}} mockDB := helpers.NewMockMediaDBI() mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), resolvedPath). Return(&database.Media{DBID: 10, MediaTitleDBID: 20}, nil) mockDB.On("MediaHasTag", mock.Anything, int64(10), "scraper.gamelist.xml:scraped").Return(false, nil) - mockDB.On("ApplyScrapeResult", mock.Anything, int64(10), int64(20), companionWriteMatcher(childTags)).Return(nil) + mockDB.On("ApplyScrapeResult", mock.Anything, int64(10), int64(20), + companionWriteMatcher(childTags, titleTags, companionXMLGameIDProps("42"))).Return(nil) s := &GamelistXMLScraper{db: mockDB} system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 1} @@ -1688,7 +1717,12 @@ func TestProcessCompanionEntries_ChildBySlugFile(t *testing.T) { mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{{DBID: 40, MediaTitleDBID: 30}}, nil) mockDB.On("FindMediaTitleBySystemAndSlug", mock.Anything, int64(5), "myslug").Return(title, nil) mockDB.On("MediaHasTag", mock.Anything, int64(40), "scraper.gamelist.xml:scraped").Return(false, nil) - mockDB.On("ApplyScrapeResult", mock.Anything, int64(40), int64(30), companionWriteMatcher(nil)).Return(nil) + mockDB.On("ApplyScrapeResult", mock.Anything, int64(40), int64(30), + companionWriteMatcher( + nil, + []database.TagInfo{{Type: string(tags.TagTypeDeveloper), Tag: "Dev"}}, + companionXMLGameIDProps("99"), + )).Return(nil) s := &GamelistXMLScraper{db: mockDB} system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 5} @@ -1941,7 +1975,12 @@ func TestProcessCompanionEntries_NoRegionLangStillWritesSentinel(t *testing.T) { mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), gamePath). Return(&database.Media{DBID: 5, MediaTitleDBID: 6}, nil) mockDB.On("MediaHasTag", mock.Anything, int64(5), "scraper.gamelist.xml:scraped").Return(false, nil) - mockDB.On("ApplyScrapeResult", mock.Anything, int64(5), int64(6), companionWriteMatcher(nil)).Return(nil) + mockDB.On("ApplyScrapeResult", mock.Anything, int64(5), int64(6), + companionWriteMatcher( + nil, + []database.TagInfo{{Type: string(tags.TagTypeDeveloper), Tag: "Dev"}}, + companionXMLGameIDProps("1"), + )).Return(nil) s := &GamelistXMLScraper{db: mockDB} system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 1} From 96a2a407df60ac53277e24f0996504904b03ecab Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Fri, 29 May 2026 06:37:10 +0800 Subject: [PATCH 11/17] fix(scraper): repair companion image metadata --- pkg/api/methods/media_image.go | 46 +++++++++++++++- pkg/api/methods/media_image_test.go | 4 ++ pkg/database/scraper/gamelistxml/scraper.go | 54 ++++++++++++------- .../scraper/gamelistxml/scraper_test.go | 36 +++++++++---- 4 files changed, 111 insertions(+), 29 deletions(-) diff --git a/pkg/api/methods/media_image.go b/pkg/api/methods/media_image.go index 49beb5dc..a683868b 100644 --- a/pkg/api/methods/media_image.go +++ b/pkg/api/methods/media_image.go @@ -27,6 +27,7 @@ import ( "errors" "fmt" "os" + "sort" "github.com/ZaparooProject/zaparoo-core/v2/pkg/api/models" "github.com/ZaparooProject/zaparoo-core/v2/pkg/api/models/requests" @@ -94,6 +95,27 @@ func buildPropsMap(props []database.MediaProperty) map[string]database.MediaProp return m } +func imagePropertyTypeTags(props []database.MediaProperty) []string { + known := make(map[string]struct{}, len(imageTypeTags)) + for _, typeTag := range imageTypeTags { + known[typeTag] = struct{}{} + } + + seen := make(map[string]struct{}) + for _, p := range props { + if _, ok := known[p.TypeTag]; ok { + seen[p.TypeTag] = struct{}{} + } + } + + result := make([]string, 0, len(seen)) + for typeTag := range seen { + result = append(result, typeTag) + } + sort.Strings(result) + return result +} + // HandleMediaImage returns a single best-match image for a media record as a // base64-encoded blob. func HandleMediaImage(env requests.RequestEnv) (any, error) { //nolint:gocritic // single-use parameter in API handler @@ -259,8 +281,18 @@ func selectMediaImage( } } + log.Debug(). + Str("system", row.System.SystemID). + Str("path", row.Path). + Int64("mediaDBID", row.DBID). + Int64("titleDBID", row.Title.DBID). + Strs("prefs", prefs). + Strs("mediaImageProps", imagePropertyTypeTags(mediaProps)). + Strs("titleImageProps", imagePropertyTypeTags(titleProps)). + Msg("media.image: no image found") + return models.MediaImageResponse{}, models.ClientErrf( - "no image found for media: %s/%s", row.System.SystemID, row.Path, + "no image found for media: system=%q path=%q", row.System.SystemID, row.Path, ) } @@ -372,13 +404,25 @@ func deleteStaleMediaImageProperty( if delErr := db.DeleteMediaTitleProperty(ctx, row.Title.DBID, prop.TypeTagDBID); delErr != nil { log.Warn().Err(delErr).Int64("titleDBID", row.Title.DBID).Str("typeTag", typeTag). Msg("media.image: failed to delete stale title property") + return } + log.Debug(). + Int64("titleDBID", row.Title.DBID). + Str("typeTag", typeTag). + Str("text", prop.Text). + Msg("media.image: deleted stale image property") return } if delErr := db.DeleteMediaProperty(ctx, row.DBID, prop.TypeTagDBID); delErr != nil { log.Warn().Err(delErr).Int64("mediaDBID", row.DBID).Str("typeTag", typeTag). Msg("media.image: failed to delete stale media property") + return } + log.Debug(). + Int64("mediaDBID", row.DBID). + Str("typeTag", typeTag). + Str("text", prop.Text). + Msg("media.image: deleted stale image property") } func mediaImageMaxBytes(pl platforms.Platform) int64 { diff --git a/pkg/api/methods/media_image_test.go b/pkg/api/methods/media_image_test.go index 6d4874aa..6001aaf7 100644 --- a/pkg/api/methods/media_image_test.go +++ b/pkg/api/methods/media_image_test.go @@ -350,6 +350,7 @@ func TestHandleMediaImage_NoMatchFound(t *testing.T) { mockDB := testhelpers.NewMockMediaDBI() row := makeMediaFullRow(4, 40) + row.Path = string(filepath.Separator) + filepath.Join("games", "missing.rom") expectMediaImageResolve(mockDB, row) mockDB.On("GetMediaProperties", mock.Anything, int64(4)). Return([]database.MediaProperty{}, nil) @@ -362,6 +363,9 @@ func TestHandleMediaImage_NoMatchFound(t *testing.T) { var clientErr *models.ClientError require.ErrorAs(t, err, &clientErr) + assert.Contains(t, err.Error(), `system="NES"`) + assert.Contains(t, err.Error(), `path="/games/missing.rom"`) + assert.NotContains(t, err.Error(), "NES//games") mockDB.AssertExpectations(t) } diff --git a/pkg/database/scraper/gamelistxml/scraper.go b/pkg/database/scraper/gamelistxml/scraper.go index a627a33d..6209057d 100644 --- a/pkg/database/scraper/gamelistxml/scraper.go +++ b/pkg/database/scraper/gamelistxml/scraper.go @@ -210,6 +210,8 @@ func (g *GamelistXMLScraper) LoadRecords( mediaByPathFold map[string]database.Media, ) ([]*GamelistRecord, error) { var records []*GamelistRecord + candidateMedia := len(mediaByPathFold) + var gamelistFiles, gamelistEntries, companionEntriesSkipped, invalidPaths, unmatchedPaths int outer: for _, rootPath := range system.ROMPaths { @@ -231,6 +233,8 @@ outer: continue } + gamelistFiles++ + gamelistEntries += len(gl.Games) log.Info(). Str("path", gamelistPath). Int("entries", len(gl.Games)). @@ -239,13 +243,20 @@ outer: availableMediaDirs := statMediaDirsFS(g.filesystem(), rootPath) for i := range gl.Games { + if isCompanionGame(&gl.Games[i]) { + companionEntriesSkipped++ + continue + } + resolved := resolveESPath(gl.Games[i].Path, rootPath) if resolved == "" { + invalidPaths++ continue } media, matchedKey, ok := matchMediaByResolvedPath(mediaByPathFold, resolved) if !ok { + unmatchedPaths++ continue } @@ -265,6 +276,14 @@ outer: log.Info(). Str("system", system.ID). + Int("candidate_media", candidateMedia). + Int("gamelist_files", gamelistFiles). + Int("gamelist_entries", gamelistEntries). + Int("companion_entries_skipped", companionEntriesSkipped). + Int("invalid_paths", invalidPaths). + Int("unmatched_paths", unmatchedPaths). + Int("matched_records", len(records)). + Int("remaining_unmatched_media", len(mediaByPathFold)). Int("total_records", len(records)). Msg("gamelistxml: finished loading records for system") @@ -1001,6 +1020,10 @@ func readGameListXMLFS(fs afero.Fs, path string) (*esapi.GameList, error) { // companionSource is the XML source attribute value that marks ZaparooCompanion entries. const companionSource = "ZaparooCompanion" +func isCompanionGame(game *esapi.Game) bool { + return game != nil && (game.Source == companionSource || game.SourceAttr == companionSource) +} + // companionParent holds a ZaparooCompanion parent meta record parsed from a gamelist.xml. // Parent records carry full metadata but no ROM path; they represent the canonical game // title shared by multiple regional ROM releases. @@ -1057,7 +1080,7 @@ func (g *GamelistXMLScraper) loadCompanionEntries( default: } game := gl.Games[i] - if game.Source != companionSource && game.SourceAttr != companionSource { + if !isCompanionGame(&game) { log.Debug().Str("source", game.Source).Msg("source not companion") continue } @@ -1171,9 +1194,20 @@ func (g *GamelistXMLScraper) processCompanionEntries( } sentinel := scraper.SentinelTagInfo("gamelist.xml") - sentinelTag := sentinel.Type + ":" + sentinel.Tag var stats companionStats + defer func() { + log.Info(). + Str("system", system.ID). + Int("parents", len(parents)). + Int("children", len(children)). + Int("processed", stats.Processed). + Int("matched", stats.Matched). + Int("skipped", stats.Skipped). + Bool("force", opts.Force). + Msg("gamelistxml: companion: finished entries") + }() + for _, c := range children { stats.Processed++ meta, ok := parentMeta[c.ParentGameID] @@ -1191,22 +1225,6 @@ func (g *GamelistXMLScraper) processCompanionEntries( } for _, media := range matched { - if !opts.Force { - scraped, tagErr := mdb.MediaHasTag(ctx, media.DBID, sentinelTag) - if tagErr != nil { - log.Warn().Err(tagErr).Int64("mediaDBID", media.DBID). - Msg("gamelistxml: companion: sentinel check failed, skipping child media") - stats.Skipped++ - continue - } - if scraped { - log.Debug().Int64("mediaDBID", media.DBID). - Msg("gamelistxml: companion: child media already scraped, skipping") - stats.Skipped++ - continue - } - } - writeErr := mdb.ApplyScrapeResult(ctx, media.DBID, media.MediaTitleDBID, &database.ScrapeWrite{ Sentinel: sentinel, MediaTags: companionChildTags(c), diff --git a/pkg/database/scraper/gamelistxml/scraper_test.go b/pkg/database/scraper/gamelistxml/scraper_test.go index a91494d9..e1656cb0 100644 --- a/pkg/database/scraper/gamelistxml/scraper_test.go +++ b/pkg/database/scraper/gamelistxml/scraper_test.go @@ -321,6 +321,29 @@ func TestLoadRecords_SubfolderPathMatch(t *testing.T) { assert.Equal(t, int64(23), records[0].MatchedTitleDBID) } +func TestLoadRecords_SkipsCompanionChildren(t *testing.T) { + t.Parallel() + + root := t.TempDir() + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` + + + Parent + + + ./child.nes + +`), 0o600)) + + records, err := (&GamelistXMLScraper{}).LoadRecords( + context.Background(), + scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}}, + mediaByPath(database.Media{DBID: 12, MediaTitleDBID: 23, Path: filepath.Join(root, "child.nes")}), + ) + require.NoError(t, err) + assert.Empty(t, records, "companion children must be handled only by companion processing") +} + func TestLoadRecords_DoesNotReadNestedGameList(t *testing.T) { t.Parallel() @@ -1688,7 +1711,6 @@ func TestProcessCompanionEntries_ChildByExactPath(t *testing.T) { mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), resolvedPath). Return(&database.Media{DBID: 10, MediaTitleDBID: 20}, nil) - mockDB.On("MediaHasTag", mock.Anything, int64(10), "scraper.gamelist.xml:scraped").Return(false, nil) mockDB.On("ApplyScrapeResult", mock.Anything, int64(10), int64(20), companionWriteMatcher(childTags, titleTags, companionXMLGameIDProps("42"))).Return(nil) @@ -1716,7 +1738,6 @@ func TestProcessCompanionEntries_ChildBySlugFile(t *testing.T) { mockDB := helpers.NewMockMediaDBI() mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{{DBID: 40, MediaTitleDBID: 30}}, nil) mockDB.On("FindMediaTitleBySystemAndSlug", mock.Anything, int64(5), "myslug").Return(title, nil) - mockDB.On("MediaHasTag", mock.Anything, int64(40), "scraper.gamelist.xml:scraped").Return(false, nil) mockDB.On("ApplyScrapeResult", mock.Anything, int64(40), int64(30), companionWriteMatcher( nil, @@ -1731,7 +1752,7 @@ func TestProcessCompanionEntries_ChildBySlugFile(t *testing.T) { mockDB.AssertExpectations(t) } -func TestProcessCompanionEntries_SkipsAlreadyScrapedUnlessForce(t *testing.T) { +func TestProcessCompanionEntries_RewritesAlreadyScraped(t *testing.T) { t.Parallel() root := t.TempDir() require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(companionXML), 0o600)) @@ -1741,12 +1762,12 @@ func TestProcessCompanionEntries_SkipsAlreadyScrapedUnlessForce(t *testing.T) { mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), resolvedPath). Return(&database.Media{DBID: 10, MediaTitleDBID: 20}, nil) - mockDB.On("MediaHasTag", mock.Anything, int64(10), "scraper.gamelist.xml:scraped").Return(true, nil) + mockDB.On("ApplyScrapeResult", mock.Anything, int64(10), int64(20), mock.Anything).Return(nil) s := &GamelistXMLScraper{db: mockDB} system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 1} stats := s.processCompanionEntries(context.Background(), scraper.ScrapeOptions{}, system, mockDB) - assert.Equal(t, companionStats{Processed: 1, Skipped: 1}, stats) + assert.Equal(t, companionStats{Processed: 1, Matched: 1}, stats) mockDB.AssertExpectations(t) } @@ -1842,7 +1863,6 @@ func TestProcessCompanionEntries_MapsIssue161CompanionArtwork(t *testing.T) { mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), childPath). Return(&database.Media{DBID: 10, MediaTitleDBID: 20}, nil) - mockDB.On("MediaHasTag", mock.Anything, int64(10), "scraper.gamelist.xml:scraped").Return(false, nil) mockDB.On( "ApplyScrapeResult", mock.Anything, int64(10), int64(20), companionArtworkWriteMatcher(t, expectedArtwork), ).Return(nil) @@ -1884,7 +1904,6 @@ func TestProcessCompanionEntries_StatsAggregateWithNormalProgress(t *testing.T) }, nil) mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, systemDBID, companionPath). Return(&database.Media{DBID: companionMediaDBID, MediaTitleDBID: companionTitleDBID}, nil) - mockDB.On("MediaHasTag", mock.Anything, companionMediaDBID, "scraper.gamelist.xml:scraped").Return(false, nil) mockDB.On("ApplyScrapeResult", mock.Anything, companionMediaDBID, companionTitleDBID, mock.Anything).Return(nil) mockDB.On("GetScrapedMediaIDs", mock.Anything, "gamelist.xml", systemDBID). Return(map[int64]struct{}{}, nil) @@ -1944,8 +1963,6 @@ func TestProcessCompanionEntries_EachChildUsesAtomicScrapeWrite(t *testing.T) { Return(&database.Media{DBID: 10, MediaTitleDBID: 20}, nil) mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), child2Path). Return(&database.Media{DBID: 11, MediaTitleDBID: 20}, nil) - mockDB.On("MediaHasTag", mock.Anything, int64(10), "scraper.gamelist.xml:scraped").Return(false, nil) - mockDB.On("MediaHasTag", mock.Anything, int64(11), "scraper.gamelist.xml:scraped").Return(false, nil) mockDB.On("ApplyScrapeResult", mock.Anything, int64(10), int64(20), mock.Anything).Return(nil) mockDB.On("ApplyScrapeResult", mock.Anything, int64(11), int64(20), mock.Anything).Return(nil) @@ -1974,7 +1991,6 @@ func TestProcessCompanionEntries_NoRegionLangStillWritesSentinel(t *testing.T) { mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), gamePath). Return(&database.Media{DBID: 5, MediaTitleDBID: 6}, nil) - mockDB.On("MediaHasTag", mock.Anything, int64(5), "scraper.gamelist.xml:scraped").Return(false, nil) mockDB.On("ApplyScrapeResult", mock.Anything, int64(5), int64(6), companionWriteMatcher( nil, From 5621e37e9c198fc1c33c2ff3ff432da3063aaf97 Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Fri, 29 May 2026 07:11:33 +0800 Subject: [PATCH 12/17] fix(api): preserve image metadata on read --- pkg/api/methods/media_image.go | 42 ++++++++-------------- pkg/api/methods/media_image_test.go | 34 ++++++------------ pkg/api/methods/media_meta.go | 55 +++++++++++++++++++++++------ pkg/api/methods/media_meta_test.go | 41 +++++++++++++++++++++ pkg/api/models/responses.go | 30 ++++++++-------- 5 files changed, 128 insertions(+), 74 deletions(-) diff --git a/pkg/api/methods/media_image.go b/pkg/api/methods/media_image.go index a683868b..287eab4a 100644 --- a/pkg/api/methods/media_image.go +++ b/pkg/api/methods/media_image.go @@ -331,11 +331,11 @@ func loadMediaImageProperty( return nil, false, fmt.Errorf("media.image: read image blob %d: %w", *prop.BlobDBID, err) } if len(binary) == 0 { - deleteStaleMediaImageProperty(ctx, db, row, prop, src.isMedia, typeTag) + logStaleMediaImageProperty(row, prop, src.isMedia, typeTag, "empty blob data") return nil, true, nil } case prop.Text != "": - data, stale, err := loadMediaImageFile(ctx, fs, db, row, prop, src.isMedia, typeTag, maxBytes) + data, stale, err := loadMediaImageFile(fs, row, prop, src.isMedia, typeTag, maxBytes) if stale || err != nil { return nil, stale, err } @@ -353,9 +353,7 @@ func loadMediaImageProperty( } func loadMediaImageFile( - ctx context.Context, fs afero.Fs, - db database.MediaDBI, row *database.MediaFullRow, prop *database.MediaProperty, isMedia bool, @@ -365,7 +363,7 @@ func loadMediaImageFile( info, err := fs.Stat(prop.Text) if err != nil { if errors.Is(err, os.ErrNotExist) { - deleteStaleMediaImageProperty(ctx, db, row, prop, isMedia, typeTag) + logStaleMediaImageProperty(row, prop, isMedia, typeTag, err.Error()) return nil, true, nil } return nil, false, mediaImageReadError(prop.Text, err) @@ -384,7 +382,7 @@ func loadMediaImageFile( data, readErr := readMediaBinaryFile(fs, prop.Text, maxBytes) if readErr != nil { if errors.Is(readErr, os.ErrNotExist) { - deleteStaleMediaImageProperty(ctx, db, row, prop, isMedia, typeTag) + logStaleMediaImageProperty(row, prop, isMedia, typeTag, readErr.Error()) return nil, true, nil } return nil, false, mediaImageReadError(prop.Text, readErr) @@ -392,37 +390,27 @@ func loadMediaImageFile( return data, false, nil } -func deleteStaleMediaImageProperty( - ctx context.Context, - db database.MediaDBI, +func logStaleMediaImageProperty( row *database.MediaFullRow, prop *database.MediaProperty, isMedia bool, typeTag string, + reason string, ) { - if !isMedia { - if delErr := db.DeleteMediaTitleProperty(ctx, row.Title.DBID, prop.TypeTagDBID); delErr != nil { - log.Warn().Err(delErr).Int64("titleDBID", row.Title.DBID).Str("typeTag", typeTag). - Msg("media.image: failed to delete stale title property") - return - } - log.Debug(). - Int64("titleDBID", row.Title.DBID). - Str("typeTag", typeTag). - Str("text", prop.Text). - Msg("media.image: deleted stale image property") - return - } - if delErr := db.DeleteMediaProperty(ctx, row.DBID, prop.TypeTagDBID); delErr != nil { - log.Warn().Err(delErr).Int64("mediaDBID", row.DBID).Str("typeTag", typeTag). - Msg("media.image: failed to delete stale media property") - return + level := "title" + if isMedia { + level = "media" } log.Debug(). + Str("system", row.System.SystemID). + Str("path", row.Path). Int64("mediaDBID", row.DBID). + Int64("titleDBID", row.Title.DBID). Str("typeTag", typeTag). Str("text", prop.Text). - Msg("media.image: deleted stale image property") + Str("source", level). + Str("reason", reason). + Msg("media.image: stale image property ignored") } func mediaImageMaxBytes(pl platforms.Platform) int64 { diff --git a/pkg/api/methods/media_image_test.go b/pkg/api/methods/media_image_test.go index 6001aaf7..bafb04ef 100644 --- a/pkg/api/methods/media_image_test.go +++ b/pkg/api/methods/media_image_test.go @@ -449,11 +449,10 @@ func TestHandleMediaImage_ItemsWithMediaIDReturnsClientError(t *testing.T) { mockDB.AssertExpectations(t) } -// TestHandleMediaImage_FileReadError_DeletesAndContinues verifies that when a -// property's Text path is unreadable, the stale property is deleted (DB + in-memory) -// and the handler continues to the next preference. With no remaining image the -// final result is a ClientError. -func TestHandleMediaImage_FileReadError_DeletesAndContinues(t *testing.T) { +// TestHandleMediaImage_FileReadError_IgnoresStaleProperty verifies that when a +// property's Text path is unreadable, the stale property is ignored in-memory +// without mutating the DB. With no remaining image the final result is a ClientError. +func TestHandleMediaImage_FileReadError_IgnoresStaleProperty(t *testing.T) { t.Parallel() // Use a path inside a real temp dir so the dir exists, but the file does not. @@ -471,17 +470,13 @@ func TestHandleMediaImage_FileReadError_DeletesAndContinues(t *testing.T) { Binary: nil, TypeTagDBID: 42, } - // Properties are fetched once; the iterative loop deletes from the in-memory map. + // Properties are fetched once; the iterative loop removes stale entries from the in-memory map only. mockDB.On("GetMediaTitleProperties", mock.Anything, int64(60)). Return([]database.MediaProperty{staleProp}, nil) mockDB.On("GetMediaProperties", mock.Anything, int64(6)). Return([]database.MediaProperty{}, nil) - // Expect the stale title-level property to be deleted from the DB. - mockDB.On("DeleteMediaTitleProperty", mock.Anything, int64(60), int64(42)). - Return(nil) - env := makeMediaImageEnv(t, mockDB, mediaImageParams(row, `"imageTypes": ["boxart"]`)) _, err := HandleMediaImage(env) require.Error(t, err) @@ -492,9 +487,9 @@ func TestHandleMediaImage_FileReadError_DeletesAndContinues(t *testing.T) { } // TestHandleMediaImage_StaleMedia_FallsBackToTitle verifies that when a -// media-level property has a stale file path, the handler deletes the stale -// entry and falls back to the title-level property for the same TypeTag before -// moving on to the next preference in the list. +// media-level property has a stale file path, the handler ignores the stale +// entry in-memory and falls back to the title-level property for the same TypeTag +// before moving on to the next preference in the list. func TestHandleMediaImage_StaleMedia_FallsBackToTitle(t *testing.T) { t.Parallel() @@ -525,10 +520,6 @@ func TestHandleMediaImage_StaleMedia_FallsBackToTitle(t *testing.T) { mockDB.On("GetMediaTitleProperties", mock.Anything, int64(80)). Return([]database.MediaProperty{titleProp}, nil) - // Expect only the stale media-level property to be deleted. - mockDB.On("DeleteMediaProperty", mock.Anything, int64(8), int64(77)). - Return(nil) - env := makeMediaImageEnv(t, mockDB, mediaImageParams(row, `"imageTypes": ["boxart"]`)) result, err := HandleMediaImage(env) require.NoError(t, err) @@ -541,9 +532,9 @@ func TestHandleMediaImage_StaleMedia_FallsBackToTitle(t *testing.T) { } // TestHandleMediaImage_FileReadError_FallsBackToNextPref verifies that when a -// title-level property's file is unreadable, the stale entry is deleted from the -// DB and the in-memory map, and the handler continues to find the next-preference -// image successfully without a second round-trip. +// title-level property's file is unreadable, the stale entry is ignored in-memory, +// and the handler continues to find the next-preference image successfully without +// a second round-trip or DB mutation. func TestHandleMediaImage_FileReadError_FallsBackToNextPref(t *testing.T) { t.Parallel() @@ -580,9 +571,6 @@ func TestHandleMediaImage_FileReadError_FallsBackToNextPref(t *testing.T) { mockDB.On("GetMediaProperties", mock.Anything, int64(7)). Return([]database.MediaProperty{}, nil) - mockDB.On("DeleteMediaTitleProperty", mock.Anything, int64(70), int64(55)). - Return(nil) - env := makeMediaImageEnv(t, mockDB, mediaImageParams(row, `"imageTypes": ["boxart", "screenshot"]`)) result, err := HandleMediaImage(env) require.NoError(t, err) diff --git a/pkg/api/methods/media_meta.go b/pkg/api/methods/media_meta.go index 900802da..fb12dde2 100644 --- a/pkg/api/methods/media_meta.go +++ b/pkg/api/methods/media_meta.go @@ -21,6 +21,7 @@ package methods import ( "fmt" + "sort" "github.com/ZaparooProject/zaparoo-core/v2/pkg/api/models" "github.com/ZaparooProject/zaparoo-core/v2/pkg/api/models/requests" @@ -148,17 +149,19 @@ func buildMediaMetaResponse( } return models.MediaMetaResponse{Media: models.MediaMetaMediaResponse{ - Path: row.Path, - ParentDir: row.ParentDir, - IsMissing: row.IsMissing, - Tags: mediaTags, - Properties: mapMediaProperties(mediaProps), + Path: row.Path, + ParentDir: row.ParentDir, + IsMissing: row.IsMissing, + Tags: mediaTags, + Properties: mapMediaProperties(mediaProps), + AvailableImageTypes: availableImageTypes(mediaProps), Title: models.MediaMetaTitleResponse{ - Slug: row.Title.Slug, - SecondarySlug: secondarySlug, - Name: row.Title.Name, - SlugLength: row.Title.SlugLength, - SlugWordCount: row.Title.SlugWordCount, + Slug: row.Title.Slug, + SecondarySlug: secondarySlug, + Name: row.Title.Name, + SlugLength: row.Title.SlugLength, + SlugWordCount: row.Title.SlugWordCount, + AvailableImageTypes: availableImageTypes(titleProps), System: models.MediaMetaSystemResponse{ ID: row.System.SystemID, Name: row.System.Name, @@ -169,6 +172,38 @@ func buildMediaMetaResponse( }} } +func availableImageTypes(props []database.MediaProperty) []string { + typesByTag := make(map[string]string, len(imageTypeTags)) + for imageType, typeTag := range imageTypeTags { + typesByTag[typeTag] = imageType + } + + seen := make(map[string]struct{}) + for _, p := range props { + if imageType, ok := typesByTag[p.TypeTag]; ok { + seen[imageType] = struct{}{} + } + } + if len(seen) == 0 { + return nil + } + + result := make([]string, 0, len(seen)) + for _, imageType := range defaultImageTypes { + if _, ok := seen[imageType]; ok { + result = append(result, imageType) + delete(seen, imageType) + } + } + + remaining := make([]string, 0, len(seen)) + for imageType := range seen { + remaining = append(remaining, imageType) + } + sort.Strings(remaining) + return append(result, remaining...) +} + // mapMediaProperties converts a []database.MediaProperty slice into a map keyed // by TypeTag (e.g. "property:description"). Binary payloads are not included. func mapMediaProperties(props []database.MediaProperty) map[string]models.MediaMetaPropertyItem { diff --git a/pkg/api/methods/media_meta_test.go b/pkg/api/methods/media_meta_test.go index 8cf7d20f..a071ef85 100644 --- a/pkg/api/methods/media_meta_test.go +++ b/pkg/api/methods/media_meta_test.go @@ -114,6 +114,47 @@ func TestMapMediaProperties_InfersPathContentType(t *testing.T) { assert.Equal(t, "png", *prop.Extension) } +func TestAvailableImageTypes_UsesDefaultOrder(t *testing.T) { + t.Parallel() + + got := availableImageTypes([]database.MediaProperty{ + {TypeTag: "property:image-wheel"}, + {TypeTag: "property:image-boxart3d"}, + {TypeTag: "property:description"}, + {TypeTag: "property:image-boxart"}, + {TypeTag: "property:image-boxartback"}, + }) + + assert.Equal(t, []string{"boxart", "boxart3d", "wheel", "boxartback"}, got) +} + +func TestHandleMediaMeta_AvailableImageTypes(t *testing.T) { + t.Parallel() + + mockDB := testhelpers.NewMockMediaDBI() + row := makeMediaFullRow(12, 120) + expectMediaMetaResolve(mockDB, row) + mockDB.On("GetMediaTagsByMediaDBID", mock.Anything, int64(12)).Return([]database.TagInfo{}, nil) + mockDB.On("GetMediaTitleTagsByMediaTitleDBID", mock.Anything, int64(120)).Return([]database.TagInfo{}, nil) + mockDB.On("GetMediaProperties", mock.Anything, int64(12)).Return([]database.MediaProperty{ + {TypeTag: "property:image-screenshot"}, + }, nil) + mockDB.On("GetMediaTitleProperties", mock.Anything, int64(120)).Return([]database.MediaProperty{ + {TypeTag: "property:image-wheel"}, + {TypeTag: "property:image-boxart"}, + }, nil) + + env := makeMediaMetaEnv(t, mockDB, mediaMetaParams(row)) + result, err := HandleMediaMeta(env) + require.NoError(t, err) + + resp, ok := result.(models.MediaMetaResponse) + require.True(t, ok) + assert.Equal(t, []string{"screenshot"}, resp.Media.AvailableImageTypes) + assert.Equal(t, []string{"boxart", "wheel"}, resp.Media.Title.AvailableImageTypes) + mockDB.AssertExpectations(t) +} + func TestHandleMediaMeta_SecondarySlug(t *testing.T) { t.Parallel() diff --git a/pkg/api/models/responses.go b/pkg/api/models/responses.go index e3620c4f..1c90730b 100644 --- a/pkg/api/models/responses.go +++ b/pkg/api/models/responses.go @@ -245,24 +245,26 @@ type MediaMetaSystemResponse struct { // MediaMetaTitleResponse is the MediaTitle sub-object within a media.meta response, // with its own level-separated tags and properties. type MediaMetaTitleResponse struct { - SecondarySlug *string `json:"secondarySlug,omitempty"` - Properties map[string]MediaMetaPropertyItem `json:"properties"` - System MediaMetaSystemResponse `json:"system"` - Slug string `json:"slug"` - Name string `json:"name"` - Tags []database.TagInfo `json:"tags"` - SlugLength int `json:"slugLength"` - SlugWordCount int `json:"slugWordCount"` + SecondarySlug *string `json:"secondarySlug,omitempty"` + Properties map[string]MediaMetaPropertyItem `json:"properties"` + System MediaMetaSystemResponse `json:"system"` + Slug string `json:"slug"` + Name string `json:"name"` + Tags []database.TagInfo `json:"tags"` + AvailableImageTypes []string `json:"availableImageTypes,omitempty"` + SlugLength int `json:"slugLength"` + SlugWordCount int `json:"slugWordCount"` } // MediaMetaMediaResponse is the top-level Media object in a media.meta response. type MediaMetaMediaResponse struct { - Properties map[string]MediaMetaPropertyItem `json:"properties"` - Path string `json:"path"` - ParentDir string `json:"parentDir"` - Tags []database.TagInfo `json:"tags"` - Title MediaMetaTitleResponse `json:"title"` - IsMissing bool `json:"isMissing"` + Properties map[string]MediaMetaPropertyItem `json:"properties"` + Path string `json:"path"` + ParentDir string `json:"parentDir"` + Tags []database.TagInfo `json:"tags"` + AvailableImageTypes []string `json:"availableImageTypes,omitempty"` + Title MediaMetaTitleResponse `json:"title"` + IsMissing bool `json:"isMissing"` } // MediaMetaResponse is the response envelope for the media.meta method. From 0cfaee29b90907ae3d7cef7f45b251dd3bf45cfd Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Fri, 29 May 2026 08:00:44 +0800 Subject: [PATCH 13/17] fix(scraper): restore hybrid gamelist matching --- docs/scraper.md | 57 ++--- pkg/database/scraper/gamelistxml/scraper.go | 211 +++++++++++++++--- .../scraper/gamelistxml/scraper_test.go | 157 +++++++++---- 3 files changed, 323 insertions(+), 102 deletions(-) diff --git a/docs/scraper.md b/docs/scraper.md index 92c88f2f..7c7ccb67 100644 --- a/docs/scraper.md +++ b/docs/scraper.md @@ -38,15 +38,18 @@ For each system, the normal loop: 1. Resolves target systems from indexed MediaDB systems and platform launcher paths. 2. Runs ZaparooCompanion processing first. This is a special path; see [ZaparooCompanion Entries](#zaparoocompanion-entries). -3. Loads indexed media rows for the system. -4. With `force=false`, removes media rows that already have sentinel tag `scraper.gamelist.xml:scraped`. -5. Loads `gamelist.xml` from each ROM root. -6. Resolves each `` path under its ROM root. -7. Matches the resolved path to an existing Media row with case-insensitive path matching. -8. Maps XML fields to media-level tags plus title-level tags/properties. -9. Writes metadata through `MediaDB.ApplyScrapeResult`. -10. Writes the scraper sentinel tag to the matched Media row last inside the same transaction. -11. Emits progress updates and a final done update. +3. Loads eligible indexed titles for slug matching (`force=true` loads all titles; otherwise titles without the scraper sentinel are loaded). +4. Loads indexed media rows for the system. +5. With `force=false`, removes media rows that already have sentinel tag `scraper.gamelist.xml:scraped` from path fallback candidates. +6. Loads `gamelist.xml` from each ROM root. +7. Resolves each `` path under its ROM root. +8. Computes the same display-name slug used by the original scraper and prefers that title match. +9. Uses the resolved path to select the concrete Media row for the slug-matched title when possible; otherwise falls back to the first Media row for that title. +10. If slug matching fails, falls back to case-insensitive path matching so otherwise missed records can still scrape. +11. Maps XML fields to media-level tags/properties plus title-level shared tags/properties. +12. Writes metadata through `MediaDB.ApplyScrapeResult`. +13. Writes the scraper sentinel tag to the selected Media row last inside the same transaction. +14. Emits progress updates and a final done update. The sentinel tag format is `scraper.:scraped`, for example `scraper.gamelist.xml:scraped`. Writing it last is intentional: if a normal record write fails, the transaction rolls back and the missing sentinel leaves that media row eligible for retry. @@ -60,8 +63,8 @@ The DB supports tags/properties at both media and title scope. Normal `gamelist. |---|---|---| | `MediaTags` | ROM-level variant metadata | region, lang, scraper sentinel | | `MediaTitleTags` | Title-level shared metadata | developer, publisher, year, rating, genre, players, arcadeboard, gamefamily | -| `MediaTitleProperties` | Title-level static content | description, artwork paths, video path, manual path, XML game ID | -| `MediaProperties` | ROM-level static content | Supported by DB helpers, not currently written by normal `gamelist.xml` entries | +| `MediaTitleProperties` | Title-level shared static content | description, XML game ID | +| `MediaProperties` | ROM-level static content | artwork paths, video path, manual path for normal `gamelist.xml` entries | Tag exclusivity is controlled by `TagTypes.IsExclusive`. Exclusive types replace existing values for that type; additive types accumulate distinct values. The scraper write path groups tags by type and applies that behavior in `upsertTags`. @@ -69,15 +72,17 @@ Property rows are keyed by entity and property type tag. Re-scraping the same pr Path-backed properties persist their text path and optional `BlobDBID`; the property tables do not persist the `ContentType` computed by the mapper for path values. Blob-backed properties expose content type from `MediaBlobs`. API responses infer path-backed content type and extension from the stored path when DB content type is empty. +Normal gamelist artwork is media-level so regional or language variants can carry different image paths while sharing one title. `media.image` checks media-level properties before title-level properties. + ## Media-level Sentinel Invariant -Normal `gamelist.xml` entries are matched to concrete Media rows by path. The sentinel is written to the same Media row that receives ROM-level tags such as `region` and `lang`. +Normal `gamelist.xml` entries prefer slug/title matching, then use path matching to select the concrete Media row when possible. If no slug match exists, path-only fallback can still select a Media row. The sentinel is written to the same Media row that receives ROM-level tags such as `region` and `lang` plus ROM-level file properties such as artwork. -Title metadata remains shared by `MediaTitleDBID`, so multiple ROM variants can write the same title-level tags/properties. Rewrites are idempotent: exclusive title tags replace same-type values, additive tags are inserted-or-ignored, and properties upsert by type. +Title metadata remains shared by `MediaTitleDBID`, so multiple ROM variants can write the same title-level tags/properties. Rewrites are idempotent: exclusive title tags replace same-type values, additive tags are inserted-or-ignored, and properties upsert by type. Media-level properties upsert per concrete Media row, preventing regional artwork variants from overwriting each other. ## gamelist.xml Behavior -`GamelistXMLScraper` scans each system ROM root for `gamelist.xml`. Regular `` entries are resolved to absolute paths under the system ROM root and matched to existing Media rows by case-insensitive path. Scrapers do not create `Media` or `MediaTitle` rows. +`GamelistXMLScraper` scans each system ROM root for `gamelist.xml`. Regular `` entries are resolved to absolute paths under the system ROM root. The scraper first matches the entry to an existing title by the original display-name slug behavior, then uses the resolved path to choose the concrete Media row for that title when possible. If no slug match exists, it falls back to case-insensitive path matching. Scrapers do not create `Media` or `MediaTitle` rows. Path handling: @@ -105,18 +110,18 @@ Source fields are cleaned before mapping: HTML entities are unescaped, tab/newli | `family` | `MediaTitleTags: gamefamily` | Additive | | `desc` | `MediaTitleProperties: property:description` | Plain text | | ScreenScraper game ID | `MediaTitleProperties: property:xml-game-id` | From XML attribute or element value | -| `image` | `MediaTitleProperties: property:image-image` | XML path or filesystem fallback | -| `thumbnail` | `MediaTitleProperties: property:image-thumbnail` | Cover/thumbnail path in most ES forks | -| `boxart2d` | `MediaTitleProperties: property:image-boxart` | XML path or filesystem fallback | -| `boxart3d` | `MediaTitleProperties: property:image-boxart3d` | XML path or filesystem fallback | -| `screenshot` | `MediaTitleProperties: property:image-screenshot` | XML path or filesystem fallback | -| `video` | `MediaTitleProperties: property:video` | Filesystem path | -| `marquee` | `MediaTitleProperties: property:image-marquee` | XML path or filesystem fallback | -| `logo` / `wheel` | `MediaTitleProperties: property:image-wheel` | `logo` takes priority over `wheel`; XML path or filesystem fallback | -| `fanart` | `MediaTitleProperties: property:image-fanart` | XML path or filesystem fallback | -| `titlescreen` / `titleshot` | `MediaTitleProperties: property:image-titleshot` | `titlescreen` takes priority over `titleshot`; XML path or filesystem fallback | -| `map` | `MediaTitleProperties: property:image-map` | XML path or filesystem fallback | -| `manual` | `MediaTitleProperties: property:manual` | PDF path | +| `image` | `MediaProperties: property:image-image` | XML path or filesystem fallback | +| `thumbnail` | `MediaProperties: property:image-thumbnail` | Cover/thumbnail path in most ES forks | +| `boxart2d` | `MediaProperties: property:image-boxart` | XML path or filesystem fallback | +| `boxart3d` | `MediaProperties: property:image-boxart3d` | XML path or filesystem fallback | +| `screenshot` | `MediaProperties: property:image-screenshot` | XML path or filesystem fallback | +| `video` | `MediaProperties: property:video` | Filesystem path | +| `marquee` | `MediaProperties: property:image-marquee` | XML path or filesystem fallback | +| `logo` / `wheel` | `MediaProperties: property:image-wheel` | `logo` takes priority over `wheel`; XML path or filesystem fallback | +| `fanart` | `MediaProperties: property:image-fanart` | XML path or filesystem fallback | +| `titlescreen` / `titleshot` | `MediaProperties: property:image-titleshot` | `titlescreen` takes priority over `titleshot`; XML path or filesystem fallback | +| `map` | `MediaProperties: property:image-map` | XML path or filesystem fallback | +| `manual` | `MediaProperties: property:manual` | PDF path | Filesystem fallback searches known subdirectories under `/media/` when an XML path is absent. For games in subfolders, it searches the mirrored ROM-relative path before the flat filename; for example `./Japan/Game.nes` checks `media/images/Japan/Game.png` before `media/images/Game.png`. Side/back box art are filesystem-fallback only. diff --git a/pkg/database/scraper/gamelistxml/scraper.go b/pkg/database/scraper/gamelistxml/scraper.go index 6209057d..049d8f91 100644 --- a/pkg/database/scraper/gamelistxml/scraper.go +++ b/pkg/database/scraper/gamelistxml/scraper.go @@ -94,6 +94,12 @@ type companionStats struct { Skipped int } +type loadRecordIndexes struct { + TitlesBySlug map[string]database.MediaTitle + MediaByPathFold map[string]database.Media + MediaByTitleDBID map[int64][]database.Media +} + func (g *GamelistXMLScraper) filesystem() afero.Fs { if g == nil || g.fs == nil { return afero.NewOsFs() @@ -201,17 +207,19 @@ func resolveSystemsFromPlatform( } // LoadRecords iterates gamelist.xml files found under each ROM root path for -// the given system. Each game entry is resolved to an absolute path and matched -// to an existing Media row by case-insensitive path. The matched Media row -// supplies both the per-ROM write target and parent MediaTitle DBID. +// the given system. It prefers the original slug/title match, uses the XML path +// to select the concrete Media row for that title when possible, and falls back +// to path-only matching for records whose slug is not indexed. func (g *GamelistXMLScraper) LoadRecords( ctx context.Context, system scraper.ScrapeSystem, - mediaByPathFold map[string]database.Media, + indexes loadRecordIndexes, ) ([]*GamelistRecord, error) { var records []*GamelistRecord - candidateMedia := len(mediaByPathFold) - var gamelistFiles, gamelistEntries, companionEntriesSkipped, invalidPaths, unmatchedPaths int + candidateMedia := len(indexes.MediaByPathFold) + candidateTitles := len(indexes.TitlesBySlug) + var gamelistFiles, gamelistEntries, companionEntriesSkipped, invalidPaths int + var slugMatches, slugPathSelections, slugFirstMediaFallbacks, pathOnlyFallbacks, unmatchedRecords int outer: for _, rootPath := range system.ROMPaths { @@ -243,32 +251,78 @@ outer: availableMediaDirs := statMediaDirsFS(g.filesystem(), rootPath) for i := range gl.Games { - if isCompanionGame(&gl.Games[i]) { + game := &gl.Games[i] + if isCompanionGame(game) { companionEntriesSkipped++ continue } - resolved := resolveESPath(gl.Games[i].Path, rootPath) + resolved := resolveESPath(game.Path, rootPath) if resolved == "" { invalidPaths++ continue } - media, matchedKey, ok := matchMediaByResolvedPath(mediaByPathFold, resolved) - if !ok { - unmatchedPaths++ - continue + pf := mediascanner.GetPathFragments(&mediascanner.PathFragmentParams{ + Config: g.cfg, + Path: resolved, + SystemID: system.ID, + NoExt: true, + ProvidedName: game.Name, + }) + + if title, ok := indexes.TitlesBySlug[pf.Slug]; ok { + media, matchedKey := selectMediaForSlugMatch(indexes, title.DBID, resolved) + if media.DBID == 0 { + log.Debug(). + Str("system", system.ID). + Str("slug", pf.Slug). + Int64("mediaTitleDBID", title.DBID). + Msg("gamelistxml: slug matched title but no media row found, skipping") + unmatchedRecords++ + delete(indexes.TitlesBySlug, pf.Slug) + continue + } + + slugMatches++ + if matchedKey != "" { + slugPathSelections++ + delete(indexes.MediaByPathFold, matchedKey) + } else { + slugFirstMediaFallbacks++ + } + records = append(records, &GamelistRecord{ + SystemRootPath: rootPath, + AvailableMediaDirs: availableMediaDirs, + Game: *game, + MatchedTitleDBID: title.DBID, + MatchedMediaDBID: media.DBID, + }) + delete(indexes.TitlesBySlug, pf.Slug) + } else if media, matchedKey, ok := matchMediaByResolvedPath(indexes.MediaByPathFold, resolved); ok { + pathOnlyFallbacks++ + log.Debug(). + Str("system", system.ID). + Str("path", game.Path). + Str("resolved", resolved). + Str("name", game.Name). + Str("slug", pf.Slug). + Int64("mediaDBID", media.DBID). + Int64("mediaTitleDBID", media.MediaTitleDBID). + Msg("gamelistxml: path-only fallback matched record") + records = append(records, &GamelistRecord{ + SystemRootPath: rootPath, + AvailableMediaDirs: availableMediaDirs, + Game: *game, + MatchedTitleDBID: media.MediaTitleDBID, + MatchedMediaDBID: media.DBID, + }) + delete(indexes.MediaByPathFold, matchedKey) + } else { + unmatchedRecords++ } - records = append(records, &GamelistRecord{ - SystemRootPath: rootPath, - AvailableMediaDirs: availableMediaDirs, - Game: gl.Games[i], - MatchedTitleDBID: media.MediaTitleDBID, - MatchedMediaDBID: media.DBID, - }) - delete(mediaByPathFold, matchedKey) - if len(mediaByPathFold) == 0 { + if len(indexes.TitlesBySlug) == 0 && len(indexes.MediaByPathFold) == 0 { break outer } } @@ -276,14 +330,20 @@ outer: log.Info(). Str("system", system.ID). + Int("candidate_titles", candidateTitles). Int("candidate_media", candidateMedia). Int("gamelist_files", gamelistFiles). Int("gamelist_entries", gamelistEntries). Int("companion_entries_skipped", companionEntriesSkipped). Int("invalid_paths", invalidPaths). - Int("unmatched_paths", unmatchedPaths). + Int("slug_matches", slugMatches). + Int("slug_path_selections", slugPathSelections). + Int("slug_first_media_fallbacks", slugFirstMediaFallbacks). + Int("path_only_fallbacks", pathOnlyFallbacks). + Int("unmatched_records", unmatchedRecords). Int("matched_records", len(records)). - Int("remaining_unmatched_media", len(mediaByPathFold)). + Int("remaining_unmatched_titles", len(indexes.TitlesBySlug)). + Int("remaining_unmatched_media", len(indexes.MediaByPathFold)). Int("total_records", len(records)). Msg("gamelistxml: finished loading records for system") @@ -340,6 +400,51 @@ func (g *GamelistXMLScraper) scrapeLoop( totalMatched += companion.Matched totalSkipped += companion.Skipped + titlesBySlug := make(map[string]database.MediaTitle) + if opts.Force { + allTitles, titlesErr := mdb.GetTitlesBySystemID(system.ID) + if titlesErr != nil { + if errors.Is(titlesErr, context.Canceled) || errors.Is(titlesErr, context.DeadlineExceeded) { + ch <- scraper.ScrapeUpdate{ + SystemID: system.ID, Done: true, Processed: totalProcessed, + Matched: totalMatched, Skipped: totalSkipped, + } + return + } + ch <- scraper.ScrapeUpdate{ + SystemID: system.ID, FatalErr: titlesErr, + Done: true, Processed: totalProcessed, Matched: totalMatched, Skipped: totalSkipped, + } + return + } + for _, t := range allTitles { + titlesBySlug[t.Slug] = database.MediaTitle{ + DBID: t.DBID, SystemDBID: t.SystemDBID, Slug: t.Slug, Name: t.Name, + } + } + } else { + sentinel := scraper.SentinelTagInfo(id) + sentinelTag := sentinel.Type + ":" + sentinel.Tag + unscraped, titlesErr := mdb.FindMediaTitlesWithoutSentinel(ctx, system.DBID, sentinelTag) + if titlesErr != nil { + if errors.Is(titlesErr, context.Canceled) || errors.Is(titlesErr, context.DeadlineExceeded) { + ch <- scraper.ScrapeUpdate{ + SystemID: system.ID, Done: true, Processed: totalProcessed, + Matched: totalMatched, Skipped: totalSkipped, + } + return + } + ch <- scraper.ScrapeUpdate{ + SystemID: system.ID, FatalErr: titlesErr, + Done: true, Processed: totalProcessed, Matched: totalMatched, Skipped: totalSkipped, + } + return + } + for _, t := range unscraped { + titlesBySlug[t.Slug] = t + } + } + allMedia, mediaErr := mdb.GetMediaBySystemID(system.ID) if mediaErr != nil { if errors.Is(mediaErr, context.Canceled) || errors.Is(mediaErr, context.DeadlineExceeded) { @@ -375,22 +480,27 @@ func (g *GamelistXMLScraper) scrapeLoop( } } - mediaByPathFold := make(map[string]database.Media, len(allMedia)) + indexes := loadRecordIndexes{ + TitlesBySlug: titlesBySlug, + MediaByPathFold: make(map[string]database.Media, len(allMedia)), + MediaByTitleDBID: make(map[int64][]database.Media, len(allMedia)), + } for _, m := range allMedia { - if _, scraped := scrapedIDs[m.DBID]; scraped { - continue - } - mediaByPathFold[pathFoldKey(m.Path)] = database.Media{ + media := database.Media{ DBID: m.DBID, MediaTitleDBID: m.MediaTitleDBID, Path: m.Path, } + if _, scraped := scrapedIDs[m.DBID]; !scraped { + indexes.MediaByPathFold[pathFoldKey(m.Path)] = media + } + indexes.MediaByTitleDBID[m.MediaTitleDBID] = append(indexes.MediaByTitleDBID[m.MediaTitleDBID], media) } - if len(mediaByPathFold) == 0 { + if len(indexes.TitlesBySlug) == 0 && len(indexes.MediaByPathFold) == 0 { continue } - records, loadErr := g.LoadRecords(ctx, system, mediaByPathFold) + records, loadErr := g.LoadRecords(ctx, system, indexes) if loadErr != nil { if errors.Is(loadErr, context.Canceled) || errors.Is(loadErr, context.DeadlineExceeded) { ch <- scraper.ScrapeUpdate{ @@ -487,6 +597,7 @@ func (g *GamelistXMLScraper) scrapeLoop( &database.ScrapeWrite{ Sentinel: scraper.SentinelTagInfo(id), MediaTags: mapped.MediaTags, + MediaProps: mapped.MediaProps, TitleTags: mapped.TitleTags, TitleProps: mapped.TitleProps, }) @@ -528,6 +639,7 @@ func (g *GamelistXMLScraper) MapToDB(record *GamelistRecord) scraper.MapResult { var mediaTags []database.TagInfo var titleTags []database.TagInfo var titleProps []database.MediaProperty + var mediaProps []database.MediaProperty game := record.Game // Normalise all string fields before mapping: unescape HTML entities, @@ -601,7 +713,7 @@ func (g *GamelistXMLScraper) MapToDB(record *GamelistRecord) scraper.MapResult { titleTags = append(titleTags, database.TagInfo{Type: string(tags.TagTypeGameFamily), Tag: game.Family}) } - // --- MediaTitleProperties: title-level static content --- + // --- MediaTitleProperties: title-level shared static content --- propType := string(tags.TagTypeProperty) root := record.SystemRootPath @@ -635,7 +747,7 @@ func (g *GamelistXMLScraper) MapToDB(record *GamelistRecord) scraper.MapResult { ) } if p != nil { - titleProps = append(titleProps, *p) + mediaProps = append(mediaProps, *p) } } @@ -663,14 +775,15 @@ func (g *GamelistXMLScraper) MapToDB(record *GamelistRecord) scraper.MapResult { appendImageProp(tags.TagPropertyImageMap, game.Map) if p := pathProp(propType+":"+string(tags.TagPropertyVideo), game.Video, root); p != nil { - titleProps = append(titleProps, *p) + mediaProps = append(mediaProps, *p) } if p := pathProp(propType+":"+string(tags.TagPropertyManual), game.Manual, root); p != nil { - titleProps = append(titleProps, *p) + mediaProps = append(mediaProps, *p) } return scraper.MapResult{ MediaTags: mediaTags, + MediaProps: mediaProps, TitleTags: titleTags, TitleProps: titleProps, } @@ -963,6 +1076,31 @@ func findMediaFilePropFS( return nil } +func selectMediaForSlugMatch( + indexes loadRecordIndexes, + mediaTitleDBID int64, + resolved string, +) (media database.Media, matchedKey string) { + media, matchedKey, ok := matchMediaByResolvedPath(indexes.MediaByPathFold, resolved) + if ok && media.MediaTitleDBID == mediaTitleDBID { + return media, matchedKey + } + if ok { + log.Debug(). + Str("resolved", resolved). + Int64("pathMediaDBID", media.DBID). + Int64("pathMediaTitleDBID", media.MediaTitleDBID). + Int64("slugMediaTitleDBID", mediaTitleDBID). + Msg("gamelistxml: slug match path points at different title, using slug title media") + } + + mediaRows := indexes.MediaByTitleDBID[mediaTitleDBID] + if len(mediaRows) == 0 { + return database.Media{}, "" + } + return mediaRows[0], "" +} + func matchMediaByResolvedPath( mediaByPathFold map[string]database.Media, resolved string, @@ -1136,11 +1274,14 @@ func (g *GamelistXMLScraper) loadCompanionEntries( // record. MapToDB is safe with an empty Game.Path; the stem becomes "." which is // rejected by findMediaFilePropFS, so filesystem fallbacks are skipped cleanly. func (g *GamelistXMLScraper) mapCompanionParentToResult(p *companionParent) scraper.MapResult { - return g.MapToDB(&GamelistRecord{ + result := g.MapToDB(&GamelistRecord{ SystemRootPath: p.SystemRootPath, AvailableMediaDirs: p.AvailableMediaDirs, Game: p.Game, }) + result.TitleProps = append(result.TitleProps, result.MediaProps...) + result.MediaProps = nil + return result } // processCompanionEntries handles ZaparooCompanion-sourced entries in gamelist.xml. diff --git a/pkg/database/scraper/gamelistxml/scraper_test.go b/pkg/database/scraper/gamelistxml/scraper_test.go index e1656cb0..6547b0ad 100644 --- a/pkg/database/scraper/gamelistxml/scraper_test.go +++ b/pkg/database/scraper/gamelistxml/scraper_test.go @@ -265,12 +265,23 @@ func TestMimeFromExt_Unknown(t *testing.T) { // --- LoadRecords --- -func mediaByPath(rows ...database.Media) map[string]database.Media { - result := make(map[string]database.Media, len(rows)) +func mediaByPath(rows ...database.Media) loadRecordIndexes { + indexes := loadRecordIndexes{ + TitlesBySlug: map[string]database.MediaTitle{}, + MediaByPathFold: make(map[string]database.Media, len(rows)), + MediaByTitleDBID: make(map[int64][]database.Media, len(rows)), + } for _, row := range rows { - result[pathFoldKey(row.Path)] = row + indexes.MediaByPathFold[pathFoldKey(row.Path)] = row + indexes.MediaByTitleDBID[row.MediaTitleDBID] = append(indexes.MediaByTitleDBID[row.MediaTitleDBID], row) } - return result + return indexes +} + +func mediaBySlugAndPath(slug string, title *database.MediaTitle, rows ...database.Media) loadRecordIndexes { + indexes := mediaByPath(rows...) + indexes.TitlesBySlug[slug] = *title + return indexes } func TestLoadRecords_PathMatch(t *testing.T) { @@ -375,15 +386,63 @@ func TestLoadRecords_TitleNameDoesNotNeedSlugMatch(t *testing.T) { records, err := (&GamelistXMLScraper{}).LoadRecords( context.Background(), scraper.ScrapeSystem{ID: "NeoGeo", ROMPaths: []string{root}}, - mediaByPath(database.Media{DBID: 7, MediaTitleDBID: 42, Path: filepath.Join(root, "mslug.zip")}), + mediaBySlugAndPath("metal-slug", &database.MediaTitle{DBID: 42, Slug: "metal-slug"}, + database.Media{DBID: 7, MediaTitleDBID: 42, Path: filepath.Join(root, "mslug.zip")}), ) require.NoError(t, err) - require.Len(t, records, 1, "path match should work even when title slug derives from display name") + require.Len(t, records, 1, "slug match should work even when file basename differs from display name") assert.Equal(t, "Metal Slug", records[0].Game.Name) assert.Equal(t, int64(42), records[0].MatchedTitleDBID) assert.Equal(t, int64(7), records[0].MatchedMediaDBID) } +func TestLoadRecords_SlugMatchSelectsExactMediaPath(t *testing.T) { + t.Parallel() + + root := t.TempDir() + usaPath := filepath.Join(root, "USA", "Game.nes") + japanPath := filepath.Join(root, "Japan", "Game.nes") + require.NoError(t, os.MkdirAll(filepath.Dir(usaPath), 0o750)) + require.NoError(t, os.MkdirAll(filepath.Dir(japanPath), 0o750)) + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` + + ./Japan/Game.nesGame +`), 0o600)) + + records, err := (&GamelistXMLScraper{}).LoadRecords( + context.Background(), + scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}}, + mediaBySlugAndPath("game", &database.MediaTitle{DBID: 50, Slug: "game"}, + database.Media{DBID: 60, MediaTitleDBID: 50, Path: usaPath}, + database.Media{DBID: 61, MediaTitleDBID: 50, Path: japanPath}), + ) + require.NoError(t, err) + require.Len(t, records, 1) + assert.Equal(t, int64(50), records[0].MatchedTitleDBID) + assert.Equal(t, int64(61), records[0].MatchedMediaDBID) +} + +func TestLoadRecords_PathOnlyFallbackWhenSlugMissing(t *testing.T) { + t.Parallel() + + root := t.TempDir() + gamePath := filepath.Join(root, "odd-name.nes") + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` + + ./odd-name.nesUnindexed Display Name +`), 0o600)) + + records, err := (&GamelistXMLScraper{}).LoadRecords( + context.Background(), + scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}}, + mediaByPath(database.Media{DBID: 70, MediaTitleDBID: 80, Path: gamePath}), + ) + require.NoError(t, err) + require.Len(t, records, 1) + assert.Equal(t, int64(80), records[0].MatchedTitleDBID) + assert.Equal(t, int64(70), records[0].MatchedMediaDBID) +} + func TestLoadRecords_ZipAsDirChildMatch(t *testing.T) { t.Parallel() @@ -524,7 +583,7 @@ func TestLoadRecords_ContextCancellation(t *testing.T) { _, err := (&GamelistXMLScraper{}).LoadRecords( ctx, scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}}, - map[string]database.Media{}, + mediaByPath(), ) require.ErrorIs(t, err, context.Canceled) } @@ -612,14 +671,21 @@ func TestMapToDB_FullGame(t *testing.T) { // Title-level properties descPropKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyDescription) - imgPropKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageImage) - videoPropKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyVideo) - var foundDesc, foundImg, foundVideo bool + var foundDesc bool for _, p := range result.TitleProps { - switch p.TypeTag { - case descPropKey: + if p.TypeTag == descPropKey { foundDesc = true assert.Equal(t, "A classic platformer.", p.Text) + } + } + assert.True(t, foundDesc, "description property missing") + + // Media-level properties + imgPropKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageImage) + videoPropKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyVideo) + var foundImg, foundVideo bool + for _, p := range result.MediaProps { + switch p.TypeTag { case imgPropKey: foundImg = true assert.Equal(t, filepath.ToSlash(filepath.Join(root, "media", "images", "mario.png")), p.Text) @@ -628,11 +694,8 @@ func TestMapToDB_FullGame(t *testing.T) { assert.Equal(t, filepath.ToSlash(filepath.Join(root, "media", "videos", "mario.mp4")), p.Text) } } - assert.True(t, foundDesc, "description property missing") assert.True(t, foundImg, "image property missing") assert.True(t, foundVideo, "video property missing") - - assert.Empty(t, result.MediaProps, "gamelistxml scraper writes no media-level properties") } func TestMapToDB_EmptyGame_NoTags(t *testing.T) { @@ -759,9 +822,9 @@ func TestMapToDB_ContentType_Image(t *testing.T) { SystemRootPath: root, Game: esapi.Game{Image: "./images/mario.png"}, } - titleProps := (&GamelistXMLScraper{}).MapToDB(&rec).TitleProps + mediaProps := (&GamelistXMLScraper{}).MapToDB(&rec).MediaProps propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageImage) - for _, p := range titleProps { + for _, p := range mediaProps { if p.TypeTag == propKey { assert.Equal(t, "image/png", p.ContentType) return @@ -872,7 +935,7 @@ func TestMapToDB_NestedExplicitImagePath(t *testing.T) { propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageImage) var found bool - for _, p := range result.TitleProps { + for _, p := range result.MediaProps { if p.TypeTag == propKey { found = true assert.Equal(t, filepath.ToSlash(filepath.Join(root, "media", "images", "Japan", "Game.png")), p.Text) @@ -902,7 +965,7 @@ func TestMapToDB_FilesystemFallback_Image(t *testing.T) { propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageImage) var found bool - for _, p := range result.TitleProps { + for _, p := range result.MediaProps { if p.TypeTag == propKey { found = true assert.Equal(t, filepath.ToSlash(filepath.Join(imgDir, "mario.png")), p.Text) @@ -933,7 +996,7 @@ func TestMapToDB_FilesystemFallback_NestedGamePath(t *testing.T) { propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageImage) var found bool - for _, p := range result.TitleProps { + for _, p := range result.MediaProps { if p.TypeTag == propKey { found = true assert.Equal(t, filepath.ToSlash(imgPath), p.Text) @@ -963,7 +1026,7 @@ func TestMapToDB_FilesystemFallback_NestedWinsBeforeFlat(t *testing.T) { propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageImage) var found bool - for _, p := range result.TitleProps { + for _, p := range result.MediaProps { if p.TypeTag == propKey { found = true assert.Equal(t, filepath.ToSlash(nestedPath), p.Text) @@ -991,7 +1054,7 @@ func TestMapToDB_FilesystemFallback_ThumbnailBox2DFrontAlias(t *testing.T) { propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageThumbnail) var found bool - for _, p := range result.TitleProps { + for _, p := range result.MediaProps { if p.TypeTag == propKey { found = true assert.Equal(t, filepath.ToSlash(thumbPath), p.Text) @@ -1018,7 +1081,7 @@ func TestMapToDB_FilesystemFallback_Boxart(t *testing.T) { propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageBoxart) var found bool - for _, p := range result.TitleProps { + for _, p := range result.MediaProps { if p.TypeTag == propKey { found = true assert.Equal(t, filepath.ToSlash(filepath.Join(boxartDir, "sonic.png")), p.Text) @@ -1054,7 +1117,7 @@ func TestMapToDB_FilesystemFallback_XMLWins(t *testing.T) { propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageImage) var count int - for _, p := range result.TitleProps { + for _, p := range result.MediaProps { if p.TypeTag == propKey { count++ assert.Equal(t, filepath.ToSlash(xmlImg), p.Text, "XML path should take priority over filesystem fallback") @@ -1097,7 +1160,7 @@ func TestMapToDB_Boxart3D_XMLPath(t *testing.T) { propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageBoxart3D) var found bool - for _, p := range result.TitleProps { + for _, p := range result.MediaProps { if p.TypeTag == propKey { found = true assert.Equal(t, filepath.ToSlash(img), p.Text) @@ -1131,7 +1194,7 @@ func TestMapToDB_Boxart2D_And_Boxart3D_AreIndependent(t *testing.T) { key2d := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageBoxart) key3d := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageBoxart3D) found2d, found3d := false, false - for _, p := range result.TitleProps { + for _, p := range result.MediaProps { switch p.TypeTag { case key2d: found2d = true @@ -1163,7 +1226,7 @@ func TestMapToDB_FilesystemFallback_Boxart3D(t *testing.T) { propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageBoxart3D) var found bool - for _, p := range result.TitleProps { + for _, p := range result.MediaProps { if p.TypeTag == propKey { found = true assert.Equal(t, filepath.ToSlash(filepath.Join(dir, "sonic.png")), p.Text) @@ -1190,7 +1253,7 @@ func TestMapToDB_FilesystemFallback_BoxartSide(t *testing.T) { propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageBoxartSide) var found bool - for _, p := range result.TitleProps { + for _, p := range result.MediaProps { if p.TypeTag == propKey { found = true assert.Equal(t, filepath.ToSlash(filepath.Join(dir, "sonic.png")), p.Text) @@ -1217,7 +1280,7 @@ func TestMapToDB_FilesystemFallback_BoxartBack(t *testing.T) { propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageBoxartBack) var found bool - for _, p := range result.TitleProps { + for _, p := range result.MediaProps { if p.TypeTag == propKey { found = true assert.Equal(t, filepath.ToSlash(filepath.Join(dir, "sonic.png")), p.Text) @@ -1299,10 +1362,10 @@ func TestMapToDB_Manual(t *testing.T) { SystemRootPath: root, Game: esapi.Game{Manual: "./manuals/game.pdf"}, } - titleProps := (&GamelistXMLScraper{}).MapToDB(&rec).TitleProps + mediaProps := (&GamelistXMLScraper{}).MapToDB(&rec).MediaProps propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyManual) var found bool - for _, p := range titleProps { + for _, p := range mediaProps { if p.TypeTag == propKey { found = true assert.Equal(t, "application/pdf", p.ContentType) @@ -1324,7 +1387,7 @@ func TestMapToDB_WheelXMLLogoTakesPriority(t *testing.T) { } propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageWheel) var found bool - for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).TitleProps { + for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).MediaProps { if p.TypeTag == propKey { found = true assert.Contains(t, p.Text, "logo_source", "Logo field should take priority over Wheel") @@ -1346,7 +1409,7 @@ func TestMapToDB_WheelXMLFromWheelWhenNoLogo(t *testing.T) { } propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageWheel) var found bool - for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).TitleProps { + for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).MediaProps { if p.TypeTag == propKey { found = true assert.Contains(t, p.Text, "wheel_source") @@ -1368,7 +1431,7 @@ func TestMapToDB_TitleShotXMLFromTitleScreen(t *testing.T) { } propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageTitleshot) var found bool - for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).TitleProps { + for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).MediaProps { if p.TypeTag == propKey { found = true assert.Contains(t, p.Text, "titlescreen_source", "TitleScreen should take priority over TitleShot") @@ -1390,7 +1453,7 @@ func TestMapToDB_TitleShotXMLFromTitleShotWhenNoTitleScreen(t *testing.T) { } propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageTitleshot) var found bool - for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).TitleProps { + for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).MediaProps { if p.TypeTag == propKey { found = true assert.Contains(t, p.Text, "titleshot_source") @@ -1408,7 +1471,7 @@ func TestMapToDB_MarqueeXMLPath(t *testing.T) { } propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageMarquee) var found bool - for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).TitleProps { + for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).MediaProps { if p.TypeTag == propKey { found = true assert.Equal(t, "image/png", p.ContentType) @@ -1426,7 +1489,7 @@ func TestMapToDB_FanArtXMLPath(t *testing.T) { } propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageFanart) var found bool - for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).TitleProps { + for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).MediaProps { if p.TypeTag == propKey { found = true assert.Equal(t, "image/png", p.ContentType) @@ -1444,7 +1507,7 @@ func TestMapToDB_MapXMLPath(t *testing.T) { } propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageMap) var found bool - for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).TitleProps { + for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).MediaProps { if p.TypeTag == propKey { found = true } @@ -1461,7 +1524,7 @@ func TestMapToDB_ScreenshotXMLPath(t *testing.T) { } propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageScreenshot) var found bool - for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).TitleProps { + for _, p := range (&GamelistXMLScraper{}).MapToDB(&rec).MediaProps { if p.TypeTag == propKey { found = true assert.Equal(t, "image/png", p.ContentType) @@ -1899,6 +1962,10 @@ func TestProcessCompanionEntries_StatsAggregateWithNormalProgress(t *testing.T) ) companionPath := filepath.ToSlash(filepath.Join(root, "companion.rom")) mockDB := helpers.NewMockMediaDBI() + mockDB.On("FindMediaTitlesWithoutSentinel", mock.Anything, systemDBID, "scraper.gamelist.xml:scraped"). + Return([]database.MediaTitle{{ + DBID: normalTitleDBID, SystemDBID: systemDBID, Slug: "normal-game", Name: "Normal Game", + }}, nil) mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{ {DBID: normalMediaDBID, MediaTitleDBID: normalTitleDBID, Path: filepath.Join(root, "normal.rom")}, }, nil) @@ -2071,6 +2138,8 @@ func TestScrapeLoop_NormalMode_Success(t *testing.T) { ) mockDB := helpers.NewMockMediaDBI() + mockDB.On("FindMediaTitlesWithoutSentinel", mock.Anything, systemDBID, "scraper.gamelist.xml:scraped"). + Return([]database.MediaTitle{{DBID: titleDBID, SystemDBID: systemDBID, Slug: "mario", Name: "Mario"}}, nil) mockDB.On("GetMediaBySystemID", "nes"). Return([]database.MediaWithFullPath{{ DBID: mediaDBID, MediaTitleDBID: titleDBID, Path: filepath.Join(root, "mario.nes"), @@ -2142,8 +2211,8 @@ func TestScrapeLoop_Issue794ZipAsDirMedia(t *testing.T) { if w == nil || w.Sentinel != scraper.SentinelTagInfo("gamelist.xml") { return false } - got := make(map[string]string, len(w.TitleProps)) - for _, p := range w.TitleProps { + got := make(map[string]string, len(w.MediaProps)) + for _, p := range w.MediaProps { got[p.TypeTag] = p.Text } return assert.Equal(t, filepath.ToSlash(filepath.Join( @@ -2154,6 +2223,8 @@ func TestScrapeLoop_Issue794ZipAsDirMedia(t *testing.T) { }) mockDB := helpers.NewMockMediaDBI() + mockDB.On("FindMediaTitlesWithoutSentinel", mock.Anything, systemDBID, "scraper.gamelist.xml:scraped"). + Return([]database.MediaTitle{}, nil) mockDB.On("GetMediaBySystemID", "nes"). Return([]database.MediaWithFullPath{{DBID: mediaDBID, MediaTitleDBID: titleDBID, Path: innerPath}}, nil) mockDB.On("GetScrapedMediaIDs", mock.Anything, "gamelist.xml", systemDBID). @@ -2239,6 +2310,8 @@ func TestScrapeLoop_WriteError_RecordSkipped(t *testing.T) { ) mockDB := helpers.NewMockMediaDBI() + mockDB.On("FindMediaTitlesWithoutSentinel", mock.Anything, systemDBID, "scraper.gamelist.xml:scraped"). + Return([]database.MediaTitle{{DBID: titleDBID, SystemDBID: systemDBID, Slug: "mario", Name: "Mario"}}, nil) mockDB.On("GetMediaBySystemID", "nes"). Return([]database.MediaWithFullPath{{ DBID: mediaDBID, MediaTitleDBID: titleDBID, Path: filepath.Join(root, "mario.nes"), @@ -2284,6 +2357,8 @@ func TestScrapeLoop_AllMediaScraped_SkipsSystem(t *testing.T) { ) mockDB := helpers.NewMockMediaDBI() + mockDB.On("FindMediaTitlesWithoutSentinel", mock.Anything, systemDBID, "scraper.gamelist.xml:scraped"). + Return([]database.MediaTitle{}, nil) mockDB.On("GetMediaBySystemID", "nes"). Return([]database.MediaWithFullPath{{ DBID: mediaDBID, MediaTitleDBID: titleDBID, Path: filepath.Join(root, "mario.nes"), From 460176cd3a242923b9c0b64bff9df70262c45630 Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Fri, 29 May 2026 08:53:19 +0800 Subject: [PATCH 14/17] fix(scraper): allow mister external assets --- docs/scraper.md | 6 +- pkg/database/scraper/gamelistxml/scraper.go | 108 +++++++++++++--- .../scraper/gamelistxml/scraper_test.go | 119 +++++++++++++++++- 3 files changed, 210 insertions(+), 23 deletions(-) diff --git a/docs/scraper.md b/docs/scraper.md index 7c7ccb67..6b22d2c9 100644 --- a/docs/scraper.md +++ b/docs/scraper.md @@ -84,7 +84,7 @@ Title metadata remains shared by `MediaTitleDBID`, so multiple ROM variants can `GamelistXMLScraper` scans each system ROM root for `gamelist.xml`. Regular `` entries are resolved to absolute paths under the system ROM root. The scraper first matches the entry to an existing title by the original display-name slug behavior, then uses the resolved path to choose the concrete Media row for that title when possible. If no slug match exists, it falls back to case-insensitive path matching. Scrapers do not create `Media` or `MediaTitle` rows. -Path handling: +Path handling for `` stays strict: | Input | Behavior | |---|---| @@ -92,6 +92,10 @@ Path handling: | `~/...` | Resolved under the current user's home directory, then rejected unless still under the system ROM root | | Absolute path | Cleaned and rejected unless under the system ROM root | +Asset path handling for artwork/video/manual uses the same root-bound behavior by default. On MiSTer and MiSTeX only, absolute or `~/...` asset paths may also resolve under platform root directories from `RootDirs(cfg)`, covering SD, USB, CIFS, network, and configured index roots. This applies only to file-backed asset fields; game paths remain bound to the ROM root. Path traversal outside the ROM root or approved platform roots is rejected. + +Zip-as-directory paths are supported for matching XML entries such as `./Japan/Game.zip` to indexed media stored under that zip path, while nested artwork paths such as `./media/images/Japan/Game.png` remain resolved as asset paths. + Source fields are cleaned before mapping: HTML entities are unescaped, tab/newline/carriage-return characters become spaces, and surrounding whitespace is trimmed. ### Field Mapping diff --git a/pkg/database/scraper/gamelistxml/scraper.go b/pkg/database/scraper/gamelistxml/scraper.go index 049d8f91..a9bd83e0 100644 --- a/pkg/database/scraper/gamelistxml/scraper.go +++ b/pkg/database/scraper/gamelistxml/scraper.go @@ -43,6 +43,7 @@ import ( "github.com/ZaparooProject/zaparoo-core/v2/pkg/database/systemdefs" "github.com/ZaparooProject/zaparoo-core/v2/pkg/database/tags" "github.com/ZaparooProject/zaparoo-core/v2/pkg/platforms" + "github.com/ZaparooProject/zaparoo-core/v2/pkg/platforms/ids" "github.com/ZaparooProject/zaparoo-core/v2/pkg/platforms/shared/esapi" "github.com/rs/zerolog/log" "github.com/spf13/afero" @@ -83,9 +84,10 @@ var mediaDirCandidates = map[string][]string{ // GamelistXMLScraper loads and maps EmulationStation gamelist.xml records. // Use [NewPlatformScraper] to obtain a configured [platforms.Scraper]. type GamelistXMLScraper struct { - db database.MediaDBI - fs afero.Fs - cfg *config.Instance + db database.MediaDBI + fs afero.Fs + cfg *config.Instance + externalAssetRoots []string } type companionStats struct { @@ -107,6 +109,18 @@ func (g *GamelistXMLScraper) filesystem() afero.Fs { return g.fs } +func externalAssetRootsForPlatform(cfg *config.Instance, pl platforms.Platform) []string { + if pl == nil { + return nil + } + switch pl.ID() { + case ids.Mister, ids.Mistex: + return pl.RootDirs(cfg) + default: + return nil + } +} + // NewPlatformScraper returns a [platforms.Scraper] backed by EmulationStation // gamelist.xml files. Systems are resolved at scrape time from the platform // and media database; no state is captured at construction. @@ -129,7 +143,12 @@ func NewPlatformScraper() platforms.Scraper { if err != nil { return fmt.Errorf("gamelistxml: resolve systems: %w", err) } - s := &GamelistXMLScraper{db: db.MediaDB, fs: fs, cfg: cfg} + s := &GamelistXMLScraper{ + db: db.MediaDB, + fs: fs, + cfg: cfg, + externalAssetRoots: externalAssetRootsForPlatform(cfg, pl), + } go s.scrapeLoop(ctx, opts, systems, db.MediaDB, ch) return nil }, @@ -739,7 +758,7 @@ func (g *GamelistXMLScraper) MapToDB(record *GamelistRecord) scraper.MapResult { // the pre-stated media sub-directories for a matching .png file. appendImageProp := func(propValue tags.TagValue, xmlPath string) { key := propType + ":" + string(propValue) - p := pathProp(key, xmlPath, root) + p := pathProp(key, xmlPath, root, g.externalAssetRoots) if p == nil { p = findMediaFilePropFS( g.filesystem(), key, fallbackNames, @@ -774,10 +793,10 @@ func (g *GamelistXMLScraper) MapToDB(record *GamelistRecord) scraper.MapResult { appendImageProp(tags.TagPropertyImageTitleshot, titleshotXML) appendImageProp(tags.TagPropertyImageMap, game.Map) - if p := pathProp(propType+":"+string(tags.TagPropertyVideo), game.Video, root); p != nil { + if p := pathProp(propType+":"+string(tags.TagPropertyVideo), game.Video, root, g.externalAssetRoots); p != nil { mediaProps = append(mediaProps, *p) } - if p := pathProp(propType+":"+string(tags.TagPropertyManual), game.Manual, root); p != nil { + if p := pathProp(propType+":"+string(tags.TagPropertyManual), game.Manual, root, g.externalAssetRoots); p != nil { mediaProps = append(mediaProps, *p) } @@ -934,11 +953,11 @@ func appendCSVTags(tagInfos []database.TagInfo, tagType, raw string) []database. // pathProp resolves esPath to an absolute path and returns a MediaProperty for // the given typeTag. Returns nil if the path cannot be resolved (skipped cleanly). -func pathProp(typeTag, esPath, systemRootPath string) *database.MediaProperty { +func pathProp(typeTag, esPath, systemRootPath string, externalAssetRoots []string) *database.MediaProperty { if esPath == "" { return nil } - abs := filepath.ToSlash(resolveESAssetPath(esPath, systemRootPath)) + abs := filepath.ToSlash(resolveESAssetPath(esPath, systemRootPath, externalAssetRoots)) if abs == "" { return nil } @@ -949,19 +968,72 @@ func pathProp(typeTag, esPath, systemRootPath string) *database.MediaProperty { } } -// bound retrieved paths to child dirs of media -func resolveESAssetPath(esPath, systemRootPath string) string { - abs := resolveESPath(esPath, systemRootPath) - if abs == "" { +// resolveESAssetPath resolves a gamelist asset path. Relative asset paths stay +// bound to the system ROM root. Absolute paths can also resolve under configured +// external asset roots for platforms whose storage routes intentionally overlap. +func resolveESAssetPath(esPath, systemRootPath string, externalAssetRoots []string) string { + abs, ok := resolveESPathAbs(esPath, systemRootPath) + if !ok { return "" } + if pathWithinRoot(abs, systemRootPath) { + return abs + } + if filepath.IsAbs(esPath) || strings.HasPrefix(esPath, "~/") { + for _, root := range externalAssetRoots { + if pathWithinRoot(abs, root) { + return abs + } + } + } + return "" +} - root := filepath.Clean(systemRootPath) + string(filepath.Separator) - cleanAbs := filepath.Clean(abs) + string(filepath.Separator) - if !strings.HasPrefix(cleanAbs, root) { - return "" +func resolveESPathAbs(esPath, systemRootPath string) (string, bool) { + if esPath == "" { + return "", false } - return abs + rootAbs, err := filepath.Abs(systemRootPath) + if err != nil { + return "", false + } + rootAbs = filepath.Clean(rootAbs) + + var abs string + switch { + case strings.HasPrefix(esPath, "~/"): + home, homeErr := os.UserHomeDir() + if homeErr != nil { + return "", false + } + abs = filepath.Join(home, esPath[2:]) + case filepath.IsAbs(esPath): + abs = filepath.Clean(esPath) + default: + rel := strings.TrimPrefix(esPath, "./") + abs = filepath.Join(rootAbs, rel) + } + + abs, err = filepath.Abs(abs) + if err != nil || !filepath.IsAbs(abs) { + return "", false + } + return filepath.Clean(abs), true +} + +func pathWithinRoot(path, root string) bool { + pathAbs, err := filepath.Abs(path) + if err != nil { + return false + } + rootAbs, err := filepath.Abs(root) + if err != nil { + return false + } + pathAbs = filepath.Clean(pathAbs) + rootAbs = filepath.Clean(rootAbs) + rel, err := filepath.Rel(rootAbs, pathAbs) + return err == nil && rel != ".." && !strings.HasPrefix(rel, ".."+string(filepath.Separator)) } // textProp creates a plain-text MediaProperty. diff --git a/pkg/database/scraper/gamelistxml/scraper_test.go b/pkg/database/scraper/gamelistxml/scraper_test.go index 6547b0ad..3a283674 100644 --- a/pkg/database/scraper/gamelistxml/scraper_test.go +++ b/pkg/database/scraper/gamelistxml/scraper_test.go @@ -396,6 +396,28 @@ func TestLoadRecords_TitleNameDoesNotNeedSlugMatch(t *testing.T) { assert.Equal(t, int64(7), records[0].MatchedMediaDBID) } +func TestLoadRecords_PokemonAccentMatchesNonAccentedSlug(t *testing.T) { + t.Parallel() + + root := t.TempDir() + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` + + ./pkruby.gbaPokémon Ruby Version +`), 0o600)) + + records, err := (&GamelistXMLScraper{}).LoadRecords( + context.Background(), + scraper.ScrapeSystem{ID: "gba", ROMPaths: []string{root}}, + mediaBySlugAndPath("pokemonruby", &database.MediaTitle{DBID: 856, Slug: "pokemonruby"}, + database.Media{DBID: 857, MediaTitleDBID: 856, Path: filepath.Join(root, "indexed", "Pokemon Ruby.gba")}), + ) + require.NoError(t, err) + require.Len(t, records, 1) + assert.Equal(t, "Pokémon Ruby Version", records[0].Game.Name) + assert.Equal(t, int64(856), records[0].MatchedTitleDBID) + assert.Equal(t, int64(857), records[0].MatchedMediaDBID) +} + func TestLoadRecords_SlugMatchSelectsExactMediaPath(t *testing.T) { t.Parallel() @@ -808,7 +830,7 @@ func TestMapToDB_ScreenScraperID_NeitherSet(t *testing.T) { func TestPathProp_NormalizesSlashes(t *testing.T) { t.Parallel() root := t.TempDir() - p := pathProp("prop:image", "./images/mario.png", root) + p := pathProp("prop:image", "./images/mario.png", root, nil) require.NotNil(t, p, "expected non-nil property") if strings.Contains(p.Text, "\\") { t.Errorf("pathProp returned backslashes in path: %q", p.Text) @@ -1315,23 +1337,79 @@ func TestResolveESPath_HomeRelativeInsideRoot(t *testing.T) { func TestResolveESAssetPath_InsideRoot(t *testing.T) { t.Parallel() root := t.TempDir() - got := resolveESAssetPath("./images/art.png", root) + got := resolveESAssetPath("./images/art.png", root, nil) assert.Equal(t, filepath.Join(root, "images", "art.png"), got) } func TestResolveESAssetPath_OutsideRoot(t *testing.T) { t.Parallel() root := t.TempDir() - got := resolveESAssetPath("../../etc/passwd", root) + got := resolveESAssetPath("../../etc/passwd", root, nil) assert.Empty(t, got) } func TestResolveESAssetPath_EmptyPath(t *testing.T) { t.Parallel() - got := resolveESAssetPath("", t.TempDir()) + got := resolveESAssetPath("", t.TempDir(), nil) + assert.Empty(t, got) +} + +func TestResolveESAssetPath_ExternalRootAccepted(t *testing.T) { + t.Parallel() + romRoot := t.TempDir() + assetRoot := t.TempDir() + assetPath := filepath.Join(assetRoot, "art", "game.png") + + got := resolveESAssetPath(assetPath, romRoot, []string{assetRoot}) + + assert.Equal(t, assetPath, got) +} + +func TestResolveESAssetPath_ExternalRootRejectedWhenUnconfigured(t *testing.T) { + t.Parallel() + romRoot := t.TempDir() + assetRoot := t.TempDir() + assetPath := filepath.Join(assetRoot, "art", "game.png") + + got := resolveESAssetPath(assetPath, romRoot, nil) + assert.Empty(t, got) } +func TestResolveESAssetPath_ExternalRootEscapeRejected(t *testing.T) { + t.Parallel() + romRoot := t.TempDir() + assetRoot := t.TempDir() + outsideRoot := t.TempDir() + escaped := filepath.Join(assetRoot, "..", filepath.Base(outsideRoot), "art.png") + + got := resolveESAssetPath(escaped, romRoot, []string{assetRoot}) + + assert.Empty(t, got) +} + +func TestMapToDB_ExternalAssetRootImage(t *testing.T) { + t.Parallel() + romRoot := t.TempDir() + assetRoot := t.TempDir() + assetPath := filepath.Join(assetRoot, "art", "game.png") + + rec := GamelistRecord{ + SystemRootPath: romRoot, + Game: esapi.Game{Image: assetPath}, + } + mediaProps := (&GamelistXMLScraper{externalAssetRoots: []string{assetRoot}}).MapToDB(&rec).MediaProps + propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageImage) + var found bool + for _, p := range mediaProps { + if p.TypeTag == propKey { + found = true + assert.Equal(t, filepath.ToSlash(assetPath), p.Text) + } + } + assert.True(t, found, "external asset root image property missing") +} + // --- mimeFromExt additional --- func TestMimeFromExt_JPEG(t *testing.T) { assert.Equal(t, "image/jpeg", mimeFromExt("photo.jpeg")) } @@ -1937,6 +2015,39 @@ func TestProcessCompanionEntries_MapsIssue161CompanionArtwork(t *testing.T) { mockDB.AssertExpectations(t) } +func TestProcessCompanionEntries_ExternalCompanionArtwork(t *testing.T) { + t.Parallel() + root := t.TempDir() + assetRoot := t.TempDir() + assetPath := filepath.Join(assetRoot, "covers", "Doom.png") + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` + + Doom + `+assetPath+` + + + ./Doom.rom + +`), 0o600)) + + childPath := filepath.ToSlash(filepath.Join(root, "Doom.rom")) + propKey := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyImageImage) + mockDB := helpers.NewMockMediaDBI() + mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{}, nil) + mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(1), childPath). + Return(&database.Media{DBID: 10, MediaTitleDBID: 20}, nil) + mockDB.On( + "ApplyScrapeResult", mock.Anything, int64(10), int64(20), + companionArtworkWriteMatcher(t, map[string]string{propKey: assetPath}), + ).Return(nil) + + s := &GamelistXMLScraper{db: mockDB, externalAssetRoots: []string{assetRoot}} + system := scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}, DBID: 1} + stats := s.processCompanionEntries(context.Background(), scraper.ScrapeOptions{}, system, mockDB) + assert.Equal(t, companionStats{Processed: 1, Matched: 1}, stats) + mockDB.AssertExpectations(t) +} + func TestProcessCompanionEntries_StatsAggregateWithNormalProgress(t *testing.T) { t.Parallel() root := t.TempDir() From c4faed9d3f93534b2d9d32cca7325169a60f9d66 Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Fri, 29 May 2026 09:32:39 +0800 Subject: [PATCH 15/17] fix(scraper): clarify gamelist scrape semantics --- docs/scraper.md | 4 +- pkg/database/scraper/gamelistxml/scraper.go | 241 +++++++++++++----- .../scraper/gamelistxml/scraper_test.go | 215 ++++++++++++++++ 3 files changed, 389 insertions(+), 71 deletions(-) diff --git a/docs/scraper.md b/docs/scraper.md index 6b22d2c9..14ea757d 100644 --- a/docs/scraper.md +++ b/docs/scraper.md @@ -72,7 +72,7 @@ Property rows are keyed by entity and property type tag. Re-scraping the same pr Path-backed properties persist their text path and optional `BlobDBID`; the property tables do not persist the `ContentType` computed by the mapper for path values. Blob-backed properties expose content type from `MediaBlobs`. API responses infer path-backed content type and extension from the stored path when DB content type is empty. -Normal gamelist artwork is media-level so regional or language variants can carry different image paths while sharing one title. `media.image` checks media-level properties before title-level properties. +Normal gamelist artwork is media-level so regional or language variants can carry different image paths while sharing one title. `media.image` checks media-level properties before title-level properties. If an older scrape left title-level artwork behind, run a force scrape to refresh media-level artwork; no migration removes old title properties. ## Media-level Sentinel Invariant @@ -82,7 +82,7 @@ Title metadata remains shared by `MediaTitleDBID`, so multiple ROM variants can ## gamelist.xml Behavior -`GamelistXMLScraper` scans each system ROM root for `gamelist.xml`. Regular `` entries are resolved to absolute paths under the system ROM root. The scraper first matches the entry to an existing title by the original display-name slug behavior, then uses the resolved path to choose the concrete Media row for that title when possible. If no slug match exists, it falls back to case-insensitive path matching. Scrapers do not create `Media` or `MediaTitle` rows. +`GamelistXMLScraper` scans each system ROM root for `gamelist.xml`. Regular `` entries are resolved to absolute paths under the system ROM root. The scraper first matches the entry to an existing title by the original display-name slug behavior, then uses the resolved path to choose the concrete Media row for that title when possible. If slug matching finds a title but the path does not identify a Media row for that title, only title-level metadata is written. If no known title slug exists, it falls back to case-insensitive path matching. Scrapers do not create `Media` or `MediaTitle` rows. Path handling for `` stays strict: diff --git a/pkg/database/scraper/gamelistxml/scraper.go b/pkg/database/scraper/gamelistxml/scraper.go index a9bd83e0..e53f9ad2 100644 --- a/pkg/database/scraper/gamelistxml/scraper.go +++ b/pkg/database/scraper/gamelistxml/scraper.go @@ -53,11 +53,28 @@ import ( // root path and the DB identifiers of the matched MediaTitle and one of its // Media rows (used as the sentinel write target). type GamelistRecord struct { - AvailableMediaDirs map[string]string - SystemRootPath string - Game esapi.Game - MatchedMediaDBID int64 - MatchedTitleDBID int64 + AvailableMediaDirs map[string]string + SystemRootPath string + Game esapi.Game + MatchKind gamelistMatchKind + MatchedMediaDBID int64 + MatchedTitleDBID int64 + MediaLevelWriteSafe bool +} + +type gamelistMatchKind string + +const ( + gamelistMatchSlugPath gamelistMatchKind = "slug_path" + gamelistMatchSlugOnly gamelistMatchKind = "slug_only" + gamelistMatchSlugConflict gamelistMatchKind = "slug_conflict" + gamelistMatchPathOnly gamelistMatchKind = "path_only" +) + +type slugMediaSelection struct { + media database.Media + matchKind gamelistMatchKind + key string } // mediaDirCandidates maps each TagPropertyImage value to the ordered list of @@ -98,6 +115,7 @@ type companionStats struct { type loadRecordIndexes struct { TitlesBySlug map[string]database.MediaTitle + AllTitlesBySlug map[string]database.MediaTitle MediaByPathFold map[string]database.Media MediaByTitleDBID map[int64][]database.Media } @@ -291,8 +309,8 @@ outer: }) if title, ok := indexes.TitlesBySlug[pf.Slug]; ok { - media, matchedKey := selectMediaForSlugMatch(indexes, title.DBID, resolved) - if media.DBID == 0 { + selection := selectMediaForSlugMatch(indexes, title.DBID, resolved) + if selection.media.DBID == 0 { log.Debug(). Str("system", system.ID). Str("slug", pf.Slug). @@ -304,20 +322,33 @@ outer: } slugMatches++ - if matchedKey != "" { + mediaLevelWriteSafe := false + if selection.key != "" && selection.matchKind == gamelistMatchSlugPath { slugPathSelections++ - delete(indexes.MediaByPathFold, matchedKey) + mediaLevelWriteSafe = true + delete(indexes.MediaByPathFold, selection.key) } else { slugFirstMediaFallbacks++ } records = append(records, &GamelistRecord{ - SystemRootPath: rootPath, - AvailableMediaDirs: availableMediaDirs, - Game: *game, - MatchedTitleDBID: title.DBID, - MatchedMediaDBID: media.DBID, + SystemRootPath: rootPath, + AvailableMediaDirs: availableMediaDirs, + Game: *game, + MatchKind: selection.matchKind, + MatchedTitleDBID: title.DBID, + MatchedMediaDBID: selection.media.DBID, + MediaLevelWriteSafe: mediaLevelWriteSafe, }) delete(indexes.TitlesBySlug, pf.Slug) + } else if titleSlugKnown(indexes, pf.Slug) { + unmatchedRecords++ + log.Debug(). + Str("system", system.ID). + Str("path", game.Path). + Str("resolved", resolved). + Str("name", game.Name). + Str("slug", pf.Slug). + Msg("gamelistxml: slug exists for another or already-scraped title, skipping path-only fallback") } else if media, matchedKey, ok := matchMediaByResolvedPath(indexes.MediaByPathFold, resolved); ok { pathOnlyFallbacks++ log.Debug(). @@ -330,11 +361,13 @@ outer: Int64("mediaTitleDBID", media.MediaTitleDBID). Msg("gamelistxml: path-only fallback matched record") records = append(records, &GamelistRecord{ - SystemRootPath: rootPath, - AvailableMediaDirs: availableMediaDirs, - Game: *game, - MatchedTitleDBID: media.MediaTitleDBID, - MatchedMediaDBID: media.DBID, + SystemRootPath: rootPath, + AvailableMediaDirs: availableMediaDirs, + Game: *game, + MatchKind: gamelistMatchPathOnly, + MatchedTitleDBID: media.MediaTitleDBID, + MatchedMediaDBID: media.DBID, + MediaLevelWriteSafe: true, }) delete(indexes.MediaByPathFold, matchedKey) } else { @@ -389,9 +422,9 @@ func (g *GamelistXMLScraper) scrapeLoop( if waitErr := opts.Pauser.Wait(ctx); waitErr != nil { ch <- scraper.ScrapeUpdate{ SystemID: systemID, - Processed: totalProcessed + processed, - Matched: totalMatched + matched, - Skipped: totalSkipped + skipped, + Processed: processed, + Matched: matched, + Skipped: skipped, Done: true, } return false @@ -415,31 +448,31 @@ func (g *GamelistXMLScraper) scrapeLoop( // Process ZaparooCompanion parent/child entries before regular slug scrape. companion := g.processCompanionEntries(ctx, opts, system, mdb) - totalProcessed += companion.Processed - totalMatched += companion.Matched - totalSkipped += companion.Skipped titlesBySlug := make(map[string]database.MediaTitle) + allTitlesBySlug := make(map[string]database.MediaTitle) if opts.Force { allTitles, titlesErr := mdb.GetTitlesBySystemID(system.ID) if titlesErr != nil { if errors.Is(titlesErr, context.Canceled) || errors.Is(titlesErr, context.DeadlineExceeded) { ch <- scraper.ScrapeUpdate{ - SystemID: system.ID, Done: true, Processed: totalProcessed, - Matched: totalMatched, Skipped: totalSkipped, + SystemID: system.ID, Done: true, Processed: companion.Processed, + Matched: companion.Matched, Skipped: companion.Skipped, } return } ch <- scraper.ScrapeUpdate{ SystemID: system.ID, FatalErr: titlesErr, - Done: true, Processed: totalProcessed, Matched: totalMatched, Skipped: totalSkipped, + Done: true, Processed: companion.Processed, Matched: companion.Matched, Skipped: companion.Skipped, } return } for _, t := range allTitles { - titlesBySlug[t.Slug] = database.MediaTitle{ + title := database.MediaTitle{ DBID: t.DBID, SystemDBID: t.SystemDBID, Slug: t.Slug, Name: t.Name, } + titlesBySlug[t.Slug] = title + allTitlesBySlug[t.Slug] = title } } else { sentinel := scraper.SentinelTagInfo(id) @@ -448,34 +481,54 @@ func (g *GamelistXMLScraper) scrapeLoop( if titlesErr != nil { if errors.Is(titlesErr, context.Canceled) || errors.Is(titlesErr, context.DeadlineExceeded) { ch <- scraper.ScrapeUpdate{ - SystemID: system.ID, Done: true, Processed: totalProcessed, - Matched: totalMatched, Skipped: totalSkipped, + SystemID: system.ID, Done: true, Processed: companion.Processed, + Matched: companion.Matched, Skipped: companion.Skipped, } return } ch <- scraper.ScrapeUpdate{ SystemID: system.ID, FatalErr: titlesErr, - Done: true, Processed: totalProcessed, Matched: totalMatched, Skipped: totalSkipped, + Done: true, Processed: companion.Processed, Matched: companion.Matched, Skipped: companion.Skipped, } return } for _, t := range unscraped { titlesBySlug[t.Slug] = t } + allTitles, allTitlesErr := mdb.GetTitlesBySystemID(system.ID) + if allTitlesErr != nil { + if errors.Is(allTitlesErr, context.Canceled) || errors.Is(allTitlesErr, context.DeadlineExceeded) { + ch <- scraper.ScrapeUpdate{ + SystemID: system.ID, Done: true, Processed: companion.Processed, + Matched: companion.Matched, Skipped: companion.Skipped, + } + return + } + ch <- scraper.ScrapeUpdate{ + SystemID: system.ID, FatalErr: allTitlesErr, + Done: true, Processed: companion.Processed, Matched: companion.Matched, Skipped: companion.Skipped, + } + return + } + for _, t := range allTitles { + allTitlesBySlug[t.Slug] = database.MediaTitle{ + DBID: t.DBID, SystemDBID: t.SystemDBID, Slug: t.Slug, Name: t.Name, + } + } } allMedia, mediaErr := mdb.GetMediaBySystemID(system.ID) if mediaErr != nil { if errors.Is(mediaErr, context.Canceled) || errors.Is(mediaErr, context.DeadlineExceeded) { ch <- scraper.ScrapeUpdate{ - SystemID: system.ID, Done: true, Processed: totalProcessed, - Matched: totalMatched, Skipped: totalSkipped, + SystemID: system.ID, Done: true, Processed: companion.Processed, + Matched: companion.Matched, Skipped: companion.Skipped, } return } ch <- scraper.ScrapeUpdate{ SystemID: system.ID, FatalErr: mediaErr, - Done: true, Processed: totalProcessed, Matched: totalMatched, Skipped: totalSkipped, + Done: true, Processed: companion.Processed, Matched: companion.Matched, Skipped: companion.Skipped, } return } @@ -486,14 +539,14 @@ func (g *GamelistXMLScraper) scrapeLoop( if scrapedErr != nil { if errors.Is(scrapedErr, context.Canceled) || errors.Is(scrapedErr, context.DeadlineExceeded) { ch <- scraper.ScrapeUpdate{ - SystemID: system.ID, Done: true, Processed: totalProcessed, - Matched: totalMatched, Skipped: totalSkipped, + SystemID: system.ID, Done: true, Processed: companion.Processed, + Matched: companion.Matched, Skipped: companion.Skipped, } return } ch <- scraper.ScrapeUpdate{ SystemID: system.ID, FatalErr: scrapedErr, - Done: true, Processed: totalProcessed, Matched: totalMatched, Skipped: totalSkipped, + Done: true, Processed: companion.Processed, Matched: companion.Matched, Skipped: companion.Skipped, } return } @@ -501,6 +554,7 @@ func (g *GamelistXMLScraper) scrapeLoop( indexes := loadRecordIndexes{ TitlesBySlug: titlesBySlug, + AllTitlesBySlug: allTitlesBySlug, MediaByPathFold: make(map[string]database.Media, len(allMedia)), MediaByTitleDBID: make(map[int64][]database.Media, len(allMedia)), } @@ -512,10 +566,22 @@ func (g *GamelistXMLScraper) scrapeLoop( } if _, scraped := scrapedIDs[m.DBID]; !scraped { indexes.MediaByPathFold[pathFoldKey(m.Path)] = media + indexes.MediaByTitleDBID[m.MediaTitleDBID] = append(indexes.MediaByTitleDBID[m.MediaTitleDBID], media) } - indexes.MediaByTitleDBID[m.MediaTitleDBID] = append(indexes.MediaByTitleDBID[m.MediaTitleDBID], media) } if len(indexes.TitlesBySlug) == 0 && len(indexes.MediaByPathFold) == 0 { + if companion.Processed > 0 { + ch <- scraper.ScrapeUpdate{ + SystemID: system.ID, + Total: companion.Processed, + Processed: companion.Processed, + Matched: companion.Matched, + Skipped: companion.Skipped, + } + } + totalProcessed += companion.Processed + totalMatched += companion.Matched + totalSkipped += companion.Skipped continue } @@ -523,30 +589,37 @@ func (g *GamelistXMLScraper) scrapeLoop( if loadErr != nil { if errors.Is(loadErr, context.Canceled) || errors.Is(loadErr, context.DeadlineExceeded) { ch <- scraper.ScrapeUpdate{ - SystemID: system.ID, Done: true, Processed: totalProcessed, Matched: totalMatched, - Skipped: totalSkipped, + SystemID: system.ID, Done: true, Processed: companion.Processed, Matched: companion.Matched, + Skipped: companion.Skipped, } return } ch <- scraper.ScrapeUpdate{ SystemID: system.ID, FatalErr: loadErr, - Done: true, Processed: totalProcessed, Matched: totalMatched, Skipped: totalSkipped, + Done: true, Processed: companion.Processed, Matched: companion.Matched, Skipped: companion.Skipped, } return } + var processed, matched, skipped int + totalRecords := len(records) + systemTotal := companion.Processed + totalRecords + select { case <-ctx.Done(): ch <- scraper.ScrapeUpdate{ Done: true, Processed: totalProcessed, Matched: totalMatched, Skipped: totalSkipped, } return - case ch <- scraper.ScrapeUpdate{SystemID: system.ID, Total: len(records)}: + case ch <- scraper.ScrapeUpdate{ + SystemID: system.ID, + Total: systemTotal, + Processed: companion.Processed, + Matched: companion.Matched, + Skipped: companion.Skipped, + }: } - - var processed, matched, skipped int - totalRecords := len(records) - if !waitForResume(system.ID, processed, matched, skipped) { + if !waitForResume(system.ID, companion.Processed+processed, companion.Matched+matched, companion.Skipped+skipped) { return } @@ -556,17 +629,17 @@ func (g *GamelistXMLScraper) scrapeLoop( return true } update.SystemID = system.ID - update.Total = totalProcessed + totalRecords - update.Processed += totalProcessed - update.Matched += totalMatched - update.Skipped += totalSkipped + update.Total = systemTotal + update.Processed += companion.Processed + update.Matched += companion.Matched + update.Skipped += companion.Skipped select { case <-ctx.Done(): ch <- scraper.ScrapeUpdate{ SystemID: system.ID, - Processed: totalProcessed + processed, - Matched: totalMatched + matched, - Skipped: totalSkipped + skipped, + Processed: companion.Processed + processed, + Matched: companion.Matched + matched, + Skipped: companion.Skipped + skipped, Done: true, } return false @@ -577,16 +650,16 @@ func (g *GamelistXMLScraper) scrapeLoop( } for _, record := range records { - if !waitForResume(system.ID, processed, matched, skipped) { + if !waitForResume(system.ID, companion.Processed+processed, companion.Matched+matched, companion.Skipped+skipped) { return } select { case <-ctx.Done(): ch <- scraper.ScrapeUpdate{ SystemID: system.ID, - Processed: totalProcessed + processed, - Matched: totalMatched + matched, - Skipped: totalSkipped + skipped, + Processed: companion.Processed + processed, + Matched: companion.Matched + matched, + Skipped: companion.Skipped + skipped, Done: true, } return @@ -607,7 +680,20 @@ func (g *GamelistXMLScraper) scrapeLoop( } mapped := g.MapToDB(record) - if !waitForResume(system.ID, processed, matched, skipped) { + if !record.MediaLevelWriteSafe { + if len(mapped.MediaTags) > 0 || len(mapped.MediaProps) > 0 { + log.Debug(). + Str("system", system.ID). + Str("path", record.Game.Path). + Str("matchKind", string(record.MatchKind)). + Int64("mediaDBID", record.MatchedMediaDBID). + Int64("mediaTitleDBID", record.MatchedTitleDBID). + Msg("gamelistxml: omitting media-level metadata for slug-only match") + } + mapped.MediaTags = nil + mapped.MediaProps = nil + } + if !waitForResume(system.ID, companion.Processed+processed, companion.Matched+matched, companion.Skipped+skipped) { return } @@ -644,9 +730,9 @@ func (g *GamelistXMLScraper) scrapeLoop( if !emitProgress(scraper.ScrapeUpdate{Processed: processed, Matched: matched, Skipped: skipped}, true) { return } - totalProcessed += processed - totalMatched += matched - totalSkipped += skipped + totalProcessed += companion.Processed + processed + totalMatched += companion.Matched + matched + totalSkipped += companion.Skipped + skipped } ch <- scraper.ScrapeUpdate{Done: true, Processed: totalProcessed, Matched: totalMatched, Skipped: totalSkipped} @@ -1148,29 +1234,41 @@ func findMediaFilePropFS( return nil } +func titleSlugKnown(indexes loadRecordIndexes, slug string) bool { + if indexes.AllTitlesBySlug != nil { + _, ok := indexes.AllTitlesBySlug[slug] + return ok + } + _, ok := indexes.TitlesBySlug[slug] + return ok +} + func selectMediaForSlugMatch( indexes loadRecordIndexes, mediaTitleDBID int64, resolved string, -) (media database.Media, matchedKey string) { +) slugMediaSelection { media, matchedKey, ok := matchMediaByResolvedPath(indexes.MediaByPathFold, resolved) if ok && media.MediaTitleDBID == mediaTitleDBID { - return media, matchedKey + return slugMediaSelection{media: media, matchKind: gamelistMatchSlugPath, key: matchedKey} } + + matchKind := gamelistMatchSlugOnly if ok { - log.Debug(). + matchKind = gamelistMatchSlugConflict + log.Warn(). Str("resolved", resolved). Int64("pathMediaDBID", media.DBID). Int64("pathMediaTitleDBID", media.MediaTitleDBID). Int64("slugMediaTitleDBID", mediaTitleDBID). - Msg("gamelistxml: slug match path points at different title, using slug title media") + Msg("gamelistxml: slug match path points at different title, writing title metadata only") } mediaRows := indexes.MediaByTitleDBID[mediaTitleDBID] if len(mediaRows) == 0 { - return database.Media{}, "" + return slugMediaSelection{matchKind: matchKind} } - return mediaRows[0], "" + return slugMediaSelection{media: mediaRows[0], matchKind: matchKind} } func matchMediaByResolvedPath( @@ -1528,7 +1626,12 @@ func (*GamelistXMLScraper) matchCompanionChildMedia( return nil } if len(matched) > 1 { - log.Warn().Str("filename", filename).Int("matches", len(matched)). + log.Warn(). + Str("system", system.ID). + Str("path", child.ResolvedPath). + Str("filename", filename). + Str("parentGameID", child.ParentGameID). + Int("matches", len(matched)). Msg("gamelistxml: companion: ambiguous child filename matches, skipping") return nil } diff --git a/pkg/database/scraper/gamelistxml/scraper_test.go b/pkg/database/scraper/gamelistxml/scraper_test.go index 3a283674..1832edce 100644 --- a/pkg/database/scraper/gamelistxml/scraper_test.go +++ b/pkg/database/scraper/gamelistxml/scraper_test.go @@ -416,6 +416,8 @@ func TestLoadRecords_PokemonAccentMatchesNonAccentedSlug(t *testing.T) { assert.Equal(t, "Pokémon Ruby Version", records[0].Game.Name) assert.Equal(t, int64(856), records[0].MatchedTitleDBID) assert.Equal(t, int64(857), records[0].MatchedMediaDBID) + assert.Equal(t, gamelistMatchSlugOnly, records[0].MatchKind) + assert.False(t, records[0].MediaLevelWriteSafe) } func TestLoadRecords_SlugMatchSelectsExactMediaPath(t *testing.T) { @@ -442,6 +444,32 @@ func TestLoadRecords_SlugMatchSelectsExactMediaPath(t *testing.T) { require.Len(t, records, 1) assert.Equal(t, int64(50), records[0].MatchedTitleDBID) assert.Equal(t, int64(61), records[0].MatchedMediaDBID) + assert.Equal(t, gamelistMatchSlugPath, records[0].MatchKind) + assert.True(t, records[0].MediaLevelWriteSafe) +} + +func TestLoadRecords_SkipsPathOnlyFallbackWhenSlugKnown(t *testing.T) { + t.Parallel() + + root := t.TempDir() + gamePath := filepath.Join(root, "conflict.nes") + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` + + ./conflict.nesKnown Title +`), 0o600)) + + indexes := mediaByPath(database.Media{DBID: 70, MediaTitleDBID: 80, Path: gamePath}) + indexes.AllTitlesBySlug = map[string]database.MediaTitle{ + "knowntitle": {DBID: 90, Slug: "knowntitle", Name: "Known Title"}, + } + + records, err := (&GamelistXMLScraper{}).LoadRecords( + context.Background(), + scraper.ScrapeSystem{ID: "nes", ROMPaths: []string{root}}, + indexes, + ) + require.NoError(t, err) + assert.Empty(t, records) } func TestLoadRecords_PathOnlyFallbackWhenSlugMissing(t *testing.T) { @@ -463,6 +491,8 @@ func TestLoadRecords_PathOnlyFallbackWhenSlugMissing(t *testing.T) { require.Len(t, records, 1) assert.Equal(t, int64(80), records[0].MatchedTitleDBID) assert.Equal(t, int64(70), records[0].MatchedMediaDBID) + assert.Equal(t, gamelistMatchPathOnly, records[0].MatchKind) + assert.True(t, records[0].MediaLevelWriteSafe) } func TestLoadRecords_ZipAsDirChildMatch(t *testing.T) { @@ -2234,6 +2264,129 @@ func drainChannel(ch chan scraper.ScrapeUpdate) []scraper.ScrapeUpdate { return updates } +func TestScrapeLoop_ProgressIsPerSystem(t *testing.T) { + t.Parallel() + + root1 := t.TempDir() + root2 := t.TempDir() + require.NoError(t, os.WriteFile(filepath.Join(root1, "gamelist.xml"), []byte(` + + ./first.nesFirst +`), 0o600)) + games := []string{"one", "two", "three"} + var builder strings.Builder + builder.WriteString("\n") + for _, name := range games { + builder.WriteString(" ./" + name + ".sfc" + name + "\n") + } + builder.WriteString("") + require.NoError(t, os.WriteFile(filepath.Join(root2, "gamelist.xml"), []byte(builder.String()), 0o600)) + + mockDB := helpers.NewMockMediaDBI() + mockDB.On("GetTitlesBySystemID", "nes").Return([]database.TitleWithSystem{{ + DBID: 1, SystemDBID: 10, Slug: "first", Name: "First", + }}, nil) + mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{{ + DBID: 11, MediaTitleDBID: 1, Path: filepath.Join(root1, "first.nes"), + }}, nil) + mockDB.On("GetTitlesBySystemID", "snes").Return([]database.TitleWithSystem{ + {DBID: 2, SystemDBID: 20, Slug: "one", Name: "one"}, + {DBID: 3, SystemDBID: 20, Slug: "two", Name: "two"}, + {DBID: 4, SystemDBID: 20, Slug: "three", Name: "three"}, + }, nil) + mockDB.On("GetMediaBySystemID", "snes").Return([]database.MediaWithFullPath{ + {DBID: 12, MediaTitleDBID: 2, Path: filepath.Join(root2, "one.sfc")}, + {DBID: 13, MediaTitleDBID: 3, Path: filepath.Join(root2, "two.sfc")}, + {DBID: 14, MediaTitleDBID: 4, Path: filepath.Join(root2, "three.sfc")}, + }, nil) + mockDB.On("ApplyScrapeResult", mock.Anything, mock.AnythingOfType("int64"), mock.AnythingOfType("int64"), mock.Anything). + Return(nil) + + s := &GamelistXMLScraper{db: mockDB} + ch := make(chan scraper.ScrapeUpdate, 128) + s.scrapeLoop(context.Background(), scraper.ScrapeOptions{ + Pauser: syncutil.NewPauser(), + Force: true, + }, []scraper.ScrapeSystem{ + {ID: "nes", ROMPaths: []string{root1}, DBID: 10}, + {ID: "snes", ROMPaths: []string{root2}, DBID: 20}, + }, mockDB, ch) + + updates := drainChannel(ch) + var lastSNES scraper.ScrapeUpdate + var done scraper.ScrapeUpdate + for _, update := range updates { + if update.SystemID == "snes" { + lastSNES = update + } + if update.Done { + done = update + } + } + + assert.Equal(t, 3, lastSNES.Total) + assert.Equal(t, 3, lastSNES.Processed) + assert.Equal(t, 3, lastSNES.Matched) + assert.Equal(t, 0, lastSNES.Skipped) + require.True(t, done.Done) + assert.Equal(t, 4, done.Processed) + assert.Equal(t, 4, done.Matched) + mockDB.AssertExpectations(t) +} + +func TestScrapeLoop_ProgressIncludesCompanionBaseline(t *testing.T) { + t.Parallel() + + root := t.TempDir() + companionPath := filepath.Join(root, "companion.nes") + normalPath := filepath.Join(root, "normal.nes") + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` + + + Companion Parent + Parent metadata + + + ./companion.nes + + ./normal.nesNormal +`), 0o600)) + + mockDB := helpers.NewMockMediaDBI() + mockDB.On("GetMediaBySystemID", "nes").Return([]database.MediaWithFullPath{ + {DBID: 10, MediaTitleDBID: 1000, Path: companionPath}, + {DBID: 11, MediaTitleDBID: 1001, Path: normalPath}, + }, nil) + mockDB.On("FindMediaBySystemAndPathFold", mock.Anything, int64(100), filepath.ToSlash(companionPath)). + Return(&database.Media{DBID: 10, MediaTitleDBID: 1000, Path: companionPath}, nil) + mockDB.On("GetTitlesBySystemID", "nes").Return([]database.TitleWithSystem{{ + DBID: 1001, SystemDBID: 100, Slug: "normal", Name: "Normal", + }}, nil) + mockDB.On("ApplyScrapeResult", mock.Anything, mock.AnythingOfType("int64"), mock.AnythingOfType("int64"), mock.Anything). + Return(nil) + + s := &GamelistXMLScraper{db: mockDB} + ch := make(chan scraper.ScrapeUpdate, 128) + s.scrapeLoop(context.Background(), scraper.ScrapeOptions{ + Pauser: syncutil.NewPauser(), + Force: true, + }, []scraper.ScrapeSystem{{ID: "nes", ROMPaths: []string{root}, DBID: 100}}, mockDB, ch) + + updates := drainChannel(ch) + var lastNES scraper.ScrapeUpdate + for _, update := range updates { + if update.SystemID == "nes" { + lastNES = update + } + } + + assert.Equal(t, 2, lastNES.Total) + assert.Equal(t, 2, lastNES.Processed) + assert.Equal(t, 2, lastNES.Matched) + assert.Equal(t, 0, lastNES.Skipped) + mockDB.AssertExpectations(t) +} + func TestScrapeLoop_NormalMode_Success(t *testing.T) { t.Parallel() root := t.TempDir() @@ -2364,6 +2517,68 @@ func TestScrapeLoop_Issue794ZipAsDirMedia(t *testing.T) { mockDB.AssertExpectations(t) } +func TestScrapeLoop_SlugOnlyMatchWritesTitleMetadataOnly(t *testing.T) { + t.Parallel() + root := t.TempDir() + require.NoError(t, os.WriteFile(filepath.Join(root, "gamelist.xml"), []byte(` + + + ./xml-path.nes + Mario + Title metadata + usa + ./media/images/mario.png + +`), 0o600)) + + const ( + titleDBID = int64(2) + mediaDBID = int64(20) + systemDBID = int64(200) + ) + + mockDB := helpers.NewMockMediaDBI() + mockDB.On("FindMediaTitlesWithoutSentinel", mock.Anything, systemDBID, "scraper.gamelist.xml:scraped"). + Return([]database.MediaTitle{{DBID: titleDBID, SystemDBID: systemDBID, Slug: "mario", Name: "Mario"}}, nil) + mockDB.On("GetMediaBySystemID", "nes"). + Return([]database.MediaWithFullPath{{ + DBID: mediaDBID, MediaTitleDBID: titleDBID, Path: filepath.Join(root, "indexed-path.nes"), + }}, nil) + mockDB.On("GetScrapedMediaIDs", mock.Anything, "gamelist.xml", systemDBID). + Return(map[int64]struct{}{}, nil) + writeMatcher := mock.MatchedBy(func(w *database.ScrapeWrite) bool { + if w == nil { + return false + } + descTypeTag := string(tags.TagTypeProperty) + ":" + string(tags.TagPropertyDescription) + return assert.Empty(t, w.MediaTags) && + assert.Empty(t, w.MediaProps) && + assert.Contains(t, w.TitleProps, database.MediaProperty{ + TypeTag: descTypeTag, + Text: "Title metadata", + ContentType: "text/plain", + }) + }) + mockDB.On("ApplyScrapeResult", mock.Anything, mediaDBID, titleDBID, writeMatcher).Return(nil) + + s := &GamelistXMLScraper{db: mockDB} + ch := make(chan scraper.ScrapeUpdate, 128) + s.scrapeLoop(context.Background(), scraper.ScrapeOptions{ + Pauser: syncutil.NewPauser(), + }, []scraper.ScrapeSystem{{ID: "nes", ROMPaths: []string{root}, DBID: systemDBID}}, mockDB, ch) + + updates := drainChannel(ch) + var done scraper.ScrapeUpdate + for _, u := range updates { + if u.Done { + done = u + } + } + assert.Equal(t, 1, done.Processed) + assert.Equal(t, 1, done.Matched) + mockDB.AssertExpectations(t) +} + func TestScrapeLoop_ForceMode_Success(t *testing.T) { t.Parallel() root := t.TempDir() From 7dd20bbe8c31cd3a9422aa0b133a9f5b98c97570 Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Fri, 29 May 2026 09:39:52 +0800 Subject: [PATCH 16/17] fix(scraper): satisfy gamelist lint checks --- pkg/database/scraper/gamelistxml/scraper.go | 25 +++++++++++++++---- .../scraper/gamelistxml/scraper_test.go | 18 +++++++------ 2 files changed, 31 insertions(+), 12 deletions(-) diff --git a/pkg/database/scraper/gamelistxml/scraper.go b/pkg/database/scraper/gamelistxml/scraper.go index e53f9ad2..ca130d8b 100644 --- a/pkg/database/scraper/gamelistxml/scraper.go +++ b/pkg/database/scraper/gamelistxml/scraper.go @@ -55,8 +55,8 @@ import ( type GamelistRecord struct { AvailableMediaDirs map[string]string SystemRootPath string - Game esapi.Game MatchKind gamelistMatchKind + Game esapi.Game MatchedMediaDBID int64 MatchedTitleDBID int64 MediaLevelWriteSafe bool @@ -72,9 +72,9 @@ const ( ) type slugMediaSelection struct { - media database.Media matchKind gamelistMatchKind key string + media database.Media } // mediaDirCandidates maps each TagPropertyImage value to the ordered list of @@ -619,7 +619,12 @@ func (g *GamelistXMLScraper) scrapeLoop( Skipped: companion.Skipped, }: } - if !waitForResume(system.ID, companion.Processed+processed, companion.Matched+matched, companion.Skipped+skipped) { + if !waitForResume( + system.ID, + companion.Processed+processed, + companion.Matched+matched, + companion.Skipped+skipped, + ) { return } @@ -650,7 +655,12 @@ func (g *GamelistXMLScraper) scrapeLoop( } for _, record := range records { - if !waitForResume(system.ID, companion.Processed+processed, companion.Matched+matched, companion.Skipped+skipped) { + if !waitForResume( + system.ID, + companion.Processed+processed, + companion.Matched+matched, + companion.Skipped+skipped, + ) { return } select { @@ -693,7 +703,12 @@ func (g *GamelistXMLScraper) scrapeLoop( mapped.MediaTags = nil mapped.MediaProps = nil } - if !waitForResume(system.ID, companion.Processed+processed, companion.Matched+matched, companion.Skipped+skipped) { + if !waitForResume( + system.ID, + companion.Processed+processed, + companion.Matched+matched, + companion.Skipped+skipped, + ) { return } diff --git a/pkg/database/scraper/gamelistxml/scraper_test.go b/pkg/database/scraper/gamelistxml/scraper_test.go index 1832edce..589df37a 100644 --- a/pkg/database/scraper/gamelistxml/scraper_test.go +++ b/pkg/database/scraper/gamelistxml/scraper_test.go @@ -2275,11 +2275,13 @@ func TestScrapeLoop_ProgressIsPerSystem(t *testing.T) { `), 0o600)) games := []string{"one", "two", "three"} var builder strings.Builder - builder.WriteString("\n") + _, _ = builder.WriteString("\n") for _, name := range games { - builder.WriteString(" ./" + name + ".sfc" + name + "\n") + _, _ = builder.WriteString( + " ./" + name + ".sfc" + name + "\n", + ) } - builder.WriteString("") + _, _ = builder.WriteString("") require.NoError(t, os.WriteFile(filepath.Join(root2, "gamelist.xml"), []byte(builder.String()), 0o600)) mockDB := helpers.NewMockMediaDBI() @@ -2299,8 +2301,9 @@ func TestScrapeLoop_ProgressIsPerSystem(t *testing.T) { {DBID: 13, MediaTitleDBID: 3, Path: filepath.Join(root2, "two.sfc")}, {DBID: 14, MediaTitleDBID: 4, Path: filepath.Join(root2, "three.sfc")}, }, nil) - mockDB.On("ApplyScrapeResult", mock.Anything, mock.AnythingOfType("int64"), mock.AnythingOfType("int64"), mock.Anything). - Return(nil) + mockDB.On( + "ApplyScrapeResult", mock.Anything, mock.AnythingOfType("int64"), mock.AnythingOfType("int64"), mock.Anything, + ).Return(nil) s := &GamelistXMLScraper{db: mockDB} ch := make(chan scraper.ScrapeUpdate, 128) @@ -2362,8 +2365,9 @@ func TestScrapeLoop_ProgressIncludesCompanionBaseline(t *testing.T) { mockDB.On("GetTitlesBySystemID", "nes").Return([]database.TitleWithSystem{{ DBID: 1001, SystemDBID: 100, Slug: "normal", Name: "Normal", }}, nil) - mockDB.On("ApplyScrapeResult", mock.Anything, mock.AnythingOfType("int64"), mock.AnythingOfType("int64"), mock.Anything). - Return(nil) + mockDB.On( + "ApplyScrapeResult", mock.Anything, mock.AnythingOfType("int64"), mock.AnythingOfType("int64"), mock.Anything, + ).Return(nil) s := &GamelistXMLScraper{db: mockDB} ch := make(chan scraper.ScrapeUpdate, 128) From 06bdafcd908545bb2915ded556cbc5853454909d Mon Sep 17 00:00:00 2001 From: Callan Barrett Date: Fri, 29 May 2026 10:19:58 +0800 Subject: [PATCH 17/17] docs(scraper): clarify stale image lookup --- docs/scraper.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/scraper.md b/docs/scraper.md index 14ea757d..9631a1ca 100644 --- a/docs/scraper.md +++ b/docs/scraper.md @@ -207,7 +207,7 @@ Only one scraper can run at a time, and scraping is mutually exclusive with medi `media.meta` returns the metadata graph for media rows: media-level tags and properties, title-level tags and properties, and stored system identity. Single requests accept `mediaId` or `system`/`path` and keep the single-response shape; batch requests use `items` and return per-item results. Binary property bytes are not included; clients should use `media.image` for image data. -`media.image` accepts one media ref plus image type preferences such as `image`, `boxart`, `boxart3d`, `screenshot`, `wheel`, `titleshot`, `map`, `marquee`, and `fanart`. These resolve to canonical image property tags; for example `boxart` becomes `property:image-boxart` and `image` becomes `property:image-image`. Media-level properties are preferred over title-level properties for the same type. Stale file path properties are removed automatically and lookup falls through to the next available source. +`media.image` accepts one media ref plus image type preferences such as `image`, `boxart`, `boxart3d`, `screenshot`, `wheel`, `titleshot`, `map`, `marquee`, and `fanart`. These resolve to canonical image property tags; for example `boxart` becomes `property:image-boxart` and `image` becomes `property:image-image`. Media-level properties are preferred over title-level properties for the same type. For stale image properties in these canonical tags, such as missing file paths for `property:image-boxart` or `property:image-image`, `media.image` logs the stale property in memory only and does not delete DB rows; lookup falls through to the next available source. ## Useful Focused Tests