Skip to content

silkyclouds/PMDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

694 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PMDA

PMDA (pronounced "pimda")

Files-first music library matching, cleanup, playback, and discovery for large self-hosted collections.

Files-first library Multi-provider matching Trusted batch import Duplicate and incomplete review OCR and optional AI PostgreSQL and Redis Docker and Unraid

Docker Hub User Guide Configuration Discord

PMDA interface overview

PMDA (pronounced "pimda") is a self-hosted music library application and processing pipeline. It scans folders, reads audio tags and artwork, matches albums against local and external evidence, detects duplicates and incomplete releases when needed, publishes a fast browseable library, and provides a web player with history, likes, playlists, recommendations, artist pages, and admin review tools.

Quick Start

PMDA ships as one Docker container with the backend, React frontend, PostgreSQL, Redis, OCR tooling, and media tooling. MusicBrainz mirrors and AI runtimes such as Ollama, OpenAI, Anthropic, and Gemini are configured as external providers that you manage outside PMDA.

docker run -d \
  --name pmda \
  --restart unless-stopped \
  -p 5005:5005 \
  -e PMDA_AUTH_ENABLED=1 \
  -e PMDA_MEDIA_CACHE_ROOT=/cache \
  -v /srv/pmda/config:/config \
  -v /srv/pmda/cache:/cache \
  -v /srv/music:/music:rw \
  -v /srv/pmda/review:/dupes:rw \
  -v /srv/pmda/export:/export:rw \
  meaning/pmda:latest

Open http://localhost:5005, create the admin account, then configure:

Area What to set
Sources One or more standard source folders, plus optional incoming folders
Review targets Duplicate and incomplete destinations, usually outside source roots
Library serving Existing curated destination or PMDA generated library root
Export Hardlink, symlink, copy, or move into a clean downstream tree
Providers Metadata, artwork, web-search, concert, and AI providers you want enabled
Users Admins, normal users, download/statistics/AI permissions, sharing preferences

Docker tags:

  • meaning/pmda:latest: normal release tag
  • meaning/pmda:beta: beta rollout tag, often built from the same current code when testing

What PMDA Does

Capability Details
Scan and group albums Walks configured roots, groups audio by folder/release structure, reads tags, durations, formats, covers, and provider IDs.
Match identities Combines MusicBrainz, Discogs, Last.fm, Bandcamp, iTunes/Apple Music, Deezer, Spotify, Qobuz, TIDAL, TheAudioDB, AcoustID, OCR, web search, and optional AI.
Preserve trusted metadata Keeps existing MusicBrainz release/release-group IDs, Discogs release IDs, Last.fm album MBIDs, Bandcamp URLs, PMDA match tags, cover-provider tags, and local covers when importing trusted folders.
Detect duplicates Chooses preferred editions using format, tags, provider identity, track structure, and policy; losing editions can stay review-only or move to the duplicate target.
Detect incompletes Flags broken or incomplete albums and can quarantine them with reversible move records.
Publish a fast library Writes PostgreSQL browse rows, track rows, artist/genre/label facets, cached artwork, and a published snapshot used during heavy scans.
Play music Streams tracks from the library, supports album playback, queues, mobile player behavior, likes, ratings, notes, playlists, and downloads when permitted.
Remember playback Provides a History page with period, text, artist, genre, and label filters, separate from the Home page's recently played surface.
Enrich artist pages Builds artist profiles, images, biographies, facts, similar artists, collaborations, concerts, and full discography grids.
Export downstream Generates a clean library tree for Plex, Jellyfin, Navidrome, or direct filesystem use using hardlinks, symlinks, copies, or moves.
Operate at large scale Uses PostgreSQL, Redis, provider caches, checkpointed jobs, scan-safe browse fallbacks, and optional Unraid disk-aware power saver mode.

Visual Tour

Library Home And Discovery

PMDA library home

Trusted Batch Import

PMDA trusted library import

Full Artist Pages

PMDA artist page

Album Detail And Playback

PMDA album page

Workflows

Full Scan

Use a full scan when PMDA should inspect source folders, group album candidates, match them, detect duplicates/incompletes, publish the library, and optionally export or refresh downstream players.

Scan jobs support start, pause, resume, stop, clear, scan history, pipeline traces, move audit rows, AI cost rollups, and progress snapshots. Long scans are designed to keep library browse pages usable through published snapshots and short-timeout browse fallbacks.

Trusted Library Import

Use Library -> Import trusted for an already curated destination, for example /music/Music_matched after SongKong or a previous PMDA materialization.

Trusted import:

  • indexes folders incrementally by batch
  • defaults to batch size 250
  • checkpoints completed folders
  • resumes safely with Resume on and Reset off
  • never clears unrelated library rows
  • skips provider lookups and AI calls
  • reads local tags, track data, covers, and trusted provider IDs
  • upserts albums, tracks, and published browse rows as it goes

Closing the browser tab does not stop the import. Restarting the container stops the in-memory worker, but already completed batches remain checkpointed and can be resumed from the same root.

Review And Recovery

Admin review surfaces include:

  • duplicate groups and chosen winners
  • incomplete/broken albums
  • moved album history
  • restore actions
  • scan pipeline trace
  • scan history details
  • provider match details
  • cover-cache repair
  • publication reconcile
  • strict-match backlog export
  • smart provider promotion

Provider Inventory

PMDA has several provider categories. Some providers are identity sources, some are fallback metadata/artwork sources, some are web or concert sources, and some are AI runtimes.

Metadata, Identity, And Artwork Providers

Provider Used for Credential
MusicBrainz Core album/release-group identity, artist IDs, release labels, tracklists, classical metadata; can use public API or local mirror Contact email; optional local mirror
Discogs Release variants, labels, catalog numbers, covers, artist profiles, collection-style signals User token
iTunes / Apple Music Album search, title/year/label hints, tracklists, high-resolution cover fallback No key
Deezer Album search, tracklists, genres, labels, cover fallback No key
Spotify Public album page metadata and artwork evidence No key
Qobuz Public store album metadata, edition hints, artwork fallback No key
TIDAL Public album page metadata and artwork fallback No key
Last.fm Album/artist metadata, genres, artist images, scrobbles/listeners, now playing, scrobbling, loved-track sync API key and secret; user session for scrobbling
Bandcamp Independent release fallback, album/artist pages, artwork, supporter/community signals No key
TheAudioDB Optional album and artist artwork/profile fallback API key
Fanart.tv Optional MBID-based artist artwork fallback API key
Wikipedia / Wikidata / Wikimedia Commons Artist biography, page image, Commons/Wikidata media fallback, contextual enrichment No key
AcoustID / Chromaprint Fingerprint-based identification when tags are missing or weak AcoustID API key; fpcalc in container
Serper Dedicated web-search backend for MusicBrainz identity evidence and enrichment search API key
Local / media cache Embedded tags, sidecar artwork, folder covers, PMDA cached images and derived media Local files

The provider gateway rate-limits and caches external lookups for Discogs, Last.fm, iTunes, Deezer, Spotify, Qobuz, TIDAL, TheAudioDB, and Bandcamp.

Concert And Map Providers

Provider Used for Credential
Songkick Default upcoming concert source for artist pages, fetched from public pages No key
Bandsintown Explicit alternative concert provider No key; PMDA uses an app id and defaults to pmda
OpenStreetMap / Nominatim Best-effort city geocoding for concert maps No key

AI Providers

Provider Runtime ID Used for
OpenAI API openai-api Text, vision, matching disambiguation, summaries, assistant tasks
OpenAI Codex OAuth openai-codex OAuth-backed OpenAI runtime and Codex-assisted tasks
Anthropic Claude anthropic External AI fallback/provider choice
Google Gemini google External AI fallback/provider choice
Ollama ollama Local-first AI through an external Ollama endpoint, with selectable fast and complex models

AI is an escalation path, not the first matching step. PMDA uses deterministic tags, track structure, provider IDs, provider cross-checks, OCR, and caches first, then uses AI for ambiguous identities, cover checks, summaries, facts, reviews, richer profile enrichment, and curator review proposals. PMDA can list models from configured AI providers, but it does not install or pull models itself.

Library UI

PMDA's frontend is a React/Vite app with a dark responsive interface.

Available user-facing areas:

  • Home and discovery feed
  • Albums, Artists, Genres, Labels
  • Full artist pages with all album categories shown as grids, not hidden carousels
  • Album detail pages with tracks, artwork, match sources, badges, ratings, notes, and playback
  • History with period/search/artist/genre/label filters
  • Liked items
  • Recommendations
  • Playlists and playlist detail
  • Global search and suggestions
  • Mobile bottom navigation and installable PWA shell

Admin areas:

  • Scan control and progress
  • Tools, duplicates, incompletes, broken albums
  • Statistics and listening/library statistics
  • Settings and onboarding wizard
  • Users and permissions
  • MCP access settings
  • Logs, task notifications, provider preflights, runtime status

Curator MCP

PMDA exposes a local stdio MCP bridge for non-destructive curator work. A Codex/AI curator can read batches of albums missing reviews, inspect duplicate and incomplete candidates, compare an artist against provider release-groups, and create review proposals. It does not move files directly; proposals remain pending until a human validates them in PMDA.

The installable Codex skill/plugin plus Claude, Cursor, and OpenAI MCP examples live in integrations/pmda-agent-toolkit. For Codex Desktop, PMDA can also generate a private preconfigured plugin ZIP from Settings -> MCP agent access, so users can import it through the Codex UI without a terminal.

Matching Model

PMDA does not trust one provider blindly. It scores and classifies evidence into confidence tiers:

Tier Meaning
strict_mb Strict MusicBrainz or equivalent release identity with move-safe proof
strong_provider Strict verified non-MusicBrainz provider identity
soft_provider Useful trusted provider ID or metadata, publishable but not automatically move-safe
ai_review AI-assisted identity that still requires review for destructive operations
unresolved Not enough trusted evidence

Only strict/strong evidence is considered safe for automatic filesystem materialization. Softer identities can still improve browsing, cover recovery, and review surfaces without allowing unsafe moves by default.

Classical music receives special handling for composers, works, conductors, orchestras, ensembles, soloists, performers, labels, catalog numbers, disc counts, track counts, and duration consistency.

Configuration Reference

Most settings can be edited in the UI. Important environment/config keys include:

Group Keys
Auth PMDA_AUTH_ENABLED, PMDA_AUTH_ALLOW_PUBLIC_BOOTSTRAP, PMDA_AUTH_TRUST_PROXY_HEADERS, PMDA_AUTH_SESSION_COOKIE_SECURE, PMDA_AUTH_SESSION_COOKIE_SAMESITE
Storage PMDA_CONFIG_DIR, PMDA_MEDIA_CACHE_ROOT, LIBRARY_MODE, FILES_ROOTS, LIBRARY_SOURCE_ROOTS, LIBRARY_INTAKE_ROOTS, LIBRARY_SERVING_ROOT, EXPORT_ROOT
Scan pipeline SCAN_THREADS, PIPELINE_ENABLE_MATCH_FIX, PIPELINE_ENABLE_DEDUPE, PIPELINE_ENABLE_INCOMPLETE_MOVE, PIPELINE_ENABLE_EXPORT, PIPELINE_ENABLE_PLAYER_SYNC, PIPELINE_POST_SCAN_ASYNC
Matching providers USE_MUSICBRAINZ, MUSICBRAINZ_EMAIL, MUSICBRAINZ_BASE_URL, USE_DISCOGS, DISCOGS_USER_TOKEN, USE_ITUNES, USE_DEEZER, USE_SPOTIFY, USE_QOBUZ, USE_TIDAL, USE_LASTFM, LASTFM_API_KEY, LASTFM_API_SECRET, USE_BANDCAMP, USE_ACOUSTID, ACOUSTID_API_KEY
Artwork/web providers FANART_API_KEY, THEAUDIODB_API_KEY, USE_WEB_SEARCH_FOR_MB, WEB_SEARCH_PROVIDER, SERPER_API_KEY, USE_AI_WEB_SEARCH_FALLBACK
AI AI_PROVIDER, SCAN_AI_POLICY, SCAN_PAID_PROVIDER_ORDER, OPENAI_API_KEY, OPENAI_ENABLE_API_KEY_MODE, OPENAI_ENABLE_CODEX_OAUTH_MODE, OPENAI_MODEL, OPENAI_MODEL_FALLBACKS, OPENAI_VISION_MODEL, ANTHROPIC_API_KEY, GOOGLE_API_KEY, OLLAMA_URL, OLLAMA_MODEL, OLLAMA_COMPLEX_MODEL
Export/materialization EXPORT_LINK_STRATEGY, LIBRARY_MATERIALIZATION_MODE, LIBRARY_WINNER_PLACEMENT_STRATEGY, EXPORT_NAMING_TEMPLATE, EXPORT_INCLUDE_ALBUM_FORMAT_IN_FOLDER, EXPORT_INCLUDE_ALBUM_TYPE_IN_FOLDER
Player refresh PIPELINE_PLAYER_TARGET, PLEX_HOST, PLEX_TOKEN, JELLYFIN_URL, JELLYFIN_API_KEY, NAVIDROME_URL, NAVIDROME_USERNAME, NAVIDROME_PASSWORD, NAVIDROME_API_KEY
Unraid power saver STORAGE_POWER_SAVER_ENABLED, STORAGE_PROVIDER, UNRAID_HOST_MNT_ROOT, UNRAID_USER_SHARE_HOST_ROOT, UNRAID_CONTAINER_SHARE_ROOT, STORAGE_MAX_ACTIVE_DEVICES, STORAGE_SPINDOWN_POLICY

Unraid Notes

PMDA works well on Unraid with network=host and persistent appdata. Recommended mappings:

Container path Purpose
/config PMDA settings, PostgreSQL data, Redis state, auth state, logs
/cache SSD/NVMe artwork and media cache
/music Library source roots and/or trusted matched library
/dupes Duplicate and incomplete review target
/export Optional generated clean library
/host_mnt Optional read-only bind of host /mnt for disk-aware power saver scans

The optional Unraid disk-aware mode scans one physical disk bucket at a time while preserving canonical /music/... paths in PMDA. Keep /host_mnt read-only.

Official template: PMDA Unraid Community Apps template

Architecture

PMDA is now modularized. pmda.py remains the process bootstrap and compatibility wiring layer, while feature logic lives in focused packages:

Package Role
pmda_api Flask blueprints and API route boundaries
pmda_core config, auth, state DB, scheduling, scan progress, library browse helpers, runtime utilities
pmda_scan discovery, resume, queueing, progress, reconciliation, history, pipeline trace
pmda_discovery filesystem walks, tag extraction, audio inspection, file watchers, storage buckets
pmda_matching MusicBrainz, Discogs, Bandcamp, public provider fallbacks, confidence, arbitration
pmda_publication PostgreSQL schema, index rebuild, published snapshots, trusted import, cover cache
pmda_library browse/detail/profile/recommendation/catalog runtime behavior
pmda_enrichment artwork, profiles, Wikipedia/Wikidata, media cache, image utilities
pmda_ai provider config, model probing, AI calls, guardrails, assistant/RAG support
pmda_materialization export, move/copy/link policy, audit helpers
pmda_dedupe and pmda_incompletes review, detection, move/restore flows
pmda_integrations Last.fm and post-publication player refresh
pmda_mcp MCP server/admin integration

Frontend stack:

  • React 18
  • TypeScript
  • Vite
  • Tailwind CSS
  • shadcn/Radix UI components
  • TanStack Query
  • lucide icons
  • Leaflet for maps

Runtime stack:

  • Python 3.11
  • Flask
  • PostgreSQL 15
  • Redis
  • SQLite state/checkpoint DB
  • FFmpeg / ffprobe
  • Tesseract OCR
  • Chromaprint / AcoustID
  • Mutagen tag fallback

Development

Useful checks:

python3 scripts/pipeline_audit_gate.py
python3 scripts/legacy_cleanup_gate.py
python3 -m py_compile pmda.py
cd frontend && npm run build

The autonomous refactor guard is available for deeper validation:

python3 scripts/autonomous_refactor_guard.py --phase settings_config_blueprint --full

Documentation

About

PMDA (pronounced pimda) is a self-hosted music library app for matching, cleanup, playback, discovery, trusted imports, exports, and optional dedupe at scale.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors