A CLI/TUI toolkit for music collectors who manage their own libraries. Lattice handles library visualization, integrity verification, cover art extraction, and metadata auditing, built on mutagen and tqdm, with flac and ffmpeg shelled out for integrity checks.
Lattice is read-only. It reads tags and decodes audio, and it writes only reports, playlists, and extracted cover art. It never modifies the metadata inside your audio files. The optional companion scripts in
scripts/are the deliberate exception: they do modify files (tags, rating bytes, folder layout) and must be used with caution. See Companion scripts.
Note: This is considered completed software. It is effectively feature complete; bug fixes will be addressed as they come, but no new features are planned. It has been thoroughly tested and is known to be fully functional on the primary development environment: Fedora Linux 44 (Workstation Edition), kernel
7.0.9-205.fc44.x86_64, on Python 3.14, withflacandffmpegfrom the Fedora repositories. While it is pure Python and should be cross-platform, this specific setup is the only officially tested environment.
- Why this exists
- Features · Sample output
- Installation · Requirements
- Usage
- Modes: AI library export · Genre wings · Multi-root scanning · Integrity checks · Library statistics · Cover art extraction · Color output · Supported formats
- Architecture
- Full help output
- Companion scripts (destructive):
retag.py·genre_tidy.py·rerate.py·cleaner.py·genre_foldermap.py·replaygain.py·apestrip.py - Credits & Acknowledgements · Support
Modern music players often hide your library behind proprietary databases. Lattice is built for collectors who treat the filesystem as the source of truth. It reads tags directly via mutagen, ensuring your library is portable and player-agnostic.
| Mode | Flag | Description |
|---|---|---|
| Library tree | --library |
Builds a formatted text tree with artist/album/track/rating/genre |
| AI library export | --ai-library |
Token-efficient flat export for LLM recommendation prompts |
| Genre wings | --all-wings |
Generates a separate library tree file for each genre |
| AI wings | --ai-wings |
Generates separate AI-friendly flat library files per genre |
| Smart Playlist | --playlist |
Generates an .m3u playlist based on a dynamic rule (e.g. rating >= 4) |
| Library statistics | --stats |
Library-wide statistics: format breakdown, bitrate, ratings, genres, top artists |
| FLAC integrity | --testFLAC |
Verifies FLAC via flac -t (authoritative) or FFmpeg; sorts files into severity tiers |
| MP3 integrity | --testMP3 |
Decodes MP3 through FFmpeg (demuxer forced); sorts files into severity tiers |
| Opus integrity | --testOpus |
Decodes Opus through FFmpeg; sorts files into severity tiers |
| WAV integrity | --testWAV |
Decodes WAV through FFmpeg; sorts files into severity tiers |
| WMA integrity | --testWMA |
Decodes WMA through FFmpeg; sorts files into severity tiers |
| Cover art extraction | --extractArt |
Extracts embedded art to cover.jpg with format priority ranking |
| Missing art report | --missingArt |
Lists directories with no cover art (folder or embedded) to text |
| Art quality audit | --auditArtQuality |
Reports extracted/folder covers below a resolution threshold |
| Duplicate detection | --duplicates |
Four-section report: exact album dupes across directories, within-folder multi-format pairs, fuzzy similar-name candidates, and track-level dupes filtered by duration |
| Tag audit | --auditTags |
Reports files missing title, artist, track number, or genre to text |
| Bitrate audit | --auditBitrate |
Reports files falling below a minimum bitrate floor |
| ReplayGain audit | --auditReplayGain |
Reports per-album ReplayGain coverage (missing, partial, no album gain, OK); Opus R128 gain counts as tagged |
| Version | --version |
Prints version and exits |
Running with no arguments launches an interactive TUI: a full-screen curses interface with arrow-key navigation, color-coded section groups (Library, Integrity, Artwork, Metadata), and a highlighted selection cursor. Menus, parameter prompts, and pause screens all render inside styled Unicode boxes for a consistent experience. Library tree, AI export, and genre wings live in a dedicated submenu. Falls back to typed input if curses is unavailable.
ARTIST: Ólafur Arnalds
├── ALBUM: Found Songs (Neo-Classical)
├── SONG: 01. Ólafur Arnalds — Erla's Waltz (flac) [★★★★★ 5.0/5]
├── SONG: 02. Ólafur Arnalds — Raein (flac) [★★★★★ 5.0/5]
├── SONG: 03. Ólafur Arnalds — Romance (flac) [★★★★★ 5.0/5]
├── SONG: 04. Ólafur Arnalds — Allt varð hljótt (flac) [★★★★★ 5.0/5]
├── SONG: 05. Ólafur Arnalds — Lost Song (flac) [★★★★★ 5.0/5]
├── SONG: 06. Ólafur Arnalds — Faun (flac) [★★★★★ 5.0/5]
└── SONG: 07. Ólafur Arnalds — Ljósið (flac) [★★★★★ 5.0/5]
Genre tags are optional (--genres). If your genre metadata is inconsistent, leave them off; the tree gets unwieldy fast.
Lattice installs as a Python package, or compiles into a standalone binary (PyInstaller, hatch run build-bin).
Option 1: pipx (recommended)
pipx install .
# now you can run `lattice` globallyOption 2: pip (virtual environment)
python -m venv .venv
source .venv/bin/activate
pip install .Runtime dependencies are mutagen and tqdm (installed automatically). The integrity modes shell out to system tools:
flac: used by--testFLAC(preferred)ffmpeg: used by--testMP3,--testOpus,--testWAV,--testWMA, and as a fallback for--testFLAC
# Fedora/RHEL
sudo dnf install flac ffmpeg-free
# Debian/Ubuntu
sudo apt install flac ffmpeg
# Windows
winget install flac ffmpegTests. The suite is stdlib unittest (no extra dependencies): pure-helper unit tests plus integration tests that run the report modes, and the companion scripts, against a committed fixture library. Run it from the repo root:
python -m unittest discoverLattice remembers your library location. On first run (TUI or CLI) it asks for your music library path and saves it to ~/.config/lattice/config.json; after that, --root is optional. Repeat --root to scan several libraries together in one pass (see Multi-root scanning).
# Build a library tree with genre tags
lattice --library --output library.txt --genres
# Export library for AI/LLM recommendation prompts
lattice --ai-library --output library_ai.txt
# Generate per-genre library files (add --genres to label each album)
lattice --all-wings --output wings/
# Generate per-genre AI-friendly library files
lattice --ai-wings --output wings_ai/
# Library statistics (prints to screen, or --output for file)
lattice --stats
lattice --stats --output library_stats.txt
# Verify FLAC integrity (4 parallel workers)
lattice --testFLAC --output flac_errors.txt --workers 4
# Verify MP3s for decode errors
lattice --testMP3 --output mp3_errors.txt --workers 4
# Verify Opus files for decode errors
lattice --testOpus --output opus_errors.txt --workers 4
# Extract cover art (FLAC > Opus > M4A > MP3 priority)
lattice --extractArt
# Preview art extraction without writing files
lattice --extractArt --dry-run
# Report directories missing cover art
lattice --missingArt --output missing_art.txt
# Find duplicates: exact, multi-format, similar-name, track-level
lattice --duplicates --output duplicates.txt
# Scan two libraries together (repeat --root); surfaces cross-library duplicates
lattice --duplicates --root ~/Music --root /mnt/usb/Albums --output duplicates.txt
# Audit tags for missing metadata
lattice --auditTags --output tag_audit.txt
# Audit ReplayGain coverage per album (add --verbose to also list fully-tagged albums)
lattice --auditReplayGain --output replaygain_audit.txtThe --ai-library mode generates a flat, pipe-delimited summary designed to fit inside an LLM context window for music recommendations:
Artist | Album | Genre | Rating | Tracks
--------------------------------------------------
Converge | Jane Doe | Metalcore | 4.8 | 12
Ólafur Arnalds | Found Songs | Neo-Classical | 5.0 | 7
Rating is the average across all rated tracks. Tracks is the number of audio files in the album directory. If you've culled 3-star-and-below tracks from disk, this is your survivor count. Paste the output into a prompt and ask for recommendations against your actual library.
--all-wings groups albums by genre and writes one library tree file per genre into the output directory:
lattice --all-wings --root ~/Music --output wings/Produces Alternative_Rock_Library.txt, East_Coast_Rap_Library.txt, and so on; untagged albums land in Uncategorized_Library.txt. Add --genres to label each album header.
--root is repeatable, so a single invocation can span more than one library:
lattice --duplicates --root ~/Music --root /mnt/usb/Albums --output duplicates.txtEvery mode aggregates across the roots: combined statistics, one merged library tree, genre wings that span both, and so on. A path passed twice is de-duped. The payoff for --duplicates is cross-library detection: an album that lives in both libraries is grouped as a single exact duplicate, and each entry is prefixed by its root's basename (Music/… vs Albums/…) so you can tell the copies apart.
To make several roots permanent, add a library_roots array to ~/.config/lattice/config.json:
{ "library_roots": ["/home/you/Music", "/mnt/usb/Albums"] }The first-run prompt still saves only the single library_root, so a throwaway --root is never written to config.
The integrity modes (--testFLAC, --testMP3, --testOpus, --testWAV, --testWMA) decode every file and sort the results into four tiers rather than a flat pass/fail, because a decoder complaint is not by itself proof of damaged audio:
- CORRUPT: could not decode through, or a FLAC truncated before its declared length.
- SUSPECT: decoded to the end but the tool complained (these usually still play), or a FLAC with trailing data after a complete stream.
- METADATA: only tag/container parse warnings; the audio is fine.
- OK: clean decode.
CORRUPT and SUSPECT are always listed in the report; METADATA and OK are summarized and listed only with --verbose. The exit code is 1 only when something is CORRUPT, so a clean-but-chatty library still exits 0. FFmpeg is invoked with the demuxer forced from the file extension (so a large ID3v2 tag is never mis-read as a corrupt container) and with embedded cover art skipped.
--stats reports file counts, total size and duration, a per-format breakdown, a bitrate summary, the rating distribution, top genres, and top artists. Prints to screen, or --output to save.
--extractArt writes embedded art to cover.jpg, pulling from the highest-quality source in each directory (FLAC → Opus/OGG → M4A → MP3) and preferring the "Front Cover" picture type. It checks for existing covers case-insensitively (cover/folder/front/album in .jpg/.jpeg/.png), so it won't duplicate art. Reads FLAC pictures, Opus/OGG METADATA_BLOCK_PICTURE, M4A covr atoms, and MP3 APIC frames.
The status summary that each integrity mode prints is colorized: green for an all-clear, yellow for suspect counts, red for corrupt counts. Color appears only on an interactive terminal. It is suppressed inside the TUI, when output is piped or redirected, and when NO_COLOR is set, so report files and pipes stay clean.
.mp3 · .flac · .ogg · .opus · .m4a · .wav · .wma · .aac
Lattice is a modular Python package under src/lattice/:
tags.py: unified abstraction layer for format-agnostic metadata extraction (returns aTagBundlefrom a singlemutagenopen).modes/: per-mode implementation of auditing and visualization logic (library, integrity, artwork, audit, stats, playlists).cli.py/tui.py: the argparse dispatch and the full-screen curses interface; both call the same mode functions.
The filesystem is the source of truth: Lattice walks the tree on every invocation and keeps no index or database.
Full lattice --help
usage: lattice [-h] [--version] [--library | --ai-library | --all-wings | --ai-wings | --testFLAC | --testMP3 | --testOpus | --testWAV |
--testWMA | --extractArt | --missingArt | --auditArtQuality | --duplicates | --auditTags | --auditBitrate | --auditReplayGain |
--playlist | --stats]
[--root DIR] [--output OUTPUT] [--rule RULE] [--layout LAYOUT] [--min-art-res MIN_ART_RES] [--min-bitrate MIN_BITRATE]
[--workers WORKERS] [--prefer {flac,ffmpeg}] [--quiet] [--genres] [--paths] [--dry-run] [--only-errors | --no-only-errors]
[--ffmpeg FFMPEG] [--verbose]
[pos_root]
Music library toolkit: tree, integrity, art, duplicates, tag audit
positional arguments:
pos_root Root directory (positional fallback)
options:
-h, --help show this help message and exit
--version show program's version number and exit
--library Generate library tree
--ai-library Generate token-efficient library for AI recommendations
--all-wings Generate separate library files for each genre
--ai-wings Generate separate AI-friendly library files for each genre
--testFLAC Verify FLAC files
--testMP3 Verify MP3 files
--testOpus Verify Opus files via FFmpeg decode
--testWAV Verify WAV files via FFmpeg decode
--testWMA Verify WMA files via FFmpeg decode
--extractArt Extract embedded cover art to folder
--missingArt Report directories missing cover art
--auditArtQuality Report extracted/folder covers below a resolution threshold
--duplicates Four-section dupe report: exact albums, within-folder multi-format, similar names, track-level
--auditTags Report files with incomplete tags
--auditBitrate Report files below a certain bitrate floor
--auditReplayGain Report per-album ReplayGain coverage (missing, partial, no album gain)
--playlist Generate a smart .m3u playlist based on a rule
--stats Library-wide statistics summary
--root DIR Root directory; repeat --root to scan several libraries
together (default: read from config or current dir)
--output OUTPUT Output path
--rule RULE Smart playlist rule (e.g. "rating >= 4 and genre == 'Jazz'")
--layout LAYOUT Directory structure pattern for extracting tags from path (default: the `layout` config key,
or {artist}/{album}). Use {genre}/{artist}/{album} for a genre-first library.
--min-art-res MIN_ART_RES
Minimum resolution in pixels for --auditArtQuality (default: 500)
--min-bitrate MIN_BITRATE
Minimum bitrate in kbps for --auditBitrate (default: 192)
--workers WORKERS Parallel workers (integrity modes)
--prefer {flac,ffmpeg}
Preferred tool (FLAC mode)
--quiet Minimize output
--genres Include album genres in library tree
--paths Include absolute directory paths at the album level
--dry-run Preview changes without writing (extractArt)
--only-errors, --no-only-errors
Write only errors/warns (MP3/Opus modes)
--ffmpeg FFMPEG Path to ffmpeg
--verbose Verbose output
The scripts/ directory holds seven standalone maintenance tools. They are not part of the lattice package and deliberately sit outside its read-only contract: unlike Lattice itself, they modify your files in place, rewriting tags, rewriting rating bytes, or moving and renaming folders. Run them directly with python3.
Use them with caution. Have a backup or snapshot first, always preview with --dry-run, and read the log before applying. Each writes an append-only timestamped log and is idempotent, so a second run on an already-clean library is a no-op.
| Script | What it changes | Scope |
|---|---|---|
retag.py |
Genre tags on one album directory | manual, per-album |
genre_tidy.py |
Genre tags library-wide (through retag.py) |
policy map, then apply |
rerate.py |
MP3 POPM rating bytes | reconcile DeaDBeeF / foobar |
cleaner.py |
Folder names and layout (moves, merges, renames) | filesystem |
genre_foldermap.py |
Restructures the tree into Genre/Artist/Album | filesystem |
replaygain.py |
Writes ReplayGain 2.0 gain/peak tags (via rsgain) |
album-by-album |
apestrip.py |
Removes stray APEv2 tags from MP3s (--keep-metadata to migrate first) |
recursive, MP3-only |
Destructive: writes genre tags in place. Always preview with
--dry-run; pass--logto keep an append-only record.
A universal genre tagger designed to work directly with the --all-wings --paths output.
Audio metadata formats handle multiple genres entirely differently (ID3 uses null bytes or slashes, Vorbis uses multiple GENRE= pairs, Apple uses specific custom atoms). retag.py abstracts this container chaos away, allowing you to safely hard-overwrite genres on an entire album directory simultaneously.
The Workflow:
- Generate your wings with paths:
lattice --all-wings --root ~/Music --output wings/ --paths(If you are using the compiled binary, replacelatticewith./dist/lattice) - Open a generated wing (e.g.,
Uncategorized_Library.txt) and copy the bracketed[/path/to/album]from an album header. - Preview the change first with
--dry-run(printsold -> newper file, writes nothing):./scripts/retag.py "/mnt/SharedData/Music/Kanye West/Yeezus" "Alternative Rap" "Industrial" --dry-run
- When it looks right, drop
--dry-runto apply it:./scripts/retag.py "/mnt/SharedData/Music/Kanye West/Yeezus" "Alternative Rap" "Industrial"
Destructive on
apply.buildis read-only;applyrewrites genre tags throughretag.py. Previewapplywith--dry-runfirst.
A two-phase tool for libraries whose genre tags have drifted: it builds an artist to genre authority map, then collapses any album that disagrees with it. It pairs Lattice with retag.py: the build phase only reads (through lattice's scanner), and the apply phase does every write through retag.py. It imports lattice, so it needs the package importable: installed via pip/pipx, or run from a checkout with PYTHONPATH=src.
This is aimed at the messy general library, not a meticulously tagged one. Because build records every genre an artist already uses, apply does nothing until you edit the map; on a cleanly tagged library it reports everything compliant.
The map. build writes an editable tab-separated file (default <library>/genre_map.tsv), one line per artist listing every genre that artist is allowed to carry:
Artist<TAB>Genre<TAB>Second Genre<TAB>...
- Every genre on the line is allowed: albums tagged with any of them are left untouched.
- The first genre is the fix target:
applyretags any of that artist's albums whose genre is not on the line to this first genre. buildseeds the line with all the genres the artist currently uses (most-common first), so the map starts as a faithful snapshot andapplyis a no-op. To tidy, remove a stray genre from a line; its albums then collapse to the first genre. Reorder the line to change which genre is the target.- Leave only the artist (nothing after it) to skip that artist entirely.
- Multi-genre artists get a
#comment above their line with the per-genre counts, so low-count strays worth trimming stand out (e.g.# Eminem: 3 genres: Hardcore Hip Hop×13, Boom Bap×1, Horrorcore×1). - Compilations are excluded. An album whose album-artist is
Various Artists(orVA/Various) gets a flaggedEXCLUDEDcomment, never an enforceable row, andapplyalways skips it: a compilation collects unrelated tracks with no single canonical genre, so there is nothing to enforce.
Matching is by the artist tag (normalized for quote, dash, and case variants), not the folder name. Lattice's tag layer prefers the album-artist, so a compilation is keyed under its Various Artists album-artist and caught by the exclusion above.
Safety. Seeding the map from the library's current state means apply changes nothing you have not asked for: a retag happens only where you removed a genre from a line. apply is otherwise guarded like the other companions: --dry-run previews every retag.py call and writes nothing (log lines prefixed [DRY]), an append-only timestamped log records every decision (default <library>/genre_tidy.log), and the operation is idempotent (a second apply is all no-ops). Re-running build over an existing map preserves your edits and only appends artists new to the library.
The Workflow:
- Build the map (read-only):
./scripts/genre_tidy.py build /mnt/SharedData/Music
- Open
genre_map.tsv. Each line lists an artist's current genres; remove the strays you consider mistakes (the#-commented lines with low counts are the usual suspects), reorder to change a fix target, or blank a line to leave an artist alone. - Preview the changes (writes nothing):
./scripts/genre_tidy.py apply /mnt/SharedData/Music --dry-run
- Inspect
genre_tidy.log; every retag it would perform is recorded with[DRY]. - Apply for real:
./scripts/genre_tidy.py apply /mnt/SharedData/Music
Relationship to retag.py. retag.py is the manual, one-album tool; genre_tidy.py is the library-wide policy layer on top of it, calling it once per album you have tidied out of compliance. Reach for retag.py for a one-off fix, genre_tidy.py to enforce a whole-collection rule.
A real, build-generated map from a roughly 877-artist library ships at artist_genre_defaults.tsv in the repo root. It doubles as a worked example of the format (single- and multi-genre lines, the #-flagged counts, the blank-to-skip pattern) and as a maintained authority: point the tool at it with --map artist_genre_defaults.tsv. Keep it current by re-running build, which appends artists new to the library under a dated marker while preserving every line you have edited; hand-edit a line to accept a new genre for an existing artist.
Destructive: rewrites MP3 rating bytes in place. Preview with
--dry-run; every change is logged, so a run is reversible.
Reconciles MP3 star ratings between DeaDBeeF and foobar2000. Both store ratings in an ID3 POPM frame (a 0–255 byte), but on different scales, so a rating set in one reads shifted in the other. Measured on a real library:
- DeaDBeeF 2★ writes byte
127, which foobar reads as 3★. - DeaDBeeF 4★ writes byte
254, which foobar reads as 5★.
foobar's own values are read the same by both players (byte 196 shows 4★ in DeaDBeeF and foobar alike). So rerate.py rewrites DeaDBeeF's odd bytes to the equivalent foobar value, making the two agree without changing what DeaDBeeF shows: 127 → 64 (both 2★) and 254 → 196 (both 4★).
It touches only those exact bytes. foobar's canonical values, MusicBee's bytes (186/242, which already read correctly), unrated files, and every non-MP3 file are left alone; Vorbis/Opus ratings are clean 0–5 integers and are unaffected. It writes an append-only timestamped log (default <directory>/rerate.log) recording every old -> new change, so a run is fully auditable and reversible. Idempotent.
The Workflow:
- Preview:
./scripts/rerate.py /mnt/SharedData/Music --dry-run
- Inspect
rerate.log(each change is logged as<file>: 254 -> 196). - Apply:
./scripts/rerate.py /mnt/SharedData/Music
Scope. rerate.py is MP3/POPM-only and remaps a fixed set of byte values (REMAP in the script). The diagnosis behind it is simply "which byte does each player read as which star"; if your players use a different scale, edit that map.
Destructive: moves, merges, and renames folders. Preview with
--dry-runand read the log before applying.
A one-shot consolidator for album folders that have fragmented across two paths because of inconsistent metadata. The pattern looks like this:
Music/Modern Baseball/You're Gonna Miss It All/ ← 3 mp3s (straight quote)
Music/Modern Baseball/You’re Gonna Miss It All/ ← 3 opus (curly quote)
Same album, no track overlap, scattered between two folders by filesystem accident. The same artifact shows up at the artist level (Jay-Z & Kanye West/ vs JAY‐Z & Kanye West/, different hyphen codepoints) and across casing variants (BONES/ vs Bones/).
cleaner.py walks the library, finds every sibling pair of folders whose names normalize to the same key (after folding curly→straight quotes, en/em-dashes→ASCII hyphen, NFKC, lowercase, strip), and merges the smaller into the larger.
Safety contract.
mvonly on the same filesystem: an atomic rename, so audio bytes are never read or rewritten.- Audio collisions never auto-delete. If a track of the same name exists in both folders with different file sizes, the source copy is kept under a
<stem>.from-fragment.<ext>suffix instead of being overwritten. Identical-size copies (true duplicates) are dropped from the source. - Cover-art collisions keep the better image. When a
.jpg/.pngexists in both folders, the higher-resolution file wins (ties, or images it cannot parse, fall back to the larger byte size). Other non-audio collisions (.nfo,.cue) drop the source; the canonical copy wins. - The survivor is normalized. The folder with the most files becomes canonical, so its name can be the less-standard variant; after merging, the survivor is renamed to its normalized form (broken hyphens, curly quotes/apostrophes folded to ASCII; en/em dashes, the ellipsis glyph, and prime marks preserved so names stay legal on NTFS/exFAT).
- Conservative matching. Only sibling folders whose normalized names match are merged. Cases like
DomesticavsCursive's Domestica (Deluxe Edition)(different prefix, not just quote variation) are left alone for manual review. --dry-runflag previews every action without touching the filesystem (log lines prefixed[DRY]) and faithfully predicts the real run, including which folders get removed.- Per-file logging to
<directory>/cleanup.log(or--logoverride): every move, drop, collision, rename, andrmdiris timestamped and audit-trailed. - Idempotent: running on an already-clean library is a no-op.
The Workflow:
- Preview first:
./scripts/cleaner.py /mnt/SharedData/Music --dry-run
- Inspect
/mnt/SharedData/Music/cleanup.log; every action it would take is recorded with[DRY]prefixes. - If the plan looks right, apply for real:
./scripts/cleaner.py /mnt/SharedData/Music
- Re-run
lattice --duplicatesafterward to confirm the consolidated state.
Two passes. Pass 1 collapses artist-folder duplicates (e.g., merges JAY‐Z & Kanye West/ into Jay-Z & Kanye West/). Pass 2 then runs album-level consolidation inside each artist folder. The order matters: collapsing the artist split first means album-level matching can find pairs that would otherwise be hidden under the duplicate artist directory.
Layout note. The passes are depth-fixed: the given root's children are treated as artists, and their children as albums. On a genre-first library (Genre/Artist/Album, the shape genre_foldermap.py builds), point cleaner.py at each genre folder rather than the library root; run against the root it would consolidate at the genre and artist levels but never reach the album folders.
Normalizing lone folders (--normalize-names). The merge passes only touch duplicate folders. Libraries often also carry lone, non-duplicate folders whose names use non-standard characters (e.g. At the Drive‐In with a unicode hyphen, or a curly apostrophe). With --normalize-names, a third pass renames every folder whose name differs from its normalized form, folding the same classes as the survivor rename (broken hyphens, curly quotes/apostrophes; en/em dashes and the ellipsis preserved). It is off by default and can touch many folders at once, so preview with --dry-run first.
What it does not do. cleaner.py is intentionally narrow. It does not:
- Rewrite tags (use
retag.pyfor that) - Re-encode or transcode audio (filesystem operations only)
- Match albums by tag content (folder name only, by design, so the operation is auditable from the log alone)
- Touch the source-of-truth import pipeline. If the same fragmentation pattern keeps reappearing, the upstream tagger or downloader needs a curly-quote normalization rule.
Destructive: moves folders. Dry-run is the default; nothing moves until you pass
--apply. Every move is recorded to a manifest that--revertreplays in reverse.
Tidy your genre tags first. Placement uses each album's dominant genre, so an album whose tracks disagree on genre lands under whichever value wins the count, and the rest are not reflected in the tree. For predictable results, run strict tag hygiene before this script: a single, consistent genre per album is ideal.
genre_tidy.pyis built for exactly that (enforce one canonical genre per artist/album), so a sensible order isgenre_tidy.pyfirst, thengenre_foldermap.py.
Restructures a flat Artist/Album/Song library into Genre/Artist/Album/Song, moving each album folder under a top-level genre directory. The genre is the album's dominant embedded genre tag, read through Lattice's scanner (the same aggregation every library/wing mode uses), so placement matches what Lattice reports. Folder names are preserved verbatim; nothing is retagged. It imports lattice, so it needs the package importable: installed via pip/pipx, or run from a checkout with PYTHONPATH=src.
Two directory shapes are handled:
Artist/Album→Genre/Artist/Album(the whole album directory is moved).- An artist folder with loose tracks sitting directly inside (no album subfolder) →
Genre/Artist/Singles/. Only the loose files move; any album subfolders are separate albums placed under their own genre.
Artist-level sidecar files (e.g. an Artist/cover.jpg beside the album subfolders) follow the artist to its dominant genre, so they are never orphaned in an emptied folder.
Placement is gated by the genres your library already uses. The tool first learns your genre vocabulary from the folders that already hold a Genre/Artist/Album tree, then only files a stray into one of those existing genres. An album tagged with a genre the library doesn't already use is flagged, not given a brand-new top-level folder (so a typo'd or junk tag can't spawn a stray genre directory). Pass --allow-new-genre to lift the gate and create new folders. On a flat library with no genre folders yet, the vocabulary is empty and the gate is off, so tags are trusted; this is the normal first-time Artist/Album → Genre/Artist/Album conversion.
Albums already at Genre/Artist/Album are left in place. If such an album's folder genre disagrees with its tags, that is reported as a NOTE for your review and the album is not silently re-filed.
Safety contract.
mvonly on the same filesystem: an atomic rename, so audio bytes (and embedded tags/ratings) are never read or rewritten.- Dry-run by default. Without
--applythe tool only prints the plan;--applyperforms it and writes the manifest. - Reversible. Every move is appended to a manifest TSV (
src<TAB>dst<TAB>time);genre_foldermap.py --revert <manifest>undoes the run. - Never overwrites. A destination that already exists is reported and skipped; collisions are flagged before anything moves.
- Wrong-root guard. A directory deeper than
Genre/Artist/Albumis flaggedTOO DEEPand skipped, never collapsed to its last two components. Aim the tool at the parent of an already-organized library by mistake and it flags everything rather than relocating the whole tree. (Point it at the library root itself, e.g.…/Music, not its parent.) - Genre names are folded to a filesystem-legal form (Windows/NTFS-forbidden characters become spaces), so a stray
:or/in a tag can't break the tree. - Idempotent: running on an already-organized library is a no-op.
The Workflow:
- Preview the full plan (writes nothing):
./scripts/genre_foldermap.py /mnt/SharedData/Music
- Smoke-test one genre, verify it landed, then do the rest:
./scripts/genre_foldermap.py /mnt/SharedData/Music --only-genre "Comedy Rock" --apply ./scripts/genre_foldermap.py /mnt/SharedData/Music --apply --log ~/foldermap.manifest.tsv
- If you change your mind, replay the manifest in reverse:
./scripts/genre_foldermap.py --revert ~/foldermap.manifest.tsv - Point Lattice at the new shape by setting
"layout": "{genre}/{artist}/{album}"in~/.config/lattice/config.json(or pass--layout), then regenerate your wings.
Destructive: writes ReplayGain tags in place. Preview with
--dry-run; a real run prints the worklist and asks for confirmation before writing (skip with--yes). Every album scanned, and the exact values written, are logged.
The companion to the --auditReplayGain audit: where the audit reports which albums lack ReplayGain, replaygain.py writes it. It wraps rsgain (libebur128, ReplayGain 2.0, the -18 LUFS / 89 dB reference foobar2000 uses) to do what foobar's "Scan selection as album" does: compute one album gain plus album peak per album folder and a per-track gain plus peak, then write them into the files. rsgain leaves the audio stream untouched; only metadata changes. It imports lattice for the format-aware ReplayGain reader, so it needs the package importable (installed via pip/pipx, or run from a checkout with PYTHONPATH=src).
Requires rsgain. It is not bundled. On Fedora: sudo dnf install rsgain. Other platforms: see the rsgain releases.
Cross-platform. The script is pure, portable Python (paths via os.path, no shell invocation, UTF-8 logging) and runs on Linux, macOS, and Windows. It needs the same three things everywhere: Python 3.14+ (it imports lattice), mutagen, and rsgain on PATH (macOS: brew install rsgain; Windows: winget install rsgain, scoop, or choco). One Windows-only edge: --target-lufs passes each track's path to rsgain as an argument, so an unusually large folder (hundreds of files) could exceed Windows' ~32K command-line limit; a normal album is nowhere near it, and the default mode (no --target-lufs) is unaffected because it passes only the folder.
Safety contract.
- Album = one folder. The whole folder is rescanned together so album gain is correct.
- No half-scanned albums. A partial album is rescanned in full;
--skip-taggedskips an already-fully-tagged album as a unit (skipping only its tagged tracks would compute album gain over a subset and corrupt it). --dry-runlists every album and its current coverage without invoking rsgain at all.- Confirmation before writing. A real run shows the worklist and prompts, unless
--yesis passed or stdin is not a TTY. - Read-back logging. After each album, the tags just written are read back and logged, so the log is a record of exactly what landed on disk.
- Format-aware through rsgain: MP3 (
TXXX), FLAC/Ogg (Vorbis), Opus (theR128_*_GAINconvention), M4A, WMA, WAV.
The Workflow:
- See what is missing (read-only, from the package):
lattice --auditReplayGain --root /mnt/SharedData/Music --output rg_audit.txt
- Preview the scan plan (writes nothing):
./scripts/replaygain.py /mnt/SharedData/Music --dry-run
- Apply, skipping already-tagged albums and giving rsgain 4 scan threads per album:
./scripts/replaygain.py /mnt/SharedData/Music --skip-tagged --threads 4
Going louder than the standard (--target-lufs). The default target is the 89 dB / -18 LUFS ReplayGain 2.0 reference, which most players (and the rest of your library) assume. If you want a louder result, --target-lufs N sets a different target loudness in LUFS; each 1 LUFS is 1 dB, so a higher target attenuates loud masters less:
| Target | ≈ dB | vs 89 dB |
|---|---|---|
-18 |
89 dB | standard (default) |
-16 |
91 dB | +2 dB, gentle |
-14 |
93 dB | +4 dB, streaming-loud (Spotify/YouTube range) |
./scripts/replaygain.py /mnt/SharedData/Music --target-lufs -14This switches rsgain to custom mode and writes standard replaygain_* tags for every format, Opus included (the R128 convention is fixed at -23 LUFS and cannot carry a custom target, so it is not used here). Two caveats: keep one target across the whole library or albums will not be evenly normalized, and a louder target gives some tracks positive gain (clip protection stays on, but there is less headroom). For a louder result on one device only (e.g. weak laptop speakers), prefer your player's ReplayGain pre-amp instead: it is non-destructive, per-device, and leaves the portable 89 dB tags intact.
Destructive: removes APEv2 tags from MP3s in place. Always preview with
--dry-run; a confirmation prompt guards the real run, and a--logis written by default.
Some MP3s (commonly torrent rips) carry a hidden APEv2 tag in addition to their ID3 tags. Players that read APEv2 on MP3, including foobar2000 and DeaDBeeF, merge the APE values over the ID3 ones. So a stray APE Genre like Trash Metal keeps reappearing as Trash Metal, Metal no matter how many times you fix the ID3 genre, and ordinary tag editors never touch the APEv2 block, so it looks unkillable. retag.py removes APEv2 only as a side effect of rewriting the genre; apestrip.py is the general stripper.
By default it just deletes the APEv2 block and leaves ID3 untouched. That is the point: the stray APE values (the genre most of all) are what you want gone, so copying them back into ID3 would defeat the tool. APE Genre and Rating are always reported so you can see exactly what is being dropped.
--keep-metadata opts in to migration. With that flag, before deleting the APEv2 tag, every APE field whose value is not already present in ID3 is copied into the correct ID3 frame:
| APE field | Migrates to |
|---|---|
Year / Date |
TDRC |
Title / Artist / Album |
TIT2 / TPE1 / TALB |
Album Artist / Band |
TPE2 |
Track / Disc |
TRCK / TPOS |
Composer / Publisher |
TCOM / TPUB |
Comment |
COMM |
Cover Art (Front) |
APIC (front) |
Unsynced lyrics |
USLT |
| sort orders | TSOP / TSO2 / TSOT / TSOA |
| anything else (MusicBrainz IDs, ISRC, barcode, ReplayGain, ...) | TXXX:<key> passthrough |
Two fields are handled deliberately, never migrated even under --keep-metadata:
- Genre is never migrated. ID3 stays authoritative; the APE genre is exactly the value you want gone. If a file has no ID3 genre at all, it is reported (left blank), never invented from the APE value.
- Rating is never written. APE and ID3 (
POPM) use different rating scales, so an auto-conversion would corrupt star counts (the same hazardrerate.pyexists to fix). Any APERatingis reported so you can apply it deliberately.
The Workflow:
- Preview the whole library first (writes nothing):
The worklist shows, per file, whether it is a plain strip or (under
./scripts/apestrip.py "/mnt/SharedData/Music" --dry-run--keep-metadata) which APE fields are redundant and which will be migrated and to where, plus any reported ratings and genre warnings. - When it looks right, drop
--dry-runand confirm at the prompt:./scripts/apestrip.py "/mnt/SharedData/Music"
A plain strip leaves the ID3 frames byte for byte; only the APEv2 block is removed. When --keep-metadata actually migrates a field, the ID3 is re-saved as ID3v2.3 with a refreshed ID3v1 (the same player-compatible save retag.py uses). The run writes an append-only timestamped log (default <directory>/apestrip.log) and is idempotent: a file with no APEv2 tag is left untouched, so a second run on a clean library is a no-op. MP3-only, since the APEv2-over-ID3 conflict is specific to MP3; other formats carry their own authoritative tags and are skipped. Pass --yes to skip the prompt (it is auto-skipped when stdin is not a TTY).
Malformed tags (--repair-malformed). Some rips carry an APEv2 tag that is structurally broken (for example a footer with the IS_HEADER bit wrongly set, or junk bytes between the footer and a trailing ID3v1). mutagen refuses to load these, so the normal path cannot strip them. By default apestrip reports such files (malformed APEv2 tag (mutagen cannot parse)) and leaves them alone rather than silently calling them clean. Pass --repair-malformed to fix them: apestrip parses the tag straight from the bytes, but only after proving the footer sits exactly where the header's size field points (so the cut boundary is a real tag edge, not a chance signature in the audio), then excises the APE block (migrating sole-source fields into ID3 first only if --keep-metadata is also given; genre still never migrated, ratings still report-only). The result is written to a temp file, verified (it still decodes and no APE signature survives), and atomically swapped in. The audio frames and the trailing ID3v1 are preserved byte for byte; if any check fails the original is left untouched.
Lattice is built upon several excellent open-source libraries and tools:
- Mutagen: Handles all audio metadata extraction and tagging logic.
- tqdm: Powers the extensible progress bars for library scanning and integrity checks.
- FFmpeg: The heavy lifter for multi-format audio decoding and integrity verification.
- FLAC: Used for high-speed native FLAC verification.
If Lattice's useful to you and you'd like to chip in:
bc1qkge6zr45tzqfwfmvma2ylumt6mg7wlwmhr05yv