audacious-uade-tools

⚠️ Repository history was recently filtered with git-filter-repo to reduce size, so recloning is needed. Also audio fingerprint TSVs are now separate downloads. Sorry for any inconvenience.

audacious-uade-tools

This repo contains Scala CLI scripts for generating songdb TSV files used by audacious-uade.

Also an experimental Shazam like tool is included (see Audio Matching) for identifying music from audio files or via microphone.

The database contains songlengths and module infos for around 440000 unique MD5s, and metadata (authors/album/publishers/year) for 350000, coming from 1.6M+ files in over 100 sources.

Directories

songdb/ - Scala CLI, SQL scripts and raw source TSVs to generate the final processed TSV files
tsv/encoded/ - the songdb TSV files used by audacious-uade. The files are "encoded" to almost binary format to optimize for size and fast in-memory songdb initialization.
tsv/pretty/ - pretty printed / clear text versions of the TSV files. See TSV Format Specification.
misc/ - misc bash scripts

Hashing

There are two alternative hashing methods provided and separate TSVs for each under md5 and xxh32 subfolders. Hashes are calculated from decompressed files, even if the original source files were compressed.

MD5 - 48-bits (MSB) as hex, hash calculated from whole file
XXH32+filesize - 48-bits as hex (32-bit + 16-bit). Calculated+concatenated as hex(XXH32(file)) + hex(filesize & 0xFFFF). XXH32 is calculated from max first 256k bytes only, filesize is full filesize.

Songdb TSV Files

tsv/pretty/*/songlengths.tsv - subsong and songlengths info
tsv/pretty/*/modinfos.tsv - module file format and channel info
tsv/pretty/*/metadata.tsv - all metadata from different sources distilled to single TSV

Extra TSV Files

tsv/pretty/*/amp.tsv - author/album metadata sourced from AMP
tsv/pretty/*/demozoo.tsv - author/publisher/album/year metadata sourced from Demozoo
tsv/pretty/*/fujiology.tsv - author/publisher/album/year metadata sourced from Fujiology
tsv/pretty/*/modland.tsv - author/album metadata sourced from Modland
tsv/pretty/*/modsanthology.tsv - author/publisher/album/year metadata sourced from Mods Anthology
tsv/pretty/*/oldexotica.tsv - author/publisher/album/year metadata sourced from ExoticA (old)
tsv/pretty/*/unexotica.tsv - author/publisher/album/year metadata sourced from UnExoticA
tsv/pretty/*/wantedteam.tsv - author/publisher/album/year metadata sourced from Wanted Team

Raw TSV Source Files

songdb/sources/*/*.tsv - module infos and songlengths for each site/source
songdb/sources/site/demozoo_leftovers.tsv - module infos and songlengths for Demozoo downloads (excluding duplicated sources). Link list is generated with songdb/scripts/sql/demozoo_leftovers.sql
songdb/sources/metadata/demozoo_music.tsv - Demozoo metadata generated with SQL query (songdb/scripts/sql/demozoo_music.sql) from Demozoo postgres database dump
songdb/sources/metadata/demozoo_prods.tsv - Demozoo metadata generated with SQL query (songdb/scripts/sql/demozoo_prods.sql) from Demozoo postgres database dump
songdb/sources/audio/*.tsv - audio fingerprints (chromaprint), separate download. See scripts/sources/audio.sc for format.

The module infos and songlength TSVs are generated using the precalc binary+script from audacious-uade from my local copy/mirror/snapshot of the various sites/sources.

Note: Audio fingerprint files must be separately downloaded from https://github.com/mvtiaine/audacious-uade-tools/releases/tag/audio See Audio Matching for setup.

Note: Some additional required files not included in Github, specifically local mirror of some of source web pages and/or database files are needed to actually run the Scala songdb.sc script.

Note: Only files playable by audacious-uade are included in the database. The script runs completely locally and does not download anything from internet.

TSV Format Specification

Here are example snippets and short spec for the pretty printed TSVs. Example parsing code can be found in songdb/scripts/pretty.sc

songlengths.tsv

ff5c7b3227e0	0	65920,p 65920,p,!
fffd7a7d8547	1	250840,p+s
fffdc1d765c3	0	40880,l 117860,l 8780,s 79340,l 8080,s 19000,s

Format: [hash]<TAB>[minsubsong]<TAB>[[songlength(ms),songend[,!]]<SPACE>[songlength(ms),songend[,!]]<SPACE>[...]]

Duplicate subsongs are denoted by !

modinfos.tsv

fffdc1d765c3	CustomPlay	
fffdd3c2bef3	Scream Tracker 3.2x (GUS)	8
fffe869a7f8d	AHX v2

Format: [hash]<TAB>[format]<TAB>[channels]

metadata.tsv

feaa9d2a4869	Scorpik	Alchemy	Toxic Ziemniak	1992
feaba2f4c992	Jazz			
feabaabf8a62	Mantronix~Tip	Blue House Productions~Rebels~Sonic Projects	Blue House 2	1991

Format: [hash]<TAB>[authors]<TAB>[publishers]<TAB>[album]<TAB>[year]

Multiple authors or publishers are separated by ~

The TSV files use UTF-8 encoding.

Note: I reserve the right to change the format or location in Github of any of the TSV or other files at any time.

Audio Matching

Identify Amiga exotic modules and tracker music from audio files or via microphone.

The tool uses simple brute force approach for chroma similarity matching. On M4 Max it takes about 10 seconds, depending on input length. All CPU cores are utilized.

Proper implementation should use something like https://github.com/acoustid/acoustid-index or https://github.com/acoustid/pg_acoustid

It's recommended to record at least 30s of audio, but the more the better. Accuracy can depend on many factors, like audio quality and unique audio features available. For best results use fpcalcand audio_match.sc directly with chromaprint generated from the original audio file (like YouTube rip), instead of using microphone.

Requirements: scala-cli (https://scala-cli.virtuslab.org/), chromaprint (fpcalc), 8GB+ of memory. For microphone support: sox, (macOS) mic permission for terminal. Also make sure mic input volume is high enough.

Setup:

Download and decompress audio fingerprint files:

mkdir -p songdb/sources/audio
cd songdb/sources/audio
rm audio_*.zst
for i in {0..9} {a..f}; do wget https://github.com/mvtiaine/audacious-uade-tools/releases/download/audio/audio_$i.tsv.zst; done
zstd -d -f --rm audio_*.zst

Fetch dependencies:

cd songdb
./audio_match.sc

Usage:

./audio_match.sc                                 # Prints usage
./audio_match.sc AQAAC1EShUokRcMfoT-OX8RfNKH...  # Match specific chromaprint
fpcalc -plain somefile.wav | ./audio_match.sc -  # Calculate and match chromaprint from audiofile
./record.sh                                      # Prints usage
./record.sh 0                                    # Interactive recording and matching using microphone
./record.sh 30                                   # Record and match 30 seconds using microphone

See songdb/audio_match.sc and songdb/record.sh sources for more details.

Note:: audio TSV files and git repo must be in sync

Note:: Run ./audio_match.sc once before running ./record.sh. It will fetch the Scala dependencies on first run, which takes a while.

Note:: Only tested on macOS and Linux.

Output:

Score | MD5          | Sub | Authors    | Album                 | Publishers                 | Year | Filenames             
----------------------------------------------------------------------------------------------------------------------------
0,935 | 98d24339316c | 1   | Interphace | The Co-Operation Demo | Andromeda & Infernal Minds | 1990 | MOD.dawn, dawn.mod    
0,787 | a241710e5f1f | 1   |            |                       |                            |      | lords of the boards.xm
0,781 | d3a158c9db44 | 0   | Slammy     |                       |                            |      | final voyage.it

List of top matched entries with match score, MD5, subsong and some metadata from songdb. You can grep the MD5 from songdb/sources/*/*.tsv and tsv/pretty/md5/*.tsv to locate the matching mod file and all available metadata.

License

The Scala and SQL scripts are licensed under GPL-2.0-or-later.

For any applicable sui generis rights or copyrights I may have over the database files, they are provided under CC BY-NC-SA 4.0 license.

Sources

See sources.md for sources used for the database.

Used By

This database is also used by:

HippoPlayer - https://github.com/koobo/HippoPlayer

Contact

My email address is [firstname].[lastname][at]aalto.fi

The old address mvtiaine@cc.hut.fi no longer works.

Name		Name	Last commit message	Last commit date
Latest commit History 213 Commits
misc		misc
songdb		songdb
tsv		tsv
.gitignore		.gitignore
COPYING		COPYING
ChangeLog		ChangeLog
README.md		README.md
sources.md		sources.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

audacious-uade-tools

Directories

Hashing

Songdb TSV Files

Extra TSV Files

Raw TSV Source Files

TSV Format Specification

songlengths.tsv

modinfos.tsv

metadata.tsv

Audio Matching

License

Sources

Used By

Contact

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

audacious-uade-tools

Directories

Hashing

Songdb TSV Files

Extra TSV Files

Raw TSV Source Files

TSV Format Specification

songlengths.tsv

modinfos.tsv

metadata.tsv

Audio Matching

License

Sources

Used By

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages