Skip to content

Conversation

@NebularNerd
Copy link
Contributor

@NebularNerd NebularNerd commented Nov 15, 2025

MPEG Audio Scanner Version 1

This is my first pass at an MP3 Scanner, it ended up being a whole lot more than that, it can scan basically any valid MPEG-1 audio stream. In short, if it's an .mp1, .mp2 or .mp3 this should understand what it is.

The decoder grew in scope far beyond what PureMagic is aimed at, I'm going to eventually release a full-featured decoder under its own repo as a standalone tool, this is not to compete with PureMagic but it will provide features outside of the PureMagic goals (i.e: Tag recovery/conversion for obscure formats, Data stream checking etc...). As I develop either this or that, code enhancements will pass back and forth, so this scanner will see updates.

Deepscan

I've altered the Best Match line to print Deepscan Match if a scanner returns a positive confidence=1 result, this makes it clearer to users that we're 100% certain the file is what it is. If the file fails the test we politely return None and let a regular magic_data match offer a best match.

magic_data

A few changes have been made to accommodate matches for all valid byte combinations of .mp1, .mp2 and .mp3.
Changed extension on ffbb from the less common .mpga to .mp3

MPEG Audio Scanner:

Overview

To test fully if the file is a true MP3 I've ending up building something close to a full featured MP3 decoder:

  • If the decoder fails then it will return None and matches will fall back to the .json as there is a chance it's not an MPEG Audio file (or is highly corrupted).
  • If the decoder returns a match then you can be certain it's a bonafide MPEG Audio file, it also provides a few details similar to MediaInfo such as bitrate, sample rate, tag types, CBR/VBR.
  • This decoder does not trust the Xing/Info header as these can provide a false results, for VBR/CBR testing we test the bitrate across a few frames, if it changes it's VBR/ABR, if not it's most likely a CBR.
  • This will also scan and check pretty much every TAG/Metadata style out there, if it finds one it will add it to the details.

Features

Deep scans any MPEG Audio files, this is a pretty complex scanner we have to account for:

  • Understanding MP1 (Layer 1), MP2 (Layer 2) or MP3 files (Layer 3)
  • MP3 CBR files that are not LAME encoded with/without v2 tags
  • MP3 VBR files with/without v2 tags at the start
  • MP3 LAME Xing VBR/ABR or Info CBR encoded files
  • MP3 VBR files based on VBRI encoding
  • All the various End of File TAG formats (see Tags tested below)
  • Decode various bits of info such as Bitrate etc.. to confirm it's a real MPEG audio file

Issues and Limitations

  • MP2 (MPEG-1 Layer-2) files can be encoded with wonky frame data, this causes them to be identified as VBR which is not correct. For now I have a bodge that overrides this until I can create a 100% reliable frame check for MP2 files with that quirk.
  • End of file tags: These are a nightmare, they can push each other around meaning they fall out of accepted spec locations, yet still, in part be valid. We hunt for them as best as, but if they are out of position a player would likely not see them due to being mislocated in the file.
  • End of file tags: Most of these seem to have issues with their specs (padding, wrong size calculations etc...) even when not being pushed around, we have to use broad searches to overcome their quirks
  • Refactoring: This scanner is pretty solid but I know it's not as efficient or easy to read as it should be. I wanted to get feedback from real world use. As mentioned, I'm going to be developing a stand-alone tool so I shall start the decoder from scratch to compartmentalise checks and improve things across the board.

MPEG Audio quirks

MP3's are not quite as standard as one would think, there are a few common assumptions that are wrong, I learned a lot about the format working on this one, some examples are:

  • That they all start with ID3: If the file has V2 tags then ID3 will be present at byte 0, otherwise it's the MPEG header frame starting with hex ffeX or fffX
  • Xing means VBR and Info means CBR: an MP3 can be encoded with Info and still be a VBR. 🤦 MediaInfo (an awesome tool) checks these flags and bases VBR/CBR-ness from this, if you hex edit Info to Xing or vice-versa it changes its report.

Tags Tested

ID3v1.x

TAG 128 bytes from EOF. The original MP3 tags, limited but everything knows what to do with them.

ID3v2.x

ID3 at start of file. These are the current standard, big lumps of data for all sorts of info, v2.2, v2.3 and v2.4 all differ slightly but are handled.

APE Tag

APETAGEX at the absolute end of file or just before ID3v1 TAG. The APE tag is a competitor to the standard ID3 tag. There are two version and both are detected.

ID3v1.2 Enhanced Tag

EXT: 256 bytes from EOF. Niche standard used in the late 1990's invented by BirdCageSoft. Their software supports it but I can't find anything else that does. It was designed to overcome the limits of ID3v1 by offering extra tacked on space for tags.

ID3v1 Enhanced Tag

TAG+ at 227(ish) bytes from EOF. Another niche standard aimed at addressing similar shortfalls in ID3v1 tags as EXT.
One tool created by the spec creators called SpeedTag exists on the WaybackMachine linked below for those wanting to play.
Interestingly the SpeedTag tool does not follow their spec on site and seems to place the TAG+ data slightly earlier in the file, we can still test for it reliably and without affecting overall speed.
There is also a later tool MP3Manager (see LYRICS) created by one or more of the SpeedTag/TAG+ authors, this may have supported TAG+ in an earlier form but it's latest version seems to ignore them.

Additional notes regarding TAG+ and EXT

Due to the nature of these tags it entirely possible for entries to be corrupted easily by other TAG editors. In addition to getting pushed out of the byte window by other EOF tags, there is the possibility of a regular ID3v1 tag editor altering the base TAG without affecting these two in any way.

    Both TAG+ and EXT work in the same way, say you have a Title longer than 30 characters (the limit of v1) like:
    Neon Reflections of a Thousand Forgotten Summer Dreams
                                  ^
    The ^ represents where this title would be carried over into the `TAG+` or `EXT` data, but if an ID3v1 edit was to change this:
    Neon Reflections of Summer     Forgotten Summer Dreams
                                  ^
    Now we have a corrupted title for `TAG+` or `EXT` as the editor only handles the data before ^.                                                                   

This is what really put the nail in the coffin for these extended formats, they were a great idea at the times but splitting the tags between two data fields caused weird or short names on devices that could not read them, or they could be easily corrupted by other tag editors.

LYRICS

Large block before ID3v1 TAG prefixed by LYRICSBEGIN. Created to address both the shortfalls of ID3v1 tags and add lyrics to your song. Seems to have been created in part or whole by some of the TAG+ developers. Lyrics3 (v1 and v2) became one of the first widely used standards to successfully add lyric information to MP3s. Lyrics3v2 upgrades allowed for timestamped lyrics for karaoke and other enhancements.

The spec for v2 has a size field for calculating the tag size, however the official tool MP3Manager is bugged by design or accident and breaks their spec by miscalculating the size, some other tools seem to follow the spec going by some files in my library. I've created a rudimentary check that is not ideal but functional. If I develop a more rigid check later for my own decoder project I'll backport it to PureMagic.

3DI Tag

3DI 10 bytes before the ID3v1 TAG. This is a super niche tag, According to the Library of Congress link it was meant to be placed 10 bytes before the ID3v1 TAG marker, or 10 bytes before the end of the file if not.
It's purpose as summarised by Google Gemini (about the only source of information I could find on what it was for):

While the structure varied slightly across different early applications, the 10-byte extension most commonly broke down like this, focused entirely on track information:
   Bytes 0-2: Identifier "3DI" (3 bytes).
   Byte 3: Track Number (1 byte, typically 1 to 255). This was the most important piece of data.
   Byte 4: Disc Number (1 byte, typically 1 to 255).
   Bytes 5-9: Reserved/Padding (5 bytes). These were often left empty or used inconsistently for things like a simple file checksum by specific tagging programs.

Once ID3v1.1 came along this became less relevant and obviously ID3v2 killed any need for it stone dead. I have no test files so this is a theoretical implementation but no reason for it to not work.

Sample files

mp3

For testing, all VBR files found in the test\resources\audio are based off 3-second synth melody from https://samplelib.com/sample-mp3.html, the files are free or any use restrictions:

  • VBR-Xing-128k-NoTags.mp3 VBR file with Xing header and No Tags
  • VBR-Info-128k-NoTags.mp3 VBR file with Info header and No Tags, MediaInfo will incorrectly call this a CBR
  • VBR-Xing-128k-v1tag-tagplus.mp3 VBR file with Xing header, V1 TAGS and ID3v1 Enhanced Tag (TAG+). Almost nothing now can read the TAG+ part and will simply ignore it.
  • VBR-Xing-128k-v1tag-ext.mp3 VBR file with Xing header, V1 TAGS and ID3.1v2 (EXT Tag). Almost nothing now will read the EXT part and will simply ignore it.
  • VBR-Xing-128k-v1tag-lyrics3v2.mp3 VBR file with Xing header, V1 TAGS and Lyrics3v2.
    These abominations are purely to stretch the decoder to it's limit, these combinations could exist if someone modified an old file they found buried somewhere.
  • VBR-Xing-128k-v1tag-tagplus-ape.mp3 VBR file with Xing header, V1 TAGS, ID3v1 Enhanced Tag and APE Tags. SpeedTag will no longer be able to see the TAG+ as it's pushed out the byte window it looks at.
  • VBR-Xing-128k-v1tag-ape-tagplus.mp3 VBR file with Xing header, V1 TAGS, APE Tags and ID3v1 Enhanced Tag. SpeedTag pushes the APE tags out from their byte window causing them to no longer be seen.
    I've not added V2 tagged based samples as they do not affect the logic needed to test the end of the file, equally CBR's not encoded via LAME and VBRI's would behave the same.

mp2 and mp1

For testing:

Example outputs:

These all come from real files.

'M:\Music\Music-Dump\Chiptunes_Change_Christmas\03-TORIENA_-_Cockscomb_Jingle_Bells.mp3' : .mp3
Total Possible Matches: 1

        Deepscan Match
        Name: MPEG-1 Audio Layer 3 (MP3) audio file [320k 44.1Khz Joint-Stereo CBR ID3v2.4 ID3v1]
        Confidence: 100%
        Extension: .mp3
        Mime Type: audio/mpeg
        Byte Match: b'ID3'
        Offset: 0
'M:\Music\Music-Dump\[ORAL002]_V_A__-_A_Tribute_To_Evangelion___Rebuild_2_22\01-sieg_heilman-cruel_angels_thesis.mp3' : .mp3
Total Possible Matches: 1

        Deepscan Match
        Name: MPEG-1 Audio Layer 3 (MP3) audio file [320k 44.1Khz Joint-Stereo CBR ID3v2.4 APEv2 ID3v1]
        Confidence: 100%
        Extension: .mp3
        Mime Type: audio/mpeg
        Byte Match: b'ID3'
        Offset: 0
'M:\Downloads\Symphony No.6 (1st movement).mp2' : .mp2
Total Possible Matches: 1

        Deepscan Match
        Name: MPEG-1 Audio Layer 2 (MP2) audio file [384k 44.1Khz Stereo CBR]
        Confidence: 100%
        Extension: .mp2
        Mime Type: audio/mpeg
        Byte Match: b'\xff\xfd'
        Offset: 0
'R:\dump\mp1-sample.mp1' : .mp1
Total Possible Matches: 1

        Deepscan Match
        Name: MPEG-1 Audio Layer 1 (MP1) audio file [384k 32.0Khz Stereo CBR]
        Confidence: 100%
        Extension: .mp1
        Mime Type: audio/mpeg
        Byte Match: b'\xff\xfe'
        Offset: 0

Deep scans MPEG Audio files
Supports MP1, MP2, MP3
@NebularNerd
Copy link
Contributor Author

NebularNerd commented Nov 16, 2025

Mmmmm tests are failing, is this because I built against the SNDHDR branch?

EDIT: I might refactor this before we merge, it literally kept me awake last night. It would make things a lot clearer for expansion/testing/debugging later.

Comment on lines +409 to +429
case (
mpeg_audio_scanner.mp3_id3_match_bytes
| mpeg_audio_scanner.raw_mp3_match_bytes
| mpeg_audio_scanner.fffe_match_bytes
| mpeg_audio_scanner.ffff_match_bytes
| mpeg_audio_scanner.fffc_match_bytes
| mpeg_audio_scanner.fffd_match_bytes
| mpeg_audio_scanner.fffa_match_bytes
| mpeg_audio_scanner.fff6_match_bytes
| mpeg_audio_scanner.fff7_match_bytes
| mpeg_audio_scanner.fff4_match_bytes
| mpeg_audio_scanner.fff5_match_bytes
| mpeg_audio_scanner.fff2_match_bytes
| mpeg_audio_scanner.fff3_match_bytes
| mpeg_audio_scanner.ffe6_match_bytes
| mpeg_audio_scanner.ffe7_match_bytes
| mpeg_audio_scanner.ffe4_match_bytes
| mpeg_audio_scanner.ffe5_match_bytes
| mpeg_audio_scanner.ffe2_match_bytes
| mpeg_audio_scanner.ffe3_match_bytes
):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
case (
mpeg_audio_scanner.mp3_id3_match_bytes
| mpeg_audio_scanner.raw_mp3_match_bytes
| mpeg_audio_scanner.fffe_match_bytes
| mpeg_audio_scanner.ffff_match_bytes
| mpeg_audio_scanner.fffc_match_bytes
| mpeg_audio_scanner.fffd_match_bytes
| mpeg_audio_scanner.fffa_match_bytes
| mpeg_audio_scanner.fff6_match_bytes
| mpeg_audio_scanner.fff7_match_bytes
| mpeg_audio_scanner.fff4_match_bytes
| mpeg_audio_scanner.fff5_match_bytes
| mpeg_audio_scanner.fff2_match_bytes
| mpeg_audio_scanner.fff3_match_bytes
| mpeg_audio_scanner.ffe6_match_bytes
| mpeg_audio_scanner.ffe7_match_bytes
| mpeg_audio_scanner.ffe4_match_bytes
| mpeg_audio_scanner.ffe5_match_bytes
| mpeg_audio_scanner.ffe2_match_bytes
| mpeg_audio_scanner.ffe3_match_bytes
):
case mpeg_bytes if mpeg_bytes in mpeg_audio_scanner.mpeg_audio_signatures
):

Would just put those all in a single array

# The first match wins
for scanner in (pdf_scanner, python_scanner, json_scanner):
result = scanner.main(filename, head, foot)
result = scanner.main(filename, head)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
result = scanner.main(filename, head)
result = scanner.main(filename, head, foot)

Need to pass in all three of these. Some other scanners may use foot or expect exactly 3 inputs

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NebularNerd this may be testing issue

cached_data = {"path": None, "matched": False, "name_end": [], "name_format": []}


class mp3_decoding:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class mp3_decoding:
class MP3Decoding:

Standard class naming should be camel case

b"WXXX",
]
self.lyric3_tags = [b"IND", b"LYR", b"INF", b"AUT", b"EAL", b"EAR", b"ETT", b"IMG", b"GRE"]
"""Temporary variables are stored here."""
Copy link
Owner

@cdgriffith cdgriffith Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Temporary variables are stored here."""
# Temporary variables are stored here.

Not everything here may be used in outputs at this time.
"""

def __init__(self, file_path: os.PathLike | str):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this class is basically a copy paste from your repo?

There isn't really a reason to use a class here otherwise, but if it's just an easy drop it can keep it that way for ease!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm was using it to make things clearer, but then it got messier as I went on.

return Match(extension=cached_data["ext"], name=cached_data["name"], mime_type="audio/mpeg", confidence=1.0)


def main(file_path: os.PathLike | str, head: bytes) -> Optional[Match]:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def main(file_path: os.PathLike | str, head: bytes) -> Optional[Match]:
def main(file_path: os.PathLike | str, head: bytes, _) -> Optional[Match]:

@NebularNerd
Copy link
Contributor Author

Thanks for the suggestions, I'll integrate this into a new PR

I'm going to close this PR for now as it's a bit messy and I want to refactor into something a bit easier to deal with. There will still be classes purely to help keep everything apart (EOF Tags, v2 Tags, VBR and Stream) this should make it easier for anyone else reading the code to follow what does what.

Hope you like the scanner so far 😎

@NebularNerd
Copy link
Contributor Author

OK rewrite making good progress, the EOF tag scanner now correctly identifies and calculates the size of all tag styles (my math skills took a beating). Much more logical layout for the code. Less temporary data being stored as well.

@cdgriffith
Copy link
Owner

Hope you like the scanner so far 😎

Absolutely this is a lot of great work, thank you for the effort!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants