Fix inconsistent fingerprints #90

semohr · 2026-01-07T15:58:39Z

Depending on python version and hardware the io.DEFAULT_BUFFER_SIZE varies. We use this value to determine the length of blocks we ingest/feed to the chromaprint library. This can influence the generated fingerprints if the block sizes do not align with the number of max samples (which they hardly ever do).

Decision needed:
How do we want to approach this? The fix can be considered a bugfix but also breaking change. The fingerprints generated previously (using too much data from the last block) will not match the ones generated after the change. In my opinion this is a bug although fixing the bug might introduce issues for some users.

closes #89

we use to ingest/feed in fingerprinting can influence the generated fingerprints. This fixes the issue by only consuming the expected samples.

grawlinson · 2026-01-12T08:12:00Z

Package maintainer for Arch Linux here, if you do not fix it now, you're just kicking the can down the road. As there are already issues popping up and there will be more as Python 3.1{3,4} become more widespread. Just my 2 cents.

snejus · 2026-01-14T12:01:03Z

How does this affect matching fingerprints that have already been uploaded to AcousticBrainz?

semohr · 2026-01-14T12:31:37Z

It shouldnt matter too much, they will be different ofc. But comparing should be done with a distance measure anyways and results should be closer to the true distribution now. We should test this tho not sure how the implementation looks on their side.

Effectively only the last few bytes changed (because that's how the algorithm work) for long songs (bigger 120s).

The only issue I see is beets here: Im not sure how and where we do fingerpint comparisons but if they are equality checks and not a distance based with a cutoff, that will introduce issues.

As a side note, the fingerprints are still different to the ones generated by the cli fpcalc tool. But the average error in my local testing seem to be lower i.e. they are closer by distance.

As another note the fingerprints will also differ slightly if you use another spectrum extraction backend.

Depending on python version and hardware the DEFAULT_BUFFER_SIZE which

dc1af3b

we use to ingest/feed in fingerprinting can influence the generated fingerprints. This fixes the issue by only consuming the expected samples.

semohr mentioned this pull request Jan 13, 2026

Support for python 3.14 beetbox/beets#6267

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix inconsistent fingerprints #90

Fix inconsistent fingerprints #90

Uh oh!

semohr commented Jan 7, 2026 •

edited

Loading

Uh oh!

grawlinson commented Jan 12, 2026

Uh oh!

snejus commented Jan 14, 2026

Uh oh!

semohr commented Jan 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix inconsistent fingerprints #90

Are you sure you want to change the base?

Fix inconsistent fingerprints #90

Uh oh!

Conversation

semohr commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

grawlinson commented Jan 12, 2026

Uh oh!

snejus commented Jan 14, 2026

Uh oh!

semohr commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

semohr commented Jan 7, 2026 •

edited

Loading

semohr commented Jan 14, 2026 •

edited

Loading