Observations on instrumental music #1341

whicks1 · 2023-05-12T15:43:41Z

whicks1
May 12, 2023

Just a quick signpost having run this across a few thousand hours of mixed materials that whenever orchestral instrumental music is present, whisper has a strong bias towards inferring incorrect titles for a handful of works, but especially Edward Elgar's "Pomp and Circumstance." Assume there are some graduations mixed into the training set as there are a fair number of variations on it including movements, key, and named attribution to Elgar.

These have been my biggest offenders but there are hundreds more, YMMV. Typically identifiable in a post-process regex search surrounded by brackets (but not always).

Pomp and Circumstance
My Country Tis of Thee
The Star-Spangled Banner
Dance of the Sugar Plum Fairy
The Wizard of Oz
The Star Spangled Banner
Piano music continues
Keep the Blues Alive, Y'all
applause
Overture to La Forza del Destino

A generic (music playing) sound atmospheric normalizer would be a super helpful addition, and/or -- a boy can dream-- a genre recognizer (jazz-music playing), (orchestral music playing) (vocal music playing). In the same general space, inclusion of non-verbal atmospherics for which I assume there must be data floating around (phone ringing), (siren), (car horn), would be a welcome option as well.

phineas-pta · 2023-05-12T18:46:34Z

phineas-pta
May 12, 2023

it's the hallucination problem #679, the cause is, more or less as you said, from training data #928

for now the solution is to extract vocals using source separation models like Spleeter or Demucs

if you want more accurate sound classification, it requires fine-tuning

0 replies

phineas-pta · 2023-07-12T12:59:46Z

phineas-pta
Jul 12, 2023

update: u can try https://github.com/YuanGongND/whisper-at

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Observations on instrumental music #1341

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Observations on instrumental music #1341

Uh oh!

whicks1 May 12, 2023

Replies: 2 comments

Uh oh!

phineas-pta May 12, 2023

Uh oh!

phineas-pta Jul 12, 2023

whicks1
May 12, 2023

phineas-pta
May 12, 2023

phineas-pta
Jul 12, 2023