ground truth transcriptions for fine-tuning #1940

vjdtao · 2024-01-04T21:45:19Z

vjdtao
Jan 4, 2024

I realize high-quality transcriptions as ground truth is critical to fine-tune Whisper models. A crucial consideration arises: should manual transcriptions in English adhere to normalized text? To address this overarching query, I break it down into the following specific aspects. Giving comments and suggestion for each of them would be greatly appreciated!

Regarding punctuation, such as commas, periods, and hyphens, should retain them in the transcriptions?
In terms of capitalization, should the case mirror the original written form? This includes capitalizing words at the start of sentences and proper nouns.
When transcribing numerals, should complete words be utilized, like "twenty-two," or should numerical figures be employed, as in "21"? Similarly, for expressions like "nine one one," should it be transcribed as is or as "911"?
In the context of abbreviations, should they be included in their full form, for instance, transcribing "Avenue" instead of "Ave." when the speaker mentions an address?
For acronyms presented as a single word but pronounced as a sequence of individual letters (e.g., "U P S"), should the transcription capture the pronunciation ("UPS") or the individual letters ("U P S")?
If a speaker interrupts a word midway, should the transcription encompass the fragment of the incomplete word?
Should the transcriptions account for ambient noise, crosstalk, or any other unintelligible sounds encountered during the recording, and labeled them as e.g. [noise]?

tqtifnypmb · 2024-01-16T09:28:38Z

tqtifnypmb
Jan 16, 2024

Regarding to case 4 and 6. I think "Transcription" means you should write down whatever you hear?

If you heard "Avenue" then the transcript shouldn't be "Ave."

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ground truth transcriptions for fine-tuning #1940

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

ground truth transcriptions for fine-tuning #1940

Uh oh!

vjdtao Jan 4, 2024

Replies: 1 comment

Uh oh!

Uh oh!

tqtifnypmb Jan 16, 2024

vjdtao
Jan 4, 2024

tqtifnypmb
Jan 16, 2024