Need for a verbatim / raw transcription mode for forced alignment #2712
bvegliantestudioeco
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Whisper produces transcriptions optimized for readability, which works very well
for most use cases.
However, for professional workflows involving forced alignment (e.g. Aeneas,
subtitle timing, e-learning pipelines), this creates a limitation.
What seems to be missing is a verbatim / raw transcription mode that:
This would not require changing the default behavior.
A separate flag or decoding mode (e.g.
--verbatim) could explicitly separatetranscription from normalization.
Today users are forced to choose between:
Curious to hear if this is something the team has considered,
or if there are recommended approaches for alignment-critical workflows.
Beta Was this translation helpful? Give feedback.
All reactions