-
-
Notifications
You must be signed in to change notification settings - Fork 228
Open
Description
model = stable_whisper.load_faster_whisper("large-v3-turbo")
result = model.transcribe("1-1.mp3", language="ja", word_timestamps=False)
result.to_srt_vtt("audio-T.srt") # SRT
When "word_timestamps=False", the SRT file with the correct content:
1
00:00:01,080 --> 00:00:07,840
世界で一番有名な富士山の絵 葛飾北斎
2
00:00:07,840 --> 00:00:23,280
1998年にアメリカの雑誌ライフが この1000年の間の世界のすごい人100人を選びました
result = model.transcribe("1-1.mp3", language="ja", word_timestamps=True)`
or
result = model.align("1-1.mp3", text, language="ja", aligner="new")
When "word_timestamps=True", the SRT file with the uncorrect content:
1
00:00:01,080 --> 00:00:01,380
<font color="#00ff00">世界</font>で一番有名な
2
00:00:01,380 --> 00:00:01,660
世界<font color="#00ff00">で</font>一番有名な
3
00:00:01,660 --> 00:00:01,860
世界で一番有名な
4
00:00:01,860 --> 00:00:01,960
世界で<font color="#00ff00">一</font>番有名な
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels