Skip to content

When "word_timestamps=True", the SRT file with the uncorrect content #472

@surfincanoy

Description

@surfincanoy
model = stable_whisper.load_faster_whisper("large-v3-turbo")
result = model.transcribe("1-1.mp3", language="ja", word_timestamps=False)
result.to_srt_vtt("audio-T.srt")  # SRT

When "word_timestamps=False", the SRT file with the correct content:

1
00:00:01,080 --> 00:00:07,840
世界で一番有名な富士山の絵 葛飾北斎

2
00:00:07,840 --> 00:00:23,280
1998年にアメリカの雑誌ライフが この1000年の間の世界のすごい人100人を選びました
result = model.transcribe("1-1.mp3", language="ja", word_timestamps=True)`
                                or 
result = model.align("1-1.mp3", text, language="ja", aligner="new")

When "word_timestamps=True", the SRT file with the uncorrect content:

1
00:00:01,080 --> 00:00:01,380
<font color="#00ff00">世界</font>で一番有名な

2
00:00:01,380 --> 00:00:01,660
世界<font color="#00ff00">で</font>一番有名な

3
00:00:01,660 --> 00:00:01,860
世界で一番有名な

4
00:00:01,860 --> 00:00:01,960
世界で<font color="#00ff00">一</font>番有名な

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions