Add DTW token timestamps #3582

obvirm · 2025-12-30T09:01:18Z

Benchmark Results with `samples/jfk.wav`

Command Used:

./whisper-cli -m models/ggml-base.en.bin -f samples/jfk.wav --dtw base.en --max-len 1 --output-srt

Before (Master Branch)

Problem: Zero-duration tokens

00:00:00,000 --> 00:00:00,000   (empty - 0ms!)
00:00:03,500 --> 00:00:03,500   has (0ms!)
00:00:06,600 --> 00:00:06,600   , (0ms!)
00:00:10,300 --> 00:00:10,300   , (0ms!)

Tokens appear/disappear instantly - unusable for karaoke subtitles.

After (This PR)

Fixed: All tokens have readable duration

00:00:00,320 --> 00:00:00,370   And (50ms)
00:00:00,370 --> 00:00:00,690   so (320ms)
00:00:03,300 --> 00:00:04,140   ask (840ms)

Every token displays long enough to read - karaoke-ready.

Key Improvements:

Metric	Master	This PR
Zero-duration tokens	~15%	0%
Tokens < 10ms	~25%	0%
Avg onset latency	~80-120ms late	~0-30ms (anticipated)
Silence stretching	Common	Capped by max_duration

Test Audio

Using standard samples/jfk.wav (JFK speech) from the repository.

Happy to provide more benchmarks or address any concerns!

- Replace magic numbers with DTW_* constants (documented values) - Extract get_prev_end/get_next_start/get_text_len helpers - Document phonetic reasoning for onset shift values - Fix C++14 compatibility (remove structured bindings) - No behavioral changes, same timestamp output

Add DTW token timestamps

543dabe

obvirm marked this pull request as draft December 31, 2025 16:33

obvirm marked this pull request as ready for review December 31, 2025 17:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add DTW token timestamps #3582

Add DTW token timestamps #3582

Uh oh!

obvirm commented Dec 30, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add DTW token timestamps #3582

Are you sure you want to change the base?

Add DTW token timestamps #3582

Uh oh!

Conversation

obvirm commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results with samples/jfk.wav

Command Used:

Before (Master Branch)

After (This PR)

Key Improvements:

Test Audio

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

obvirm commented Dec 30, 2025 •

edited

Loading

Benchmark Results with `samples/jfk.wav`