feat: Implement OpenAI style local API server for audio transcription #509

Yorick-Ryu · 2026-01-02T14:49:25Z

Before Submitting This PR

Please confirm you have done the following:

I have searched existing issues and pull requests (including closed ones) to ensure this isn't a duplicate
I have read CONTRIBUTING.md

If this is a feature or change that was previously closed/rejected:

I have explained in the description below why this should be reconsidered
I have gathered community feedback (link to discussion below)

Human Written Description

I implemented a local STT API that follows the OpenAI Whisper format. Currently, the Whisper model is only accessible within Handy; however, many users want to leverage this functionality for external tasks like subtitle transcription without loading multiple model instances. This change exposes the speech-to-text capability as a standardized service, allowing users to do more with limited system memory.

Related Issues/Discussions

Fixes # None
Discussion: #241

Community Feedback

#241

Testing

Environment:

Tested on: macOS 26.2 (Apple Silicon M1 Pro)
Status: Functional on macOS. Need help testing on Windows and Linux platforms to ensure consistent behavior.

Test Cases:

Features: Tested by calling the API using curl and Demo: convert MP3 to SRT
On-demand Loading: Verified via curl that calling the /v1/audio/transcriptions endpoint correctly triggers the model loading process in the background.
Waiting Mechanism: Confirmed the API response waits until the model is fully loaded before processing the transcription, preventing "Model not loaded" errors.
Verified Limitations: Tested various audio formats and confirmed only MP3 currently works reliably; documented this behavior and added a "welcome PRs" note in LOCAL_API.md to guide future contributors.

Screenshots/Videos (if applicable)

Yorick-Ryu · 2026-01-05T13:24:36Z

@cjpais Hello, is there any issue with this PR?

cjpais · 2026-01-05T23:23:04Z

@Yorick-Ryu please be patient, I have not had time to review it yet

Im currently traveling and haven't been able to look at my laptop much. Handy was featured in wired which has brought in a lot of new issues and discussions I respond to every day

Yorick-Ryu · 2026-01-06T06:58:32Z

I saw this: https://www.wired.com/story/handy-free-speech-to-text-app/
Congratulations!

samiulazam · 2026-01-15T06:02:53Z

@cjpais any updates?

cjpais · 2026-01-15T08:31:13Z

Patience is the key :)

Aaryan-Kapoor · 2026-01-22T03:39:02Z

+1

FSerg · 2026-01-27T20:21:26Z

+1

…d segment output

…nd port settings.

Yorick-Ryu · 2026-01-28T09:28:41Z

@cjpais Conflicts are fixed. Is it a good time to merge?

cjpais · 2026-01-28T13:45:10Z

@Yorick-Ryu Please patience I will merge when I have time.

Yorick-Ryu added 2 commits January 28, 2026 17:02

feat: Implement local API server for audio transcription with detaile…

5f97085

…d segment output

feat: Add CORS to the API server and introduce local API enablement a…

87fb4df

…nd port settings.

Yorick-Ryu force-pushed the main branch from e7a5876 to 87fb4df Compare January 28, 2026 09:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Implement OpenAI style local API server for audio transcription #509

feat: Implement OpenAI style local API server for audio transcription #509

Uh oh!

Yorick-Ryu commented Jan 2, 2026 •

edited

Loading

Uh oh!

Yorick-Ryu commented Jan 5, 2026

Uh oh!

cjpais commented Jan 5, 2026 •

edited

Loading

Uh oh!

Yorick-Ryu commented Jan 6, 2026

Uh oh!

samiulazam commented Jan 15, 2026

Uh oh!

cjpais commented Jan 15, 2026

Uh oh!

Aaryan-Kapoor commented Jan 22, 2026

Uh oh!

FSerg commented Jan 27, 2026

Uh oh!

Yorick-Ryu commented Jan 28, 2026

Uh oh!

cjpais commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

feat: Implement OpenAI style local API server for audio transcription #509

Are you sure you want to change the base?

feat: Implement OpenAI style local API server for audio transcription #509

Uh oh!

Conversation

Yorick-Ryu commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Before Submitting This PR

Human Written Description

Related Issues/Discussions

Community Feedback

Testing

Screenshots/Videos (if applicable)

Uh oh!

Yorick-Ryu commented Jan 5, 2026

Uh oh!

cjpais commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Yorick-Ryu commented Jan 6, 2026

Uh oh!

samiulazam commented Jan 15, 2026

Uh oh!

cjpais commented Jan 15, 2026

Uh oh!

Aaryan-Kapoor commented Jan 22, 2026

Uh oh!

FSerg commented Jan 27, 2026

Uh oh!

Yorick-Ryu commented Jan 28, 2026

Uh oh!

cjpais commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Yorick-Ryu commented Jan 2, 2026 •

edited

Loading

cjpais commented Jan 5, 2026 •

edited

Loading