Skip to content

Add ElevenLabs as a Provider for Speech to Text#991

Merged
iamdharmesh merged 8 commits intodevelopfrom
feature/986
Sep 8, 2025
Merged

Add ElevenLabs as a Provider for Speech to Text#991
iamdharmesh merged 8 commits intodevelopfrom
feature/986

Conversation

@dkotter
Copy link
Collaborator

@dkotter dkotter commented Aug 27, 2025

Description of the Change

At the moment we only have a single Provider for the Audio Transcripts Generation Feature (Speech to Text), and that is OpenAI. This PR brings in ElevenLabs as a Provider there.

No functional changes have been made to this Feature, it just now supports another Provider. One benefit of ElevenLabs is they have a free tier that supports 2 hours 30 minutes of Speech to Text each month, so if you have limited transcription needs, you may be able to use ElevenLabs and not have any cost.

They also seem to support multiple languages better than OpenAI, so that could also be a good reason to use them.

Note: probably worth a followup to add ElevenLabs as a Text to Speech Provider once this is merged in, as that will be pretty easy at that point.

Closes #986

How to test the Change

  1. If you don't have an ElevenLabs account, you'll need to sign up for one (it is free)
  2. In your account, create an API key and ensure you have the Speech to Text endpoint enabled and Read access to the Models information (which we use to validate the connection)
  3. Turn on Audio Transcripts Generation, select ElevenLabs and enter your API key
  4. Save settings and ensure no error happens
  5. If desired, change the model it uses (there are only two available right now)
  6. Upload a new audio file and ensure the transcription is generated and stored in the Description field
  7. In an existing audio file, ensure you can generate a transcription both from the media modal and the single edit view

Changelog Entry

Added - ElevenLabs as a new Provider for the Audio Transcripts Generation Feature (Speech to Text)
Changed - Moved some methods from the OpenAI Speech to Text Provider class to the Audio Transcripts Generation Feature class, to avoid code duplication. If relying on those methods, please update your code

Credits

Props @dkotter, @jeffpaul

Checklist:

@dkotter dkotter added this to the 3.7.0 milestone Aug 27, 2025
@dkotter dkotter self-assigned this Aug 27, 2025
@dkotter dkotter requested review from a team and jeffpaul as code owners August 27, 2025 22:23
@github-actions github-actions bot added the needs:code-review This requires code review. label Aug 27, 2025
Copy link
Member

@iamdharmesh iamdharmesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks great and tests well. Thanks for adding this @dkotter

@iamdharmesh iamdharmesh merged commit 6f7845f into develop Sep 8, 2025
20 checks passed
@iamdharmesh iamdharmesh deleted the feature/986 branch September 8, 2025 12:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs:code-review This requires code review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add ElevenLabs' Scribe as a Speech-to-Text service provider

2 participants