Skip to content

Commit 6f7845f

Browse files
authored
Merge pull request #991 from 10up/feature/986
Add ElevenLabs as a Provider for Speech to Text
2 parents 3d147dd + f8c35c5 commit 6f7845f

File tree

15 files changed

+837
-96
lines changed

15 files changed

+837
-96
lines changed

README.md

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ Tap into leading cloud-based services like [OpenAI](https://openai.com/), [Micro
2222
* Expand or condense text content using [OpenAI's ChatGPT API](https://platform.openai.com/docs/guides/chat), [Microsoft Azure's OpenAI service](https://azure.microsoft.com/en-us/products/ai-services/openai-service), [Google's Gemini API](https://ai.google.dev/docs/gemini_api_overview), [xAI's Grok](https://x.ai/) or locally using [Ollama](https://ollama.com/)
2323
* Draft a full length article using [OpenAI's ChatGPT API](https://platform.openai.com/docs/guides/chat), [Microsoft Azure's OpenAI service](https://azure.microsoft.com/en-us/products/ai-services/openai-service) or locally using [Ollama](https://ollama.com/)
2424
* Generate new images on demand to use in-content or as a featured image using [OpenAI's Image Generation API](https://platform.openai.com/docs/guides/images-vision), [Google AI's Imagen API](https://ai.google.dev/gemini-api/docs/image-generation#imagen), [Together AI's API](https://docs.together.ai/docs/images-overview) or locally using [Stable Diffusion](https://github.com/AUTOMATIC1111/stable-diffusion-webui/)
25-
* Generate transcripts of audio files using [OpenAI's Audio Transcription API](https://platform.openai.com/docs/guides/speech-to-text)
25+
* Generate transcripts of audio files using [OpenAI's Audio Transcription API](https://platform.openai.com/docs/guides/speech-to-text) or [ElevenLabs Speech to Text API](https://elevenlabs.io/docs/capabilities/speech-to-text)
2626
* Moderate incoming comments for sensitive content using [OpenAI's Moderation API](https://platform.openai.com/docs/guides/moderation)
2727
* Convert text content into audio and output a "read-to-me" feature on the front-end to play this audio using [Microsoft Azure's Text to Speech API](https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/text-to-speech), [Amazon Polly](https://aws.amazon.com/polly/) or [OpenAI's Text to Speech API](https://platform.openai.com/docs/guides/text-to-speech)
2828
* Classify post content using [IBM Watson's Natural Language Understanding API](https://www.ibm.com/watson/services/natural-language-understanding/), [OpenAI's Embedding API](https://platform.openai.com/docs/guides/embeddings), [Microsoft Azure's OpenAI service](https://azure.microsoft.com/en-us/products/ai-services/openai-service) or locally using [Ollama](https://ollama.com/)
@@ -61,6 +61,7 @@ Tap into leading cloud-based services like [OpenAI](https://openai.com/), [Micro
6161
* [WordPress](http://wordpress.org) 6.7+
6262
* To utilize the NLU Language Processing functionality, you will need an active [IBM Watson](https://cloud.ibm.com/registration) account.
6363
* To utilize the ChatGPT, Embeddings, Text to Speech or Speech to Text Language Processing functionality or Image Generation functionality, you will need an active [OpenAI](https://platform.openai.com/signup) account.
64+
* To utilize the ElevenLabs Speech to Text Language Processing functionality, you will need an active [ElevenLabs](https://elevenlabs.io/sign-up) account.
6465
* To utilize the Azure AI Vision Image Processing functionality or Text to Speech Language Processing functionality, you will need an active [Microsoft Azure](https://signup.azure.com/signup) account.
6566
* To utilize the Azure OpenAI Language Processing functionality, you will need an active [Microsoft Azure](https://signup.azure.com/signup) account and you will need to [apply](https://aka.ms/oai/access) for OpenAI access.
6667
* To utilize the Google Gemini Language Processing functionality or Image Generation functionality, you will need an active [Google Gemini](https://ai.google.dev/tutorials/setup) account.
@@ -371,6 +372,36 @@ Note that [OpenAI](https://platform.openai.com/docs/guides/speech-to-text) can c
371372
* Upload a new audio file.
372373
* Check to make sure the transcript was stored in the Description field.
373374

375+
## Set Up Audio Transcripts Generation (via ElevenLabs Speech to Text)
376+
377+
Note that [ElevenLabs](https://elevenlabs.io/docs/capabilities/speech-to-text) can create a transcript for audio files that meet the following requirements:
378+
379+
* The file must be presented in mp3, mp4, mpeg, wav, or ogg format
380+
* The file size must be less than 100 megabytes (MB)
381+
382+
### 1. Sign up for ElevenLabs
383+
384+
* [Sign up for an ElevenLabs account](https://elevenlabs.io/sign-up) or sign into your existing one.
385+
* Log into your account and go to the [API key page](https://elevenlabs.io/app/developers/api-keys).
386+
* Click `Create Key` create a new API key and ensure you turn on access to the Speech to Text endpoint and turn on Read access to the Models endpoint.
387+
388+
### 2. Configure ElevenLabs API Keys under Tools > ClassifAI > Language Processing > Audio Transcripts Generation > Settings
389+
390+
* Select **ElevenLabs Audio Transcription** in the Provider dropdown.
391+
* Enter your API Key copied from the above step into the `API Key` field.
392+
* Select the model you want to use for the transcription after saving and verifying the connection.
393+
394+
### 3. Enable specific features
395+
396+
* Choose to enable the ability to automatically generate transcripts from supported audio files.
397+
* Choose which user roles have access to this ability.
398+
* Save settings. An error will show if API authentication fails.
399+
400+
### 4. Upload a new audio file
401+
402+
* Upload a new audio file.
403+
* Check to make sure the transcript was stored in the Description field.
404+
374405
## Set Up Text to Speech (via Microsoft Azure)
375406

376407
### 1. Sign up for Azure services

includes/Classifai/Features/AudioTranscriptsGeneration.php

Lines changed: 88 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,17 @@
33
namespace Classifai\Features;
44

55
use Classifai\Services\LanguageProcessing;
6-
use Classifai\Providers\OpenAI\SpeechToText;
6+
use Classifai\Providers\OpenAI\SpeechToText as OpenAISpeechToText;
7+
use Classifai\Providers\ElevenLabs\SpeechToText as ElevenLabsSpeechToText;
78
use WP_Error;
89
use WP_REST_Server;
910
use WP_REST_Request;
1011

1112
use function Classifai\get_asset_info;
1213
use function Classifai\clean_input;
1314
use function Classifai\safe_wp_remote_get;
15+
use function Classifai\is_remote_url;
16+
use function Classifai\is_local_path;
1417

1518
/**
1619
* Class AudioTranscriptsGeneration
@@ -34,7 +37,8 @@ public function __construct() {
3437

3538
// Contains just the providers this feature supports.
3639
$this->supported_providers = [
37-
SpeechToText::ID => __( 'OpenAI Audio Transcription', 'classifai' ),
40+
OpenAISpeechToText::ID => __( 'OpenAI Audio Transcription', 'classifai' ),
41+
ElevenLabsSpeechToText::ID => __( 'ElevenLabs Audio Transcription', 'classifai' ),
3842
];
3943
}
4044

@@ -271,7 +275,7 @@ public function get_enable_description(): string {
271275
*/
272276
public function get_feature_default_settings(): array {
273277
return [
274-
'provider' => SpeechToText::ID,
278+
'provider' => OpenAISpeechToText::ID,
275279
];
276280
}
277281

@@ -470,6 +474,87 @@ public static function remote_url_to_path( string $url ) {
470474
return $temp_file_path;
471475
}
472476

477+
/**
478+
* Generates a transcript from a given attachment ID.
479+
*
480+
* Validates that the current user can edit the attachment,
481+
* ensures the feature is enabled, and checks whether the attachment
482+
* meets the processing criteria (e.g., correct file type and size).
483+
*
484+
* @param int $attachment_id Attachment post ID.
485+
* @param array $args Optional arguments to pass to the route.
486+
* @return string|WP_Error Transcription result on success, or WP_Error on failure.
487+
*/
488+
public function transcribe_from_attachment( int $attachment_id = 0, array $args = [] ) {
489+
if ( $attachment_id && ! current_user_can( 'edit_post', $attachment_id ) && ( ! defined( 'WP_CLI' ) || ! WP_CLI ) ) {
490+
return new WP_Error( 'no_permission', esc_html__( 'User does not have permission to edit this attachment.', 'classifai' ) );
491+
}
492+
493+
if ( ! $this->is_feature_enabled() ) {
494+
return new WP_Error( 'not_enabled', esc_html__( 'Transcript generation is disabled. Please check your settings.', 'classifai' ) );
495+
}
496+
497+
if ( ! $this->should_process( $attachment_id ) ) {
498+
return new WP_Error( 'process_error', esc_html__( 'Attachment does not meet processing requirements. Ensure the file type and size meet requirements.', 'classifai' ) );
499+
}
500+
501+
$settings = $this->get_settings();
502+
$provider_id = $settings['provider'];
503+
$provider_instance = $this->get_feature_provider_instance( $provider_id );
504+
505+
if ( ! $provider_instance || ! method_exists( $provider_instance, 'transcribe_audio' ) ) {
506+
return new WP_Error( 'provider_error', esc_html__( 'Provider instance not found.', 'classifai' ) );
507+
}
508+
509+
return $provider_instance->transcribe_audio(
510+
get_attached_file( $attachment_id ),
511+
array_merge( $args, array( 'attachment_id' => $attachment_id ) )
512+
);
513+
}
514+
515+
/**
516+
* Generates a transcript from a file path or remote URL.
517+
*
518+
* If the path is a remote URL, it is downloaded to a temporary
519+
* location and deleted after processing. If it's a local path
520+
* and the file exists, it is processed directly.
521+
*
522+
* @param string $path Absolute local path or remote URL to an audio file.
523+
* @param array $args Optional arguments to pass to the route.
524+
* @return string|WP_Error Transcription result on success, or WP_Error on failure.
525+
*/
526+
public function transcribe_from_path( string $path, array $args = [] ) {
527+
$settings = $this->get_settings();
528+
$provider_id = $settings['provider'];
529+
$provider_instance = $this->get_feature_provider_instance( $provider_id );
530+
531+
if ( ! $provider_instance || ! method_exists( $provider_instance, 'transcribe_audio' ) ) {
532+
return new WP_Error( 'provider_error', esc_html__( 'Provider instance not found.', 'classifai' ) );
533+
}
534+
535+
$result = '';
536+
537+
if ( is_remote_url( $path ) ) {
538+
$temp_file_path = self::remote_url_to_path( $path );
539+
540+
if ( is_wp_error( $temp_file_path ) ) {
541+
return $temp_file_path;
542+
}
543+
544+
$result = $provider_instance->transcribe_audio( $temp_file_path, $args );
545+
wp_delete_file( $temp_file_path );
546+
} elseif ( is_local_path( $path ) ) {
547+
if ( file_exists( $path ) ) {
548+
return $provider_instance->transcribe_audio( $path, $args );
549+
550+
} else {
551+
return $result;
552+
}
553+
}
554+
555+
return $result;
556+
}
557+
473558
/**
474559
* Generates feature setting data required for migration from
475560
* ClassifAI < 3.0.0 to 3.0.0
Lines changed: 213 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,213 @@
1+
<?php
2+
/**
3+
* ElevenLabs shared functionality
4+
*/
5+
6+
namespace Classifai\Providers\ElevenLabs;
7+
8+
use WP_Error;
9+
10+
use function Classifai\safe_wp_remote_get;
11+
use function Classifai\safe_wp_remote_post;
12+
13+
trait ElevenLabs {
14+
15+
/**
16+
* ElevenLabs base API URL
17+
*
18+
* @var string
19+
*/
20+
protected $api_url = 'https://api.elevenlabs.io/v1';
21+
22+
/**
23+
* ElevenLabs model path
24+
*
25+
* @var string
26+
*/
27+
protected $model_path = 'models';
28+
29+
/**
30+
* Build the API URL.
31+
*
32+
* @param string $path The path to the API endpoint.
33+
* @return string
34+
*/
35+
public function get_api_url( string $path = '' ): string {
36+
/**
37+
* Filter the ElevenLabs API URL.
38+
*
39+
* @since x.x.x
40+
* @hook classifai_elevenlabs_api_url
41+
*
42+
* @param string $url The default API URL.
43+
* @param string $path The path to the API endpoint.
44+
*
45+
* @return string The API URL.
46+
*/
47+
return apply_filters( 'classifai_elevenlabs_api_url', trailingslashit( $this->api_url ) . $path, $path );
48+
}
49+
50+
/**
51+
* Make a request to the ElevenLabs API.
52+
*
53+
* Note instead of adding a new APIRequest class like we do elsewhere,
54+
* doing a lightweight version of that here instead. The goal is to
55+
* replace this with a more robust APIRequest class in the future,
56+
* based on the PHP AI SDK.
57+
*
58+
* @param string $url The URL for the request.
59+
* @param string $api_key The API key.
60+
* @param string $type The type of request.
61+
* @param array $options The options for the request.
62+
* @return array|WP_Error
63+
*/
64+
public function request( string $url, string $api_key = '', string $type = 'post', array $options = [] ) {
65+
/**
66+
* Filter the URL for the request.
67+
*
68+
* @since x.x.x
69+
* @hook classifai_elevenlabs_api_request_url
70+
*
71+
* @param string $url The URL for the request.
72+
* @param array $options The options for the request.
73+
*
74+
* @return string The URL for the request.
75+
*/
76+
$url = apply_filters( 'classifai_elevenlabs_api_request_url', $url, $options );
77+
78+
// Set our default options.
79+
$options = wp_parse_args(
80+
$options,
81+
[
82+
'timeout' => 90, // phpcs:ignore WordPressVIPMinimum.Performance.RemoteRequestTimeout.timeout_timeout
83+
]
84+
);
85+
86+
/**
87+
* Filter the options for the request.
88+
*
89+
* @since x.x.x
90+
* @hook classifai_elevenlabs_api_request_options
91+
*
92+
* @param array $options The options for the request.
93+
* @param string $url The URL for the request.
94+
*
95+
* @return array The options for the request.
96+
*/
97+
$options = apply_filters( 'classifai_elevenlabs_api_request_options', $options, $url );
98+
99+
// Set our default headers.
100+
if ( empty( $options['headers'] ) ) {
101+
$options['headers'] = [];
102+
}
103+
104+
if ( ! isset( $options['headers']['xi-api-key'] ) ) {
105+
$options['headers']['xi-api-key'] = $api_key;
106+
}
107+
108+
if ( ! isset( $options['headers']['Content-Type'] ) ) {
109+
$options['headers']['Content-Type'] = 'application/json';
110+
}
111+
112+
// Make the request.
113+
if ( 'post' === $type ) {
114+
$response = safe_wp_remote_post( $url, $options );
115+
} else {
116+
$response = safe_wp_remote_get( $url, $options );
117+
}
118+
119+
// Parse out the response.
120+
if ( is_wp_error( $response ) ) {
121+
return $response;
122+
}
123+
124+
$body = wp_remote_retrieve_body( $response );
125+
$code = wp_remote_retrieve_response_code( $response );
126+
$json = json_decode( $body, true );
127+
128+
if ( 200 !== $code ) {
129+
if ( isset( $json['detail']['message'] ) ) {
130+
return new WP_Error( $json['detail']['status'] ?? $code, $json['detail']['message'] ?? esc_html__( 'An error occurred', 'classifai' ) );
131+
} else {
132+
return new WP_Error( $code, esc_html__( 'An error occurred', 'classifai' ) );
133+
}
134+
}
135+
136+
if ( json_last_error() === JSON_ERROR_NONE ) {
137+
if ( empty( $json['error'] ) ) {
138+
return $json;
139+
} else {
140+
$message = $json['error']['message'] ?? esc_html__( 'An error occurred', 'classifai' );
141+
return new WP_Error( $code, $message );
142+
}
143+
} elseif ( ! empty( wp_remote_retrieve_response_message( $response ) ) ) {
144+
return new WP_Error( $code, wp_remote_retrieve_response_message( $response ) );
145+
} else {
146+
return new WP_Error( 'Invalid JSON: ' . json_last_error_msg(), $body );
147+
}
148+
}
149+
150+
/**
151+
* Sanitize the API key, showing an error message if needed.
152+
*
153+
* @param array $new_settings Incoming settings, if any.
154+
* @param array $settings Current settings, if any.
155+
* @return array
156+
*/
157+
public function sanitize_api_key_settings( array $new_settings = [], array $settings = [] ): array {
158+
$models = $this->get_models( $new_settings[ static::ID ]['api_key'] ?? '' );
159+
160+
$new_settings[ static::ID ]['authenticated'] = $settings[ static::ID ]['authenticated'];
161+
$new_settings[ static::ID ]['models'] = $settings[ static::ID ]['models'];
162+
163+
if ( is_wp_error( $models ) ) {
164+
$new_settings[ static::ID ]['authenticated'] = false;
165+
$new_settings[ static::ID ]['models'] = [];
166+
$error_message = $models->get_error_message();
167+
168+
add_settings_error(
169+
'api_key',
170+
'classifai-auth',
171+
$error_message,
172+
'error'
173+
);
174+
} else {
175+
$new_settings[ static::ID ]['authenticated'] = true;
176+
$new_settings[ static::ID ]['models'] = $models;
177+
}
178+
179+
$new_settings[ static::ID ]['api_key'] = sanitize_text_field( $new_settings[ static::ID ]['api_key'] ?? $settings[ static::ID ]['api_key'] );
180+
181+
return $new_settings;
182+
}
183+
184+
/**
185+
* Get the available models.
186+
*
187+
* @param string $api_key The API key.
188+
* @return array|WP_Error
189+
*/
190+
protected function get_models( string $api_key = '' ) {
191+
// Check that we have credentials before hitting the API.
192+
if ( empty( $api_key ) ) {
193+
return new WP_Error( 'auth', esc_html__( 'Please enter your ElevenLabs API key.', 'classifai' ) );
194+
}
195+
196+
$response = $this->request( $this->get_api_url( $this->model_path ), $api_key, 'get' );
197+
198+
if ( is_wp_error( $response ) ) {
199+
return $response;
200+
}
201+
202+
// Get the model data we need.
203+
$models = array_map(
204+
fn( $model ) => [
205+
'id' => $model['model_id'] ?? '',
206+
'display_name' => $model['name'] ?? '',
207+
],
208+
$response
209+
);
210+
211+
return $models;
212+
}
213+
}

0 commit comments

Comments
 (0)