Replies: 1 comment 3 replies
-
|
Would be a great addition! And would be great to see a great DSL for creating SSML like RubyLLM::Schema. Feel free to open an issue about both so we can discuss the design of both before jumping in the code. The speak interface otherwise looks nice, except it would be nice to autodetect SSML in order to not needing to specify a format! |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
RubyLLM already supports audio transcription (speech-to-text) via
RubyLLM.transcribe, which is great. I'm wondering if there are any plans to add the reverse direction — text-to-speech (TTS).Both OpenAI and Azure expose TTS APIs that follow patterns similar to the chat/embedding APIs RubyLLM already wraps. Right now I have to drop out of RubyLLM to generate speech audio, using either the
ruby-openaigem or raw Faraday calls depending on the provider.OpenAI TTS (via ruby-openai)
Azure Speech Services (via raw Faraday)
What a RubyLLM API could look like
Something like this would be a natural complement to
transcribe:Why SSML matters
Both Azure and OpenAI support SSML to varying degrees, and it's essential for any non-trivial TTS use case — podcast-style multi-voice audio, controlled pacing, emphasis, etc. Azure in particular expects SSML as its native input format. Supporting a
format: :ssmloption (or auto-detecting<speak>tags) would make RubyLLM viable for these real-world scenarios without falling back to raw HTTP.Context
I'm using RubyLLM for chat (works great with both OpenAI and Azure Foundry), but for TTS I still need a separate client. It would be nice to consolidate everything under one library, especially since RubyLLM already handles the provider abstraction, configuration, and token tracking so well.
Is this something on the roadmap? Happy to help if there's interest.
Beta Was this translation helpful? Give feedback.
All reactions