feat: integrate Google Cloud and Azure Text-to-Speech services #6828

roomote · 2025-08-07T23:28:03Z

This PR implements integration with Google Cloud Text-to-Speech and Microsoft Azure Speech Services as requested in #6827.

Summary of Changes

Core Implementation

Created a provider-based architecture for TTS services with a common interface
Implemented three TTS providers:
- Native (existing OS-based TTS using the say package)
- Google Cloud Text-to-Speech
- Microsoft Azure Speech Services
Added TtsManager to coordinate between different providers and handle provider switching

UI Updates

Added provider selection dropdown in notification settings
Created configuration UI components for Google Cloud and Azure settings
Added secure input fields for API keys and configuration parameters

Settings & Configuration

Extended global settings to include TTS provider configuration
Added support for storing API keys and provider-specific settings
Updated message handlers to process new TTS-related settings

Key Features

Backward Compatibility: Default to native TTS, ensuring existing functionality remains intact
Provider Flexibility: Users can switch between providers based on their needs
Secure Configuration: API keys are handled securely through VSCode's state management
Voice Selection: Support for listing and selecting available voices from cloud providers
Error Handling: Graceful fallback to native TTS if cloud providers fail

Testing

All existing tests pass
Linting and type checking completed successfully
Manual testing recommended for cloud provider integration

Configuration Required

For Google Cloud TTS:

Enable Text-to-Speech API in Google Cloud Console
Create an API key
Enter credentials in settings

For Azure Speech Services:

Create Speech Services resource in Azure Portal
Copy subscription key and region
Enter credentials in settings

Fixes #6827

Important

Integrates Google Cloud and Azure TTS services with a provider-based architecture, UI updates for configuration, and secure handling of API keys.

Core Implementation:
- Introduces a provider-based architecture for TTS services with a common interface.
- Implements NativeTtsProvider, GoogleCloudTtsProvider, and AzureTtsProvider in TtsManager.
- Adds TtsManager to manage providers and handle TTS operations.
UI Updates:
- Adds TTS provider selection dropdown in NotificationSettings.tsx.
- Implements GoogleCloudTtsSettings.tsx and AzureTtsSettings.tsx for provider-specific configurations.
Settings & Configuration:
- Extends global-settings.ts to include TTS provider configurations and API keys.
- Updates webviewMessageHandler.ts to handle TTS-related settings and provider initialization.
Key Features:
- Ensures backward compatibility by defaulting to native TTS.
- Allows users to switch between TTS providers.
- Securely handles API keys using VSCode's state management.
- Supports voice selection from cloud providers.
- Provides error handling with fallback to native TTS if cloud providers fail.

^{This description was created by}^{for 4112aa6. You can customize this summary. It will automatically update as commits are pushed.}

- Add TTS provider interface and implementations for native, Google Cloud, and Azure - Create TtsManager to coordinate between different TTS providers - Update UI to allow provider selection and configuration - Add settings for API keys and provider-specific configuration - Maintain backward compatibility with existing native TTS functionality Fixes #6827

ellipsis-dev · 2025-08-07T23:30:43Z

src/core/webview/ClineProvider.ts

+			} = state
+
+			// Initialize TTS manager with provider configuration
+			await initializeTts({


In the TTS initialization block, consider adding additional logging or error-handling around the initializeTts() call so that any configuration issues (e.g. missing API keys) are clearly logged.

ellipsis-dev · 2025-08-07T23:30:44Z

src/core/webview/webviewMessageHandler.ts

+			await updateGlobalState("googleCloudTtsApiKey", googleCloudApiKey)
+			// Re-initialize TTS with new config
+			const gcState = await provider.getState()
+			await initializeTts({


For new TTS configuration message handlers (e.g. for 'ttsProvider', 'googleCloudTtsApiKey', etc.), consider wrapping the re-initialization calls to initializeTts() in try/catch blocks. This will help prevent an unhandled error from crashing the handler and will allow a user‐friendly error message.

roomote

I reviewed my own code and found bugs I introduced 5 minutes ago. Classic.

roomote · 2025-08-07T23:32:51Z

packages/types/src/global-settings.ts

 	ttsSpeed: z.number().optional(),
+	ttsProvider: z.enum(["native", "google-cloud", "azure"]).optional(),
+	ttsVoice: z.string().optional(),
+	googleCloudTtsApiKey: z.string().optional(),


Security concern: These API keys should be stored using VSCode's SecretStorage API instead of global state. Storing sensitive credentials in global state could expose them through settings sync or exports.

Consider moving these to SecretStorage instead of adding them to the global settings schema.

roomote · 2025-08-07T23:32:52Z

src/services/tts/TtsManager.ts

+
+		try {
+			if (!this.activeProvider) {
+				await this.setActiveProvider("native")


Missing error recovery here. If the active provider is null and setActiveProvider fails, we should have a fallback mechanism. Could we try falling back to native TTS instead of silently failing?

roomote · 2025-08-07T23:32:52Z

src/core/webview/webviewMessageHandler.ts

 			await provider.postStateToWebview()
 			if (message.value !== undefined) {
-				Terminal.setShellIntegrationTimeout(message.value)
+				Terminal.setShellIntegrationTimeout(Number(message.value))


Type coercion without validation could result in NaN values. Should we validate the input before converting to ensure we don't store invalid numbers?

roomote · 2025-08-07T23:32:52Z

src/services/tts/providers/google-cloud.ts

+				input: { text },
+				voice: {
+					languageCode: "en-US",
+					name: options?.voice || "en-US-Neural2-F",


Hardcoded default voice might not be available. Should we validate against available voices first or handle the error if this voice doesn't exist?

roomote · 2025-08-07T23:32:52Z

src/services/tts/providers/google-cloud.ts

+				const path = require("path")
+				const os = require("os")
+
+				const tempFile = path.join(os.tmpdir(), `tts-${Date.now()}.mp3`)


Potential memory leak if an error occurs between creating the temp file and deleting it. Consider using a try-finally block to ensure cleanup happens even if playback fails.

roomote · 2025-08-07T23:32:52Z

webview-ui/src/components/settings/GoogleCloudTtsSettings.tsx

+
+	return (
+		<div className="flex flex-col gap-4 p-4 border border-vscode-panel-border rounded">
+			<h4 className="text-sm font-semibold">Google Cloud TTS Configuration</h4>


Missing i18n support. These hardcoded strings should use the translation system for consistency with the rest of the application.

roomote · 2025-08-07T23:32:52Z

webview-ui/src/components/settings/AzureTtsSettings.tsx

+
+	return (
+		<div className="flex flex-col gap-4 p-4 border border-vscode-panel-border rounded">
+			<h4 className="text-sm font-semibold">Azure Speech Services Configuration</h4>


Missing i18n support here as well. All user-facing strings should be translatable.

daniel-lxs · 2025-08-11T17:07:27Z

Closing, the author of the issue will implement it

roomote bot requested review from cte, jr and mrubens as code owners August 7, 2025 23:28

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Aug 7, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Aug 7, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Aug 7, 2025

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request UI/UX UI/UX related or focused labels Aug 7, 2025

roomote bot mentioned this pull request Aug 7, 2025

Feature: Integrate Google Cloud and Microsoft Azure Text-to-Speech Services #6827

Closed

4 tasks

ellipsis-dev bot reviewed Aug 7, 2025

View reviewed changes

roomote bot commented Aug 7, 2025

View reviewed changes

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Aug 8, 2025

daniel-lxs closed this Aug 11, 2025

github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Aug 11, 2025

github-project-automation bot moved this from New to Done in Roo Code Roadmap Aug 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: integrate Google Cloud and Azure Text-to-Speech services #6828

feat: integrate Google Cloud and Azure Text-to-Speech services #6828

Uh oh!

roomote bot commented Aug 7, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

ellipsis-dev bot Aug 7, 2025

Uh oh!

ellipsis-dev bot Aug 7, 2025

Uh oh!

roomote bot left a comment

Uh oh!

roomote bot Aug 7, 2025

Uh oh!

roomote bot Aug 7, 2025

Uh oh!

roomote bot Aug 7, 2025

Uh oh!

roomote bot Aug 7, 2025

Uh oh!

roomote bot Aug 7, 2025

Uh oh!

roomote bot Aug 7, 2025

Uh oh!

roomote bot Aug 7, 2025

Uh oh!

daniel-lxs commented Aug 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: integrate Google Cloud and Azure Text-to-Speech services #6828

feat: integrate Google Cloud and Azure Text-to-Speech services #6828

Uh oh!

Conversation

roomote bot commented Aug 7, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of Changes

Core Implementation

UI Updates

Settings & Configuration

Key Features

Testing

Configuration Required

Uh oh!

ellipsis-dev bot Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot left a comment

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

daniel-lxs commented Aug 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

roomote bot commented Aug 7, 2025 •

edited by ellipsis-dev bot

Loading