-
Notifications
You must be signed in to change notification settings - Fork 2.6k
feat: integrate Google Cloud and Azure Text-to-Speech services #6828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add TTS provider interface and implementations for native, Google Cloud, and Azure - Create TtsManager to coordinate between different TTS providers - Update UI to allow provider selection and configuration - Add settings for API keys and provider-specific configuration - Maintain backward compatibility with existing native TTS functionality Fixes #6827
| } = state | ||
|
|
||
| // Initialize TTS manager with provider configuration | ||
| await initializeTts({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the TTS initialization block, consider adding additional logging or error-handling around the initializeTts() call so that any configuration issues (e.g. missing API keys) are clearly logged.
| await updateGlobalState("googleCloudTtsApiKey", googleCloudApiKey) | ||
| // Re-initialize TTS with new config | ||
| const gcState = await provider.getState() | ||
| await initializeTts({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For new TTS configuration message handlers (e.g. for 'ttsProvider', 'googleCloudTtsApiKey', etc.), consider wrapping the re-initialization calls to initializeTts() in try/catch blocks. This will help prevent an unhandled error from crashing the handler and will allow a user‐friendly error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed my own code and found bugs I introduced 5 minutes ago. Classic.
| ttsSpeed: z.number().optional(), | ||
| ttsProvider: z.enum(["native", "google-cloud", "azure"]).optional(), | ||
| ttsVoice: z.string().optional(), | ||
| googleCloudTtsApiKey: z.string().optional(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Security concern: These API keys should be stored using VSCode's SecretStorage API instead of global state. Storing sensitive credentials in global state could expose them through settings sync or exports.
Consider moving these to SecretStorage instead of adding them to the global settings schema.
|
|
||
| try { | ||
| if (!this.activeProvider) { | ||
| await this.setActiveProvider("native") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing error recovery here. If the active provider is null and setActiveProvider fails, we should have a fallback mechanism. Could we try falling back to native TTS instead of silently failing?
| await provider.postStateToWebview() | ||
| if (message.value !== undefined) { | ||
| Terminal.setShellIntegrationTimeout(message.value) | ||
| Terminal.setShellIntegrationTimeout(Number(message.value)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type coercion without validation could result in NaN values. Should we validate the input before converting to ensure we don't store invalid numbers?
| input: { text }, | ||
| voice: { | ||
| languageCode: "en-US", | ||
| name: options?.voice || "en-US-Neural2-F", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded default voice might not be available. Should we validate against available voices first or handle the error if this voice doesn't exist?
| const path = require("path") | ||
| const os = require("os") | ||
|
|
||
| const tempFile = path.join(os.tmpdir(), `tts-${Date.now()}.mp3`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential memory leak if an error occurs between creating the temp file and deleting it. Consider using a try-finally block to ensure cleanup happens even if playback fails.
|
|
||
| return ( | ||
| <div className="flex flex-col gap-4 p-4 border border-vscode-panel-border rounded"> | ||
| <h4 className="text-sm font-semibold">Google Cloud TTS Configuration</h4> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing i18n support. These hardcoded strings should use the translation system for consistency with the rest of the application.
|
|
||
| return ( | ||
| <div className="flex flex-col gap-4 p-4 border border-vscode-panel-border rounded"> | ||
| <h4 className="text-sm font-semibold">Azure Speech Services Configuration</h4> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing i18n support here as well. All user-facing strings should be translatable.
|
Closing, the author of the issue will implement it |
This PR implements integration with Google Cloud Text-to-Speech and Microsoft Azure Speech Services as requested in #6827.
Summary of Changes
Core Implementation
saypackage)TtsManagerto coordinate between different providers and handle provider switchingUI Updates
Settings & Configuration
Key Features
Testing
Configuration Required
For Google Cloud TTS:
For Azure Speech Services:
Fixes #6827
Important
Integrates Google Cloud and Azure TTS services with a provider-based architecture, UI updates for configuration, and secure handling of API keys.
NativeTtsProvider,GoogleCloudTtsProvider, andAzureTtsProviderinTtsManager.TtsManagerto manage providers and handle TTS operations.NotificationSettings.tsx.GoogleCloudTtsSettings.tsxandAzureTtsSettings.tsxfor provider-specific configurations.global-settings.tsto include TTS provider configurations and API keys.webviewMessageHandler.tsto handle TTS-related settings and provider initialization.This description was created by
for 4112aa6. You can customize this summary. It will automatically update as commits are pushed.