feat(stage-ui): add abstract base interfaces for transcription and sp…#961
feat(stage-ui): add abstract base interfaces for transcription and sp…#961lockrush-dev wants to merge 48 commits intomoeru-ai:mainfrom
Conversation
…eech providers - Add BaseTranscriptionProviderDefinition and BaseSpeechProviderDefinition interfaces - Implement provider adapters to convert between ProviderMetadata and base interfaces - Add converter functions to integrate base providers with existing providers store - Implement OpenAI and OpenAI Compatible providers using new base interfaces - Refactor OpenAI providers to use structured provider definitions This architectural improvement provides a consistent contract for transcription and speech providers, making it easier to add new providers and reduce code duplication in validation and configuration logic.
Summary of ChangesHello @lockrush-dev, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the provider architecture within the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a solid architectural improvement by creating base interfaces for transcription and speech providers. The refactoring of OpenAI and OpenAI-compatible providers to use these new interfaces significantly reduces code duplication in the main providers.ts store and improves consistency. The use of converter functions to bridge the new provider definitions with the existing metadata-based system is a good integration strategy.
My review includes a few suggestions to further enhance maintainability and type safety:
- Removing newly added but unused adapter files.
- Replacing
as anytype assertions with more specific types in the converter functions to improve type safety. - Consolidating duplicated URL normalization logic into a shared utility.
packages/stage-ui/src/libs/providers/providers/openai-compatible/speech.ts
Outdated
Show resolved
Hide resolved
Resolved conflicts by keeping the new base provider interface implementation using convertSpeechProviderToMetadata and convertTranscriptionProviderToMetadata. Updated provider implementations to include the latest models and voices from upstream: - Added gpt-4o-mini-tts-2025-12-15 model to OpenAI Speech - Added gpt-4o-mini-transcribe-2025-12-15 and gpt-4o-transcribe-diarize to OpenAI Transcription - Updated voice compatibleModels to include the new model - Improved model descriptions
⏳ Approval required for deploying to Cloudflare Workers (Preview) for stage-web.
Hey, @nekomeowww, @sumimakito, @luoling8192, @LemonNekoGH, kindly take some time to review and approve this deployment when you are available. Thank you! 🙏 |
Remove createSpeechProviderAdapter and createTranscriptionProviderAdapter as they are not used anywhere in the codebase. These adapters were intended to convert from ProviderMetadata to BaseProviderDefinition, but all current providers are implemented directly using the base interfaces. The adapters can be re-added later if needed for migrating legacy providers.
- Extract normalizeBaseUrl to shared utility to reduce duplication - Fix type safety in converters.ts by using specific types instead of 'as any' - Remove redundant async/await wrappers in converter functions - Update all provider implementations to use shared normalizeBaseUrl utility This addresses code review feedback about code duplication and type safety.
|
Since you've completed so many tasks about speech & transcription API, would you love to try extending and try using this https://github.com/n1n-api/airi/tree/feat/n1n-provider/packages/stage-ui/src/libs/providers/providers pattern to |
I can certainly try! I'll try to take care of that as a part of this pull request. |
…ription providers - Create registry-speech.ts and registry-transcription.ts with defineSpeechProvider() and defineTranscriptionProvider() helpers - Refactor OpenAI and OpenAI-compatible speech/transcription providers to use new registry pattern - Export registry functions (listSpeechProviders, listTranscriptionProviders, etc.) for programmatic discovery - Follows same pattern as existing defineProvider() for chat providers This provides a consistent architecture across all provider types and enables centralized discovery of speech and transcription providers.
- Fix undefined title type errors in settings.vue layout (WindowTitleBar and PageHeader) - Ensure title is always a string in settings index page (IconItem) - Remove duplicate 'li' property in docs/uno.config.ts These fixes resolve CI build failures in stage-tamagotchi typecheck.
- Ensure routeHeaderMetadata.title is always a string when routeHeaderMetadata exists - Add conditional check in template to only render PageHeader when title exists - Fixes 'string | undefined' is not assignable to type 'string' error
- Keep both 'ul' and 'li' styles in docs/uno.config.ts - Auto-merged settings.vue and index.vue files (conflicts resolved automatically) - All TypeScript fixes preserved
…fineProvider pattern Refactor OpenAI and OpenAI-compatible speech/transcription providers to use the unified `defineProvider()` pattern (matching PR moeru-ai#968 n1n provider pattern) instead of separate registries. This unifies all providers under a single consistent API. Changes: - Refactor OpenAI speech/transcription providers to use `defineProvider()` with `tasks` and `extraMethods` instead of `defineSpeechProvider`/`defineTranscriptionProvider` - Refactor OpenAI-compatible speech/transcription providers similarly - Add `convertProviderDefinitionToMetadata()` converter function to bridge unified pattern with existing store's `ProviderMetadata` format - Update store to use new converter function instead of old separate converters - Mark old base interfaces and registry exports as deprecated for backward compatibility - Fix settings index page to filter out routes with empty titles (preventing empty menu items from rendering) BREAKING CHANGE: Speech and transcription providers now use the unified `defineProvider()` pattern. Old `defineSpeechProvider` and `defineTranscriptionProvider` functions are deprecated but still available for backward compatibility. Refs: PR moeru-ai#961, PR moeru-ai#968
…fineProvider pattern Refactor OpenAI and OpenAI-compatible speech/transcription providers to use the unified `defineProvider()` pattern (matching PR moeru-ai#968 n1n provider pattern) instead of separate registries. This unifies all providers under a single consistent API. Changes: - Refactor OpenAI speech/transcription providers to use `defineProvider()` with `tasks` and `extraMethods` instead of `defineSpeechProvider`/`defineTranscriptionProvider` - Refactor OpenAI-compatible speech/transcription providers similarly - Add `convertProviderDefinitionToMetadata()` converter function to bridge unified pattern with existing store's `ProviderMetadata` format - Update store to use new converter function instead of old separate converters - Mark old base interfaces and registry exports as deprecated for backward compatibility - Fix settings index page to filter out routes with empty titles (preventing empty menu items from rendering) BREAKING CHANGE: Speech and transcription providers now use the unified `defineProvider()` pattern. Old `defineSpeechProvider` and `defineTranscriptionProvider` functions are deprecated but still available for backward compatibility. Refs: PR moeru-ai#961, PR moeru-ai#968
66beb66 to
c58ccbd
Compare
- Fix t function parameter type to use ComposerTranslation - Add wrapper function to adapt ComposerTranslation signature - Fix createProvider return type casting - Remove unused imports
…in converters - Add explicit type cast for contextOptions in getValidatorsOfProvider call - Ensure t is properly typed as ComposerTranslation when passed to validators
Prefill base URL for openai-audio-speech and openai-audio-transcription so settings pages load with https://api.openai.com/v1/ by default.
Document that hard-coded default models in speech/hearing stores should be moved to provider-owned defaults (metadata/schema).
|
@gemini-code-assist review |
There was a problem hiding this comment.
Code Review
This pull request is an excellent and significant refactoring that unifies speech and transcription providers under the defineProvider() pattern. This greatly improves architectural consistency, reduces code duplication, and enhances maintainability. The changes are well-structured, and the use of a converter function to bridge the new provider definitions with the existing store is a smart approach for gradual migration.
My review includes a few suggestions for improvement: one for code simplification in a Vue component, and a high-severity comment on a potential fragility in the new provider converter that could affect future development.
Avoid passing '{}' into extraMethods and simplify settings PageHeader bindings.
This comment was marked as outdated.
This comment was marked as outdated.
…nified defineProvider pattern - Migrate all 7 speech providers to unified defineProvider() pattern: - elevenlabs - deepgram-tts - microsoft-speech - index-tts-vllm - alibaba-cloud-model-studio - volcengine - player2-speech - Migrate all 2 transcription providers to unified defineProvider() pattern: - browser-web-speech-api - aliyun-nls-transcription - Fix convertProviderDefinitionToMetadata bug where nameKey/descriptionKey were storing translated strings instead of i18n keys. Use identity function (keyExtractor) instead of translator (tWrapper) to extract i18n key strings. - Update providers.ts to use convertProviderDefinitionToMetadata for all migrated providers instead of old pattern. - Register all migrated providers in providers/index.ts This completes the migration requested by Niko in PR moeru-ai#968 to unify provider patterns across the codebase.
- Fix import paths: change ProviderValidationResult from base-types to types
- Update validator signatures to accept contextOptions parameter
- Fix error format: use Array<{error, errorKey?}> and include reasonKey
- Fix type casting: use 'as unknown as' for VoiceProviderWithExtraOptions
- Fix Zod enum: use literal array instead of readonly const with errorMap
- Fix unused parameters: prefix with underscore (_config, _contextOptions)
- Fix type names: ElevenLabsConfig, DeepgramConfig, etc.
- Fix Microsoft Speech region handling with default value
|
@gemini-code-assist review |
There was a problem hiding this comment.
Code Review
This pull request significantly refactors the speech and transcription providers to align with the unified defineProvider() pattern, enhancing architectural consistency and reducing code duplication. The introduction of convertProviderDefinitionToMetadata() effectively bridges the new pattern with the existing store's ProviderMetadata format. Additionally, the changes correctly address the issue of empty menu items in the settings UI and improve model/voice selection consistency when switching providers. The overall direction of these changes is positive, leading to a more maintainable and extensible provider system.
…xy server guidance - Update default base URL from unspeech.hyp3r.link to api.elevenlabs.io/v1/ - Add migration logic to fix old incorrect base URLs (unspeech.hyp3r.link, api.elevenlabs.io/v2/) - Add informational Alert component explaining proxy server requirements for web browsers - Disable voice dropdown until API key is configured - Improve apiKeyConfigured validation to check for non-empty trimmed strings - Add documentation about CORS limitations and proxy server usage
…youts - Migrate comet-api-transcription and openai-compatible-audio-transcription to use TranscriptionProviderSettings wrapper - Ensure all providers use consistent side-by-side layout (settings 40%, playground 60%) - Move validation alerts to advanced-settings slot for better organization - Simplify credential management by leveraging wrapper components - All speech providers already use SpeechProviderSettings wrapper consistently - Maintain custom layouts for browser-web-speech-api and aliyun-nls-transcription due to specialized UIs
…m/lockrush-dev/airi into feat/provider-abstract-interfaces
…API model lists - Fix Live2D scale reactivity: restore scale from storeToRefs to maintain reactivity to store updates while allowing prop overrides - Fix icon field handling: restore ?? '' fallback in settings index pages to prevent undefined icon values - Add static model lists for CometAPI providers: - Speech: 9 TTS models (TTS, Kling TTS, GPT-4o variants, etc.) - Transcription: 12 STT models (Gemini variants, Whisper-1, GPT-4o variants) - Update CometAPI UI: add model dropdowns to both speech and transcription settings pages (replacing manual input for transcription) - Refactor: create useProviderConfig composable to eliminate code duplication across provider settings pages - Update all provider pages to use useProviderConfig composable for consistent API key/base URL validation logic
…m/lockrush-dev/airi into feat/provider-abstract-interfaces
Refactor All Speech and Transcription Providers to Unified
defineProvider()PatternSummary
This PR refactors all speech and transcription providers to use the unified
defineProvider()pattern (as established in PR #968), improving architectural consistency, reducing code duplication, and enabling consistent provider discovery across all provider types.Changes
Core Architecture
defineProvider()pattern instead of separate registriesconvertProviderDefinitionToMetadata()to bridge the unifieddefineProvider()pattern with the existing store'sProviderMetadataformattasksarray (e.g.,['text-to-speech', 'speech']for speech providers,['speech-to-text', 'automatic-speech-recognition', 'asr', 'stt']for transcription providers)listModelsandlistVoicesare now defined inextraMethodswithin the unified patternBaseSpeechProviderDefinitionandBaseTranscriptionProviderDefinitioninterfaces as deprecated (kept for backward compatibility)normalizeBaseUrlutility function to eliminate code duplication across provider implementationsProvider Implementations
Speech Providers (9 total) - All Migrated ✅
defineProvider()withtasks: ['text-to-speech', 'speech']andextraMethodsforlistModelsandlistVoicesdefineProvider()withtasks: ['text-to-speech', 'speech']andextraMethodsforlistModelsandlistVoicesTranscription Providers (4 total) - All Migrated ✅
defineProvider()withtasks: ['speech-to-text', 'automatic-speech-recognition', 'asr', 'stt']andextraMethodsforlistModelsdefineProvider()withtasks: ['speech-to-text', 'automatic-speech-recognition', 'asr', 'stt']andextraMethodsforlistModelsAll providers are now automatically registered when imported via the unified registry and can be discovered via
listProviders()andgetDefinedProvider().Code Quality Improvements
ComposerTranslationtypes, validator signatures, and provider instance type castingreasonKeysupport and propercontextOptionsparameter handlingSettings UI & Store Improvements
Benefits
defineProvider()patterndefineProvider()patternlistProviders()andgetDefinedProvider()functionsTechnical Details
Provider Definition Pattern
Store Integration
The
convertProviderDefinitionToMetadata()function converts unified provider definitions to the store'sProviderMetadataformat, enabling seamless integration with existing provider store logic.Validator Pattern
All validators follow a consistent pattern with proper error handling:
Testing
pnpm run typecheckpnpm lintpnpm -F @proj-airi/stage-tamagotchi buildMigration Notes
defineSpeechProvider()anddefineTranscriptionProvider()functions are deprecated but still exported for backward compatibilityBaseSpeechProviderDefinitionandBaseTranscriptionProviderDefinitioninterfaces are deprecated but still availabledefineProvider()patternRelated
defineProvider()pattern to all speech and transcription providers