Skip to content

Auzure Speech To Text impl. #3

@stone-alex

Description

@stone-alex

Description

Add Microsoft Azure Cognitive Services Speech-to-Text as a second STT backend.
The app will automatically detect the provider from the API key format (no user dropdown needed).

Why

  • Gives players choice between Google and Azure
  • Zero config for existing users
  • Fully backwards compatible

Acceptance criteria

  • Implement elite.intel.ai.ears.EarsInterface in new class AzureSTT
  • Must be singleton, zero DI – follow exact pattern of GoogleSTT
  • Loaded dynamically via elite.intel.ai.ApiFactory using elite.intel.ai.KeyDetector.detectProvider(apiKey, "STT")
  • Reuse or adopt existing RMS gate, VAD, and Audio Calibration logic (see from GoogleSTT)
  • Use existing EventBusManager.register(this); / @subscribe pattern to listen to events
  • Interrupt the STT procerssing when vocalization is in progress (See GoogleSTT / IsSpeakingEvent) and resume after vocalization ends.
  • No Python, no JNI, no unsigned DLLs no native packages or libraries → pure Azure Java SDK only
  • Fully works on Linux and Windows
  • Project compiles and runs with gradle
  • App runs from fat-jar
  • Provide temporary key to project maintainer for testing.

Useful links

Difficulty: Medium (follow GoogleSTT pattern + async streams)

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions