Implement Local Speech To Text

### Level – advanced.

## Requirements
- Pure Java integration impl, no DLLs, no JNI
- Minimum Java version 21.
- Linux first, Windows optional but nice to have.
- Either use Java API for a local STT model or implement web-socket integration to an STT server (running locally or on a home lab)

## Pipeline
- Detect user audio hardware (use existing AudioFormatDetector class)
- Implement Audio calibration (use existing AudioCalibrator class)
- Implement as a true singleton and run speech recognition in its own thread
- Open the audio stream and build VOD. 
    - Check the RMS. 
    - If RMS falls below an upper threshold and stay there for ~ 1 second, stop recording and send VOD to STT.
- Sanitize the received transcript with STTSanitizer.getInstance().correctMistakes(fullTranscript)
- Send a sanitized transcript to LLM by publishing UserInputEvent (text and confidence) and TTSInterruptEvent.
- Honor streaming mode. When streaming mode is only only send transcript that contains word "computer"
- place implementation in to elite.intel.ai.ears.local package
- (See GoogleSTT class for inspiration)

## Acceptance Criteria
- High accuracy transcriptions
- Low latency  
- Pure Java, no JNI
- Runs on Linux
- Gradle builds fat jar 
- Project runs in IntelliJ IDEA without errors.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Local Speech To Text #13

Level – advanced.

Requirements

Pipeline

Acceptance Criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Implement Local Speech To Text #13

Description

Level – advanced.

Requirements

Pipeline

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions