Soniox integration #90

MyButtermilk · 2025-08-21T08:33:10Z

@anej-soniox:
Here a PR to test Soniox on the Open ASR benchmark. I tried running it, and it worked fine until the transcription stopped after around seven hours of transcriptions. I think there is something blocking too many async API calls. Maybe you can run it on your hardware side. It would be beneficial for Soniox to be present in the quite popular OpenASR benchmark.

This commit introduces support for Soniox's speech-to-text API for both English and multilingual benchmarks. A new `soniox/` directory has been added, containing the necessary scripts to run the evaluations: - `run_eval.py`: for English benchmarks (async and real-time). - `run_eval_ml.py`: for multilingual benchmarks (async and real-time). - `run_soniox.sh` and `run_soniox_ml.sh`: shell scripts to run the evaluations. - `requirements.txt`: specifies the dependencies for the Soniox integration.

- Fix audio decoding by disabling automatic torchcodec dependency and manually loading from bytes - Add proper PYTHONPATH to shell script to locate normalizer module - Ensure compatibility with datasets library without requiring torchcodec installation 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

- Remove hardcoded SONIOX_API_KEY from shell scripts for security - Add environment variable checks with helpful error messages - Fix run_eval.py dataset loading to avoid torchcodec dependency - Update data_utils.prepare_data() to support undecoded audio - Add comprehensive requirements.txt with all dependencies - Create .env.example template for API key configuration 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

Deep-unlearning · 2025-08-21T08:36:21Z

Hi @MyButtermilk
Thanks for the PR, we will look into that!

MyButtermilk · 2025-08-21T09:29:06Z

@Deep-unlearning

Thanks for looking into it. Maybe it works when you try again?

You get 200 USD free credits at Soniox for subscribing. That should be easily enough for running the whole benchmark for English only and multi-lingual, considering that the price is only 0,10 USD per hour for async and 0,12 USD per hour for realtime transcription.

MyButtermilk · 2025-08-29T05:22:13Z

@Deep-unlearning Did you look at the PR?

MyButtermilk · 2025-11-13T07:04:59Z

@Deep-unlearning any update on the pr?

google-labs-jules bot and others added 3 commits August 20, 2025 09:47

MyButtermilk mentioned this pull request Aug 21, 2025

Add Soniox #78

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Soniox integration #90

Soniox integration #90

Uh oh!

MyButtermilk commented Aug 21, 2025

Uh oh!

Deep-unlearning commented Aug 21, 2025

Uh oh!

MyButtermilk commented Aug 21, 2025

Uh oh!

MyButtermilk commented Aug 29, 2025

Uh oh!

MyButtermilk commented Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Soniox integration #90

Are you sure you want to change the base?

Soniox integration #90

Uh oh!

Conversation

MyButtermilk commented Aug 21, 2025

Uh oh!

Deep-unlearning commented Aug 21, 2025

Uh oh!

MyButtermilk commented Aug 21, 2025

Uh oh!

MyButtermilk commented Aug 29, 2025

Uh oh!

MyButtermilk commented Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants