Skip to content

Conversation

@MyButtermilk
Copy link

@anej-soniox:
Here a PR to test Soniox on the Open ASR benchmark. I tried running it, and it worked fine until the transcription stopped after around seven hours of transcriptions. I think there is something blocking too many async API calls. Maybe you can run it on your hardware side. It would be beneficial for Soniox to be present in the quite popular OpenASR benchmark.

google-labs-jules bot and others added 3 commits August 20, 2025 09:47
This commit introduces support for Soniox's speech-to-text API for both English and multilingual benchmarks.

A new `soniox/` directory has been added, containing the necessary scripts to run the evaluations:
- `run_eval.py`: for English benchmarks (async and real-time).
- `run_eval_ml.py`: for multilingual benchmarks (async and real-time).
- `run_soniox.sh` and `run_soniox_ml.sh`: shell scripts to run the evaluations.
- `requirements.txt`: specifies the dependencies for the Soniox integration.
- Fix audio decoding by disabling automatic torchcodec dependency and manually loading from bytes
- Add proper PYTHONPATH to shell script to locate normalizer module
- Ensure compatibility with datasets library without requiring torchcodec installation

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Remove hardcoded SONIOX_API_KEY from shell scripts for security
- Add environment variable checks with helpful error messages
- Fix run_eval.py dataset loading to avoid torchcodec dependency
- Update data_utils.prepare_data() to support undecoded audio
- Add comprehensive requirements.txt with all dependencies
- Create .env.example template for API key configuration

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@Deep-unlearning
Copy link
Collaborator

Hi @MyButtermilk
Thanks for the PR, we will look into that!

@MyButtermilk
Copy link
Author

@Deep-unlearning

Thanks for looking into it. Maybe it works when you try again?

You get 200 USD free credits at Soniox for subscribing. That should be easily enough for running the whole benchmark for English only and multi-lingual, considering that the price is only 0,10 USD per hour for async and 0,12 USD per hour for realtime transcription.

@MyButtermilk MyButtermilk mentioned this pull request Aug 21, 2025
@MyButtermilk
Copy link
Author

@Deep-unlearning Did you look at the PR?

@MyButtermilk
Copy link
Author

@Deep-unlearning any update on the pr?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants