Update start_stt.sh to match start_tts.sh#149
Conversation
|
Thanks! The reason the text-to-speech needs Python is that the actual model is called from Python and the Rust code is a thin wrapper, but for the speech-to-text it's all Rust. But we still need libpython because it's the same executable, so if you don't have a Python installation at all, it'll fail. Maybe just add a note about this, perhaps: # We need libpython because the TTS uses a Python component. STT and TTS have the same executable, so we need
# to have libpython even if we don't end up using it. For simplicity, we use the same code as for TTS, even though
# you don't need to install any of these Python packages if you're only using the STT. |
|
Maybe Moshi is looking for a particular version of python? Looks like it expects 3.12: The default python installed on the machine seems to be python 3.11. |
|
I don't understand the Rust codebase that deeply but I don't think it's hardcoded. Searching for "3.12" doesn't reveal anything Rust-related. Could it be that you built the So for example, if you first run this command with a virtualenv, you'll get a binary that assumes that version of Python, and then if you try to run it without that virtualenv, it'll fail. I'm happy to merge this fix though, it doesn't hurt and it seems like it can make things more reliable in some cases. |
That's probably what happened. I may have been doing some things inside virtualenv, and some things outside of the virtualenv, and it caused some mis-alignment of dependencies. I do think this change might make the behavior more deterministic, since it forces everything to happen within the context of the virtualenv. |
|
Makes sense. Could you just add the note I mentioned above to clarify it's needed because of the TTS and not the STT? Thanks for all the PRs! |
|
No problem! Excited to get it running, it's such an incredibly powerful tool. Yes I'll update the comment, and I'll also remove anything that's not needed. (but will need to retest it first) I think this might be the only line needed: |
|
I think you'll need at least |
Updated comments for clarity on Python dependencies and environment setup.
|
Thank you! |
Checklist
I, @tleyden, confirm that I have read and understood the terms of the CLA of Kyutai-labs, as outlined in the repository's CONTRIBUTING.md, and I agree to be bound by these terms. The full CLA is provided as follows:
I, @tleyden, hereby grant to Kyutai-labs a perpetual, worldwide, non-exclusive, royalty-free, irrevocable license to use, modify, distribute, and sublicense my Contributions. I understand and accept that Contributions are limited to modifications, improvements, or changes to the project’s source code submitted via pull requests. I accept that Kyutai-labs has full discretion to review, accept, reject, or request changes to any Contributions I submit, and that submitting a pull request does not guarantee its inclusion in the project. By submitting a Contribution, I grant Kyutai-labs a perpetual, worldwide license to use, modify, reproduce, distribute, and create derivative works based on my Contributions. I also agree to assign all patent rights for any inventions or improvements that arise from my Contributions, giving the Kyutai-labs full rights to file for and enforce patents. I understand that the Kyutai-labs may commercialize, relicense, or exploit the project and my Contributions without further notice or obligation to me. I confirm that my Contributions are original and that I have the legal right to grant this license. If my Contributions include third-party materials, I will ensure that I have the necessary permissions and will disclose this information. I accept that once my Contributions are integrated, they may be altered or removed at the Kyutai-labs’s discretion. I acknowledge that I am making these Contributions voluntarily and will not receive any compensation. Furthermore, I understand that all Contributions, including mine, are provided on an "as-is" basis, with no warranties. By submitting a pull request, I agree to be bound by these terms.
PR Description
While running this on runpod in a dockerless manner, I hit this error:
I noticed that the
start_stt.shscript was missing some stuff instart_tts.shto source the python env, so this PR just copied that in.It fixed the isuse for me, and now the stt server seems to be working: