Official code for "Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis"
# Install
pip install habibi-tts
# Launch the GUI TTS interface
habibi-tts_infer-gradioImportant
Read the F5-TTS documentation for (1) Detailed installation guidance; (2) Best practice for inference; etc.
# Default using the Unified model (recommanded)
habibi-tts_infer-cli \
--ref_audio "assets/MSA.mp3" \
--ref_text "كان اللعيب حاضرًا في العديد من الأنشطة والفعاليات المرتبطة بكأس العالم، مما سمح للجماهير بالتفاعل معه والتقاط الصور التذكارية." \
--gen_text "أهلًا، يبدو أن هناك بعض التعقيدات، لكن لا تقلق، سأرشدك بطريقة سلسة وواضحة خطوة بخطوة."
# Assign the dialect ID, rather than inferred from given reference prompt (UNK, by default)
# (best use matched dialectal content with ID: MSA, SAU, UAE, ALG, IRQ, EGY, IRQ, OMN, TUN, LEV, SDN, LBY)
habibi-tts_infer-cli --dialect MSA
# Alternatively, use `.toml` file to config, see `src/habibi_tts/infer/example.toml`
habibi-tts_infer-cli -c YOUR_CUSTOM.toml
# Check more CLI features with
habibi-tts_infer-cli --helpNote
Some dialectal audio samples are provided under src/habibi_tts/assets, see the relevant README.md for usage and more details.
# SOONAll code is released under MIT License.
The unified, SAU, and UAE models are licensed under CC-BY-NC-SA-4.0, restricted by SADA and Mixat.
The rest specialized models (ALG, EGY, IRQ, MAR, MSA) are released under Apache 2.0 license.
