Izwi Bug Report Info
System
- OS: Ubuntu 24.04.4 LTS
- Kernel: Linux 6.8.0-100-generic
- CPU: 12 cores
- RAM: 31 GB (13 GB available)
- GPU: AMD Radeon RX 470/480/570/580/590 (Ellesmere)
- No NVIDIA GPU - running on CPU only
Version
- Izwi: v0.1.0-alpha-11
- CLI & Server
Steps to Reproduce
- Start server:
./izwi-server
- Transcribe any audio file (tested with 30 sec, 10 min, 53 min files)
- Command:
izwi transcribe <file> --language ru -f json
- Result: Returns same short text (~600 chars) regardless of input file length
Expected Behavior
Full transcription of entire audio file.
Actual Behavior
Only returns first ~600 characters of audio, ignoring rest. Same result for:
- 30 second audio
- 10 minute audio
- 53 minute audio
Logs
Server runs fine, transcription completes, returns truncated text.
Tested Models
- Qwen3-ASR-0.6B
- Qwen3-ASR-1.7B
Both return same truncated output.
Additional Notes
- Model files download successfully (~4.7GB for 1.7B)
- Server doesn't crash on alpha-11 (unlike earlier tests)
- CPU inference works (RTF ~0.9 for 30s audio)
- The output text is always the same ~first 10-15 seconds of audio