BiCodec training apparently never converges.

Greetings.
I tried training the BiCodec on more than 1K hours of multilingual data on the default 32khz config. after 10K-20K steps, it results in an intelligible output but it never improves in terms of acoustic quality or speaker similarity beyond that (I have waited for two days and 72K steps).

here's the log:

[20250801_102602.log](https://github.com/user-attachments/files/21559032/20250801_102602.log)

here's a sample at step 72,000:
[rec_0.wav](https://voca.ro/16SZQKHaKlIP) , [ground_truth.wav](https://voca.ro/1cgzR3GTV6DN)
I appreciate your input.
Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BiCodec training apparently never converges. #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BiCodec training apparently never converges. #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions