Add kokoro support by daavoo · Pull Request #94 · mozilla-ai/document-to-podcast

daavoo · 2025-01-16T10:28:14Z

Test in https://colab.research.google.com/github/mozilla-ai/document-to-podcast/blob/text-to-speech-model/demo/notebook.ipynb

stefanfrench · 2025-01-16T14:48:10Z

@daavoo When testing locally, I get ModuleNotFoundError: No module named 'phonemizer'

Do we need to update the dependencies?

Kostis-S-Z · 2025-01-16T14:48:44Z

Awesome stuff! I dont have many comments on the code.

As Stefan mentioned, there is the issue of updating the dependencies. Especially how to handle the espeak-ng dep.

And then there is another bigger question: How "fully" do we want to integrate kokoro? Or a better phrase is are we fully moving away from Oute and into kokoro? I am asking this due to the following issues:

What should we do regarding dependencies? Should we make outetts optional? Should we completely remove outetts support, as we did with parler and bark?
What should we do with testing? Should we add kokoro to test_model_loaders and test_text_to_speech?
Should we update all the docs references, specifically in step-by-step to reference kokoro instead of oute?
I dont think it makes sense for me to meticulously review the inference/kokoro/ files, as I am hoping we will soon replace them for the pip package 🤞

Kostis-S-Z · 2025-01-16T14:58:52Z

Also kokoro seems to pad each audio clip with at least 1 second silence and then the final complete podcast has a bit too much silence between speakers. I would update here like this:

        audio_np = stack_audio_segments(
            st.session_state.audio, speech_model.sample_rate, silence_pad=0.0
        )

daavoo · 2025-01-16T15:03:39Z

What I am thinking is:

We integrate kokoro only for demo purposes (until a stable version is released in pypi)
We handle dependencies either in the notebook or the Dockerfile used for the HF Spaces.
We don't update README, docs or tests until pypi release.
We do update references in the demo notebook and app.
We keep outetts the default for the rest of the cases.

WDYT @stefanfrench @Kostis-S-Z

stefanfrench · 2025-01-16T16:04:18Z

What I am thinking is:

We integrate kokoro only for demo purposes (until a stable version is released in pypi)
We handle dependencies either in the notebook or the Dockerfile used for the HF Spaces.
We don't update README, docs or tests until pypi release.
We do update references in the demo notebook and app.

We keep outetts the default for the rest of the cases.

WDYT @stefanfrench @Kostis-S-Z

Okay I'm comfortable with that

Kostis-S-Z · 2025-01-17T10:06:03Z

What I am thinking is:

* We integrate `kokoro` only for demo purposes (until a stable version is released in pypi)
  We handle dependencies either in the notebook or the Dockerfile used for the HF Spaces.
  We don't update README, docs or tests until pypi release.
  We do update references in the demo notebook and app.

* We keep `outetts` the default for the rest of the cases.

WDYT @stefanfrench @Kostis-S-Z

Works for me! So you need to revert the changes in demo/app.py, example_data/config.yaml, src/document_to_podcast/cli.py, src/document_to_podcast/config.py and maybe also push this and then merge?

daavoo · 2025-01-17T11:14:29Z

Works for me! So you need to revert the changes in demo/app.py, example_data/config.yaml, src/document_to_podcast/cli.py, src/document_to_podcast/config.py and maybe also push this and then merge?

demo/app.py is used in the HF space, needs to use kokoro there. Rest is done

stefanfrench

Thanks for making the changes.

Tested on Colab, local app, and CLI - works as expected. Approved.

daavoo requested a review from a team January 16, 2025 12:16

daavoo marked this pull request as ready for review January 16, 2025 12:16

daavoo force-pushed the text-to-speech-model branch from 7dd5199 to 5145045 Compare January 16, 2025 14:58

daavoo added 3 commits January 17, 2025 12:06

Add kokoro support

fc53de8

Update demo to use kokoro

94f6fc1

Use am_michael instead of am_adam

44849c8

daavoo force-pushed the text-to-speech-model branch from 5145045 to 44849c8 Compare January 17, 2025 11:06

daavoo added 2 commits January 17, 2025 12:10

Install kokoro deps in dockerfile

fdf5d4a

Revert changes for outetts

e22e5f3

Use 0.0 as pad. Add if

a9eb96f

daavoo self-assigned this Jan 17, 2025

enh(demo): Use HF_SPACE env var in order to chose model

b1d6f17

stefanfrench approved these changes Jan 17, 2025

View reviewed changes

daavoo merged commit 3e265b1 into main Jan 17, 2025
4 checks passed

daavoo deleted the text-to-speech-model branch January 17, 2025 14:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add kokoro support#94

Add kokoro support#94
daavoo merged 7 commits intomainfrom
text-to-speech-model

daavoo commented Jan 16, 2025 •

edited

Loading

Uh oh!

stefanfrench commented Jan 16, 2025

Uh oh!

Kostis-S-Z commented Jan 16, 2025 •

edited

Loading

Uh oh!

Kostis-S-Z commented Jan 16, 2025

Uh oh!

daavoo commented Jan 16, 2025

Uh oh!

stefanfrench commented Jan 16, 2025

Uh oh!

Kostis-S-Z commented Jan 17, 2025 •

edited

Loading

Uh oh!

daavoo commented Jan 17, 2025

Uh oh!

stefanfrench left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

daavoo commented Jan 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stefanfrench commented Jan 16, 2025

Uh oh!

Kostis-S-Z commented Jan 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kostis-S-Z commented Jan 16, 2025

Uh oh!

daavoo commented Jan 16, 2025

Uh oh!

stefanfrench commented Jan 16, 2025

Uh oh!

Kostis-S-Z commented Jan 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

daavoo commented Jan 17, 2025

Uh oh!

stefanfrench left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

daavoo commented Jan 16, 2025 •

edited

Loading

Kostis-S-Z commented Jan 16, 2025 •

edited

Loading

Kostis-S-Z commented Jan 17, 2025 •

edited

Loading