Skip to content

Suggestion for generation of long text like podcast #1205

@ILG2021

Description

@ILG2021

Checks

  • This template is only for research question, not usage problems, feature requests or bug reports.
  • I have thoroughly reviewed the project documentation and read the related paper(s).
  • I have searched for existing issues, including closed ones, no similar questions.
  • I am using English to submit this issue to facilitate community communication.

Question details

In single speaker podcast, emotions and tone of voice will constantly change, so I need to change ref audio constantly. Is there a way to solve this problem, can f5 tts generate emotions and tone depend on text?

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions