Skip to content

[Feature Request] Kokoro-style Voice Mixing #127

@OwenTyme

Description

@OwenTyme

In the past, I've used a CLI fork of Kokoro that allows blending voices together, to produce unique mixtures. I always found that to be a fascinating and very useful thing, which I miss since I've started using Pocket in its place.

Here's the fork I'm talking about, linking to the section on command-line options, where that's spelled out:
https://github.com/nazdridoy/kokoro-tts?tab=readme-ov-file#options

It's based on percentages, like so: 'voice1:30,voice2:70'.

I really don't entirely understand how it works and have no idea if the same thing is possible with Pocket, but I'd love having that as a feature, because it allows toning down a harshly accented sample into something less so, or tweaking a voice with a little hint of another.

My understanding is that the af_heart from Kokoro is a synthetic voice produced in this fashion. If I remember right, it's a blending of af_bella and one other.

So, is there any way to do this with Pocket? Again, I'd love this to be a feature.

Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions