Skip to content

Emilia fully annotated with EmoNet / Empathic Insight Voice emotion captions & scores - ready for training :) #237

@christophschuhmann

Description

@christophschuhmann

Hey, we ran our state of the art emotion captioning and emotion estimation scores over all of Emilia. You can find the results here distributed over these five repositories.

https://huggingface.co/datasets/laion/Emilia-with-Emotion-Annotations
https://huggingface.co/datasets/laion/Emilia-with-Emotion-Annotations2
https://huggingface.co/datasets/laion/Emilia-with-Emotion-Annotations3
https://huggingface.co/datasets/laion/Emilia-with-Emotion-Annotations4
https://huggingface.co/datasets/laion/Emilia-with-Emotion-Annotations5

https://huggingface.co/Orange/Speaker-wavLM-tbr

I would suggest fine-tuning that is also conditioned on these speaker embeddings because they capture the time-independent attributes of a voice that make up a speaker's identity without confusing it with emotions and arousal. This could eventually enable voice cloning without having a pair for the reference audio. Just take the embedding of the target as conditioning together with the emotion scores.

Have fun! :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions