Skip to content

Future Work - Models #67

@AI-Guru

Description

@AI-Guru

Hi!

I am very curious about the future work part of the paper.

There were a few suggestions in the paper. Let me talk about two.

1. Use perceptual losses.

You have just merged a PR that allows for loss customization. Which perceptual loss did you have in mind when you wrote the suggestion?

2. Using mel spectrograms instead of magnitude spectrograms as input.

dmae1d-ATC64-v2 Uses the magnitude spectrogram.

What would be a good mel feature extractor?

I sometimes ran into this one but I would like to know what you think about it:

encoder=MelE1d( # The encoder used, in this case a mel-spectrogram encoder
                in_channels=in_channels,
                channels=512,
                multipliers=[1, 1],
                factors=[2],
                num_blocks=[12],
                out_channels=32,
                mel_channels=80,
                mel_sample_rate=48000,
                mel_normalize_log=True,
                bottleneck=TanhBottleneck(),
            ),

I believe it extracts a lot of features, thus putting a strain on the GPU.

Curious what you have to say about 1 and 2.

Cheers,
Tristan

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions