Future Work - Models

Hi!

I am very curious about the future work part of the paper.

There were a few suggestions in the paper. Let me talk about two.

## 1. Use perceptual losses.

You have just merged a PR that allows for loss customization. Which perceptual loss did you have in mind when you wrote the suggestion?

## 2. Using mel spectrograms instead of magnitude spectrograms as input. 

[dmae1d-ATC64-v2](https://huggingface.co/archinetai/dmae1d-ATC64-v2/tree/main) Uses the magnitude spectrogram.

What would be a good mel feature extractor? 

I sometimes ran into this one but I would like to know what you think about it:

```
encoder=MelE1d( # The encoder used, in this case a mel-spectrogram encoder
                in_channels=in_channels,
                channels=512,
                multipliers=[1, 1],
                factors=[2],
                num_blocks=[12],
                out_channels=32,
                mel_channels=80,
                mel_sample_rate=48000,
                mel_normalize_log=True,
                bottleneck=TanhBottleneck(),
            ),
```

I believe it extracts a lot of features, thus putting a strain on the GPU.

Curious what you have to say about 1 and 2.

Cheers,
Tristan


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Future Work - Models #67

1. Use perceptual losses.

2. Using mel spectrograms instead of magnitude spectrograms as input.

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Future Work - Models #67

Description

1. Use perceptual losses.

2. Using mel spectrograms instead of magnitude spectrograms as input.

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions