How to train a model that can fully extract the 44100hz frequency

I want to train a 2 stems model

I noticed that in the yaml configuration of each model, there are some parameters that will affect the final frequency cutoff, it seems that multigpu_drums.yaml can handle the full 44100hz frequency, but with the reduction of num_blocks (11 => 9), the model size will also decrease accordingly (29mb => 21mb).

Although using something like multigpu_drums.yaml can handle 44100hz in full, but the model shrinks instead. Does this affect the final accuracy?

It seems that dim_t, hop_length, overlap, num_blocks these parameters have a wonderful complementarity that I cannot understand, maybe this 'complementarity' is designed for the competition(mix to demucs), but I want to apply this to the real world without demucs(only mdx-net, after some testing, I think the potential of mdx-net is higher than demucs).

When I try to change num_blocks from 9 to 11, the results of inference have overlapping and broken voices... do you have any good parameters recommendations for me to train a **full 44100hz** one **without loss of accuracy** (i.e. the model does not Shrinking)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to train a model that can fully extract the 44100hz frequency #35

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to train a model that can fully extract the 44100hz frequency #35

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions