Skip to content

Genrated song is very close to the song before conversion. #208

@STuoX

Description

@STuoX

I successfully trained the model using this code. Now the problem is that after inference, the generated song vocals do not closely resemble the desired singer. A song originally sung by a female can be successfully converted to a male voice, but the differences between different male voices are not significant. Additionally, a song originally sung by a male is not easily converted to a female voice. I trained the model 430 epochs with the following parameters. Should I increase learning rate?

train:
model: "sovits"
seed: 1234
epochs: 430
learning_rate: 5e-5
betas: [0.8, 0.99]
lr_decay: 0.999875
eps: 1e-9
batch_size: 6
accum_step: 2
c_stft: 9
c_mel: 1.
c_kl: 0.2
port: 8001
pretrain: "./vits_pretrain/sovits5.0.pretrain.pth"
#############################
data:
training_files: "files/train.txt"
validation_files: "files/valid.txt"
segment_size: 8000 # WARNING: base on hop_length
max_wav_value: 32768.0
sampling_rate: 32000
filter_length: 1024
hop_length: 320
win_length: 1024
mel_channels: 100
mel_fmin: 50.0
mel_fmax: 16000.0
#############################
vits:
ppg_dim: 1280
vec_dim: 256
spk_dim: 256
gin_channels: 256
inter_channels: 192
hidden_channels: 192
filter_channels: 640
#############################
gen:
upsample_input: 192
upsample_rates: [5,4,4,2,2]
upsample_kernel_sizes: [15,8,8,4,4]
upsample_initial_channel: 320
resblock_kernel_sizes: [3,7,11]
resblock_dilation_sizes: [[1,3,5], [1,3,5], [1,3,5]]
#############################
mpd:
periods: [2,3,5,7,11]
kernel_size: 5
stride: 3
use_spectral_norm: False
lReLU_slope: 0.2
#############################
mrd:
resolutions: "[(1024, 120, 600), (2048, 240, 1200), (4096, 480, 2400), (512, 50, 240)]" # (filter_length, hop_length, win_length)
use_spectral_norm: False
lReLU_slope: 0.2
#############################
log:
info_interval: 200
eval_interval: 100
save_interval: 10
num_audio: 6
pth_dir: 'chkpt'
log_dir: 'logs'
keep_ckpts: 0
#############################
dist_config:
dist_backend: "nccl"
dist_url: "tcp://localhost:54321"
world_size: 1

Image

](url)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions