ㅤㅤ👇 You can join my discord server below ( RVC / AI Audio friendly ) 👇ㅤㅤ
ㅤㅤ👆 To stay up-to-date with advancements, hang out or get support 👆ㅤㅤ
You could say.. A more advanced features-rich Applio ~ With my lil twist.
But If you have any ideas, want to pr or collaborate, feel free to do so!
ㅤ
1. Datasets must be processed properly:
- Peak or RMS compression if necessary! ( This step isn't covered by the fork's preprocessing btw.)
- Silence-truncation ( Absolutely necessary. )
- 'simple' method chosen for preprocessing ( Even 3 sec segments. )
- Enable Loudness Normalization in the ui.
- Enable automatic LUFS range finder for Loudness Normalization.
Expect issues with PESQ and data alignment If the following requirements are not met.
2. Experimental things are experimental for a reason:
- If you don't understand what it does, what it brings or how it works? preferably don't use it.
- Certain features / currently chosen params can be potentially unstable or broken and are a subject to change.
- Not all experimental features gonna reach "stable" status ( There's only as much I can test/ablation study on my own. )
- Some experimental things might disappear at some point if deemed too unstable / not worth it.***
3. Clarification on pretrained models, architectures & vocoders:
- Each Architecture/Vocoder requires own dedicated pretrains.
- The original architecture. ( HiFi-GAN + MPD, MSD )
- It's pretrained models are auto-downloaded during the first launch.
- Available for sample rates: 48, 40 and 32khz.
Models made with this arch are cross-compatible: RVC, Applio and codename-rvc-fork-4.
- Custom architecture. ( MRF-HiFi-GAN / RefineGAN + MPD, MSD )
- Not 100% sure on the status of pretrains yet. Once I get more info, will update this entry.
Models made with this arch are LIMITED cross-compatible: codename-rvc-fork-4 and Applio
- Custom architecture. ( RingFormer + MPD, MSD, MRD )
- There are no available pretrained models for it yet.
- Planned supported sample rates: 48khz ( and maybe 24khz, but that's up to dr87 ).
Models made with this arch ARE NOT cross-compatible: codename-rvc-fork-4
-
Hold-Out type validation mechanism during training. ( L1 MEL, mrSTFT, PESQ, SI-SDR )
In between epochs.
-
BF16-AMP, TF32, FP32 Training modes available.
BF16 & TF32 require Ampere or newer GPUs.
BF16 and TF32 can be used simultaneously for extra speed gains.
NOTE: BF16 is used by default ( and bf16-AdamW ). If unsupported hardware detected, switched back to FP32. Inference is only in FP32.
-
Support for 'Spin' embedder.
-
Ability to choose an optimizer.
( Supported: AdamW, AdamW_BF16, RAdam, Ranger21, DiffGrad, Prodigy )
-
(EXP) Double-update strategy for Discriminator.
-
Support for custom input-samples used during training for live-preview of model's reconstruction performance.
-
Mel spectrogram %-based similarity metric.
-
Support for Multi-scale, classic L1 mel and (EXP) multi-resolution stft spectral losses.
-
Support for the following vocoders: HiFi-GAN, MRF-HiFi-gan, Refine-GAN, RingFormer.
RingFormer architecture consists of: RingFormer ( Conformer + RingAttention, snake activations ) + MPD, MSD, MRD Discs.
-
Checkpointing and various speed / memory optimizations compared to og RVC.
-
New logging mechanism for losses: Average loss per epoch logged as the standard loss,
and rolling average loss over 50 steps to evaluate general trends and the model's performance over time. -
From-ui quick tweaks; lr for g/d, schedulers, linear warmup, kl loss annealing and more ..
✨ to-do list ✨
- None
💡 Ideas / concepts 💡
- Currently none. Open to your ideas ~
Run the installation script:
- Double-click
run-install.bat
.
Start Applio using:
- Double-click
run-fork.bat
.
This launches the Gradio interface in your default browser.
To monitor training or visualize data:
- Run the " run_tensorboard_in_model_folder.bat " file from logs folder and paste in there path to your model's folder
( containing 'eval' folder or tfevents file/s. )
If it doesn't work for you due to blocked port, open up CMD with admin rights and use this command:netsh advfirewall firewall add rule name="Open Port 25565" dir=in action=allow protocol=TCP localport=25565
- Alternatively if the above method fails, run the tensorboard manually in cmd:
tensorboard --logdir="path/to/your/model/folder" --bind_all
(PS. Make sure you have tensorboard installed. ( in cmd: pip install tensorboard )
The creators, maintainers, and contributors of the original Applio repository, as well as the creator of this fork (Codename;0), which is based on Applio, and the contributors of this fork, are not liable for any legal issues, damages, or consequences arising from the use of this repository or any content generated from it. By using this fork, you acknowledge and accept the following terms:
- The use of this fork is at your own risk.
- This repository is intended solely for educational, and experimental purposes.
- Any misuse, including but not limited to illegal activities or violation of third-party rights,
is not the responsibility of the original creators, contributors, or this fork’s maintainer. - You willingly agree to comply with this repository's Terms of Use