EmiyaEngineNN

A completely reconstructed EmiyaEngine using neural networks. An upsampling/restoration model suitable for common lossy audio.

Lossy AAC Input	Upsampled Lossless FLAC Output

Methodology

EmiyaEngineNN is a high-fidelity broadband audio upsampling model based on a modification of BAE-Net.
Compared to the original design, the network capacity has been increased to about three times the original by significantly widening the FFT window (1576->3072), modifying the number of channels in the intermediate layers, and other operations.
This is to better adapt to the more complex scenarios of general lossy audio, rather than just VCTK speech.
Additionally, the engineering aspect references the design of kokoro, with STFT/iSTFT built into the network for end-to-end computation, reducing the alignment cost of pre- and post-processing.

Usage

The environment uses Python 3.12 + PyTorch 2.7.1 + ONNX 1.18.0.
Prepare a directory named dataset, put the audio files into it, and start train_aio.py to begin training.

If you just want to see the effect, you can download the binary from the Release page.
It supports common lossy audio format inputs (e.g., MP3, AAC, Opus), and the output is fixed to lossless compressed FLAC.

zansei.exe model.onnx input.mp3 output.flac

It should be noted that the program will internally downsample the audio to 32kHz to remove the empty spectrum and optimize the output quality.
Using lossless audio or other inputs containing information above this frequency range may cause audio degradation.

Training Details

The training used 226 stereo recordings randomly selected from a personal music library and trained for about 90 hours, which was interrupted by a machine failure and restart.
The MS-STFT weighted loss at the last checkpoint was about 8.1, and the discriminator loss was about 0.33. Other metrics were lost due to the failure.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
export_onnx.py		export_onnx.py
train_aio.py		train_aio.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EmiyaEngineNN

Methodology

Usage

Training Details

About

Uh oh!

Releases 1

Packages

Languages

License

Sg4Dylan/EmiyaEngineNN

Folders and files

Latest commit

History

Repository files navigation

EmiyaEngineNN

Methodology

Usage

Training Details

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages