@@ -35,11 +35,17 @@ The latest version of the source is available from
3535https://gitlab.xiph.org/xiph/rnnoise . The GitHub repository
3636is a convenience copy.
3737
38- == TRAINING ==
38+ == Training ==
39+
40+ The models distributed with RNNoise are now trained using only the publicly
41+ available datasets listed below and using the training precedure described
42+ here. Exact results will still depend on the the exact mix us data used,
43+ on how long the training is performed and on the various random seeds involved.
3944
4045To train an RNNoise model, you need both clean speech data, and noise data.
4146Both need to be sampled at 48 kHz, in 16-bit PCM format (machine endian).
42- Clean speech data can be obtained from
47+ Clean speech data can be obtained from the datasets listed in the datasets.txt
48+ file, or by downloaded the already-concatenation of those files in
4349https://media.xiph.org/rnnoise/data/tts_speech_48k.sw
4450For noise data, we suggest concatenating the 48 kHz noise data from DEMAND at
4551https://zenodo.org/records/1227121
@@ -78,12 +84,30 @@ concatenate the output to a single file.
7884Once the feature file is computed, you can start the training with:
7985% python3 train_rnnoise.py features.f32 output_directory
8086
81- The training will produce .pth files, e.g. rnnoise_200.pth
87+ Choose a number of epochs (using --epochs) that leads to about 75000 weight
88+ updates. The training will produce .pth files, e.g. rnnoise_50.pth .
8289The next step is to convert the model to C files using:
8390
84- % python3 dump_rnnoise_weights.py --quantize rnnoise_200 .pth rnnoise_c
91+ % python3 dump_rnnoise_weights.py --quantize rnnoise_50 .pth rnnoise_c
8592
8693which will produce the rnnoise_data.c and rnnoise_data.h files in the
8794rnnoise_c directory.
8895
8996Copy these files to src/ and then build RNNoise using the instructions above.
97+
98+ For slightly better results, a trained model can be used to remove any noise
99+ from the "clean" training speech, before restaring the denoising process
100+ again (no need to do that more than once).
101+
102+ == Loadable Models ==
103+
104+ The model format has changed since v0.1.1. Models now use a binary
105+ "machine endian" format. To output a model in that format, build RNNoise
106+ with that model and use the dump_weights_blob executable to output a
107+ weights_blob.bin binary file. That file can then be used with the
108+ rnnoise_model_from_file() API call. Note that the model object MUST NOT
109+ be deleted while the RNNoise state is active and the file MUST NOT
110+ be closed.
111+
112+ To avoid including the default model in the build (e.g. to reduce download
113+ size) and rely only on model loading, add -DUSE_WEIGHTS_FILE to the CFLAGS.
0 commit comments