Skip to content

Commit 31a4a23

Browse files
committed
Updated README
1 parent c7845a3 commit 31a4a23

File tree

1 file changed

+28
-4
lines changed

1 file changed

+28
-4
lines changed

README

Lines changed: 28 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -35,11 +35,17 @@ The latest version of the source is available from
3535
https://gitlab.xiph.org/xiph/rnnoise . The GitHub repository
3636
is a convenience copy.
3737

38-
== TRAINING ==
38+
== Training ==
39+
40+
The models distributed with RNNoise are now trained using only the publicly
41+
available datasets listed below and using the training precedure described
42+
here. Exact results will still depend on the the exact mix us data used,
43+
on how long the training is performed and on the various random seeds involved.
3944

4045
To train an RNNoise model, you need both clean speech data, and noise data.
4146
Both need to be sampled at 48 kHz, in 16-bit PCM format (machine endian).
42-
Clean speech data can be obtained from
47+
Clean speech data can be obtained from the datasets listed in the datasets.txt
48+
file, or by downloaded the already-concatenation of those files in
4349
https://media.xiph.org/rnnoise/data/tts_speech_48k.sw
4450
For noise data, we suggest concatenating the 48 kHz noise data from DEMAND at
4551
https://zenodo.org/records/1227121
@@ -78,12 +84,30 @@ concatenate the output to a single file.
7884
Once the feature file is computed, you can start the training with:
7985
% python3 train_rnnoise.py features.f32 output_directory
8086

81-
The training will produce .pth files, e.g. rnnoise_200.pth
87+
Choose a number of epochs (using --epochs) that leads to about 75000 weight
88+
updates. The training will produce .pth files, e.g. rnnoise_50.pth .
8289
The next step is to convert the model to C files using:
8390

84-
% python3 dump_rnnoise_weights.py --quantize rnnoise_200.pth rnnoise_c
91+
% python3 dump_rnnoise_weights.py --quantize rnnoise_50.pth rnnoise_c
8592

8693
which will produce the rnnoise_data.c and rnnoise_data.h files in the
8794
rnnoise_c directory.
8895

8996
Copy these files to src/ and then build RNNoise using the instructions above.
97+
98+
For slightly better results, a trained model can be used to remove any noise
99+
from the "clean" training speech, before restaring the denoising process
100+
again (no need to do that more than once).
101+
102+
== Loadable Models ==
103+
104+
The model format has changed since v0.1.1. Models now use a binary
105+
"machine endian" format. To output a model in that format, build RNNoise
106+
with that model and use the dump_weights_blob executable to output a
107+
weights_blob.bin binary file. That file can then be used with the
108+
rnnoise_model_from_file() API call. Note that the model object MUST NOT
109+
be deleted while the RNNoise state is active and the file MUST NOT
110+
be closed.
111+
112+
To avoid including the default model in the build (e.g. to reduce download
113+
size) and rely only on model loading, add -DUSE_WEIGHTS_FILE to the CFLAGS.

0 commit comments

Comments
 (0)