Skip to content

Commit eeed4f9

Browse files
committed
Prevent viewer get confused by readme
1 parent b978cc6 commit eeed4f9

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

README.MD

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Real time monaural source separation base on fully convolutional neural network operates on Time-frequency domain
1+
# Real time monaural source separation base on fully convolutional neural network operates on time-frequency domain
22
AI Source separator written in C running a U-Net model trained by Deezer, separate your audio input to Drum, Bass, Accompaniment and Vocal/Speech with Spleeter model.
33

44
## Network overview
@@ -12,7 +12,7 @@ Batch normalization and activation is followed by the output of each convolution
1212

1313
The decoder uses transposed convolution with stride = 2 for upsampling, with their input concatenated with each encoder Conv2D pair.
1414

15-
Worth notice, batch normalization and activation isn't the output of each encoder layers we are going to concatenate. The decoder side concatenates just the convolution output of the layers of an encoder.
15+
Worth notice, batch normalization and activation isn't the output of each encoder layers we are going to concatenate. The decoder side concatenates just the convolution output of the layers of an encoder.
1616

1717
## Real time system design
1818
Deep learning inference is all about GEMM, we have to implement im2col() function with stride, padding, dilation that can handle TensorFlow-styled CNN or even Pytorch-styled convolutional layer.
@@ -25,7 +25,7 @@ I don't plan to use libtensorflow, I'll explain why.
2525

2626
Deep learning functions in existing code: im2col(), col2im(), gemm(), conv_out_dim(), transpconv_out_dim()
2727

28-
We have to initialize a buck of memory and spawn some threads before processing begins, we allow developers to adjust the number of frequency bins and time frames for the neural network to inference, the __official__ Spleeter set FFTLength = 4096, Flim = 1024 and T = 512 for default CNN input, then the neural network will predict mask up to 11kHz and take about 11 secs.
28+
We have to initialize a buck of memory and spawn some threads before processing begins, we allow developers to adjust the number of frequency bins and time frames for the neural network to inference, the __official__ Spleeter set FFTLength = 4096, Flim = 1024 and T = 512 for default CNN input, then the neural network will predict mask up to 11kHz and take about 10 secs.
2929

3030
Which mean real-world latency of default setting using __official__ model will cost you 11 secs + overlap-add sample latency, no matter how fast your CPU gets, the sample latency is intrinsical.
3131

@@ -76,7 +76,7 @@ We got 4 sources to demix, we run 4 CNN in parallel, each convolutional layer ge
7676
## System Requirements and Installation
7777
Currently, the UI is implemented using JUCE with no parameters can be adjusted.
7878

79-
Any compilable audio plugin host or the standalone program will run the program.
79+
Any audio plugin host that is compilable with JUCE will run the program.
8080

8181
Win32 API are used to find user profile directory to fread the deep learning model.
8282

@@ -100,7 +100,7 @@ You need to write a Python program, you will going to split the checkpoint of 4
100100

101101
2. The audio processor is so slow, slower than Python version on the same hardware.
102102

103-
A: Not really, the plugin isn't like __official__ Spleeter, we can't do everything in offline, there's a big no to write a real-time signal processor that run in offline mode.
103+
A: Not really, the plugin isn't like __official__ Spleeter, we can't do everything in offline, there's a big no to write a real-time signal processor that run in offline mode, online separation give meaning to this repository.
104104

105105
The audio processor buffering system will cost extra overhead to process compared to offline Python program.
106106

@@ -112,6 +112,6 @@ Different audio plugin host or streaming system have different buffer size, the
112112
Other than the project main components are GPL-licensed, I don't know much about Intel MKL.
113113

114114
## Credit
115-
Deezer, of source, this processor won't happen without their great model.
115+
Deezer, of cource, this repository won't happen without their great model.
116116

117117
Intel MKL, without MKL, the convolution operation run 40x slower.

0 commit comments

Comments
 (0)