You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Dec 21, 2023. It is now read-only.
Copy file name to clipboardExpand all lines: userguide/sound_classifier/advanced-usage.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,11 +35,11 @@ This allows us to perform 5-fold cross validation more than twice as fast. For l
35
35
36
36
37
37
#### Tune Custom Neural Network Configuration
38
-
The [custom neural network](how-it-works.html#custom-neural-network-stage) used by the Sound Classifier is made up of a series of dense layers. Using more layers or more units in each layer can have a significant affect on accuracy. This will also affect the size of your model.
38
+
The [custom neural network](how-it-works.html#custom-neural-network-stage) used by the Sound Classifier is made up of a series of dense layers. Using more layers or more units in each layer can have a significant effect on accuracy. This will also affect the size of your model.
39
39
40
40
The `custom_layer_sizes` parameter allows you specify how many layers and the number of units in each layer. The default values for this parameter is `[100, 100]` which corresponds to two dense layers with a 100 units each.
41
41
42
-
Using a smaller number of overall units will result in a smaller model, potentially with minimal affects on accuracy. If you have a large amount of training data, you should get better accuracy by using more layers and/or more units.
42
+
Using a smaller number of overall units will result in a smaller model, potentially with minimal effects on accuracy. If you have a large amount of training data, you should get better accuracy by using more layers and/or more units.
43
43
44
44
The code below tries several different neural network configurations and reports the validation accuracy for each one:
Copy file name to clipboardExpand all lines: userguide/sound_classifier/how-it-works.md
+8-6Lines changed: 8 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,10 +1,12 @@
1
-
# How Does this Work?
1
+
# How Does This Work?
2
2
3
3
Training and making predictions for a sound classifier model is a three
4
4
stage process:
5
-
1 - Signal preprocessing
6
-
2 - A pretrained neural network is used to extract deep features
7
-
3 - A custom neural network is used to make the predictions
5
+
6
+
1. Signal preprocessing
7
+
2. A pretrained neural network is used to extract deep features
8
+
3. A custom neural network is used to make the predictions
9
+
8
10
Details below about each stage.
9
11
10
12
## Signal Preprocessing Pipeline Stage
@@ -31,7 +33,7 @@ input length depends on sample rate) and produces an array of shape
31
33
32
34
## VGGish Feature Extraction Stage
33
35
VGGish is a pretrained [Convolutional Neural Network](https://en.wikipedia.org/wiki/Convolutional_neural_network) from Google,
34
-
see [their paper](https://ai.google/research/pubs/pub45611) and [their GitHub page](https://github.com/tensorflow/models/tree/master/research/audioset) for more details. As the name suggests, the architecture of
36
+
see [their paper](https://ai.google/research/pubs/pub45611) and [their GitHub page](https://github.com/tensorflow/models/tree/master/research/audioset/vggish) for more details. As the name suggests, the architecture of
35
37
this network is inspired by the famous VGG networks used for image
36
38
classification. The network consists of a series of convolution and
37
39
activation layers, optionally followed by a max pooling layer.
@@ -42,7 +44,7 @@ last three layers of the original VGGish model. We use the widest
42
44
layer, from the original network, as our input data for the final
43
45
stage. This modified VGGish model outputs a double vector of length
44
46
12,288. On non-Linux systems, the model has also been eight bit
45
-
quantized, to reduce its size.
47
+
quantized to reduce its size.
46
48
47
49
## Custom Neural Network Stage
48
50
This is the only stage which is updated based on the input data.
0 commit comments