report: Last images

jonnor · jonnor · commit 5d479780013d · 2019-05-14T06:04:33.000+02:00
diff --git a/TODO.md b/TODO.md
@@ -3,10 +3,6 @@
 
 ### Final 1
 
-Background
-
-- Make missing images
-
 Checking
 
 - Do a spell checking pass
diff --git a/report/Makefile b/report/Makefile
@@ -21,6 +21,7 @@ includes: pyincludes/urbansound8k-classes.tex \
     pyincludes/experiment-settings.tex \
     pyincludes/models.tex \
     pyincludes/results.tex \
+    pyplots/logloss.png \
     pyplots/filterbanks.png \
     pyplots/dataaugmentations.png \
     plots/urbansound8k-examples.png \
diff --git a/report/img/spectrograms.svg b/report/img/spectrograms.svg
diff --git a/report/pyplots/logloss.png b/report/pyplots/logloss.png
diff --git a/report/pyplots/logloss.py b/report/pyplots/logloss.py
@@ -0,0 +1,30 @@
+import numpy
+from matplotlib import pyplot as plt
+from sklearn.metrics import log_loss
+
+def plot_logloss(figsize=(6, 3)):
+    fig, ax = plt.subplots(1, figsize=figsize)
+
+    yhat = numpy.linspace(0.0, 1.0, 300)
+    losses_0 = [log_loss([0], [x], labels=[0,1]) for x in yhat]
+    losses_1 = [log_loss([1], [x], labels=[0,1]) for x in yhat]
+
+    ax.plot(yhat, losses_0, label='true=0')
+    ax.plot(yhat, losses_1, label='true=1')
+    ax.legend()
+
+    ax.set_ylim(0, 8)
+    ax.set_xlim(0, 1)
+
+    return fig
+
+def main():
+    fig = plot_logloss()
+    fig.tight_layout()
+    out = (__file__).replace('.py', '.png')
+    fig.savefig(out, bbox_inches='tight')
+
+if __name__ == '__main__':
+    main()
+
+
diff --git a/report/report.md b/report/report.md
@@ -408,10 +408,11 @@ Neural Networks are trained through numerical optimization of an objective funct
 For supervised learning the standard method is mini-batch Gradient Descent with Backpropagation.
 
 For classification the cross-entropy (log loss) function is often applied.
-As predicted probability of the true class gets close to zero, the (negative) log-loss goes towards infinity.
-Figure \ref{fig:log-loss}
+As predicted probability of the true class gets close to zero, the log-loss goes towards infinity.
+This penalizes wrong predictions heavily, see Figure \ref{fig:log-loss}.
+
+![Plot of log-loss for binary classification. \label{fig:log-loss}](./pyplots/logloss.png){ width=100% }
 
-`TODO: picture of loss in binary cross entropy`
 Categorical cross-entropy is an extension of binary cross-entropy to multiple classes.
 Other loss functions are Logistic Loss, Mean Squared Error and Mean Absolute Error.
 
@@ -428,10 +429,9 @@ This is computed as the partial derivative of the function.
 <!--
 MAYBE: mention momentum
 [@SaddlePointNeuralNetworks]
--->
-
 
 `TODO: image of 1-D loss landscape and Gradient Descent`
+-->
 
 The key to calculating the gradients in a multi-layer neural networks
 is *backpropagation*[@BackpropagationNeuralNetworks].
@@ -754,7 +754,7 @@ It however adds compression artifacts, and is best avoided for machine learning
 Recordings can have multiple channels of audio but for machine learning on audio
 single-channel data (mono-aural) is still common.
 
-### Spectrograms
+### Spectrogram
 
 Sounds of interest often have characteristic patterns not just in time (temporal signature)
 but also in frequency content (spectral signature).
@@ -780,8 +780,10 @@ For speech a typical choice of window length is 20 ms.
 Similar frame lengths are often adopted for acoustic events.
 The STFT returns complex numbers describing phase and magnitude of each frequency bin.
 A spectrogram squaring the absolute of the magnitude, and discards the phase information.
+This is called a *linear spectrogram* or sometimes just spectrogram.
 The lack of phase information means that the spectrogram is not strictly invertible,
 though estimations exist[@GriffinLimSpectrogramInversion][@MCNNSpectrogramInversion].
+A linear spectrogram can be on top in Figure \ref{fig:spectrograms}.
 
 ### Mel-spectrogram
 
@@ -800,7 +802,7 @@ See Figure \ref{figure:filterbanks}.
 The Mel scaled filters is commonly used for audio classification. <!-- TODO: reference -->
 The spectrogram that results for applying a Mel-scale filter-bank is often called a Mel-spectrogram.
 
-`TODO: image of mel-spectrogram`
+![Different spectrograms showing birdsong. Top: Linear spectrogram. Middle: Mel-spectrogram. Bottom: Normalized mel-spectrogram after mean-subtraction and standard scaling. The Mel-spectrograms in this example had the first filter set to 1kHz, eliminating a lot of the low frequency noise seen in the linear spectrogram.](./img/spectrograms.svg){short-caption="Different spectrograms" width=100%}
 
 Mel-Filter Cepstral Coefficients (MFCC) is a feature representation
 computed by performing a Discrete Cosine Transform (DCT) on a mel-spectrogram.