Tweaky tweak

jonnor · jonnor · commit 747343544a77 · 2019-05-14T02:52:38.000+02:00
diff --git a/TODO.md b/TODO.md
@@ -1,17 +1,8 @@
 
-Feedback needed
-
-- Results/Discussion/Conclusion
-
 ## TODO
 
 ### Final 1
 
-Reprod
-
-- Tag a branch for submitted thesis
-- Results are from git commit `b49efa5dde48f9fd72a32eff4c751d9d0c0de712`
-
 Materials
 
 - Materials: Add images of compared models
@@ -20,15 +11,19 @@ Background
 
 - Make missing images
 
-Report
-
-- Add short captions to figures, so List of Figures looks nice
+Checking
 
+- Do a spell checking pass
+- Do a grammar checking pass. LanguageTool + Grammarly
+- Do a figure/table captions pass.
+Do they explain the figure setup/contents OK?
+- Make sure final page is EVEN number
 
-### Draft 5
+### Final 2
 
 Results
 
+- Include grouped evaluation
 - Include error analysis
 - Plot performance of models relative to fold
 
@@ -39,102 +34,25 @@ Add Acknowledgements?
 - Marianna
 - John
 
-Final
-
-- Do a figure/table captions pass.
-Do they explain the figure setup/contents OK?
-Short captions for the table of contents.
-Remove dots at the end. 2.2, 2.6, 2.16 etc
-- Check references all valid
-- Do a spell checking pass
-- Do a grammar checking pass. LanguageTool + Grammarly
-- Fix Github link, to be to final branch
-- Make sure final page is EVEN number
-
 ### After report
 
-
 Dissemination
 
 - Image of overall project/system
 - Project image, title page
 - Record a demo video
-- Write a blogpost
 - Publish on Arxiv? cs.LG cs.SD eess.AS stat.ML
+- Write a blogpost
 
 Related
 
 - STM32AI: Test different FFT/mel sizes
 - STM32AI: Report/fix melspec preprocessing bug
 https://community.st.com/s/topic/0TO0X0000003iUqWAI/stm32-machine-learning-ai
-- Test USB audio input for classifying on device
+- Test USB audio input for systematic on-device testing of classification 
 
 Experiment
 
-- MAYBE: Fix train and validation generators to be single-pass? 
-
-Code quality
-
-- Add end2end tests
-- Check windowing functions, esp last frame and padding
-
-
-## Done
-
-- Investigated why MobileNets etc use much more RAM than SB-CNN.
-For SB-CNN (Conv2d->MaxPooling2d), X-CUBE-AI fuses in the MaxPooling op, and reduces RAM usage by the pooling factor (4-9x).
-For MobileNet this optimization breaks down, because pooling is not used.
-Layer 2 is then typically too large.
-Instead one can pre-scale down using strided convolutions.
-When done from layer 1, this brings RAM usage under control
-- Fixed CUDA issue with SB-CNN. Can run 5x train at same time with minibatch 100,
-however am still CPU bound and GPU utilization only 30%. Also small batches seem to perform worse.
-With 400 batches and 3 processes, GPU utilization only 20%
-- Tested SystemPerformance tool on STM32.
-Standalone tool works nicely, gives performance for entire network.
-Interactive profiler "Validation tool" did not work, STMCubeMX fails to communicate with firmware.
-Firmware seems to work fine, says "ready to receive host command".
-Validation tool seems to be only tool that can give per-layer inference times. 
-- Test GPU training on GTX2060.
-20 seconds instead of 170 seconds per epoch on mobilenets. 8.5x speedup
-1 model only utilizing 33% of GPU power. Can theoretically run multiple models in parallell, for over 20x speedup
-Under 30 minutes per experiment on all 10 folds.
-However SB-CNN fails with cudaNN error.
-https://github.com/tensorflow/tensorflow/issues/24828
-https://github.com/keras-team/keras/issues/1538
-- STM32AI. Made tool for updating window functions.
-https://github.com/jonnor/emlearn/blob/master/examples/window-function.py
-- Test 16k30 SB-CNN model.
-No compression. 3168k MACC CNN. 200kB flash, 27kB RAM. 396-367 ms
-4 bit compression. 144kB flash, 398ms. Approx 8M MACCS/second
-- Ran FastGRNN example USPS dataset
-- Tested DenseNet for Urbansound8k
-- Sent email for info from dilated conv authors 
-- Sent email for info from LD-CNN authors
-- Tested multiple-instance learning for Urbansound8k
-- Test Dilated CNN for Urbansound8k
-- Test a SB-CNN model for Urbansound8k
-- Test a trivial audio custom model with SMT32CubeAI.
-First crack detection.
-9000 MACC, 2 ms classifier. 8 frames, under 15 ms log-mel preprocessing.
-Approx 4M MACCS/second.
-- Test CNN performance on STM32AI. 80Mhz.
-float32 1024bin FFT,30mels,32frames log-mel preprocessing under 68ms.
-float32 517k MACC CNN classification 78ms. Approx 6M MACCS/second.
-- Trigger LED change to reflect model predictions
-- Check how the neural networks are implemented in STM32CubeAI
-- Tool for extracting MACC from Keras/TensorFlow model. `./experiments/speechcommands/featurecomplexity.py`
-* Tensorflow speechcommand, test to change from MFCC to mel-spec
-* Run Tensorflow speechcommand examples, check perf against published
-- Test standard models examples on STM32 devkits.
-AudioLoop has USB Audio out, useful for recording test data.
-ST BlueSensor Android app useful for testing.
-Built-in example also had BT audio out (but locked at 8kHz?)
-- Move project to dedicated git repo
-- Setup skeleton of report Latex/Markdown
-- Setup Travis CI
-- Installed STM32Cube AI toolchain, and build STM32 AI examples (HAR)
-- Make a shortlist of datasets to consider
-- Order STM32 devkits
+- Use multi-instance learning to get bigger batches and improve GPU utilization
 
 
diff --git a/experiments/ldcnn20k60.yaml b/experiments/ldcnn20k60.yaml
@@ -14,3 +14,4 @@ val_samples: 5000
 learning_rate: 0.005
 voting: 'mean'
 voting_overlap: 0.0
+nesterov_momentum: 0.9
diff --git a/microesc/settings.py b/microesc/settings.py
@@ -17,6 +17,7 @@
     val_samples=3000,
     augment=0,
     learning_rate=0.01,
+    nesterov_momentum=0.9,
 )
 
 default_model_settings = dict(
diff --git a/microesc/stm32convert.py b/microesc/stm32convert.py
@@ -2,7 +2,7 @@
 """
 Convert a Keras/Lasagne/Caffe model to C for STM32 microcontrollers using ST X-CUBE-AI
 
-Wrapped around the 'generatecode' tool used in STM32CubeMX from the X-CUBE-AI addon
+Wrapper around the 'generatecode' tool used in STM32CubeMX from the X-CUBE-AI addon
 """
 
 import pathlib
@@ -91,13 +91,11 @@ def test_ram_use():
 
     ]
 
-
     for input, expected in examples:
         out = extract_ram_use(input)
-
         assert out == expected, out
 
-# TODO: also extract AI_NETWORK_DATA_ACTIVATIONS_SIZE  and AI_NETWORK_DATA_WEIGHTS_SIZE
+
 def extract_ram_use(str):
     regex = r"AI_ARRAY_OBJ_DECLARE\(([^)]*)\)"
     matches = re.finditer(regex, str, re.MULTILINE)
@@ -113,7 +111,6 @@ def extract_ram_use(str):
 
 
 def generatecode(model_path, out_path, name, model_type, compression):
-
     # Path to CLI tool
     home = str(pathlib.Path.home())
     version = os.environ.get('XCUBEAI_VERSION', '3.4.0')
@@ -135,7 +132,6 @@ def generatecode(model_path, out_path, name, model_type, compression):
     with open(config_path, 'w') as f:
         f.write(config)
 
-    
     # Run generatecode
     args = [
         cmd_path,
@@ -144,8 +140,6 @@ def generatecode(model_path, out_path, name, model_type, compression):
     ]
     stdout = subprocess.check_output(args, stderr=subprocess.STDOUT)
 
-    # TODO: detect NOT IMPLEMENTED
-
     # Parse MACCs / params from stdout
     stats = extract_stats(stdout)
     assert len(stats.keys()), 'No model output. Stdout: {}'.format(stdout) 
@@ -185,7 +179,6 @@ def main():
 
     test_ram_use()
 
-
     stats = generatecode(args.model, args.out,
                         name=args.name,
                         model_type=args.type,
@@ -196,7 +189,3 @@ def main():
 if __name__ == '__main__':
     main()
 
-
-
-
-
diff --git a/microesc/train.py b/microesc/train.py
@@ -98,8 +98,9 @@ def train_model(out_dir, train, val, model,
     def top3(y_true, y_pred):
         return keras.metrics.top_k_categorical_accuracy(y_true, y_pred, k=3)
 
+    optimizer = keras.optimizers.SGD(lr=learning_rate, momentum=settings['nesterov_momentum'], nesterov=True)
     model.compile(loss='categorical_crossentropy',
-                  optimizer=keras.optimizers.SGD(lr=learning_rate, momentum=0.9, nesterov=True),
+                  optimizer=optimizer,
                   metrics=['accuracy'])
 
     model_path = os.path.join(out_dir, 'e{epoch:02d}-v{val_loss:.2f}.t{loss:.2f}.model.hdf5')
diff --git a/report/Makefile b/report/Makefile
@@ -29,7 +29,7 @@ includes: pyincludes/urbansound8k-classes.tex \
 
 
 report.pdf: report.md includes
-	pandoc --include-before-body=cover.latex --include-before-body=abstract.latex --include-after-body=end.latex --bibliography=references.bib -V papersize:a4 -V geometry:margin=1.0in -V fontsize=12pt -H preamble.tex --csl ieee.csl --toc -Vlof -Vlot --pdf-engine-opt=-shell-escape --number-sections -s report.md -o report.pdf
+	pandoc --include-before-body=cover.latex --include-before-body=abstract.latex --include-after-body=end.latex --bibliography=references.bib -V papersize:a4 -V geometry:margin=1.0in -V fontsize=12pt -H preamble.tex --csl ieee.csl --toc -Vlof -Vlot --pdf-engine-opt=-shell-escape --number-sections --lua-filter=short-captions.lua -s report.md -o report.pdf
 
 status.pdf: status.md
 	pandoc -t beamer -s status.md -o status.pdf --slide-level=2 --mathml
diff --git a/report/no-figure-floats.tex b/report/no-figure-floats.tex
@@ -0,0 +1,8 @@
+\usepackage{float}
+\let\origfigure\figure
+\let\endorigfigure\endfigure
+\renewenvironment{figure}[1][2] {
+    \expandafter\origfigure\expandafter[H]
+} {
+    \endorigfigure
+}
diff --git a/report/pyincludes/experiment-settings.py b/report/pyincludes/experiment-settings.py
@@ -25,6 +25,7 @@
     'train_samples': 'Training samples/epoch',
     'val_samples': 'Validation samples/epoch',
     'learning_rate': 'Learning rate',
+    'nesterov_momentum': 'Nesterov momentum',
 }
 
 table = settings.loc[list(names.keys())]
diff --git a/report/pyincludes/experiment-settings.tex b/report/pyincludes/experiment-settings.tex
@@ -13,5 +13,6 @@
 Training samples/epoch   &  30000 \\
 Validation samples/epoch &   5000 \\
 Learning rate            &  0.005 \\
+Nesterov momentum        &    NaN \\
 \bottomrule
 \end{tabular}
diff --git a/report/pyincludes/urbansound8k-classes.py b/report/pyincludes/urbansound8k-classes.py
@@ -10,10 +10,10 @@
 
 table = pandas.DataFrame({
     'Samples': by_class.count()['classID'],
-    'Average duration': by_class.apply(lambda r: '%.2fs' % (r.end-r.start).mean()),
-    'In foreground': [ "{}%".format(int(100*r)) for r in foreground_ratio ]
+    'Duration (avg)': by_class.apply(lambda r: '%.2f s' % (r.end-r.start).mean()),
+    'In foreground': [ "{} %".format(int(100*r)) for r in foreground_ratio ]
 })
-out = table.to_latex(header=True, index=True)
+out = table.to_latex(header=True, index=True, column_format="lrrr")
 print(out)
 
 outpath = sys.argv[1] 
diff --git a/report/pyincludes/urbansound8k-classes.tex b/report/pyincludes/urbansound8k-classes.tex
@@ -1,17 +1,17 @@
-\begin{tabular}{lrll}
+\begin{tabular}{lrrr}
 \toprule
-{} &  Samples & Average duration & In foreground \\
-class            &          &                  &               \\
+{} &  Samples & Duration (avg) & In foreground \\
+class            &          &                &               \\
 \midrule
-air\_conditioner  &     1000 &            3.99s &           56\% \\
-car\_horn         &      429 &            2.46s &           35\% \\
-children\_playing &     1000 &            3.96s &           58\% \\
-dog\_bark         &     1000 &            3.15s &           64\% \\
-drilling         &     1000 &            3.55s &           90\% \\
-engine\_idling    &     1000 &            3.94s &           91\% \\
-gun\_shot         &      374 &            1.65s &           81\% \\
-jackhammer       &     1000 &            3.61s &           73\% \\
-siren            &      929 &            3.91s &           28\% \\
-street\_music     &     1000 &            4.00s &           62\% \\
+air\_conditioner  &     1000 &         3.99 s &          56 \% \\
+car\_horn         &      429 &         2.46 s &          35 \% \\
+children\_playing &     1000 &         3.96 s &          58 \% \\
+dog\_bark         &     1000 &         3.15 s &          64 \% \\
+drilling         &     1000 &         3.55 s &          90 \% \\
+engine\_idling    &     1000 &         3.94 s &          91 \% \\
+gun\_shot         &      374 &         1.65 s &          81 \% \\
+jackhammer       &     1000 &         3.61 s &          73 \% \\
+siren            &      929 &         3.91 s &          28 \% \\
+street\_music     &     1000 &         4.00 s &          62 \% \\
 \bottomrule
 \end{tabular}
diff --git a/report/report.md b/report/report.md
diff --git a/report/short-captions.lua b/report/short-captions.lua

Original file line number	Diff line number	Diff line change
`@@ -17,6 +17,7 @@`
`17`	`17`	`val_samples=3000,`
`18`	`18`	`augment=0,`
`19`	`19`	`learning_rate=0.01,`
	`20`	`+ nesterov_momentum=0.9,`
`20`	`21`	`)`
`21`	`22`
`22`	`23`	`default_model_settings = dict(`
Original file line number	Diff line number	Diff line change
`@@ -25,6 +25,7 @@`
`25`	`25`	`'train_samples': 'Training samples/epoch',`
`26`	`26`	`'val_samples': 'Validation samples/epoch',`
`27`	`27`	`'learning_rate': 'Learning rate',`
	`28`	`+ 'nesterov_momentum': 'Nesterov momentum',`
`28`	`29`	`}`
`29`	`30`
`30`	`31`	`table = settings.loc[list(names.keys())]`