11
2- Feedback needed
3-
4- - Results/Discussion/Conclusion
5-
62## TODO
73
84### Final 1
95
10- Reprod
11-
12- - Tag a branch for submitted thesis
13- - Results are from git commit ` b49efa5dde48f9fd72a32eff4c751d9d0c0de712 `
14-
156Materials
167
178- Materials: Add images of compared models
@@ -20,15 +11,19 @@ Background
2011
2112- Make missing images
2213
23- Report
24-
25- - Add short captions to figures, so List of Figures looks nice
14+ Checking
2615
16+ - Do a spell checking pass
17+ - Do a grammar checking pass. LanguageTool + Grammarly
18+ - Do a figure/table captions pass.
19+ Do they explain the figure setup/contents OK?
20+ - Make sure final page is EVEN number
2721
28- ### Draft 5
22+ ### Final 2
2923
3024Results
3125
26+ - Include grouped evaluation
3227- Include error analysis
3328- Plot performance of models relative to fold
3429
@@ -39,102 +34,25 @@ Add Acknowledgements?
3934- Marianna
4035- John
4136
42- Final
43-
44- - Do a figure/table captions pass.
45- Do they explain the figure setup/contents OK?
46- Short captions for the table of contents.
47- Remove dots at the end. 2.2, 2.6, 2.16 etc
48- - Check references all valid
49- - Do a spell checking pass
50- - Do a grammar checking pass. LanguageTool + Grammarly
51- - Fix Github link, to be to final branch
52- - Make sure final page is EVEN number
53-
5437### After report
5538
56-
5739Dissemination
5840
5941- Image of overall project/system
6042- Project image, title page
6143- Record a demo video
62- - Write a blogpost
6344- Publish on Arxiv? cs.LG cs.SD eess.AS stat.ML
45+ - Write a blogpost
6446
6547Related
6648
6749- STM32AI: Test different FFT/mel sizes
6850- STM32AI: Report/fix melspec preprocessing bug
6951https://community.st.com/s/topic/0TO0X0000003iUqWAI/stm32-machine-learning-ai
70- - Test USB audio input for classifying on device
52+ - Test USB audio input for systematic on- device testing of classification
7153
7254Experiment
7355
74- - MAYBE: Fix train and validation generators to be single-pass?
75-
76- Code quality
77-
78- - Add end2end tests
79- - Check windowing functions, esp last frame and padding
80-
81-
82- ## Done
83-
84- - Investigated why MobileNets etc use much more RAM than SB-CNN.
85- For SB-CNN (Conv2d->MaxPooling2d), X-CUBE-AI fuses in the MaxPooling op, and reduces RAM usage by the pooling factor (4-9x).
86- For MobileNet this optimization breaks down, because pooling is not used.
87- Layer 2 is then typically too large.
88- Instead one can pre-scale down using strided convolutions.
89- When done from layer 1, this brings RAM usage under control
90- - Fixed CUDA issue with SB-CNN. Can run 5x train at same time with minibatch 100,
91- however am still CPU bound and GPU utilization only 30%. Also small batches seem to perform worse.
92- With 400 batches and 3 processes, GPU utilization only 20%
93- - Tested SystemPerformance tool on STM32.
94- Standalone tool works nicely, gives performance for entire network.
95- Interactive profiler "Validation tool" did not work, STMCubeMX fails to communicate with firmware.
96- Firmware seems to work fine, says "ready to receive host command".
97- Validation tool seems to be only tool that can give per-layer inference times.
98- - Test GPU training on GTX2060.
99- 20 seconds instead of 170 seconds per epoch on mobilenets. 8.5x speedup
100- 1 model only utilizing 33% of GPU power. Can theoretically run multiple models in parallell, for over 20x speedup
101- Under 30 minutes per experiment on all 10 folds.
102- However SB-CNN fails with cudaNN error.
103- https://github.com/tensorflow/tensorflow/issues/24828
104- https://github.com/keras-team/keras/issues/1538
105- - STM32AI. Made tool for updating window functions.
106- https://github.com/jonnor/emlearn/blob/master/examples/window-function.py
107- - Test 16k30 SB-CNN model.
108- No compression. 3168k MACC CNN. 200kB flash, 27kB RAM. 396-367 ms
109- 4 bit compression. 144kB flash, 398ms. Approx 8M MACCS/second
110- - Ran FastGRNN example USPS dataset
111- - Tested DenseNet for Urbansound8k
112- - Sent email for info from dilated conv authors
113- - Sent email for info from LD-CNN authors
114- - Tested multiple-instance learning for Urbansound8k
115- - Test Dilated CNN for Urbansound8k
116- - Test a SB-CNN model for Urbansound8k
117- - Test a trivial audio custom model with SMT32CubeAI.
118- First crack detection.
119- 9000 MACC, 2 ms classifier. 8 frames, under 15 ms log-mel preprocessing.
120- Approx 4M MACCS/second.
121- - Test CNN performance on STM32AI. 80Mhz.
122- float32 1024bin FFT,30mels,32frames log-mel preprocessing under 68ms.
123- float32 517k MACC CNN classification 78ms. Approx 6M MACCS/second.
124- - Trigger LED change to reflect model predictions
125- - Check how the neural networks are implemented in STM32CubeAI
126- - Tool for extracting MACC from Keras/TensorFlow model. ` ./experiments/speechcommands/featurecomplexity.py `
127- * Tensorflow speechcommand, test to change from MFCC to mel-spec
128- * Run Tensorflow speechcommand examples, check perf against published
129- - Test standard models examples on STM32 devkits.
130- AudioLoop has USB Audio out, useful for recording test data.
131- ST BlueSensor Android app useful for testing.
132- Built-in example also had BT audio out (but locked at 8kHz?)
133- - Move project to dedicated git repo
134- - Setup skeleton of report Latex/Markdown
135- - Setup Travis CI
136- - Installed STM32Cube AI toolchain, and build STM32 AI examples (HAR)
137- - Make a shortlist of datasets to consider
138- - Order STM32 devkits
56+ - Use multi-instance learning to get bigger batches and improve GPU utilization
13957
14058
0 commit comments