Skip to content

Commit 7473435

Browse files
committed
Tweaky tweak
1 parent cd8368e commit 7473435

File tree

13 files changed

+125
-169
lines changed

13 files changed

+125
-169
lines changed

TODO.md

Lines changed: 11 additions & 93 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,8 @@
11

2-
Feedback needed
3-
4-
- Results/Discussion/Conclusion
5-
62
## TODO
73

84
### Final 1
95

10-
Reprod
11-
12-
- Tag a branch for submitted thesis
13-
- Results are from git commit `b49efa5dde48f9fd72a32eff4c751d9d0c0de712`
14-
156
Materials
167

178
- Materials: Add images of compared models
@@ -20,15 +11,19 @@ Background
2011

2112
- Make missing images
2213

23-
Report
24-
25-
- Add short captions to figures, so List of Figures looks nice
14+
Checking
2615

16+
- Do a spell checking pass
17+
- Do a grammar checking pass. LanguageTool + Grammarly
18+
- Do a figure/table captions pass.
19+
Do they explain the figure setup/contents OK?
20+
- Make sure final page is EVEN number
2721

28-
### Draft 5
22+
### Final 2
2923

3024
Results
3125

26+
- Include grouped evaluation
3227
- Include error analysis
3328
- Plot performance of models relative to fold
3429

@@ -39,102 +34,25 @@ Add Acknowledgements?
3934
- Marianna
4035
- John
4136

42-
Final
43-
44-
- Do a figure/table captions pass.
45-
Do they explain the figure setup/contents OK?
46-
Short captions for the table of contents.
47-
Remove dots at the end. 2.2, 2.6, 2.16 etc
48-
- Check references all valid
49-
- Do a spell checking pass
50-
- Do a grammar checking pass. LanguageTool + Grammarly
51-
- Fix Github link, to be to final branch
52-
- Make sure final page is EVEN number
53-
5437
### After report
5538

56-
5739
Dissemination
5840

5941
- Image of overall project/system
6042
- Project image, title page
6143
- Record a demo video
62-
- Write a blogpost
6344
- Publish on Arxiv? cs.LG cs.SD eess.AS stat.ML
45+
- Write a blogpost
6446

6547
Related
6648

6749
- STM32AI: Test different FFT/mel sizes
6850
- STM32AI: Report/fix melspec preprocessing bug
6951
https://community.st.com/s/topic/0TO0X0000003iUqWAI/stm32-machine-learning-ai
70-
- Test USB audio input for classifying on device
52+
- Test USB audio input for systematic on-device testing of classification
7153

7254
Experiment
7355

74-
- MAYBE: Fix train and validation generators to be single-pass?
75-
76-
Code quality
77-
78-
- Add end2end tests
79-
- Check windowing functions, esp last frame and padding
80-
81-
82-
## Done
83-
84-
- Investigated why MobileNets etc use much more RAM than SB-CNN.
85-
For SB-CNN (Conv2d->MaxPooling2d), X-CUBE-AI fuses in the MaxPooling op, and reduces RAM usage by the pooling factor (4-9x).
86-
For MobileNet this optimization breaks down, because pooling is not used.
87-
Layer 2 is then typically too large.
88-
Instead one can pre-scale down using strided convolutions.
89-
When done from layer 1, this brings RAM usage under control
90-
- Fixed CUDA issue with SB-CNN. Can run 5x train at same time with minibatch 100,
91-
however am still CPU bound and GPU utilization only 30%. Also small batches seem to perform worse.
92-
With 400 batches and 3 processes, GPU utilization only 20%
93-
- Tested SystemPerformance tool on STM32.
94-
Standalone tool works nicely, gives performance for entire network.
95-
Interactive profiler "Validation tool" did not work, STMCubeMX fails to communicate with firmware.
96-
Firmware seems to work fine, says "ready to receive host command".
97-
Validation tool seems to be only tool that can give per-layer inference times.
98-
- Test GPU training on GTX2060.
99-
20 seconds instead of 170 seconds per epoch on mobilenets. 8.5x speedup
100-
1 model only utilizing 33% of GPU power. Can theoretically run multiple models in parallell, for over 20x speedup
101-
Under 30 minutes per experiment on all 10 folds.
102-
However SB-CNN fails with cudaNN error.
103-
https://github.com/tensorflow/tensorflow/issues/24828
104-
https://github.com/keras-team/keras/issues/1538
105-
- STM32AI. Made tool for updating window functions.
106-
https://github.com/jonnor/emlearn/blob/master/examples/window-function.py
107-
- Test 16k30 SB-CNN model.
108-
No compression. 3168k MACC CNN. 200kB flash, 27kB RAM. 396-367 ms
109-
4 bit compression. 144kB flash, 398ms. Approx 8M MACCS/second
110-
- Ran FastGRNN example USPS dataset
111-
- Tested DenseNet for Urbansound8k
112-
- Sent email for info from dilated conv authors
113-
- Sent email for info from LD-CNN authors
114-
- Tested multiple-instance learning for Urbansound8k
115-
- Test Dilated CNN for Urbansound8k
116-
- Test a SB-CNN model for Urbansound8k
117-
- Test a trivial audio custom model with SMT32CubeAI.
118-
First crack detection.
119-
9000 MACC, 2 ms classifier. 8 frames, under 15 ms log-mel preprocessing.
120-
Approx 4M MACCS/second.
121-
- Test CNN performance on STM32AI. 80Mhz.
122-
float32 1024bin FFT,30mels,32frames log-mel preprocessing under 68ms.
123-
float32 517k MACC CNN classification 78ms. Approx 6M MACCS/second.
124-
- Trigger LED change to reflect model predictions
125-
- Check how the neural networks are implemented in STM32CubeAI
126-
- Tool for extracting MACC from Keras/TensorFlow model. `./experiments/speechcommands/featurecomplexity.py`
127-
* Tensorflow speechcommand, test to change from MFCC to mel-spec
128-
* Run Tensorflow speechcommand examples, check perf against published
129-
- Test standard models examples on STM32 devkits.
130-
AudioLoop has USB Audio out, useful for recording test data.
131-
ST BlueSensor Android app useful for testing.
132-
Built-in example also had BT audio out (but locked at 8kHz?)
133-
- Move project to dedicated git repo
134-
- Setup skeleton of report Latex/Markdown
135-
- Setup Travis CI
136-
- Installed STM32Cube AI toolchain, and build STM32 AI examples (HAR)
137-
- Make a shortlist of datasets to consider
138-
- Order STM32 devkits
56+
- Use multi-instance learning to get bigger batches and improve GPU utilization
13957

14058

experiments/ldcnn20k60.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,4 @@ val_samples: 5000
1414
learning_rate: 0.005
1515
voting: 'mean'
1616
voting_overlap: 0.0
17+
nesterov_momentum: 0.9

microesc/settings.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
val_samples=3000,
1818
augment=0,
1919
learning_rate=0.01,
20+
nesterov_momentum=0.9,
2021
)
2122

2223
default_model_settings = dict(

microesc/stm32convert.py

Lines changed: 2 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"""
33
Convert a Keras/Lasagne/Caffe model to C for STM32 microcontrollers using ST X-CUBE-AI
44
5-
Wrapped around the 'generatecode' tool used in STM32CubeMX from the X-CUBE-AI addon
5+
Wrapper around the 'generatecode' tool used in STM32CubeMX from the X-CUBE-AI addon
66
"""
77

88
import pathlib
@@ -91,13 +91,11 @@ def test_ram_use():
9191

9292
]
9393

94-
9594
for input, expected in examples:
9695
out = extract_ram_use(input)
97-
9896
assert out == expected, out
9997

100-
# TODO: also extract AI_NETWORK_DATA_ACTIVATIONS_SIZE and AI_NETWORK_DATA_WEIGHTS_SIZE
98+
10199
def extract_ram_use(str):
102100
regex = r"AI_ARRAY_OBJ_DECLARE\(([^)]*)\)"
103101
matches = re.finditer(regex, str, re.MULTILINE)
@@ -113,7 +111,6 @@ def extract_ram_use(str):
113111

114112

115113
def generatecode(model_path, out_path, name, model_type, compression):
116-
117114
# Path to CLI tool
118115
home = str(pathlib.Path.home())
119116
version = os.environ.get('XCUBEAI_VERSION', '3.4.0')
@@ -135,7 +132,6 @@ def generatecode(model_path, out_path, name, model_type, compression):
135132
with open(config_path, 'w') as f:
136133
f.write(config)
137134

138-
139135
# Run generatecode
140136
args = [
141137
cmd_path,
@@ -144,8 +140,6 @@ def generatecode(model_path, out_path, name, model_type, compression):
144140
]
145141
stdout = subprocess.check_output(args, stderr=subprocess.STDOUT)
146142

147-
# TODO: detect NOT IMPLEMENTED
148-
149143
# Parse MACCs / params from stdout
150144
stats = extract_stats(stdout)
151145
assert len(stats.keys()), 'No model output. Stdout: {}'.format(stdout)
@@ -185,7 +179,6 @@ def main():
185179

186180
test_ram_use()
187181

188-
189182
stats = generatecode(args.model, args.out,
190183
name=args.name,
191184
model_type=args.type,
@@ -196,7 +189,3 @@ def main():
196189
if __name__ == '__main__':
197190
main()
198191

199-
200-
201-
202-

microesc/train.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,8 +98,9 @@ def train_model(out_dir, train, val, model,
9898
def top3(y_true, y_pred):
9999
return keras.metrics.top_k_categorical_accuracy(y_true, y_pred, k=3)
100100

101+
optimizer = keras.optimizers.SGD(lr=learning_rate, momentum=settings['nesterov_momentum'], nesterov=True)
101102
model.compile(loss='categorical_crossentropy',
102-
optimizer=keras.optimizers.SGD(lr=learning_rate, momentum=0.9, nesterov=True),
103+
optimizer=optimizer,
103104
metrics=['accuracy'])
104105

105106
model_path = os.path.join(out_dir, 'e{epoch:02d}-v{val_loss:.2f}.t{loss:.2f}.model.hdf5')

report/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ includes: pyincludes/urbansound8k-classes.tex \
2929

3030

3131
report.pdf: report.md includes
32-
pandoc --include-before-body=cover.latex --include-before-body=abstract.latex --include-after-body=end.latex --bibliography=references.bib -V papersize:a4 -V geometry:margin=1.0in -V fontsize=12pt -H preamble.tex --csl ieee.csl --toc -Vlof -Vlot --pdf-engine-opt=-shell-escape --number-sections -s report.md -o report.pdf
32+
pandoc --include-before-body=cover.latex --include-before-body=abstract.latex --include-after-body=end.latex --bibliography=references.bib -V papersize:a4 -V geometry:margin=1.0in -V fontsize=12pt -H preamble.tex --csl ieee.csl --toc -Vlof -Vlot --pdf-engine-opt=-shell-escape --number-sections --lua-filter=short-captions.lua -s report.md -o report.pdf
3333

3434
status.pdf: status.md
3535
pandoc -t beamer -s status.md -o status.pdf --slide-level=2 --mathml

report/no-figure-floats.tex

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
\usepackage{float}
2+
\let\origfigure\figure
3+
\let\endorigfigure\endfigure
4+
\renewenvironment{figure}[1][2] {
5+
\expandafter\origfigure\expandafter[H]
6+
} {
7+
\endorigfigure
8+
}

report/pyincludes/experiment-settings.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
'train_samples': 'Training samples/epoch',
2626
'val_samples': 'Validation samples/epoch',
2727
'learning_rate': 'Learning rate',
28+
'nesterov_momentum': 'Nesterov momentum',
2829
}
2930

3031
table = settings.loc[list(names.keys())]

report/pyincludes/experiment-settings.tex

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,5 +13,6 @@
1313
Training samples/epoch & 30000 \\
1414
Validation samples/epoch & 5000 \\
1515
Learning rate & 0.005 \\
16+
Nesterov momentum & NaN \\
1617
\bottomrule
1718
\end{tabular}

report/pyincludes/urbansound8k-classes.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,10 @@
1010

1111
table = pandas.DataFrame({
1212
'Samples': by_class.count()['classID'],
13-
'Average duration': by_class.apply(lambda r: '%.2fs' % (r.end-r.start).mean()),
14-
'In foreground': [ "{}%".format(int(100*r)) for r in foreground_ratio ]
13+
'Duration (avg)': by_class.apply(lambda r: '%.2f s' % (r.end-r.start).mean()),
14+
'In foreground': [ "{} %".format(int(100*r)) for r in foreground_ratio ]
1515
})
16-
out = table.to_latex(header=True, index=True)
16+
out = table.to_latex(header=True, index=True, column_format="lrrr")
1717
print(out)
1818

1919
outpath = sys.argv[1]

0 commit comments

Comments
 (0)