Skip to content

Commit 5aa63d1

Browse files
committed
Misc fixes
1 parent dea9dee commit 5aa63d1

File tree

5 files changed

+125
-37
lines changed

5 files changed

+125
-37
lines changed

TODO.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,18 @@ Feedback needed
1212

1313
Results
1414

15-
- Include latest results
16-
- Measure runtime on device for latest models
1715
- Use Strided-DS-24 as chosen model (confusion matrix etc), instead of auto "best"
1816
- Finish basic Discussion and Conclusion
1917
- Make plots a bit prettier
2018
- Add picture of demo setup
2119

20+
Reprod
21+
22+
- Tag a branch for submitted thesis
23+
- Upload models to GH
24+
- Results are from git commit `b49efa5dde48f9fd72a32eff4c751d9d0c0de712`
25+
- Include perftools Python script in appendix?
26+
2227
Abstract
2328

2429
- Write it!

braindump.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,22 @@ For DS-5x5 12, going from 0.5 dropout to 0.25 increases perf from 65% to 72%
5656

5757
python train.py --model strided --conv_block depthwise_separable --epochs 100 --downsample_size=2x2 --filters 12 --dropout 0.25
5858

59+
### Aggregation
60+
Low-pass filter over consequtive frames?
61+
Exponential Moving Average?
62+
63+
## Testing
64+
65+
Jackhammer
66+
https://annotator.freesound.org/fsd/explore/%252Fm%252F03p19w/
67+
https://freesound.org/people/Mark_Ian/sounds/131918/
68+
69+
Dog bark
70+
https://annotator.freesound.org/fsd/explore/%252Fm%252F0bt9lr/
71+
http://freesound.org/s/365053
72+
73+
74+
5975

6076
## Kubernetes
6177

microesc/livedemo.py

Lines changed: 80 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88

99
import numpy
1010
import serial
11+
import scipy.signal
1112

1213
import matplotlib
1314
from matplotlib import pyplot as plt
@@ -75,62 +76,115 @@ def create_interactive():
7576
win = Gtk.Window()
7677
win.connect("delete-event", Gtk.main_quit)
7778
win.set_default_size(400, 300)
78-
win.set_title("Embedding in GTK")
79+
win.set_title("On-sensor Audio Classification")
7980

80-
f = matplotlib.figure.Figure(figsize=(5, 4), dpi=100)
81-
ax = f.add_subplot(111)
82-
t = numpy.arange(0.0, 3.0, 0.01)
83-
s = numpy.sin(2*numpy.pi*t)
84-
85-
#ax.plot(t, s)
81+
fig, (ax, text_ax) = plt.subplots(1, 2)
8682

8783
sw = Gtk.ScrolledWindow()
8884
win.add(sw)
8985
# A scrolled window border goes outside the scrollbars and viewport
9086
sw.set_border_width(10)
9187

92-
canvas = FigureCanvas(f) # a Gtk.DrawingArea
93-
canvas.set_size_request(800, 600)
88+
canvas = FigureCanvas(fig) # a Gtk.DrawingArea
89+
canvas.set_size_request(200, 400)
9490
sw.add_with_viewport(canvas)
9591

96-
predictions = numpy.random.random(10)
97-
rects = ax.bar(numpy.arange(len(predictions)), predictions, align='center', alpha=0.5)
92+
prediction_threshold = 0.35
9893

99-
return win, f, ax, rects
94+
# Plots
95+
predictions = numpy.zeros(11)
96+
tt = numpy.arange(len(predictions))
97+
rects = ax.barh(tt, predictions, align='center', alpha=0.5)
98+
ax.set_yticks(tt)
99+
ax.set_yticklabels(classnames)
100+
ax.set_xlim(0, 1)
100101

101-
def update_plot(ser, ax, fig, rects):
102-
raw = ser.readline()
103-
line = raw.decode('utf-8')
104-
predictions = parse_input(line)
102+
ax.axvline(prediction_threshold)
103+
ax.yaxis.set_ticks_position('right')
104+
105+
# Text
106+
text_ax.axes.get_xaxis().set_visible(False)
107+
text_ax.axes.get_yaxis().set_visible(False)
108+
109+
text = text_ax.text(0.5, 0.2, "Unknown",
110+
horizontalalignment='center',
111+
verticalalignment='center',
112+
fontsize=32,
113+
)
114+
115+
def emwa(new, prev, alpha):
116+
return alpha * new + (1 - alpha) * prev
117+
118+
prev = predictions
119+
alpha = 0.2 # smoothing coefficient
120+
121+
window = numpy.zeros(shape=(4, 11))
122+
123+
from scipy.ndimage.interpolation import shift
124+
125+
def update_plot(predictions):
126+
127+
if len(predictions) < 10:
128+
return
129+
130+
# add unknown class
131+
predictions = numpy.concatenate([predictions, [0.0]])
132+
133+
window[:, :] = numpy.roll(window, 1, axis=0)
134+
window[0, :] = predictions
135+
136+
predictions = numpy.mean(window, axis=0)
105137

106-
if predictions:
107138
best_p = numpy.max(predictions)
108139
best_c = numpy.argmax(predictions)
109-
name = classnames[best_c]
110-
if best_p >= 0.35:
111-
print('p', name, best_p)
140+
if best_p <= prediction_threshold:
141+
best_c = 10
142+
best_p = 0.0
112143

113144
for rect, h in zip(rects, predictions):
114-
rect.set_height(h)
145+
rect.set_width(h)
146+
147+
name = classnames[best_c]
148+
text.set_text(name)
149+
150+
fig.tight_layout()
151+
fig.canvas.draw()
152+
153+
return win, update_plot
154+
155+
def fetch_predictions(ser):
156+
raw = ser.readline()
157+
line = raw.decode('utf-8')
158+
predictions = parse_input(line)
159+
return predictions
115160

116-
fig.canvas.draw()
117161

118-
return True
119162

120163
def main():
121164
test_parse_preds()
122165

123166
device = '/dev/ttyACM1'
124167
baudrate = 115200
125168

126-
window, fig, ax, rects = create_interactive()
169+
window, plot = create_interactive()
127170
window.show_all()
128171

172+
def update(ser):
173+
try:
174+
preds = fetch_predictions(ser)
175+
except Exception as e:
176+
print('error', e)
177+
return True
178+
179+
if preds is not None:
180+
plot(preds)
181+
return True
182+
129183
with serial.Serial(device, baudrate, timeout=0.1) as ser:
130184
# avoid reading stale data
131185
thrash = ser.read(10000)
132-
133-
GLib.timeout_add(200.0, update_plot, ser, ax, fig, rects)
186+
187+
GLib.timeout_add(200.0, update, ser)
134188

135189
Gtk.main() # WARN: blocking
136190

report/abstract.latex

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,11 @@
44
\mbox{}
55

66
\begin{abstract}
7-
This is my summary/abstract
7+
8+
Purpose/Motivation
9+
Methods
10+
Results
11+
Conclusions
812

913
FIXME: write it
1014
\end{abstract}

report/report.md

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -856,7 +856,7 @@ the last window is zero padded.
856856
Sometimes there is a mismatch between the desired length of analysis window,
857857
and the labeled clips available in the training data.
858858
For example a dataset may consist of labeled audio clips with a length of 10 seconds,
859-
while the desired output is every 1 seconds.
859+
while the desired output is every second.
860860
When a dataset is labeled only with the presence of a sound at a coarse timescale,
861861
without information about where exactly the relevant sound(s) appears
862862
it is referred to as *weakly annotated* or *weakly labeled* data[@ComputationalAnalysisSound, ch 14.2.4.1].
@@ -1611,6 +1611,13 @@ What is the battery lifetime. BOM
16111611

16121612
# Conclusions
16131613

1614+
<!--
1615+
1616+
Recap what you did.
1617+
Highlight the big accomplishments.
1618+
Conclude. Wraps up your paper. Tie your research to the “real world.”
1619+
-->
1620+
16141621
Able to demonstrate Environmental Sound Classification
16151622
running on a low-power microcontroller suitable for use in a sensor node.
16161623

@@ -1641,31 +1648,33 @@ However there are also promising results showing that CNNs can be
16411648
effectively implemented with as little as 2 bits[@andri2016yodann][@miyashita2016convolutional][@IncrementalNetworkQuantization],
16421649
and without using any multiplications[@leng2018extremely][@cintra2018low].
16431650

1651+
<!--
16441652
Low-power hardware accelerators for Convolutional Neural Networks will hopefully
16451653
become available over the next few years.
16461654
This may enable larger models at the same power budget,
16471655
or to reduce power consumption at a given predictive performance level.
16481656
End-to-end CNN models using raw audio as input becomes extra interesting with such a co-processor,
16491657
since it allows also the filterbank processing to be offloaded from the general purpose CPU.
1658+
-->
16501659

1651-
In a practical deployment of on-edge classification, it is still desirable to
1652-
be able to collect some data for evaluation of performance and further training.
1660+
In a practical deployment of on-sensor classification, it is still desirable to
1661+
be able to collect *some* data for evaluation of performance and further training.
16531662
This could be sampled at random. But could it be more effective to use some sort of
1654-
adaptive sampling, and possibly Active Learning?
1663+
adaptive sampling, possibly Active Learning?
16551664

1665+
<!--
16561666
Normally such training and evaluation data is transferred as raw PCM audio,
16571667
which inefficient in terms of bandwidth.
16581668
Could low-power audio coding be applied to compress the data,
16591669
while still enable reliable human labeling and use as evaluation/training data?
1660-
1661-
It is also desirable to reduce how often classification is needed.
1670+
-->
1671+
1672+
It is also very desirable to reduce how often classification is needed.
16621673
Could this benefit from an adaptive sampling strategy?
16631674
For example to primarily do classification for time-periods which exceed
16641675
a sound level threshold, or to sample less often when the sound source changes slowly.
16651676

16661677

1667-
1668-
16691678
<!---
16701679
DROP: clean up the scripts, make fit on one/two page
16711680
MAYBE: table with software versions? From requirements.txt

0 commit comments

Comments
 (0)