@@ -1518,9 +1518,11 @@ the model performance is evaluated on two simplified variations:
15181518
15191519` TODO: table of group membership `
15201520
1521+ <!--
15211522Also explored is the consequence of introduce an "unknown" class for low-confidence predictions:
15221523Predictions where the highest probability is below a certain threshold
15231524are assigned to the unknown class instead of the original 10 classes.
1525+ -->
15241526
15251527The SystemPerformance application skeleton from X-CUBE-AI is used to record the
15261528average inference time per sample on the STM32L476 microcontroller.
@@ -1666,11 +1668,11 @@ practical challenges with applying on-edge classification of noise in sensor net
16661668Utilizing larger amounts of training data might
16671669be able to increase performance of the models shown.
16681670Possible techniques for this are transfer learning[ @PretrainingSpeechCommandRecognition ] ,
1669- or applying stronger data augmentation techniques (such as Mixup).
1671+ or applying stronger data augmentation techniques (such as Mixup[ Mixup ] or SpecAugment [ @ SpecAugment ] ).
16701672
1671- Applying quantization should make the computations of the models more efficient .
1673+ Applying quantization should speed up the computations of the models.
16721674A first step would be to make use of the optimized CMSIS-NN library[ @CMSIS-NN ] ,
1673- which utilizes 8-bit integer operations.
1675+ which utilizes 8-bit integer operations and the SIMD unit in the ARM Cortex M4F.
16741676However there are also promising results showing that CNNs can be
16751677effectively implemented with as little as 2 bits[ @andri2016yodann ] [ @miyashita2016convolutional ] [ @IncrementalNetworkQuantization ] ,
16761678and without using any multiplications[ @leng2018extremely ] [ @cintra2018low ] .
@@ -1685,9 +1687,10 @@ since it allows also the filterbank processing to be offloaded from the general
16851687-->
16861688
16871689In a practical deployment of on-sensor classification, it is still desirable to
1688- be able to collect * some* data for evaluation of performance and further training.
1689- This could be sampled at random. But could it be more effective to use some sort of
1690- adaptive sampling, possibly Active Learning?
1690+ collect * some* data for evaluation of performance and further training.
1691+ This could be sampled at random.
1692+ But can an on-sensor implementation Active Learning[ @ActiveLearningSonyc ] [ @SemiSupervisedActiveLearning ]
1693+ make this process more efficient?
16911694
16921695<!--
16931696Normally such training and evaluation data is transferred as raw PCM audio,
@@ -1696,8 +1699,8 @@ Could low-power audio coding be applied to compress the data,
16961699while still enable reliable human labeling and use as evaluation/training data?
16971700-->
16981701
1699- It is also very desirable to reduce how often classification is needed .
1700- Could this benefit from an adaptive sampling strategy?
1702+ It is critical for power consumption to reduce how often on-sensor classification is performed .
1703+ This should also benefit from an adaptive sampling strategy.
17011704For example to primarily do classification for time-periods which exceed
17021705a sound level threshold, or to sample less often when the sound source changes slowly.
17031706
0 commit comments