-
Notifications
You must be signed in to change notification settings - Fork 241
Description
I've been implementing "live" curations in si-gui (SpikeInterface/spikeinterface-gui#224), and to make it more usable it would be great to speed up soft merges/splits. So this is an issue to try and direct some optimisation work! Just very quick and dirty: I benchmarked a single two-unit merge on a typical 1-hour NP2 recording. Here are the times to re-compute each extension, with metrics further split up. It's not all extensions, just the ones I use in gui.
Times for each extension re-compute in seconds:
template_similarity,3.515722459065730
firing_rate,1.814500082982700
synchrony,1.3482307499507400
rp_violation,0.54650704190135
waveforms,0.1620308329584080
amplitude_cutoff,0.10835533309727900
templates,0.1060247499262910
correlograms,0.061571332975290700
amplitude_median,0.049069040920585400
spike_locations,0.02714691695291550
snr,0.016673249891027800
num_spikes,0.0160850411048159
amplitude_cv,0.012816166039556300
sliding_rp_violation,0.011563500040210800
unit_locations,0.010269250022247400
spike_amplitudes,0.004084207932464780
firing_range,0.0012063339818269000
repolarization_slope,0.000768749974668026
presence_ratio,0.0004059579223394390
recovery_slope,0.00016670895274728500
isi_violation,0.00012345798313617700
random_spikes,8.01659189164639E-05
sd_ratio,4.77919820696116E-05
half_width,1.93340238183737E-05
noise_levels,1.19999749585986E-05
peak_to_valley,1.12500274553895E-05
peak_trough_ratio,5.79189509153366E-06
(total = 7 seconds. Note: this doesn't scale linearly, two merges take much less time than 2 times one merge)
Good news: the slow ones seem reasonable to optimize! For a soft template (a linear combination of old templates) template similarity can be computed from already-computed info. Firing rate and rp_violation are slow because we re-compute for all units (which we don't need to do). synchrony should be numbafy-able.
Now a single split!
template_similarity,4.082042583962900
correlograms,2.585401250049470
firing_rate,1.890601999941280
synchrony,1.315767249907370
templates,0.44488545798230900
rp_violation,0.43966795899905300
waveforms,0.12408262502867700
amplitude_cutoff,0.12074912490788800
amplitude_median,0.04578441695775840
amplitude_cv,0.022253792034462100
snr,0.016999625018797800
num_spikes,0.011724042007699600
sliding_rp_violation,0.011248083901591600
unit_locations,0.005210084025748070
firing_range,0.000748499995097518
presence_ratio,0.0006416250253096220
repolarization_slope,0.00044879200868308500
recovery_slope,0.0002276250161230560
isi_violation,0.00018608407117426400
random_spikes,0.00010933401063084600
sd_ratio,4.31250082328916E-05
half_width,1.81249342858791E-05
peak_to_valley,1.36250164359808E-05
noise_levels,1.11659755930305E-05
spike_amplitudes,8.66595655679703E-06
peak_trough_ratio,7.12496694177389E-06
spike_locations,5.91692514717579E-06
Harder to optimize splits: we do need to re-compute correlograms for a unit (pain), we do need to recompute the similarity between the new templates (pain). Maybe the entire template_similarity computation can be optimized better? Maybe could be numba-fied or sparsified in some way?
@yger @samuelgarcia please share any ideas! I'm happy to try and compete to make things faster. Seems like a fun project for new nerds on the project (@tayheau ;) ).
Let's post here if we decide to work on something. I'll work on easy optimizations in the metrics first.