Fig 1.
Action potential detection with high temporal accuracy using a convolutional neural network (CNN).
A) Architecture of the CNN used for spike detection in 20kHz MEA recordings. The input layer consists of 200 nodes, corresponding to a 10ms window. The output layer consists of 120 nodes corresponding to 2-8ms in the input window. Each output node provides a score between 0 and 100 indicating the likelihood that the corresponding input frame contains a waveform trough. The 4 convolutional layers have a kernel size of 21 frames and a stride of 1 frame so that each output node makes a prediction based on the signal in 2ms before until 2ms after the corresponding frame in the input. B) Example of averaged waveform footprint detected by Kilosort2. The purple traces were selected for training/validating the CNN. Note the variety in the selected waveform shapes from this single unit. This ensures that the detection model can detect the propagating action potential along different parts of the neuron. C) Example of training/validating sample creation and model prediction. i: A 10ms sample of recording specific noise is taken. ii: A waveform shape is selected from the training or validating pool. iii: The waveform is pasted into the recording device specific noise with ground truth certainty about the trough location (marked with yellow dotted line). iv: CNN detection model predictions for the frames in 2-8ms of the input window show a narrow detection peak at the waveform trough. All figures share the same x-axis. D) Same as C but with multiple overlapping waveforms in the same sample. E) Training (purple) and validating (cyan) loss as a function of training epoch. Error bars indicate the STD over the different cross-validation folds. The small differences between the training loss and validating loss indicate that the models is not overfitting nor underfitting the data. F) Recall when validating the detection model on samples generated from the held-out recording and when applying a 5RMS threshold to the same samples. The markers indicate the results for each of the 6 held-out datasets and the bar reflects the mean over all held-out datasets. Mean±STD for CNN = 89.1%±5.59% and for 5RMS = 63.6%±7.36%. The detection model has a significantly higher recall (P = 1.25*10−7, two-sided paired t-test, n = 6). G) Precision when validating the detection model on samples generated from the held-out recording and when applying a 5RMS threshold to the same samples. The markers indicate the results for each of the 6 held-out datasets and the bar reflects the mean over all held-out datasets. Mean±STD for CNN = 91.5%±3.23% and for 5RMS = 95.0%±3.36%. The difference in precision is not significant (P = 0.13, two-sided paired t-test, n = 6). H) Amplitude distribution of detected (purple) and missed (cyan) spikes by the detection model shows a good detection performance for spikes below 5RMS. Amplitudes are expressed as RMS relative to the surrounding 50ms of signal. I) F1 score, precision and recall of the detection model as a function of the detection threshold. The loose and stringent detection thresholds are marked with L and S respectively. J) The absolute deviation between the model detections and the ground truth trough times shows a high temporal accuracy with on average a deviation of 13μs (0.26 frames). Inset contains the same distribution on a log scale.
Fig 2.
RT-Sort spike sorting performance on ground truth datasets.
A) Averaged waveform footprint from a neuron recorded using a high-density multi-electrode array (MEA). Each trace represents the signal from a single electrode averaged over all action potential detections of the recorded neuron. 5 times the signal to noise ratio of the signal measured on each electrode is indicated with dotted red lines. The root electrode detected by RT-Sort and used for the spike triggered averaging is marked with the red star. The trace on each electrode ranges from 2ms before until 2ms after the waveform trough on the root electrode. All detected loose electrodes by RT-Sort are marked in bold. For each electrode, the color of the trace represents the average detection interval relative to the root electrode. B) The average spike detection model scores over all detected action potentials for the same neuron and electrodes as A. The loose and stringent detection threshold on each electrode is indicated with dotted red lines. All detected loose electrodes by RT-Sort are marked in bold. To better reflect the high temporal precision of the detection model, the time range for each trace is magnified 4 times relative to the traces in A, resulting in the detection score trace ranging from 0.5ms before until 0.5ms after the waveform trough on the root electrode (marked with red star). C) From top to bottom: Raw unfiltered MEA trace of the root electrode marked in A and B for part of the recording (black). This signal is used as input for the detection model. Detection model scores for the same period (yellow). RT-Sort spike sorted detections (yellow dots). Spike detections in simultaneously made patch recording of the same neuron (blue dots). Patch trace recorded in cell-attach mode (blue line) with spike detection threshold marked in red. Note that the detection model sometimes detects spikes from an adjacent neuron (see panel F) although RT-Sort correctly assigns these detections to a different neuron. Right: zoomed in signals for a single spike marked with D. D) Same as A and B but for a single spike from the same neuron, detected by RT-Sort in online mode. The spike is marked with D in panel C. The colors for each trace correspond to the same latencies as A. E) Another example of a single spike footprint from the same neuron, marked with E in panel C. Note the difference in the waveform shape compared to D on the electrode above the root electrode. F) Single spike footprint from the same neuron that overlaps with a spike from a different neuron (root electrode marked with red arrow). Despite the waveform overlap, the spike is still correctly detected as marked in panel C. G) Precision and recall for spikes detected by RT-Sort from 4 different neurons on 2 different MEAs with simultaneous patch-clamp ground truth recording. Precision = 97.5%±4.4%, Recall = 90.9%±5.57% (mean±STD). The example neuron in A is marked with the circle. H) Precision and recall for spikes detected by RT-Sort from 2 different neurons on an MEA with simultaneous patch-clamp ground truth recording. The detections were made using sequences detected in a pre-recording, similar to how RT-Sort would be applied in real time. Precision = 98.2%±1.78%, Recall = 97.8%±1.60% (mean±STD). I) Top: for each unit detected in the simulated ground truth recording using sequence metrics generated based on the ground truth spike locations, the precision over all detected spikes compared to the most similar ground truth neuron. Mean±STD = 98.8%±12.4%. Bottom: for each unit shown in the histogram at the top of the panel, the corresponding average waveform amplitude expressed in SNR. Top and bottom share the same x-axis. J) Top: for each unit detected in the simulated ground truth recording, the precision over all detected spikes compared to the most similar ground truth neuron. Mean±STD = 97.3%±13.4%. Bottom: for each unit shown in the histogram at the top of the panel, the corresponding average waveform amplitude expressed in SNR. Top and bottom share the same x-axis. K) Top: for each unit detected in the simulated ground truth recording, the overlap score for the most similar ground truth neuron. Mean±STD = 0.651±0.242. Bottom: for each unit shown in the histogram at the top of the panel, the corresponding average waveform amplitude expressed in SNR. Top and bottom share the same x-axis.
Fig 3.
Real time spike sorting with latencies in the range of synaptic transmission.
A) Schematic representation of the durations of the different steps to go from action potential trough to sorted spike. The duration is split up into 3 categories: Biological constraints refer to biologically intrinsic waiting times since the action potential trough occurs at the soma in order to measure all the data required for the detection. This consists of measuring the action potential until the end of the waveform (bottom left) plus an additional 0.5ms for the action potential to propagate. Detection speed refers to the duration for the spike detection model to detect spikes on all the electrodes used in the recording. Sorting speed refers to the duration to take the spike detection probability outputs and use those to assign spikes to the correct units. B) Computation time to perform a forward pass through the spike detection model as a function of the number of electrodes in the recording. A first order linear regression model fitted to the forward pass duration yielded: duration = 0.155 + 0.001 * #elec (R2 = 0.997, P = 4.14*10−15) for the MEA model in purple and duration = 0.155 + 0.002 * #elec (R2 = 0.999, P = 4.09*10−19) for the Neuropixels model in cyan. C) Distribution of spike sorting duration on the detection model outputs for the two patch-MEA with real time sorting from Fig 2H. D) Distribution of spike sorting duration on the detection model outputs for the real time detections on the simulated Neuropixels recording from Fig 2I–2K. E) Distribution of spike detection and sorting durations for the real time sorting on the two patch-MEA recordings from Fig 2H. Time reflects duration from waveform trough until sorted detection (mean±STD over all detected spikes: 7.24ms±1.49ms). F) Distribution of spike detection and sorting durations for the real time sorting on the simulated ground truth dataset from Fig 2I–2K. Time reflects duration from waveform trough until sorted detection (mean±STD over all detected spikes: 7.57ms±1.57ms). G) Durations for detecting propagation sequences and sorting all spikes in the different offline patch-MEA recordings from Fig 2G (purple, markers same as Fig 2G) and in the pre-recording for the simulated Neuropixels recording made before running RT-Sort in online mode for Fig 2I–2K (cyan). Time is expressed as number of minutes per minute of recording time.
Fig 4.
RT-Sort detects spikes with consistent propagations and amplitudes.
A) Waveform footprint of a unit detected by both RT-Sort and SpyKing Circus. Left: averaged footprint over all spikes detected by both RT-Sort and SpyKing Circus. Middle: averaged footprint over all spikes only detected by RT-Sort. Right: averaged footprint over all spikes only detected by SpyKing Circus. 5 times the signal to noise ratio of the signal measured on each electrode is indicated with dotted red lines. The root electrode detected by RT-Sort and used for the spike triggered averaging is marked with the red star. All detected loose electrodes by RT-Sort are marked in bold. For each electrode, the color of the trace represents the average detection interval relative to the root electrode. B) The distribution of interelectrode intervals per spike relative to the root electrode for the comparison electrodes marked in A. Each spike group in A (both, RT-Sort only and SpyKing Circus only) is plotted as a separate distribution. The wider base of the SpyKing Circus distribution indicates spike contamination from different units for the SpyKing Circus only spikes, which is not detected by RT-Sort. C) The distribution of interelectrode interval differences compared to the mean of all spikes detected by both RT-Sort and SpyKing Circus for the comparison electrodes in B. Differences are clipped at 7 frames. D) Interelectrode interval difference scores for all matching RT-Sort and SpyKing Circus units. SpyKing Circus only spikes are significantly more different from the matched spikes in their interelectrode intervals compared to the RT-Sort only spikes (P = 3.50*10−7, one-sided paired t-test). E) -Log10(P) values for inter-electrode interval differences comparing RT-Sort only spikes to other sorter only spikes relative to matched spikes. All other sorters are significantly more different (P<0.05, one-sided paired t-test). Abbreviations: SC = SpyKing Circus, KS = Kilosort2, HS = Herdingspikes2, TDC = Tridesclous, HDS = HD-Sort, IC = IronClust. F) Amplitude difference scores for all overlapping RT-Sort and SpyKing Circus units. SpyKing Circus only spikes are significantly more different from the matched spikes in their amplitudes compared to the RT-Sort only spikes (P = 1.15*10−8, one-sided paired t-test). G) -Log10(P) values for amplitude differences comparing RT-Sort only spikes to other sorter only spikes relative to matched spikes. All other sorters are significantly more different (P<0.05, one-sided paired t-test) except HD-Sort (P = 0.0789, one-sided paired t-test). Abbreviations same as E. H) Distribution of spike detection and sorting durations for the real time sorting on the Neuropixels dataset from Fig 4. Time reflects duration from waveform trough until sorted detection (mean±STD over all detected spikes: 7.62ms±1.58ms).
Table 1.
RT-Sort parameters.