Fig 1.
Overview of the pipeline designed to sort spikes in microneurographic recordings.
Data is collected from laboratories in Aachen and Bristol and pre-analyzed using a ‘waterfall’ representation to align spikes based on C-fiber conduction speed. Two example tracks (purple and orange) are shown. The raw signal and extracted spike waveforms are harmonized, and multiple feature sets are computed (see Table) to assess their suitability for sorting. A Support Vector Machine (SVM) classifier with a radial basis function (RBF) kernel was applied to classify spikes, using 5-fold cross-validation and standard evaluation metrics (accuracy, precision, recall, and F1-score).
Table 1.
Results of the Wilcoxon signed-rank test comparing accuracy differences across all feature sets. Each row represents a pairwise comparison between one feature set and the others, displaying the statistical significance based on the Bonferroni-corrected alpha level . Red: not significant; green: p <
.
Fig 2.
Accuracy results for all datasets grouped by feature set.
(A) Individual results for each dataset. Darker shades represent lower scores, while lighter shades indicate higher scores. For Wraw, the mean accuracy is the highest, with 0.73. (B) Distribution of accuracies for each feature set. Boxes are drawn from the first quartile (median of the lower half of the score distribution) to the third quartile (median of the upper half of the score distribution). The line in the box marks the median of the scores. The lower and upper whiskers are bounded by the 1.5 interquartile range (IQR), which is the distance between the first and third quartiles. Scores outside of the 1.5 IQR bound are plotted as outliers. The dots mark the scores for individual datasets. Each color represents a different feature set group, with lighter, more transparent shades indicating the subset of the SS-SPDF feature vector or PCA in two and three components of the raw signal shown in the brighter color. Green stands for the simple feature set, blue for the feature sets of the SPDF method, and red for the raw waveform feature sets.
Fig 3.
Spike waveforms and corresponding templates for datasets A1 and A6 were created by averaging all detected spikes.
(A) Waveforms for both tracks in recording A1 show the blue track with a higher amplitude than the green track. (B) Waveforms for both tracks in recording A6, where waveforms visually overlap except for a few outliers. (C) Templates for both tracks in recording A1 are visually distinct. (D) Templates for both tracks in recording A6 show near-complete overlap and similar amplitude, indicating high template similarity.
Fig 4.
Indicators of sorting success.
(A) Relationship between template distance and best-achieved sorting accuracy. The distance metrics used are the mean squared error (MSE), the mean absolute error (MAE), and the root mean squared error (RMSE). A smaller distance reflects a higher similarity between the two templates, corresponding to lower sorting accuracy. Smaller distances indicate higher similarity between spike templates and are generally associated with lower classification accuracy. To illustrate this relationship clearly, datasets with only two fibers are shown as primary data points. Additionally, for completeness and transparency, datasets with more than two fibers are included using a workaround: we selected the pair with the highest template similarity and marked these as grey crosses. While this allows all datasets to be visualized, it does not always reflect a fair comparison and may yield contradictory results, as seen in dataset B3, which exhibits very low error scores (e.g., MSE 0.004), but a high maximum classification accuracy (0.92). (B) Mean accuracy of each feature set (x-axis) across recordings, with accuracy values shown on the y-axis. The color of each marker represents the number of fibers tracked in the recording. The markers positioned above the horizontal line indicate that the classifier performs better than the random chance for the corresponding number of fibers.
Fig 5.
Comparison of unsupervised clustering challenges for datasets A1 and A3.
Each panel visualizes spikes in different feature set spaces, color labels are applied based on the ground truth labels obtained via the marking method. Panels (A-C) show results from dataset A1. (A) PCA projection in two components (B) PCA projection in three components (C) three-dimensional feature set SPDFFV3. Panels (D–F) correspond to dataset A3. (D) PCA projection in two components (E) PCA projection in three components (F) three-dimensional feature set SPDFFV3. These visualizations highlight the difficulty of distinguishing clusters using low-dimensional PCA representations, as well as in the SPDFFV3 feature space for dataset A3. Clear separation between clusters is observed only in the SPDFFV3 representation for dataset A1 (panel C), while the other views show substantial overlap.
Table 2.
Clustering performance across increasing numbers of PCA components for A1. To assess whether additional components improved separability, PCA was extended up to eight components beyond the commonly used first two or three. Clustering performance was evaluated using Adjusted Rand Index (ARI), Normalized Mutual Information (NMI), and V-measure.
Table 3.
Performance of sorting algorithms within the SpikeInterface framework for datasets A1 and A5. SpyKING Circus 2 and MountainSort5 were applied to two microneurography recordings (datasets A1 and A5). While both algorithms detected varying numbers of units and identified only a subset of the ground truth spikes, the primary issue was the failure to distinguish between fibers: spikes from both fibers were consistently merged into a single unit.
Fig 6.
Details on microneurography experiments.
(A) Set up of microneurography experiments. One microelectrode is inserted into a fascicle of the nerve and another as a reference electrode in the skin nearby. In the receptive field in the skin, C-fibers are activated by electrical stimulation. (B) Schematic waterfall plot of two C-fiber tracks. The waterfall provides a visual representation of the marking method with two active fibers (green and red spikes) represented as track. The onset of the electrical stimulation is indicated by the blue rectangle. Each line, referred to as a trace, begins with the low-frequency stimulus. When extra stimulation is applied (line 6, red and green spikes) or spontaneous activity occurs (line 11, orange spikes), ADS is observed in both fibers [56]. (C) An example spike with a low signal-to-noise ratio. The screenshot of a recording in Dapsys, in which the red line indicates the exemplary spike of interest with an amplitude similar to the background noise level. We could only identify the spike by the marking method.
Table 4.
Overview of the data collection, including the number of active tracks, the class distributions, and the number of total spikes. Additionally, we included the spike numbers for each feature set. The classes are mostly equally distributed. Datasets labeled with A are from the lab in Aachen and datasets labeled with B are from Bristol.
Fig 7.
The recording (either raw Dapsys file or HDF55 file) is first read in and converted into pandas data frames containing spike times, stimulus times, and raw signals. The data from Dapsys and HDF5 are saved as NIX files via the creation of a Neo block. Spikes are then extracted using a window function around each spike timestamp from the raw signal. The first and second derivatives of the signal are computed to enable alignment based on the most negative peak of the first derivative. Finally, spike templates are computed for each track by averaging all aligned spike waveforms.
Table 5.
Different feature sets are extracted through domain-agnostic dimension reduction as well as through spike approaches, using the raw waveform and computed features on the waveforms. The feature sets include methodologies, such as principal component analysis (PCA) and the features from the spike sorting approach based on shape, phase, and distribution features (SS-SPDF) by Caro-Martín et al. [11].