^{1}

^{2}

^{1}

^{2}

The authors have declared that no competing interests exist.

Temporally ordered multi-neuron patterns likely encode information in the brain. We introduce an unsupervised method, SPOTDisClust (Spike Pattern Optimal Transport Dissimilarity Clustering), for their detection from high-dimensional neural ensembles. SPOTDisClust measures similarity between two ensemble spike patterns by determining the minimum transport cost of transforming their corresponding normalized cross-correlation matrices into each other (SPOTDis). Then, it performs density-based clustering based on the resulting inter-pattern dissimilarity matrix. SPOTDisClust does not require binning and can detect complex patterns (beyond sequential activation) even when high levels of out-of-pattern “noise” spiking are present. Our method handles efficiently the additional information from increasingly large neuronal ensembles and can detect a number of patterns that far exceeds the number of recorded neurons. In an application to neural ensemble data from macaque monkey V1 cortex, SPOTDisClust can identify different moving stimulus directions on the sole basis of temporal spiking patterns.

The brain encodes information by ensembles of neurons, and recent technological developments allow researchers to simultaneously record from over thousands of neurons. Neurons exhibit spontaneous activity patterns, which are constrained by experience and development, limiting the portion of state space that is effectively visited. Patterns of spontaneous activity may contribute to shaping the synaptic connectivity matrix and contribute to memory consolidation, and synaptic plasticity formation depends crucially on the temporal spiking order among neurons. Hence, the unsupervised detection of spike sequences is a sine qua non for understanding how spontaneous activity contributes to memory formation. Yet, sequence detection presents major methodological challenges like the sparsity and stochasticity of neuronal output, and its high dimensionality. We propose a dissimilarity measure between neuronal patterns based on optimal transport theory, determining their similarity from the pairwise cross-correlation matrix, which can be taken as a proxy of the “trace” that is left on the synaptic matrix. We then perform unsupervised clustering and visualization of patterns using density clustering on the dissimilarity matrix and low-dimensional embedding techniques. This method does not require binning of spike times, is robust to noise, jitter and rate fluctuations, and can detect more patterns than the number of neurons.

This is a

Precisely timed spike patterns spanning multiple neurons are a ubiquitous feature of both spontaneous and stimulus-evoked brain network activity. Remarkably, not all patterns are generated with equal probability. Synaptic connectivity, shaped by development and experience, favors certain spike sequences over others, limiting the portion of the network’s “state space” that is effectively visited [

Timing information between spikes of different neurons is critical for memory function, as it regulates spike timing dependent plasticity (STDP) of synapses, with firing of a post-synaptic neuron following the firing of a pre-synaptic neuron typically inducing synaptic potentiation, and firing in the reverse order typically inducing depotentiation [

Detecting these temporal patterns represents a major methodological challenge. With recent advances in neuro-technology, it is now possible to record from thousands of neurons simultaneously [

While approaches like frequent itemset mining and related methods [

In this paper we introduce a novel spike pattern detection method called SPOTDisClust (

A) Structure of five “ground-truth” patterns, affecting 50 neurons. λ_{in} = 0.2 spks/sample, λ_{out} = 0.02 spks/sample. _{epoch} = 300 samples, _{pulse} = 30 samples. For each pattern and each neuron, a random position was chosen for the activation pulse. B) Neuronal output is generated according to an inhomogeneous Poisson process, with rates dictated by the patterns in (A). A total of 300 epochs were simulated, out of which 150 epochs were noise patterns, and each of the 5 patterns contributed 30 epochs. C) Cross-correlation histograms, normalized to unit mass, are shown for a subset of neurons pairs, for three different epochs. Two epochs constitute realizations of the same pattern, and one epoch belongs to a different pattern. Shown is the EMD for each neuron pair, between the different epochs. These EMDs are then averaged across all neuron pairs to compute the SPOTDis. D) Illustration of the EMD for three different pairs of spike distributions. E) Left: the t-SNE projection with the ground-truth cluster labels shown. Middle: sorted dissimilarity (SPOTDis) matrix, with epochs sorted by the pattern they belong to (first 150 epochs are noise patterns). Right: reconstructed cluster average cross-correlations between neuron 1 and neurons 2-49 for pattern 2. Note the similarity between the structure of the reconstructed cross-correlation matrix and the structure of pattern 2.

Left: Two realizations of two different patterns, for 50 neurons. Simulation parameters were λ_{in} = 0.35 spks/sample, λ_{out} = 0.05 spks/sample, _{epoch} = 300 samples, _{pulse} = 30 samples. For each pattern and each neuron, a random position was chosen for the activation pulse. Right: For 500 patterns, 30 realizations per pattern were generated, and 15000 noise epochs were added. t-SNE projection with HDBSCAN labels shows that our clustering method can retrieve all patterns from the data.

(A) Multiple bimodal activation patterns and examples of realizations for each pattern (_{in} = 0.35 spks/sample, λ_{out} = 0.05 spks/sample, _{epoch} = 300 and _{pulse} = 20 samples. Bottom figures show sorted dissimilarity matrix and t-SNE for simulation with patterned noise (left) and homogeneous noise (right). (B) Multiple bimodal activation patterns and examples of realizations for each pattern (_{out} = 0.02 spks/sample (i.e. the deactivation period), λ_{in} = 0.3 spks/sample, _{epoch} = 300 and _{deactivation} = 150 samples. Bottom figures show sorted dissimilarity matrix and t-SNE for simulation with patterned noise (left) and homogeneous noise (right).

(A) Realizations of four different patterns, in which two patterns (1-2 and 3-4) have the same coarse structure, but a finer is structure embedded inside each coarse pattern. Simulation parameters were λ_{in,coarse} = 0.2 spks/sample, λ_{in,fine} = 0.8 spks/sample, λ_{out} = 0.05 spks/sample _{epoch} = 300, _{pulse,coarse} = 90 samples, _{pulse,fine} = 30 samples. Panels on bottom show sorted dissimilarity matrix and t-SNE for simulations with patterned noise (left) and homogeneous noise (right). (B) Realizations of multiple patterns, in which different random subsets of neurons become simultaneously active, leading to a synchronous firing without temporal order. Simulation parameters were λ_{in} = 0.4 spks/sample, λ_{out} = 0.05 spks/sample, _{epoch} = 300, _{pulse} = 50 samples.

(A) Example of 5 patterns with a low SNR with _{in} = 0.3 spks/sample, λ_{out} = 0.1 spks/sample, _{epoch} = 300 samples, _{pulse} = 30 samples. Shown in bottom panels are sorted dissimilarity matrix (left), t-SNE embedding (ground-truth cluster labels), and t-SNE embedding with cluster labels assigned by HDBSCAN. (B) Performance of SPOTDis depends on the SNR. Left: Firing rate inside pulse period is varied, while firing rate outside pulse was varied. We simulated 5 patterns with 30 repetitions each, with λ_{out} = 0.05 spks/sample, and λ_{in} attaining values of 0.15, 0.2, 0.25, 0.35, 0.45 or 0.5 spks/sample, _{pulse} = 30 and _{epoch} = 1000 samples. The number of neurons was 25, 50 or 100. In addition 150 epochs of homogeneous noise were included. We show the mean and the standard deviation across 10 repetitions of the same simulation. Performance relative to ground truth (measured with ARI; see _{out} = 0.05 spks/sample, and λ_{in} = 0.5, 0.4, 0.3, 0.2, 0.1 spks/sample, and _{pulse} of 100, 200, 300, 400 or 500 samples, with _{epoch} = 1000 samples; note that the product of λ_{in} _{pulse} remained constant.

Example of 5 patterns with sparse firing. Simulation parameters were λ_{in} = 0.015 spks/sample, λ_{out} = 0.0001 spks/sample, _{epoch} = 300 samples, _{pulse} = 30 samples. For each pattern, two spike realizations are shown. Bottom panels show sorted dissimilarity matrix and t-SNE with ground-truth cluster labels (left) and HDBSCAN cluster labels (right).

A) Shown are two different temporal patterns. Each temporal pattern can occur in a low (λ_{in} = 0.2 and λ_{out} = 0.02 spks/sample), medium (λ_{in} = 0.4 and λ_{out} = 0.04 spks/sample) or high rate (λ_{in} = 0.7 and λ_{out} = 0.07 spks/sample) state, with a constant ratio of λ_{in}/λ_{out}. In addition, the noise pattern can also occur in one of three rate states. The pulse duration was 30 samples. Shown at the bottom the sorted dissimilarity matrix with SPOTDis values, the t-SNE embedding with the ground-truth cluster labels and the t-SNE embedding with the HDBSCAN cluster labels. B) Shown are two temporal patterns. Each temporal pattern could occur in one of two rates states: In the first rate state, the first 25 neurons are firing at low rate (λ_{in} = 0.3 and λ_{out} = 0.03 spks/sample), and the other 25 are firing at a high rate (λ_{in} = 0.7 and λ_{out} = 0.07 spks/sample). In the second rate state, the rate scaling is reversed. The pulse duration was 30 samples. Shown at the bottom the sorted dissimilarity matrix with SPOTDis values, the t-SNE embedding with the ground-truth cluster labels and the t-SNE embedding with the HDBSCAN cluster labels.

Top: shown are the average peri-stimulus histograms for four moving bar stimuli spanning the four cardinal directions. Middle: multi-unit spike train realizations for each of the four conditions. Bottom: sorted dissimilarity matrix, t-SNE with ground-truth labels, and t-SNE with HDBSCAN cluster labels.

Suppose we perform spiking measurements from an ensemble of _{ij}(_{t} _{i}(_{j}(_{i}(_{j}(

The SPOTDisClust method contains two steps (

The SPOTDis measure is constructed as follows:

We compute, for each of the

For each pair of epochs _{ij,km} (see _{1} and _{2} as |_{1} − _{2}|/(2

The advantage of using the EMD is multi-fold. First, it is a symmetric and metric measure of similarity between two probability distributions (as opposed to e.g. the Kullback-Leibler divergence). Second, as it is a “cross-bins” distance, it can handle jitter in spike timing. In other words, it quantifies not only whether two distributions are overlapping, like the Kullback-Leibler divergence, but also how far they are shifted away, as minimum transport cost, from each other in a metric space (in this case: time). Third, it does not rely on the computation of a measure of central tendency like the center of mass or peak of the probability distribution, but can also compute transport cost between multimodal probability distributions (

After computing the EMDs between each pair of epochs for each neuron pair separately, we compute SPOTDis as the average EMD across neuron pairs,
_{ij,km} = 1. The rationale behind ignoring the other neuron pairs for computing the SPOTDis is that it avoids assigning an arbitrary value to the EMD in the case where we have no information about the temporal relationship between the neurons (i.e. where we do not have any spikes for one neuron in one epoch). We assume for now, that for all (

To test the SPOTDisClust method for cases in which the ground truth is known, we generated _{epoch} = 300 samples, defined by the instantaneous rate of inhomogeneous Poisson processes, and then generated spiking outputs according to these (_{pulse} samples) is the period in the epoch in which the neuron is more active than during the baseline, and the positions of the pulses across neurons define the pattern. For each neuron and pattern, the position of the pulse activation period was randomly chosen. We generated

A key challenge for any pattern detection algorithm is to find a larger number of patterns than the number of measurement variables, assuming that each pattern is observed several times. This is impossible to achieve with traditional linear methods like PCA (Principal Component Analysis), which do not yield more components than the number of neurons (or channels), although decomposition techniques using overcomplete base sets might in principle be able to do so. Other approaches like frequent itemset mining and related methods [

Because SPOTDisClust clusters patterns based on small SPOTDis dissimilarities, it does not require exact matches of the same pattern to occur, but only that the different instantiations of the same pattern are similar enough to one another, i.e. have SPOTDis values that are small enough, and separate them from other clusters and the noise.

When many patterns are detectable, the geometry of the low dimensional t-SNE embedding needs to be interpreted carefully: In this case, all 500 patterns are roughly equidistant to each other, however, there does not exist a 2-D projection in which all 500 clusters are equidistant to each other; this would only occur with a triangle for

Temporal patterns in neuronal data may consist not only of ordered sequences of activation, but can also have a more complex character. As explained above, a key advantage of the SPOTDis measure is that it computes averages over the EMD, which can distinguish complex patterns beyond patterns that differ only by a measure of central tendency. Indeed, we will demonstrate that SPOTDisClust can detect a wide variety of patterns, for which traditional methods that are based on the relative activation order (sequence) of neurons may not be well equipped.

We first consider a case where the patterns consist of bimodal activations within the epoch (

Next, we consider a case where there are two coarse patterns and two fine patterns embedded within each coarse pattern, resulting in a total number of four patterns. This example might be relevant for sequences that result from cross-frequency theta-gamma coupling, or from the sequential activation of place fields that is accompanied by theta phase sequences on a faster time scale [

Finally, we consider a set of patterns consisting of a synchronous (i.e. without delays) firing of a subset of cells, with a cross-correlation function that is symmetric around the delay

Previous methods to identify the co-activation (without consideration of time delays) of different neuronal assemblies relied on PCA [

A major challenge for the clustering of temporal spiking patterns is the stochasticity of neuronal firing. That is, in neural data, it is extremely unlikely to encounter, in a high dimensional space, a copy of the same pattern exactly twice, or even two instantiations that differ by only a few insertions or deletions of spikes. Furthermore, patterns might be distinct when they span a high-dimensional neural space, even when bivariate correlations among neurons are weak and when the firing of neurons in the activation period is only slightly higher than the baseline firing around it (see further below). The robustness of a sequence detection algorithm to noise is therefore critical.

We can dissociate different aspects of “noise” in temporal spiking patterns. A first source of noise is the stochastic fluctuation in the number of spikes during the pulse activation period and baseline firing period. In the ground-truth simulations presented here, this fluctuation is driven by the generation of spikes according to inhomogeneous Poisson processes. This type of noise causes differences in SPOTDis values between epochs, because of differences in the amount of mass in the pulse activation and baseline period, in combination with the normalization of the cross-correlation histogram. In the extreme case, some neurons may not fire in an given epoch, such that all information about the temporal structure of the pattern is lost. Such a neural “silence” might be prevalent when we search for spiking patterns on a short time scale. We note that fluctuations in the spike count are primarily detrimental to clustering performance because there is baseline firing around the pulse activation period, in other words because “noisy” spikes are inserted at random points in time around the pulse activations. To see this, suppose that the probability that a neuron fires at least one spike during the pulse activation period is close to one for all

A second source of noise is the jitter in spike timing. Jitter in spike timing also gives rise to fluctuations in the SPOTDis and in the ground-truth simulations presented here, spike timing jitter is a consequence of the generation of spikes according to Poisson processes. As explained above, because the SPOTDisClust method does not require exact matches of the observed patterns, but is a “cross-bins” dissimilarity measure, it can handle jitter in spike timing well. Again, we can distinguish jitter in spike timing during the baseline firing, and jitter in spike timing during the pulse activation period. The amount of perturbation caused by spike timing jitter during the pulse activation period is a function of the pulse period duration. We will explore the consequences of these different noise sources, namely the amount of baseline firing, the sparsity of firing, and spike timing jitter in Figs

We define the SNR (Signal-to-Noise-Ratio) as the ratio of the firing rate inside the activation pulse period over the firing rate outside the activation period. This measure of SNR reflects both the amount of firing in the pulse activation period as compared to the baseline period (first source of noise), and the pulse duration as compared to the epoch duration (second source of noise).

We first consider an example of 100 neurons that have a relatively low SNR (

To systematically analyze the dependence of clustering performance on the SNR, we varied the SNR by changing the firing rate inside the activation pulse period, while leaving the firing rate outside the activation period as well as the duration of the activation (pulse) period constant. Thus, we varied the first aspect of noise, which is driven by spike count fluctuations. A measure of performance was then constructed by comparing the unsupervised cluster labels rendered by HDBSCAN with the ground-truth cluster labels, using the Adjusted Rand Index (ARI) measure (see

We also varied the SNR by changing the pulse duration while leaving the ratio of expected number of spikes in the activation period relative to the baseline constant. The latter was achieved by adjusting the firing rate inside the activation period, such that the product of pulse duration with firing rate in the activation period remained constant, i.e. _{pulse}λ_{pulse} = _{pulse}λ_{pulse}, clusters that are better separated than patterns comprising longer activation pulses.

We performed further simulations to study in a more simplified, one-dimensional setting how the SPOTDis depends quantitatively on the insertion of noise spikes outside of the activation pulse periods, which further demonstrates the robustness of the SPOTDis measure to noise (

In addition, we performed simulations to determine the influence of spike sorting errors on the clustering performance. In general, spike sorting errors lead to a reduction in HDBSCAN clustering performance, the extent of which depends on the type of spike sorting error (contamination or collision) [

As explained above, an extreme case of noise driven by spike count fluctuations is the absence of firing during an epoch. If many neurons remain “silent” in a given epoch, then we can only compute the EMD for a small subset of neuron pairs (

A key aim of the SPOTDisClust methodology is to identify temporal patterns that are based on consistent temporal relationships among neurons. However, in addition to temporal patterns, neuronal populations can also exhibit fluctuations in the firing rate that can be driven by e.g. external input or behavioral state and are superimposed on temporal patterns. A global scaling of the firing rate, or a scaling of the firing rate for a specific assembly, should not constitute a different temporal pattern if the temporal structure of the pattern remains unaltered, i.e. when the normalized cross-correlation function has the same expected value, and should not interfere with the clustering of temporal patterns. This is an important point for practical applications, because it might occur for instance that in specific behavioral states rates are globally scaled [

In

Another example of a rate scaling is one that consists of a scaling of the firing rate for one half of the neurons (

We apply the SPOTDisClust method to data collected from monkey V1. Simultaneous recordings were performed from 64 V1 channels using a chronically implanted Utah array (Blackrock) (see

For the simulations and applications above, we have assumed that the temporal window of interest for the application was known, i.e. that we had some

We have presented a novel dissimilarity measure for multi-neuron temporal spike patterns, SPOTDis, with unique properties that make it suitable for the unsupervised exploration of the space of admissible firing patterns. SPOTDis is rooted in optimal transport theory, a burgeoning field in mathematics that offers promising solutions for fields as diverse as economics, engineering, physics and chemistry [

Distance measures based on “morphing” one spike train into another by moving spikes have been previously proposed. The Victor-Purpura distance, which is an adaptation of the Levenshtein distance to point processes, is a paradigmatic example [

First, the Victor-Purpura distance allows for the insertion and deletion of spikes, to enable computation of distances between spike trains with different numbers of spikes, adding in each case a penalty term (the penalty terms are arbitrary parameters to be optimized). While this may be a principled way to deal with this issue, it introduces additional complexity in the computation of the distance as many different combinations of spike shifting and insertion/deletion must be considered in order to find the optimal solution. This may render optimization difficult and the computation prohibitive as one attempts to compare a large number of multi-neuron patterns. We take the more simple-minded approach of rescaling the time series to be compared, in order to equalize mass. While this may be an oversimplification in some cases, this enables us to implement the computation in a very efficient way. Yet, we preserve many desirable features of spike train metrics such as the Victor-Purpura distance. For example, SPOTDis is not based on measures of central tendency, but can also compute dissimilarities between multimodal probability distributions (

A second important difference with spike train metric methods such as Victor-Purpura distance is that we calculate the pairwise epoch-to-epoch dissimilarity not directly on spike trains but on cross-correlograms between pairs of cell spike trains. This has the considerable advantage of enabling detection of similarity between spiking patterns that are misaligned, and eliminates the need for precise time reference points (e.g. the time of stimulus delivery), providing a way to freely search for repeated patterns in spontaneous or evoked activity. Comparing cross-correlation patterns between epochs has been used in seminal work on memory replay, where cross-correlation “bias” was compared across entire sleep or behavioral epochs, to assess the presence of significant replay [

We combined SPOTDis with a density-based clustering algorithm, HDBSCAN, which forms a good match for several reasons: First, it can deal with non-metric dissimilarities. While SPOTDis on a single cell pair cross-correlation is metric (and the sum of metrics is a metric), absence of firing in some neurons and in some cell pairs may cause violation of metricity, which is handled gracefully by HDBSCAN. Second, it can identify clusters at different characteristic densities in different regions of the state space, adapting to patterns that may arise at different time scales and different precision due to disparate underlying mechanisms. Yet, other clustering strategies than HDBSCAN may work successfully as well. We show that in many cases, a non-linear embedding technique such as t-SNE acting on SPOTDis yields a quite intuitive representation of the underlying structure of the data.

It is important to emphasize that our method is an explicit clustering method, that can find unique patterns of network activity that are well separated from one another. Several methods using decomposition techniques like PCA or matrix factorization have been utilized with the goal of extracting patterns or sequences from neuronal ensemble data [

We provide an initial application of the SPOTDis measure to real neuronal data, by analyzing multi-electrode recordings in visual cortex. In this analysis, we fed the algorithm the neural data without any knowledge of the task structure, or of the times of stimulus delivery. Strikingly, the identified clusters faithfully reflected the structure of the PSTH calculated with traditional methods, with availability of the stimulus delivery times and labels. Thus, we can recover stimulus information even after normalizing away firing rate information, which is conventionally used to decode different stimuli, demonstrating that the temporal structure of population activity encodes different moving stimulus directions. We also developed an analogous clustering method to SPOTDisClust by constructing a dissimilarity matrix based on L1 distances among population firing rate vectors. Using this clustering technique, the different stimulus directions could not be separated from one another using t-SNE embedding or HDBSCAN.

While we argued that our approach using bivariate cross-correlations yields many advantages, it also has a limitation in the sense that it does not capture higher-order correlations among neurons. Future extensions of this technique may explicitly construct a dissimilarity measure based on high-order correlations among neurons. Indeed, incorporating this may be an interesting avenue for future work. Nonetheless, it should be noted that higher order correlations in a population may be captured by models fitting the first and second moments alone [

In the present work, we only considered spike trains as if they were recorded using electrophysiological methods. However, this method may also be applied to two-photon calcium imaging data using sensors like GCaMP6f. Analysis of this type of data always involves some additional preprocessing steps like denoising, deconvolution, region of interest identification, and normalization. An excellent strategy would be to apply the method on calcium imaging data after deconvolution and source extraction, which yields sparse time series with “spikes”, although not measured with the same temporal resolution as in case of electrophysiological recordings [_{background}, where _{background} is the background fluorescence. After this, one would normalize Δ

In conclusion, we have proposed a new tool for the efficient unsupervised analysis of multi-neuron data, which opens up more flexible ways to analyze spontaneous and evoked activity than it has been so far possible.

The SPOTDisClust method contains two steps. The first step is to construct the pairwise epoch-to-epoch SPOTDis measure on the matrix of cross-correlations among all neuron pairs.

We compute, for each of the _{i,k}(_{j,k}(

We then compute the Earth Mover’s Distance (EMD) between _{ij,k}(_{ij,m}(_{τ} _{ij,k}(_{τ} _{ij,m}(_{ij,k}(_{ij,k}(_{q}) > 0 for all _{ij,m}(_{r}) > 0 for all _{q} ≥ _{p} for all _{r} ≥ _{z} for all

Because the EMD is computed on the precise delay times, we can let Δ

The EMD also requires the definition of a moving cost function, which in this case is defined over time points. Let _{1}, _{2}) ≡ |_{1} − _{2}|/(2_{q,r}], with _{q,r} the flow (i.e. the amount of mass moved) from _{q,x} to _{r,y}, such that the overall cost is minimized, i.e.

We solve the transport problem algorithmically as follows (_{q}, _{r}).

% START ALGORITHM

SET emd = 0,

WHILE

% Move all the remaining mass in _{q,x} to _{r,y}, but do not move more than _{r,y}.

SET flow = min(_{q,x}, _{r,y})

% The _{q} to the time point _{r}.

SET cost = flow × _{q}, _{r})

% Add this cost to the total cost.

SET emd = emd + cost

% Compute the remaining mass in _{q,x} to be moved, which equals the

previous mass _{q,x} minus the mass we just moved,

SET _{q,x} = _{q,x} − flow

% Compute the remaining mass in _{r,y}, to which we still need to transport from

SET _{r,y} = _{r,y} −

% If there is no mass left then increment the index, note that the vectors

IF _{r,y} = 0 THEN

SET r = r + 1

ENDIF

IF _{q,x} = 0 THEN

SET q = q + 1

ENDIF

ENDWHILE

% END ALGORITHM

After computing the EMDs between each pair of epochs for each neuron pair separately, with computational complexity of order ^{2} ^{2} ^{2}), where

Computing the SPOTDis with the sparse simulation (

HDBSCAN is an automated density clustering algorithm that clusters on the basis of pairwise dissimilarity matrices. An extensive overview of HDBSCAN can be found in [

After pairwise distances have been computed between all data points, HDBSCAN defines a “mutual reachability distance” between each pair of data points (in our case epochs). The mutual reachability distance is an adjustment of the distance measure that effectively acts as a smoother. For each epoch _{pts}th nearest neighbour

HDBSCAN then defines a minimum spanning tree, in which there is a path between all points (vertices), without any loops (i.e. an acyclic graph), such that the total weight of the edge connections is minimized. Here the edges are the mutual reachability distances.

HDBSCAN then constructs an hierarchical cluster dendrogram from the minimum spanning tree as follows: Initially all points are assigned to the “root”, a single cluster containing all points. HDBSCAN then sets a threshold _{clSize} members, the minimum cluster size, at a value of _{pts} = _{clSize} [_{pts} is the only hyperparameter, which we set to 10 here, unless specified otherwise. We use the implementation of HDBSCAN developed by [_{max}, and at some point, the cluster dies, _{min}. For each member of the cluster we can define the value of _{k} where each _{k}1/_{k} − 1/_{max} over all members. HDBSCAN then selects in the dendrogram the optimal levels at which to cut the tree in order to maximize the stability of the selected clusters, forming a set of clusters. An advantage of this selection procedure is that it allows for clusters of varying density.

T-SNE (t-distributed stochastic neighbor embedding) is a dimensionality reduction technique for high-dimensional datasets [

For each two data points (_{km} is the L1 norm for a high dimensional dataset _{1}, …, _{M}, defined as _{km} = ||_{k} − _{m}||, but we take it here as the SPOTDis, _{m} _{m|k} sum to one. The variance of the Gaussian, _{k} to satisfy the equation
_{k} is the probability distribution of all the data points given _{k} = (_{1|k}, …, _{M|k}), and

SNE then attempts to find a low dimensional set of data points, {_{1}, …, _{M}}, that have a similar distribution of conditional probabilities (similarities) as the distances derived from the high-dimensional counterparts. In this case the variance of the Gaussian is constant for all data points, and

SNE then minimizes a cost function, which in this case is the Kullback-Leibler divergence between _{m|k} and _{m|k} over all data points. To do that, it starts from a random sample of points (Gaussian distributed) and then performs a gradient descent, in which each point _{k} is moved around depending on the attraction or repulsion from other data points (see [

T-SNE makes two main adjustments relative to SNE (the rationale behind these two adjustments is extensively discussed in [

The ARI (Adjusted Rand Index, [

The Silhouette score [_{k,nearest}. We then also compute the average dissimilarity to all the epochs in the same cluster as the epoch, _{k,same}. We then compute the Silhouette as

One male macaque monkey performed a passive fixation task while moving bar stimuli (white bars on gray background, 0.25 degrees in visual angle width) were presented. All procedures complied with the German law for the protection of animals and were approved by the regional authority (Regierungspräsidium Darmstadt). Recordings were performed from 64 V1 channels simultaneously, obtained from a chronic Utah array implant (Blackrock). Receptive fields had eccentricities around 3-5 degrees visual angle. We performed band-pass filtering of each channel in the frequency range of action potentials (300-6000Hz) and then thresholded the band-pass filtered signal _{pts} = 3 with leaf selection for the HDBSCAN parameters.

For each epoch (1), the cross-correlation is computed for each pair of neurons (2). These cross-correlations are then normalized to unit mass (3). For each pair of epochs and pair of neurons (4) we then compute the EMD. The EMDs are then averaged over all eligible neuron pairs (i.e. pairs of neurons active in both epoch

(PDF)

Showing the first 5 iterations of the EMD algorithm on simple example cross correlation histograms. Each normalized cross-correlation (top and bottom) belongs to one epoch. In each step the mass is transported from the top epoch to the highlighted parts of the histogram and the histogram entries between which mass is transported are connected.

(PDF)

Simulation parameters were λ_{in} = 0.35 spks/sample, λ_{out} = 0.05 spks/sample, _{epoch} = 300 samples, _{pulse} = 30 samples. Homogeneous noise was generated according to a homogeneous Poisson process, while each patterned noise epoch was an instantiation of a unique pattern, that was randomly generated with the same statistics as the four recurring patterns (i.e. for each neuron it had the same values of the pulse duration, λ_{in} and λ_{out}). The t-SNE embedding shows that homogeneous noise forms a separate cluster, while patterned noise does not.

(PDF)

Shown are the excess of mass (EOM) and the leaf cluster selection methods of the HDBSCAN algorithm (see

(PDF)

Same settings as in

(PDF)

We consider four rate modulations (top left) each consisting of a single activation pulse and a baseline. We then generated spikes from the four rate modulations according to an inhomogeneous Poisson process (top right). The rate modulations may represent cross-correlations for neuron pairs or activation patterns for single neurons. We then varied the spike rate inside the activation period and outside the activation period, thereby changing the ratio of spikes inside the pulse period to the amount of spikes outside the pulse period (changing the SNR). We then computed the EMD between the normalized mass of rate modulation _{k}. Here the pattern mass can be interpreted as an average of many spike train realizations of that rate modulation. The figure on the bottom left shows, across many spike train realizations of rate modulation 1, ΔEMD(_{k} − Δ_{1}. The dashed lines indicate the EMD between the different patterns. Thus, a value of ΔEMD(_{k} occurs when

(PDF)

(A) Five patterns were generated for a total of 330 neurons, with λ_{in} = 0.15 spks/sample, λ_{out} = 0.01 spks/sample. Each group of 11 subsequent neurons was assigned to one “virtual electrode”. From this virtual electrode, we then only “recorded” one neuron. The activity of each recorded neuron was then contaminated by randomly inserting spikes from the other “hidden’ neurons. Both the number of contaminating “hidden’ neurons and the fraction of contaminating spikes, as compared to the total number of spikes of each “hidden” neuron, was varied. For example, a value of 5 contaminating neurons and 50% contaminating spikes indicated that 50% of spikes from each of the 5 contaminating “hidden” neurons was added to the recorded neuron. On the left and middle, an original realization of a spike pattern and the observed spike trains after contamination (10 contaminating neurons and each “hidden” neuron contaminating with 100% of its spikes). The table on the right shows the clustering performance compared with the ground-truth (ARI) as a function of spike contamination parameters. B) Patterns were defined using the same firing statistics as in

(PDF)

The simulations are based on the same firing statistics reported for

(PDF)

Five patterns were defined by selecting for each pattern a random time point at which the neuron would fire a spike. In addition, we inserted a varying number of noise spikes per neuron according to a homogeneous Poisson process, and also varied the amount of temporal jitter in the pattern spikes. The left panel shows an example for one such pattern, where the pattern spikes are displayed in black and the noise spikes are colored gray. The right panel shows the HDBSCAN cluster quality as compared to ground truth (ARI) with a varying number of noise spikes along with applying an increasing amount of temporal jitter to the pattern spikes. The maximum jitter denoted on the x-axis is defined as a percentage of the total interval length, from which a perturbation value is chosen with uniform probability, for each pattern spike individually. This means that a maximum jitter of 100% corresponds to a jitter chosen uniformly from [−

(PDF)

For each multi-unit, we computed the spike count in the same temporal window as used for the SPOTDisClust clustering, denoted _{ik} (epoch

(PDF)

The parameter settings for the patterns underlying this figure are equivalent to _{w}. For each epoch realization, the value of Δ_{w} was randomly chosen with uniform probability from an interval determined by the maximum window offset (max offset of 100 meant that Δ_{w} ∈ [−100, 100]). A value of Δ_{w} = −50 then meant that the sequence started at -200 samples and ended at 100 samples. For the clustering, we then assume that the sequence duration is unknown. We select a window ranging from −_{w}/2 to +_{w}/2 samples of length _{w}. In case of _{w} = 300 and no offset (Δ_{w} = 0), the generated data matches the simulation of 1. Clustering performance was measured relative to ground-truth (ARI) and with an unsupervised performance measure, Silhouette (see _{w} = 300 samples for both ARI and Silhouette as _{w}. indicating that we can determine the “optimal” window length in an unsupervised manner when the sequence duration is unknown.

(PDF)

We repeated the clustering with different window length selections. Windows were centered on 180 ms, the center of the 360 ms stimulation period. Clustering performance, measured both with a ground-truth performance measure (ARI) and an unsupervised cluster quality score (Silhouette), peaked at the window length of 360 ms. Large window lengths lead to an inclusion of baseline firing, decreasing clustering performance, whereas smaller window lengths also deteriorated performance. ARI and Silhouette scores showed a tight correspondence, showing the feasibility of using Silhouette score to “optimize” the window length for sequence detection.

(PDF)

The four columns correspond to spike times (column 1), for different MU channels (column 2), in different epochs (column 3) belonging to different conditions (either one of four stimulus conditions, or noise epochs).

(CSV)

Dr. Michael Schmid, Dr. Katharine Shapcott, Dr. Joscha Schmiedt, Dr. Richard Saunders, Dr. Pascal Fries, Cem Uran and Alina Peters were responsible for the collection and preprocessing of the experimental monkey V1 data, with financial support from DFG Emmy Nother grant 2806 (Dr. Michael Schmid). We thank Dr. Felix Effenberger for inspiring discussions on this topic and helpful comments.