Cyclic transitions between higher order motifs underlie sustained activity in asynchronous sparse recurrent networks

Many studies have demonstrated the prominence of higher-order patterns in excitatory synaptic connectivity as well as activity in neocortex. Surveyed as a whole, these results suggest that there may be an essential role for higher-order patterns in neocortical function. In order to stably propagate signal within and between regions of neocortex, the most basic - yet nontrivial - function which neocortical circuitry must satisfy is the ability to maintain stable spiking activity over time. Here we algorithmically construct spiking neural network models comprised of 5000 neurons using topological statistics from neocortex and a set of objective functions that identify networks which produce naturalistic low-rate, asynchronous, and critical activity. We find that the same network topology can exhibit either sustained activity under one set of initial membrane voltages or truncated activity under a different set. Yet these two outcomes are not readily differentiated by rate or criticality. By summarizing the statistical dependencies in the pairwise activity of neurons as directed weighted functional networks, we examined the transient manifestations of higher-order motifs in the functional networks across time. We find that stereotyped low variance cyclic transitions between three isomorphic triangle motifs, quantified as a Markov process, are required for sustained activity. If the network fails to engage the dynamical regime characterized by a recurring stable pattern of motif dominance, spiking activity ceased. Motif cycling generalized across manipulations of synaptic weights and across topologies, demonstrating the robustness of this dynamical regime for sustained spiking in critical asynchronous network activity. Our results point to the necessity of higher-order patterns amongst excitatory connections for sustaining activity in sparse recurrent networks. They also provide a possible explanation as to why such excitatory synaptic connectivity and activity patterns have been prominently reported in neocortex. Author summary Here we address two questions. First, it remains unclear how activity propagates stably through a network since neurons are leaky and connectivity is sparse and weak. Second, higher order patterns abound in neocortex, hinting at potential functional relevance for their presence. Several lines of evidence suggest that higher-order network interactions may be instrumental for spike propagation. For example, excitatory synaptic connectivity shows a prevalence of local neuronal cliques and patterns, and propagating activity in vivo displays elevated clustering dominated by specific triplet motifs. In this study we demonstrate a mechanistic link between activity propagation and higher-order motifs at the level of individual neurons and across networks. We algorithmically build spiking neural network (SNN) models to mirror the topological and dynamical statistics of neocortex. Using a combination of graph theory, information theory, and probabilistic tools, we show that higher order coordination of synapses is necessary for sustaining activity. Coordination takes the form of cyclic transitions between specific triangle motifs. The results of our model are consistent with numerous experimental observations in neuroscience, and their generalizability to other weakly and sparsely connected networks is predicted.

Network connectivity shapes dynamics in many systems and on many scales, ranging 2 from gene transcription networks to epidemic spreading [1]. In the brain, neocortical 3 architecture supports myriad complex functions. Fundamentally, before any of these 4 functions can occur, signals must travel within and between local circuits and cortical 5 regions. Thus spiking activity propagation is the most basic function that arises from 6 the structure of synaptic connectivity in the brain. 7 Given the fact that the vast majority of excitatory synapses are weak and 8 connections are sparse and recurrent, achieving stable activity propagation is highly 9 non-trivial [2][3][4][5]. Theoretical and experimental studies have characterized several 10 architectural features that have the capacity to promote and shape propagating activity, 11 such as a long-tailed weight distribution, excitatory clustering and the balance between 12 incoming and outgoing connections [3,[6][7][8][9]. Additionally, dynamical properties of 13 ongoing activity, such as a balance between excitation and inhibition [8] and correlated 14 spiking [10], are shaped by connectivity and in turn impact activity propagation. same SNN topology can either spontaneously stop (truncate) or show sustained spike 39 propagation (complete) on different runs, corresponding to different sets of initial 40 membrane voltages, despite the fact that the runs exhibit epochs during which spike 41 rates and critically are substantially similar. The dichotomy between sustained and 42 truncated trials on the same networks provided us the opportunity to study the 43 necessary conditions of sustained activity in spiking neuronal networks when they are 44 not readily explained by spike rates, criticality or synchrony [28]. 45 Using graph theoretic and probabilistic methods, we find that sustained activity 46 requires cyclic higher-order coordination amongst excitatory neurons. In particular, the 47 network cycles though epochs dominated in turn by three types of triangle motifs with 48 low variance: fan-in triangle motif, followed by middleman and finally fan-out triangles 49 [19]. When a network simulation fails to engage this low variance dynamical regime, 50 spiking activity is not sustained and the simulation truncates. We find that 51 strengthening weights and randomizing topologies in our networks lead to decreased 52 clustering of units into triangle motifs. However, the relative motif ratios and transitions 53 through time are maintained on sustained runs. Together, our results provide a 54 mechanistic account and a possible explanation for the widespread findings of both 55 clustered activity and synaptic connectivity in local neuronal circuits. The predictions of 56 our model are consistent with numerous experimental measures in neuroscience and may 57 be generalizable to other weakly interconnected networks that are not biological brains. 58

Fig 1. Network Construction and Search
A: Our networks were constructed with 4000 clustered excitatory and 1000 unclustered inhibitory units. Probabilities of connection were drawn from the literature and determined via grid search. Simulation runs began with 30ms of 20Hz Poisson input onto a subset of 500 units. B: Synaptic weights followed a log-normal (long-tailed) distribution. Synapses were conductance-based, so weights are in units of nanosiemens. Connections originating from inhibitory units were 10x stronger than those from excitatory units. C: For each network, we defined 50 clusters in total and randomly assigned each excitatory unit to two of these clusters. This resulted in heterogeneously-sized clusters. Here we show the cluster size distribution (in counts) for 500 networks. D: Visualization of a subset of 300 clustered excitatory units in our network. Activity was then initiated by 30 ms of 20 Hz Poisson input onto a set of 500 randomly 75 chosen excitatory units (Fig 1A). 76 Algorithmically identifying networks for analysis 77 In order to evaluate large numbers of networks while minimizing sampling bias, models 78 were constructed, simulated, and scored algorithmically. We restricted the search for 79 viable topologies to a range of connection likelihoods bounded by experimental 80 observations [19]. This should not be interpreted to suggest that these connection 81 likelihoods are the only viable solution for realistic spiking activity -we did not 82 comprehensively survey the range of possibilities here. 83 We identified viable topologies iteratively; in the first iteration, we performed a low 84 resolution grid search (Fig 2A). Specifically, we rewired topologies within a limited 85 range of probabilities of connection from excitatory to inhibitory units,p e→i , and the 86 probabilities of connection from inhibitory to excitatory units, p i→e , such that we 87 identified sets of network connection likelihoods that resulted in topologies that 88 successfully propagated spiking activity with low average firing rates and near-critical 89 and low synchrony dynamics, as observed in neocortex [25][26][27][29][30][31][32]. Criticality was 90 measured using a branching parameter that is the ratio of active descendant units to 91 active ancestor units across time [33]. A value of 1 -where the number of active 92 descendants is equal to the number of active ancestors -indicates critical dynamics 93 ( Fig 2B). We used a fast, on-line synchrony heuristic (variance of the mean voltage 94 divided by the mean of voltage variances, see Methods) for the sake of grid search speed. 95 A run was considered to be asynchronous if this heuristic value was below 0.5. Runs 96 below this threshold are shown to correspond to a high mean Van Rossum distance, a 97 common measure of spike synchrony [34,35](see Methods).

Fig 2. Comparison Between Scores on Complete and Truncated Runs A:
We performed two rounds of grid search for the topological parameters that yielded consistent low-rate, critical, and asynchronous dynamics. The first search was at a lower resolution to narrow down our region of interest, and the second was at a finer resolution. B: One scoring metric we used was branching. The branching parameter [33] is a proxy for criticality. It measures the ratio of active descendant units to active ancestor units. A branching value of 1 indicates a balanced (or critical) network, which is the value we optimized for. C: A raster plot of a single complete 1000ms simulation on one of our networks. Excitatory units are numbered 1-4000 on the y axis, and inhibitory units are 4001-5000. D: A raster plot of a single truncated simulation (700ms) on the same network with the same input.
The first iteration of grid search isolated a region of interest, and we next used a 99 higher resolution grid to find specific topologies each with the same probabilities of 100 connection but differing in the specifics of connections (Fig 2A). To find these 101 topologies we used the best results obtained from the second round of grid search, which 102 were p e→i = 0.22 and p i→e = 0.31. The values for p e→e and p i→i were taken from 103 experimentally measured wiring probabilities in neocortex, and were 0.20 and 0.30 104 respectively [3,19].

105
These connectivity parameters were used to generate 2,761 synaptic topologies, 106 where each unique topology is referred to as a network. For each network we created 107 100 sets of input units, with 500 excitatory units per set. We ran 50 simulations on each 108 set of input units, where each simulation starts with different membrane voltages for all 109 units. Each simulation lasted for as long as spiking activity was sustained, up to a 110 maximum of 1 second. The spiking activity of each run on each network was then 111 scored according to the average firing rate, the level of synchrony, how balanced -or 112 critical -the network was, and the duration of time over which spiking activity was 113 maintained (see Methods). If a network's average excitatory rate for all of its complete 114 runs was less than 8 spikes/second, we added this network to the set of low-rate 115 networks for subsequent analyses. High-rate networks were eliminated. This yielded a 116 final count of 87 low-rate networks. For each of these networks, we determined the set 117 of input units which led to the most consistently sustained simulations, with the 118 trade-off of rate increasing slightly. We will refer to these as a network's optimal input 119 units. Optimal input units were used in generating 100 additional runs on each synaptic 120 network, in which only the initial network state (i.e. membrane voltages of all units) 121 varied. This generated a total of 8,700 unique runs, which we then analyzed.

122
Scores on sustained and truncated simulations 123 We found that the same topology was capable of producing both sustained and 124 truncated activity when only initial membrane voltages were varied. A run was 125 sustained (or complete) if it displayed stable activity for the duration of a 1-second trial 126 (Fig 2C). We found that all network simulations which reached 1 second were also able 127 to sustain activity up to 10 seconds. We therefore chose one second as an indication of a 128 network's ability to sustain activity indefinitely, and as the definition of a successful run. 129 If a network ceased all spiking before reaching the 1-second mark, that simulation was 130 considered truncated or unsuccessful ( Fig 2D). Scoring analysis of the network spiking 131 dynamics of rate and branching between the two run types revealed significant overlaps. 132 We grouped truncated runs by their duration. Since network activity tended towards 133 fewer spikes as a run approached truncation, we did not include the final 50ms of any 134 run in the calculation of scores. We also did not consider the stimulus period (initial 135 30ms), as we wished to analyze self-sustained network dynamics rather than 136 stimulus-driven spikes. By focusing our analyses on the middle portion of each run, we 137 find that the rate and branching values of both sustained and truncated populations 138 overlapped substantially. Runs that truncated at 100ms, at 500ms, and runs that were 139 sustained for greater than 990ms shared similar mean excitatory firing rates (9.95, 9.77, 140 and 10.14 spikes/s, respectively). Runs that truncated between 140 and 400 ms tended 141 to have a higher mean rate (15.65 spikes/s), demonstrating that higher firing rates can 142 contribute to instability of a network [28]. The overlap index between truncated run 143 rate and completed run rate was measured to be .27. The overlap index for criticality 144 for the two run types was measured to be .31. In contrast, the synchrony measure was 145 much more discerning between the run types, yielding an overlap index of .0021. We 146 found that Van Rossum spike distance increased (synchrony decreased) as run duration 147 increased. This inverse relationship between synchrony and run duration goes against 148 the intuition that successful spike propagation is at least in part due to synchronous 149 activity [36][37][38][39][40][41][42]. Regardless, it was clear that first order spiking statistics did not 150 provide simple explanations of how and why activity was sustained in some cases and 151 truncated in others.

152
Graph theory analysis of simulated networks 153 We considered functional interactions between neurons to provide an explanation for 154 sustained activity. To do so we analytically evaluated the networks using graph theory. 155 In previous work we have defined a taxonomy of active networks to focus our analysis 156 [19]. We refer to the structural connectivity matrix of our models as the synaptic graph 157 (Fig 3A). From each simulation on a synaptic graph, we generated a single functional 158 graph as well as a time series of recruitment graphs at 10ms resolution. 159 We constructed functional graphs using mutual information to quantify pairwise The synaptic graph is the ground-truth topology of our networks. Based on spiking activity during each simulation, we construct a series of active synaptic subgraphs -one for each time bin. These are graphs made of units which spiked in that bin, connected via the same edges as in the synaptic graph. We infer a single functional graph from whole-trial spiking activity using confluent mutual information -these graphs represent the functional connectivity of the network for each simulation trial. The intersection of the functional graph with the active subgraph for a given time bin yields the recruitment graph for that time bin. B: The three triangle motifs we examine -fan-in, fan-out, and middleman -are isomorphic by rotation. When calculating motif clustering, the choice of reference node is key. C: Calculation of the clustering coefficients of the different triangle motifs on weighted directed graphs, as defined in Fagiolo 2007. The clustering coefficient is defined as the ratio of the actual to the possible motif counts.  The term 'motif' refers to a pattern formed by a group of units in a network. Previously 170 we found that triplet motifs were informative of synaptic integration [19] and also 171 increased the power of encoding models [18,[21][22][23]. Here we focused our analysis on 172 similar patterns of connectivity in the recruitment network, involving groups of three 173 units [43].

174
From the perspective of a single reference neuron, neighboring neurons can be 175 arranged into four types of triplet motifs: fan-in, fan-out, middleman, and cycle. In 176 isolating one triplet, the fan-in, fan-out, and middleman motifs are isomorphic by 177 rotation, meaning that they only differ due to the choice of reference node ( Fig 3B). The 178 relative importance of a motif for a given neuron is measured by its contribution to that 179 neuron's clustering coefficient ( Fig 3C). The clustering coefficient is the weighted ratio 180 of the actual over the possible counts of a particular triplet motif type in which that 181 neuron participates. Individual reference nodes in a given triplet may yield different 182 clustering coefficients due to their specific weights and connections (see Methods).

183
It was possible that each of the algorithmically generated networks had different 184 connection densities and weight distributions, which would impact weighted motif 185 clustering coefficient measures. In this case a measure that incorporated weight and 186 controlled for density would be especially relevant since the recruitment graph density 187 changes in time. We therefore used the measure of clustering propensity [44].

188
Propensity is the ratio of the clustering coefficients of the recruitment graphs compared 189 to the average clustering coefficients of graphs with the same connection structure but 190 randomly assigned connection weights. The propensity measure allowed us to compare 191 different networks despite different connection densities and also allowed us to assess 192 the impact of specific edge weights on triplet motif clustering coefficients [19,44]. A

Density and recurrence statistics 197
As reported above, synaptic networks were 21.1% connected, and 22.4% of connections 198 were reciprocal (Fig 4A, B). The functional networks of complete runs, which were 199 calculated using mutual information and were unique to each run, were more dense and 200 also more recurrent. The functional networks averaged 32.6% (std: 0.6%) connectivity, 201 of which 59.0% (std: 0.4%) were recurrent. Recruitment graphs across time in complete 202 runs were sparser than the synaptic graphs, although only slightly less recurrent (9.5% 203 connected, std: 0.5%, and 16.7% recurrent, std: 0.5%). Functional and recruitment 204 density and recurrence did not differ significantly between complete and truncated runs. 205 Triplet motifs in the different graph types 206 We found that the three isomorphic motifs showed equal clustering in the synaptic 207 graphs. This is expected of graphs with random, albeit clustered, synaptic connectivity. 208 Clustering propensity centered at 1.00 (std = 7.8 * 10 −5 , 7.9 * 10 −5 , 8.0 * 10 −5 for 209 middleman, fan-in, and fan-out) for all three isomorphic motifs ( Fig 4C). A value of 1 210 indicates that specific edge weights in synaptic graphs play a negligible role in 211 clustering, since random edge weights would yield the same clustering coefficients. 212 We found that in contrast to the static synaptic graph the dynamic functional and 213 recruitment graphs were not random. The isomorphic motifs' dominance in the relative to what would be expected in edge-matched networks. We found that the cyclic 230 trajectory within this region of motif space was consistent for all low-rate, excitatory 231 clustered networks we examined. We also found the same orderly temporal progression 232 from one isomorphic motif to another when we considered clustering coefficient values. 233 In contrast, truncated simulations were never restricted to this behavior. 234 Motif cycling and sustained activity 235 The motif cycling trajectory was not present at the moment of first spikes in a 236 simulation. Rather, the path started at a point in motif space as determined by the 237 initial membrane voltages of all neurons in the network. Injection of Poisson input 238 drove network activity towards its eventual trajectory (Fig 4F). We identified two 239 distinct types of truncation -in the first and far more common (97.7%) of the two, the 240 simulation trajectory never approached or entered the region in propensity motif space 241 where sustained runs lay. In the second, rarer case, the simulation successfully entered 242 the sustained regime, yet after several hundred ms the trajectory destabilized, resulting 243 in truncated activity. Truncated trajectories did not follow a canonical path. Instead, 244 motif dynamics during truncated runs transited in all directions away from the central 245 region, demonstrating the multitude of ways in which activity structure can lose 246 stability resulting in a failure of spike propagation ( Fig 4D). 247 We examined whether the initial distance and input trajectory, which were 248 determined by the initial conditions of the network and the Poisson stimulus, were 249 determinants of successful activity propagation. We found that even if the distance from 250 the cycling region at the end of the stimulus period was minimal, some simulations still 251 failed to enter into and stay within that regime. Others which were still distant from 252 the region after the initial stimulus period continued on a successful trajectory and 253 entered a stable cycling regime. These behaviors point to complex interactions between 254 the network's internal state and how input onto precise units within that network can 255 influence propagation.

257
In order to quantify the cycling between isomorphic motifs, we constructed a Markov 258 model for state transitions between dominant isomorphic motifs. We described the 259 network using a probabilistic voting scheme, as opposed to using analog propensity 260 values. A unitary vote is cast by each unit for the motif type for which it has the 261 highest propensity value at some time step. The proportion of total votes for each motif 262 type is used to describe the relative dominance of that motif at that time step.

263
From this time series we constructed a Markov model transitioning between states. 264 We found that the parameters characterizing the Markov process were canonical and 265 low variance, such that successful cycling followed a specific reliable sequence between 266 motifs. In contrast, the Markov parameters in simulations that truncated showed a 267 failure to recruit this low-variance canonical sequence. First and second-order state 268 probabilities and state transition probabilities significantly differed between completed 269 and truncated runs (p < 0.001) (Fig 5A, B). Second-order state conditional 270 probabilities also differed (p < 0.05) (Fig 5C). State probability is the probability of a 271 motif being dominant at a given time. Second order probability is the probability with 272 which a sequence of two motifs will be dominant at some given time. Conditional 273 probability is defined as the probability of a motif given history of previous two motifs. 274 Markov analysis also gave the time scale which characterized motif cycling via the  3.56 ms). Hitting times differed significantly between completed and the small subset of 284 truncated runs that entered this region of propensity (p < 0.001) (Fig 5D).

285
Effects of connectivity weights 286 We hypothesized that the cycling between clustering propensities was necessary for 287 sustained activity due to the weak strength of the majority of individual synapses. synapses in neocortex, convergence of spikes from multiple sources must occur in order 292 to evoke spikes in a receiving neuron [19]. Consequently we expected that as connection 293 weights increased, the cycling between population-wide isomorphic motifs would lessen. 294 To test this, we strengthened all synaptic weights in the networks that previously 295 scored well from 1.0x to 2.0x original values in increments of 0.1. Simulations were then 296 re-run on these strengthened networks using the same stimulus and initial conditions. 297 At 1.6 times the original weights, networks consistently displayed bursting activity.

298
Consequently we restricted our analysis to networks with weights increased 1.5 times.

299
All runs on these networks reached completion.

300
Increasing weights led to a decrease in all triangle motif propensities, and also led to 301 differences in the Markov characterization. Motif state probabilities differed 302 significantly between completed runs on the original graphs and those on graphs with 303 increased weights (Chi-square test, p < 0.001). Second-order state probabilities, state 304 conditional probabilities, hitting times, and state transition probabilities all differed 305 significantly as well (p < 0.01, p < 0.05, p < 0.005, p < 0.001 respectively), 306 demonstrating the interaction of synaptic reliability on the necessity of this regime 307 (Fig 6). However the trend remained and in all sustained runs a low variance transition 308 from motif to motif occurred.

309
The dynamical motif solution is arrived at regardless of 310 synaptic connectivity statistics.

311
The networks on which we performed all our analyses have excitatory clusters of units. 312 To test whether our results, including the motif cycling phenomenon, are dependent on 313 this structure, we next examined non-clustered Erdős-Renyi (ER) graphs with 314 p i→e = .25 and p e→i = .35. ER graphs had the same p e→e and p i→i values as the 315 clustered networks. We found that transitions between motif types were also present in 316 the activity of sustained runs on ER networks (Fig 7). The relative increase in 317 Fig 6. Networks with Increased Weights Networks have the same structure as those seen in figures 4 and 5, but all edge weights have been increased by 1.5 times their original values.A: Left, density (ratio of existing to possible connections) for synaptic, functional, and recruitment graphs. Right, recurrence (ratio of recurrent to all existing connections) for synaptic, functional, and recruitment graphs. B: Clustering propensity for isomorphic triangle motifs on increased-weight-graph simulations. The y-axis is scaled to match that of Figure 4C (clustering propensities on original graphs) and Figure  7B (clustering propensities on unclustered ER graphs). C: Probabilities of dominance of each triangle motif. The dominant motif at a time point is given by the maximum of mean middleman, mean fan-in, and mean fan-out across units. D: Second order motif state probabilities for progression of temporal recruitment graphs. E: Probabilities for each motif to follow a given second order motif. F: Hitting times for each state for the Markov process defined by motif transition probabilities. G: Trajectories of all complete runs on a sample network in 3-dimensional isomorphic motif space. In blue are the runs on the network with its original weights, in orange are the runs on the same network with weights increased. H: Markov Matrix for transition probabilities between motifs. clustering in ER graphs when comparing synaptic to recruitment graphs is substantially 318 greater than seen in our graphs with excitatory synaptic clusters. In the synaptic 319 networks, triplet clustering coefficients average 0.11. However, this value increased to 320 0.20, 0.09, and 0.15 for fan-in, fan-out, and middleman motifs in the recruitment graphs. 321 The propensity values for all isomorphic motifs were consistently lower than 1, with 322 means centered at 0.8 (Fig 7). We find that unclustered ER graphs and clustered ER 323 graphs differ significantly (p<.005) in second-order state probabilities, state conditional 324 probabilities, hitting times, and state transition probabilities. As in the case with the 325 increased weights however the qualitative cycling of motifs was present in sustained runs. 326

Fig 7. Unclustered (Erdős-Renyi) Networks
A: Left, density (ratio of existing to possible connections) for synaptic, functional, and recruitment ER graphs. Right, recurrence (ratio of recurrent to all existing connections) for synaptic, functional, and recruitment ER graphs. B: Clustering propensity for isomorphic triangle motifs on ER graph simulations. The y-axis is scaled to match that of Figure 4C (clustering propensities on original graphs) and Figure 6B (clustering propensities on graphs with 1.5 times increased weights). C: Probabilities of dominance of each triangle motif. The dominant motif at a time point is given by the maximum of mean middleman, mean fan-in, and mean fan-out across units. D: Second order motif state probabilities for progression of temporal recruitment graphs. E: Probabilities for each motif to follow a given second order motif. F: Hitting times for each state for the Markov process defined by motif transition probabilities. G: Trajectories of all runs on a sample ER network in 3-dimensional isomorphic motif space. All runs reached completion. H: Markov Matrix for transition probabilities between motifs.

327
This work demonstrates that higher-order structure is required for sustained activity in 328 low-rate recurrent networks such as neocortex. Spikes must traverse the network in a 329 coordinated way, cycling between the dominance of three triplet motifs. The transitions 330 between fan-in, middleman, and fan-out motifs reveal the necessity of balance between 331 distribution of signal and convergence of inputs necessary to integrate those inputs. The 332 presence of these motifs in the recruitment graphs reflects the functional routing of 333 activity through synaptic connections. When synapses become stronger and more 334 reliable, overall triplet clustering decreases, while the reliability of their transitions 335 remains. Higher order motifs in the recruitment network thus provide a direct link 336 between network activity and the stable activity in that network.

337
Simpler measures, such as rate or synchrony, did not easily explain network stability 338 in our models. Previous work has demonstrated that sustained, asynchronous network 339 activity co-occurs within a specific range of firing rates, supported by a balance between 340 excitatory and inhibitory conductance [28]. Counter-intuitively, networks were unable 341 to continue to spike both when firing rates are low and also when rates are too high. 342 We also found a counter intuitive result in that networks which display asynchronous 343 spiking dynamics tend to sustain activity for longer than networks with more 344 synchronous spikes. Previous work suggests that synchronous spikes are necessary and 345 particularly efficacious for signal propagation to occur [36][37][38][39][40][41][42]. For example, one 346 proposed mechanism for communication between two regions is via synchronous spike 347 volleys with coherent phases, which capitalize on the recurrent nature of connections in 348 neocortex [38]. Yet synchrony is intricately tied with rate -transmission speeds are 349 higher when spikes are more synchronous. High-rate and high-synchrony spikes may 350 overwhelm the integrative capacity of downstream units, contributing to instability. Our 351 results lend nuance to a previously-held view that although neocortex displays 352 asynchronous activity, it hinders rather than helps spike transmission. We suggest that 353 certain levels of asynchrony are in fact necessary for spike transmission at the network 354 level just as in the case with firing rate.

355
In addition to being low-rate and asynchronous, our networks were algorithmically 356 constructed to have critical dynamics. A system can be described as critical if it 357 operates near a phase transition or "critical" point. A classic example is a mound of 358 sand, which is said to be at the critical point if each new grain of sand added causes an 359 avalanche of sand following a power law distribution. In neocortex this entails activity 360 that, in the absence of external input, propagates activity without become epileptic or 361 dying off, and which follows a power law distribution in its active population size [33]. 362 However, these are only necessary conditions and not deterministic of all critical 363 systems. The idea that neocortex operates near a critical point has a long history in 364 neuroscience, going back to Alan Turing [45], and has been implicated in a number of 365 desirable properties for neural networks [46]. For example, networks tuned near the 366 critical point display maximum information transmission [33], information storage [47], 367 and computational power [48]. Understood as a branching process, critical activity 368 entails very little structure in activity, namely that the average ratio of descendants to 369 ancestors be 1.

370
Our results provide an explanation for the prominence of higher order motifs in real 371 data. Elevated motif counts have been observed in synaptic connectivity and in 372 recordings of clustered activity in vivo [2,3,8,[12][13][14][15][16][17][18]. Through mechanisms of learning 373 in neocortex such as STDP, functional patterns may be further strengthened to enhance 374 integration in cortex. We wish to draw attention to the fact that our study focused on 375 the whole-network scale. Individual units spiked only sparsely, making it difficult to  beyond-pairwise interactions in the brain. Yet the behavior of these models may reflect 382 necessary features of weakly-connected networks in which integration from multiple 383 sources is necessary for the system to succeed. In such systems it is likely that stability 384 relies on higher-order patterns. For example, the spread of rumours in a social network 385 relies on integrating interactions. Social networks are small-world networks 386 characterized by clusters, a feature which is present in our model as well as many other 387 systems [49,50]. The "illusion-of-truth" effect in rumour spreading on a social network 388 has the integrate-and-fire property, where an individual may need to hear a rumour from 389 multiple sources before they reach a confidence threshold to repeat it to others [51]. 390 The necessity of higher-order patterns for stable activity has strong implications for 391 neural coding. Previous work has already demonstrated that correlations enhance 392 coding, with triplet correlations having an advantage over pairwise 393 [5,16,18,[21][22][23][52][53][54]. The neural code must rest upon a foundation of stable 394 propagation of spikes, which we have shown in turn rests on higher-order motifs and 395 coordinated integration. Any two spikes must take place within some time interval for 396 them to interact. The asynchronous and critical regime observed in vivo and in our Our graphs are recurrent and sparsely connected networks of several thousand adaptive 407 exponential leaky integrate-and-fire (AdEx) units with an extra poisson input term [24]. 408 Synapses between all units are conductance-based. This enhances realism by taking 409 neuron-specific state features into account during synaptic integration [24]. Specifically 410 we define our neuron Voltage, V , as.
adaptation current, w, as inhibtory conductance, g i , as poisson input conductance, g p , as Spike was said to occur if V > V t , at a spike V was set to EL, w was incremented by 416 b and g e and g i were incremented by synapse weight if downstream of the spiking 417 neuron.

418
For information on parameters, see S1 Table. Each network is comprised of 1000 419 inhibitory and 4000 excitatory units. Precise wiring probabilities between excitatory 420 and inhibitory populations were determined through grid search within biological 421 constraints.

422
Network synaptic connectivity is heterogeneously clustered [8]. For each network we 423 defined 50 total clusters, with each excitatory unit randomly assigned to two clusters.  Edge weights follow a long-tailed distribution (Fig 1B). Edge weights that originate 430 from inhibitory units have conductances which are ten times greater than those which 431 originate from excitatory units, in accordance with experimental results [55]. simulation would continue for as long as spiking activity is sustained, up to a maximum 437 of 1 second. If during a simulation no spikes occur across the network for 100 ms, the 438 network is deemed inactive and the simulation trial is halted. We found that all network 439 simulations which reached 1 second were also able to sustain activity up to 10 seconds. 440 We therefore chose one second as the marker for a network's ability to sustain activity 441 indefinitely, and as the definition of a successful run. Upon completion each simulation 442 yields an output raster of spike times for every unit in the network. The poisson input 443 train, input units, network topology, and initial conditions of all units were recorded for 444 each simulation. This enabled subsequent analyses and also allowed for re-use of a 445 synaptic graph or re-instantiation of a simulation using some of the original settings Our models are constructed to parallel the features of biological neural networks, but 449 are also constrained by considerations of computing resources. In a study which 450 modeled cortex with high biophysical and anatomical detail, simplifying the neuron 451 model did not lead to drastic differences in the network's behavior from the detailed 452 model or from in vivo results. Most qualities remained unchanged, suggesting that in 453 many cases extremely granular models are not necessary to yield experimental insights 454 [24]. Instead, the most important feature for retaining qualitative correspondence are 455 the rules of synaptic connectivity. Therefore we required our models' connectivity 456 parameters to closely match those of biological neural networks.

457
The probabilities of wiring between excitatory (E) and inhibitory (I) populations in 458 our models were taken directly from or bounded by the results of biological experiments. 459 The wiring probabilities from E to other E units and from I to other I units in 460 neocortex are well-studied, but there is less data on connections from E to I and from I 461 to E. We therefore used an algorithmic approach to find the optimal values. Beginning 462 within a biological range, we used grid search to find values of p e→i and p i→e that led 463 to successful propagation of activity at the lowest possible rates. We used these optimal 464 wiring rules to construct all synaptic graphs in this study.

465
Two iterations of grid search were used to find the wiring parameters needed to 466 maintain naturalistic spiking for the duration of a simulation (Fig 2A). We searched for 467 the optimal probability of connection from excitatory to inhibitory units, p e→i , and the 468 optimal probability of connection from inhibitory to excitatory units, p i→e , such that 469 networks would propagate activity at the lowest possible rates. In the first iteration, we 470 used a low resolution grid (space size 0.001) to search for p e→i within the range 0.16 to 471 0.24 and p i→e within the range 0.29 to 0.37. These two ranges were taken from known 472 wiring probabilities in neocortex. Each grid space was visited ten times to achieve an 473 average measure of rate and completion. This isolated a region of interest where the rate 474 was lowest, between p e→i values of 0.216 and 0.220, and between p i→e values of 0.309 475 and 0.313. We used a higher resolution grid (space size 0.0001) to explore this region.

476
For all subsequent simulations we used the best results obtained from grid search.

477
The optimal probability of wiring for excitatory to inhibitory units, p e→i , was found to 478 be 0.22, and the optimal value for p i→e was 0.31. The values for p e→e and p i→i were 479 taken from known wiring probabilities in neocortex, and were 0.20 and 0.30 respectively 480 [19]. Based on these wiring rules, we constructed synaptic graphs of networks for  Networks were evaluated on rate, defined as average spike frequency over the course of 487 each trial. Networks were also evaluated on branching parameter as a measure of 488 network criticality [33]. A branching value of 1 indicates that for every 'ancestor' unit 489 that is active, there is an equal number of 'descendant' units active at the next time since biological networks display asynchronous activity. In order to evaluate synchrony 493 rapidly enough to make grid search feasible, a heuristic for synchrony was computed as 494 the variance of mean signal normalized by the mean variance of each neuron. A 495 threshold of 0.5 was considered appropriate for network asynchrony. The threshold was 496 evaluated empirically by examining rasters for Poisson spiking neurons with variable 497 coupling, where coupling was change to rate parameter by connected neurons spiking.

498
Branching, was mathematically defined as: where σ is the branching parameter, d is the number of descendants, n max is the 500 maximum number of active neurons, n a is the number of ancestor neurons, n d is the 501 number of descendant neurons, n Σa|d is the number of ancestor neurons in all avalanche 502 events that involved d descendants, and n Σa is the total number of neurons involved in 503 avalanches. The branching parameter describes the network as a whole; it cannot be 504 calculated for isolated units. For a given simulation, we calculated network branching at 505 discrete, sequential time steps throughout. We used the same temporal resolution (5 ms) 506 as used for determining the functional graph; all spikes at time t are ancestors, and all 507 spikes from t + 5 to t + 20 ms are descendants. We then averaged the network branching 508 parameter across all time steps to get the overall branching score for that simulation.

509
For the sake of computational efficiency during grid search, synchrony was defined as 510 the variance of the mean voltage divided by the mean of voltage variances. To evaluate 511 the accuracy of this rapid measure, we compared its results with pairwise Van Rossum 512 spike distance [34,35] (S1 Fig ). We used Van Rossum spike distance as our measure 513 for each simulation's synchrony score outside of the initial grid search. In this way we 514 are able to generate a multitude of network topologies that produce naturalistic spiking 515 activity.

517
The clustering coefficients for the four triplet motifs are calculated in the following 518 manner [43]: 519 Let t i denote the actual number of triplets of a motif type in the neighborhood of 520 unit i, and T i denote the maximum number of such triplets that unit i could form. We 521 will build intuition by beginning with the case of a binary directed graph, or an 522 unweighted connectivity matrix. Let A represent this graph, with a ij = 1 indicating the 523 presence of a directed connection from node i to node j. Raising the matrix A to the 524 nth power yields the number of paths of length n between nodes i and j.

525
Let us first consider the cycle motif; in order for unit i to participate in a cycle, it 526 must have an edge directed to a second unit, that second unit must have an edge 527 directed to a third unit, and that third unit must have an edge pointing back to unit i. 528 The path length is 3, and it both begins and ends at unit i. Thus we calculate A 3 and 529 extract the values along the diagonal, or A 3 ii . This gives the number of actual cycle 530 motifs unit i forms.

531
Counts of the three isomorphic motifs are calculated in a similar way, but they 532 require the additional involvement of A T . Taking the transpose of graph A reverses the 533 directionality, so that connections from i to j are now those from j to i. We would like 534 to trace a path of length 3 from i back to i to form an isomorphic triangle, but exactly 535 one of the steps must be against the true direction of that edge (Fig 3). Beginning with 536 a middleman reference node, the first step is 'with the flow', the second step is invariably 537 'against the flow', and the final step back to i is again 'with the flow'. Therefore

538
AA T A ii gives the number of actual middleman motifs unit i forms. Since fan-in and 539 fan-out motifs are isomorphic to middleman by rotation, we simply rotate which step is 540 'against the flow' to yield the count of fan-in and fan-out motifs. The number of actual 541 fan-in motifs unit i forms is A T A 2 ii , and the number of fan-out motifs is A 2 A T ii .

542
Now that we can calculate the actual counts, the possible counts of each motif T i are 543 easily intuited as a combinatorics problem. Let us begin again with the cycle motif. To 544 form a cycle, node i requires one edge directed towards it and one edge directed away 545 from it. The number of possible pairs of in and out edges from node i is calculated by 546 multiplying the out-degree of node i with the in-degree of node i. In-degree and 547 out-degree refer simply to the number of edges that are directed in or out of a given 548 node. Some edges may be bidirectional -these cannot be part of a true cycle motif. The 549 number of bidirectional edges is subtracted from the product of in-and out-degrees.

550
The final T i for the cycle motif is The Ti for middleman is in fact equal to that for cycle, since forming a middleman 552 has the same requirements -one edge directed inward paired with one edge directed 553 outward.

554
A fan-in motif requires two edges directed in towards the reference node. There are 555 d in i number of choices for the first inward edge. Once that choice has been made, there 556 are d in i − 1 choices remaining for the second inward edge. Thus we multiply the two to 557 yield T i for the fan-in motif.
Fan-out is similar -we simply substitute in-degrees with out-degrees since a fan-out 559 motif requires two edges directed out from the reference node. Ti for the fan-out motif 560 is thus Now that we have both the actual and possible counts for each motif type, the 562 triplet clustering coefficients of node i are simply their ratios. That is, If we were interested in binary graphs, we would end here. However, our graphs of 564 interest have weights associated with each directed edge. There are multiple ways to 565 account for edge weights when calculating clustering coefficients. One way is to consider 566 only the weights of the two edges that are incident to reference node i. Alternatively, the 567 weights of all three edges in a triplet can be taken into consideration. The latter is the 568 chosen method, since we desire a measure of central tendency. The total contribution of 569 a triplet to the clustering coefficient is thus the geometric mean of its weights.

570
Let W denote our weighted directed graph. For a triplet in this graph with edge 571 weights w ij , w ih , and w jh , the geometric mean is (w ij · w ih · w jh ) 1 3 . We can extend this 572 to the entire graph by, Instead of using a binary graph as matrix A in the calculation of 573 t i , using A = W 1 3 , which is the matrix that results from taking the cubic root of every 574 entry in W . We also note that this formulation is invariant to the choice of reference 575 node in a triplet. Incorporating weights only modifies the value of t i . It remains a 576 measure of the actual triplets present -instead of counts it is now a weighted measure. 577 The denominator T i still refers to maximum possible counts. It follows that the 578 clustering coefficient for node i can only be 1 (maximum) when its neighborhood truly 579 contains all triplets that could possibly be formed and every edge in each triplet is at 580 unit (maximum) weight. The complete clustering coefficient formulas of weighted 581 directed graphs are given in 3. subgraph for that time window. We binned spikes into a temporal resolution of 10 ms, 586 so that each complete 1-second simulation resulted in 99 time bins. For each time bin t 587 we defined an active subgraph. If a unit spiked within time bin t, that unit will be part 588 of the active subgraph for time bin t. All units which did not spike within that We calculated motifs in the underlying synaptic graphs and found that all four 595 clustering coefficients were equivalent when averaged across each graph, as expected.

596
To apply motif analysis to activity, we needed to infer functional graphs from spiking 597 activity to summarize network dynamics. Directed edge weights in a functional graph 598 represent the likelihood of a functional relationship in the activity between every pair of 599 units. 600 We used mutual information (MI) to infer functional graphs from all spikes across 601 the course of a trial, regardless of the trial's duration (complete or truncated). This 602 results in a single functional graph for each trial. We chose to perform functional 603 inference using the full spike set because this yields functional graphs with higher 604 fidelity and greater sparsity.

605
The MI method we used is the confluent mutual information between spikes. At a 606 conceptual level, an edge inferred from unit i to unit j using confluent MI means that 607 unit j tends to spike either in the same time bin or one time bin after unit i spikes.

608
Since spikes are binned at 10ms resolution, this method encompasses a delay of 0 to 20 609 ms. This delay is appropriate because we found that presynaptic spikes yielded a 610 maximal response from all postsynaptic neurons at a delay of 5 to 20 ms.

611
Mathematically, we defined an indicator function on the spike train of neuron j, s(j) 612 evaluating to 1 in the case where there is a spike at time t or t + 1, an indicator function 613 on the spike train of neuron i, t(j), evaluating to 1 in the case where there is a spike at 614 time t, and considered the mutual information between them. The resulting networks 615 were further processed by removing weights corresponding to neurons with negative 616 pairwise correlations. Networks were then re-expressed to minimize skewness, and 617 background signals were removed by accounting for background signal and considering 618 weights as the residual resulting from linear regression on background strength. Finally 619 we considered the z-normalized residual graph to account for heteroskedasticity [56]. 620 This yields weighted values, for which we establish 0 as a threshold. All positive normed 621 residual MI values are included in the full functional graph.

623
The recruitment graph represents both the activity and the underlying connections of a 624 network. A recruitment graph is defined separately for each 10 ms time bin of a given 625 trial, thus yielding a temporal sequence of graphs. Each graph is calculated as the 626 intersection of the functional graph, which is unique to every trial, and the active 627 subgraph, which is unique to every 10 ms time bin. All edges in the recruitment graph 628 come from underlying synaptic wiring, contained in the active subgraph, while edge  Triplet clustering coefficients were calculated for every unit on each 10 ms recruitment 636 graph, then averaged across the population to yield the whole-network clustering 637 coefficients for that 10 ms time window. These methods allow us to observe how motif 638 clustering changes in the recruitment graphs across time.
from completed and truncated runs shows that there is a significant difference between 686 the distributions defining these values across each type of run.

687
To further characterize the transitions between different dominant motifs we fit a 688 Markov model to the series of dominant motifs across recruitment graphs in both 689 truncated and completed networks. We again find a significant different (p < .001 for all 690 markov parameters). This all suggests that the failure to propagate found in some 691 networks is tied to the inability to recruit the cyclic structure that we find to be a 692 hallmark of successful propagation. synchrony is the Van Rossum spike distance [34,35]. Greater distances between spike 699 pairs indicate more asynchronous dynamics. However, for the sake of speed during grid 700 search, we used a rapid measure of synchrony (see Methods). We observe a strong 701 correlation between the two measures when we used both to examine the final set of 702 runs that we used for analysis. . 703 S1