Efficient sampling-based Bayesian Active Learning for synaptic characterization

Bayesian Active Learning (BAL) is an efficient framework for learning the parameters of a model, in which input stimuli are selected to maximize the mutual information between the observations and the unknown parameters. However, the applicability of BAL to experiments is limited as it requires performing high-dimensional integrations and optimizations in real time. Current methods are either too time consuming, or only applicable to specific models. Here, we propose an Efficient Sampling-Based Bayesian Active Learning (ESB-BAL) framework, which is efficient enough to be used in real-time biological experiments. We apply our method to the problem of estimating the parameters of a chemical synapse from the postsynaptic responses to evoked presynaptic action potentials. Using synthetic data and synaptic whole-cell patch-clamp recordings, we show that our method can improve the precision of model-based inferences, thereby paving the way towards more systematic and efficient experimental designs in physiology.


Introduction
In neuroscience, machine learning, and statistics, a central problem is that of inferring the parameters of a model .For instance, in supervised learning, one may want to learn the parameters of a Deep Neural Network (DNN) so as to minimize the difference between its output and training labels; in this case,  represents the DNN to be trained, and represents its weights and biases.Similarly, in biology, the parameters of a system can be studied by fitting a biophysical model to recorded observations.In most cases, these parameters can be neither directly measured nor analytically computed, but can be inferred using the recorded outputs of the system as a response to input stimuli .In biology, the physical quantities of a system (e.g. an organ, a cell, or a synapse) can be estimated by deriving a generative biophysical model  of the system, and by fitting its parameters to the observed responses to experimental inputs .By computing the likelihood of the outputs given the inputs and the parameters ( | , ), it is possible to obtain either a point-based estimate of the parameters such as the maximum likelihood parameters ML or the maximum a posteriori parameters MAP (Barri et al., 2016), or to compute the full posterior distribution ( | , ) ∝ ( | , ) using for instance the Metropolis-Hastings (MH) algorithm (Bird et al., 2016).
However, the accuracy of these estimates critically depends on the pair ( , ), and especially on how the successive input stimuli = 1∶ are chosen.For instance, training a DNN on non independent and identically distributed (i.i.d.) training examples (i.e.blocked training) will lead to catastrophic forgetting (Flesch et al., 2018).On the other hand, most experiments in biology still rely on pre-defined and non-adaptive inputs 1∶ , which may not yield sufficient information about the true parameters of the studied system.Consequently, experiments often require more observations or repetitions to reach a certain result, which increases their cost, time, and need for subjects.
An efficient framework to alleviate this issue is called Bayesian Active Learning (BAL).Knowing the current estimate of the parameters, the experimental protocol (i.e. the next input +1 ) can be optimized on the fly to maximize the mutual information between the recordings and the parameters (Figure 1 (b)).BAL is a branch of Optimal Experiment Design (OED) theory (Emery and Nenarokomov, 1998;Sebastiani and Wynn, 2000;Ryan et al., 2016).It has already been used in neuroscience to infer the parameters of a Generalized Linear Model (GLM) (Lewi et al., 2009), the nonlinearity in a linear-nonlinear-Poisson (LNP) encoding model (Park et al., 2011), the receptive field of a neuron (Park and Pillow, 2012), or the parameters of a Hidden Markov Model (HMM) (Jha et al., 2022).
However, implementing BAL for biological settings can be challenging, especially for real-time applications.Its applicability to real experiments is limited by two main drawbacks.Firstly, it requires computing an update of the posterior distribution of parameters after each time step, and using it to compute the expected information gain from future experiments.This involves solving an optimization problem over a possibly high-dimensional stimulus space: current methods are either too time consuming, or only applicable to specific models.Secondly, to reduce computational complexity, classical implementations of BAL usually only optimize for the immediate next stimulus input.This classical myopic approach disregards all future observations in the experiment, and is thus possibly sub-optimal (Ryan et al., 2016;Drovandi et al., 2018).
Our main contribution is to provide a general framework for online active learning, called Efficient Sampling-Based Bayesian Active Learning (ESB-BAL).We use particle filtering, which is a highly versatile filtering method (Crisan et al., 2018), for posterior computation; and propose a parallel computing implementation (Besard et al., 2018(Besard et al., , 2019) ) for efficient posterior update and information computation.Whereas previous implementations of active learning either relied on time consuming Monte Carlo (MC) methods (Huan and Marzouk, 2013;Foster et al., 2019) or were only applicable to special cases, such as linear models or GLM (Lewi et al., 2009), our proposed solution is fast enough to be used in real-time biological experiments and can be applied to any state-space model.
To illustrate our method, we apply it to the problem of inferring the parameters of a chemical synapse with Short-Term Depression (STD).Upon the arrival of a presynaptic action potential, vesicles from a pool of independent release sites will fuse with the presynaptic plasma membrane with a probability , each of these release events giving rise to a quantal current (Del Castillo and Katz, 1954;Katz, 1969).In addition, synaptic transmission is also dynamic.Short-term depression occurs when the inter-stimulation interval (ISI) is shorter than the time needed for synaptic vesicle replenishment (Tsodyks et al., 1998).A synapse exhibiting STD can thus be described by its parameters , , , and by its depression time constant.These parameters can be inferred using excitatory postsynaptic currents (EPSCs) recorded from the postsynaptic cell and elicited by stimulating the presynaptic axon.The accuracy of these estimates critically depends on the presynaptic stimulation times: if inter-stimulation intervals are longer than the depression time constant, STD will not be precisely quantified.But if the stimulation frequency is too high, the pool of presynaptic vesicles will be depleted, leading to poor parameter estimates (Gontier and Pfister, 2020;Wieland et al., 2021).Synaptic characterization is thus a relevant example application for ESB-BAL, as it requires careful tuning of the inputs 1∶ , but it is also a challenging one: computation needs to be faster than the typical ISI, which can be on the order of a few milliseconds.Using synthetic data, we show that our method allows to significantly reduce the uncertainty of the estimate in comparison to classically used non-adaptive stimulation protocols.We also show that the rate of information gain (in bit/s) of the whole experiment can be optimized by adding a penalty term for longer ISIs.Lastly, we extend active learning to non-myopic designs.Using recordings from cerebellar mossy fiber to granule cell synapses from acute mouse brain slices, we show that our framework is sufficiently efficient for optimizing not only the immediate next stimulus, but rather the future stimuli in the experiment.

Model
A general setting for Bayesian Active Learning When using active learning in sequential experiments, three key elements need to be defined (Figure 1 (b)): 1.The system to be studied: it is described by a generative model , which parameters can be inferred from its observed responses 1∶ to a set of input stimuli 1∶ .Given the stochastic nature of most systems studied in biology, the random variable 1∶ corresponding to the observations can take various values 1∶ according to a distribution ( 1∶ | 1∶ , ).In our application example of BAL, the system will be a model of binomial neurotransmitter release (see Section The system: a binomial model of neurotransmitter release).2. A filter that computes the posterior distribution of the parameters given the previous inputs and observations ( | 1∶ , 1∶ ): after each new input +1 and observation +1 , it is updated to obtain ( | 1∶ +1 , 1∶ +1 ) (see Section The filter: online computation of the posterior distributions of parameters).3. A controller that computes the next optimal input stimuli * +1 so as to maximize a certain utility function, which is often defined as the mutual information between the parameter random variable Θ and the response random variable +1 given the experimental inputs +1 (Θ; +1 |ℎ ), where ℎ = ( 1∶ , 1∶ ) is the experiment history (see Section The controller: computation of the optimal next stimulation time).
In synaptic characterization, inputs correspond to a set of stimulation times 1∶ and observations correspond to recorded excitatory postsynaptic currents (EPSCs) 1∶ .In case of successive experiments (Park and Pillow, 2012), the mutual information between the parameters and the next observation +1 conditioned on the experiment history ℎ is: where (Θ|ℎ ) is the entropy of Θ given the experiment history up to time step : and is the conditional entropy of Θ given the future observation random variable +1 .Since the actual value of the future observation is unknown, we take the average over +1 of the conditional entropy +1 (Θ|ℎ , +1 = +1 ) conditioned on a certain value +1 .As the predictive distribution depends on the unknown parameters, we also have to take an average over , using the current posterior distribution ( |ℎ ) at time (Lewi et al., 2009): Upon the arrival of a presynaptic spike, these vesicles will stochastically fuse with the plasma membrane and release their neurotransmitters into the synaptic cleft.After spike , vesicles (out of the available ones in the readily-releasable pool) release their neurotransmitters with a probability .Neurotransmitters will bind to postsynaptic receptors: a single release event triggers a quantal response .The total recorded postsynaptic current (i.e. the output of the system) is the sum of the effects of the release events.After releasing, vesicles are replenished with a certain time constant , which determines short-term depression.(b) Bayesian Active Learning applied to biology.At each time step, the response of the system (e.g.here a synapse) to artificial stimulation is recorded.This observation is used by the filter to compute the posterior distribution of parameters ( | 1∶ , 1∶ ).Given this posterior, the controller then computes the next input * +1 to maximize the expected gain of information of the next observation.In classical experiment design, the inputs 1∶ are defined and fixed prior to the recordings.
The goal of Bayesian active learning is to select the next stimulation to maximize the mutual information between the parameters and all future observations: * where  +1 is the set of possible inputs at time step + 1 and  +2∶ + is the set of possible protocols for the stimulations from time step +2 to + .This set of protocols includes all the stimulation constraints, e.g. the remaining time of the experiment or the minimal inter-stimulation time.Optimizing all future inputs is an intractable problem (especially for online applications), since the algorithmic complexity scales exponentially with the number of observations .For this reason, BAL only optimizes for the next stimulus (an approach referred to as a myopic design) (see Figure 1 Different methods have been proposed to compute Eq. 6. Monte Carlo (MC) methods (Huan and Marzouk, 2013) or a variational approach (Foster et al., 2019) can be employed, but they usually require long computation times that can be impractical if the time between successive experiments is short.Closed-form solutions can be computed only for some special cases, such as linear models or GLM (Lewi et al., 2009).

The system: a binomial model of neurotransmitter release
To illustrate our ESB-BAL framework, we apply it to the problem of estimating the parameters of a chemical synapse, represented as a state-space model with unobservable hidden states and input-dependent state transitions.A classical used model to describe the release of neurotransmitters at chemical synapses is called the binomial model (Katz, 1969;Barri et al., 2016;Bird et al., 2016;Gontier and Pfister, 2020;Stricker and Redman, 2003;Scheuss and Neher, 2001;Tsodyks et al., 1998).According to this model, a synapse is described as an Input-Output Hidden Markov Model (IO-HMM) with the following parameters (units are given in square brackets, see also Figure 1  The variables and represent, respectively, the number of available vesicles in the readilyreleasable state at the moment of spike (with 0 ≤ ≤ ), and the number of vesicles (among ) released after spike (with 0 ≤ ≤ ).For simplicity, we use the notations (⋅) = (⋅| ) with = [ , , , , ], and ∶= ( , ) to refer to the hidden variables at time step .
The probability of recording a set of EPSCs ( 1∶ ) is computed as the marginal of the joint distribution of the observations 1∶ and the hidden variables 1∶ , i.e. where is the emission probability, i.e. the probability to record output knowing that vesicles released neurotransmitter; ( | ) is the binomial distribution and represents the probability that, given available vesicles, of them will indeed release neurotransmitter: Finally, ( | −1 , −1 , ) represents the process of vesicle replenishment.During the time interval , each empty vesicle can refill with a probability ( ) = 1 − exp − such that the transition probability ( | −1 , −1 , ) is given by: is the number of refilled vesicles during the time interval .Eqs. 7 to 10 define the observation model of the studied system (see Figure 1), i.e. the probability of a set of observations 1∶ given a vector of stimuli 1∶ and a vector of parameters .

The filter: online computation of the posterior distributions of parameters
To be applicable for online experiments, the filtering block, which will compute the posterior distribution of parameters ( |ℎ ), needs to satisfy two requirements: 1.It must be sufficiently versatile to be applied to different systems and models; 2. It must be online (i.e. its algorithmic complexity should not increase with the number of observations) (Bykowska et al., 2019).
A promising solution is particle filtering (Kutschireiter et al., 2020), and especially the Nested Particle Filter (NPF) (Crisan et al., 2018).This algorithm is asymptotically exact and purely recursive, thus allowing to directly estimate the parameters of a HMM as recordings are acquired.
The NPF relies on two nested layers of particles to approximate the posterior distributions of both the static parameters of the model and of its hidden states .A first outer filter with out particles is used to compute the posterior distribution of parameters ( |ℎ ), and for each of these particles, an inner filter with in particles is used to estimate the corresponding hidden states (so that the total number of particles in the system is out × in ).After each new observation, these particles are resampled based on their respective likelihoods, hence updating their posterior distributions (Figure 5).
The NPF was originally proposed for static HMMs, in which the state transition probability ) is supposed to be constant.Here, we extend it to the more general class of Input-Output Hidden Markov Models (IO-HMMs, also called GLM-HMMs in neuroscience, see Jha et al. (2022)), in which the state transition probability at time step depends on an external input .For instance, state transition in our model of synapse is not stationary, but depends on the ISI (see Section Results).
The filter (Algorithm 1) relies on the following approximation to recursively compute the likelihood of each particle.Once the observation has been recorded, the likelihood of particle , with ∈ {1, ..., out }, depends on with If the variance of the jittering kernel (which mutates the samples to avoid particles degeneracy and local solutions, see Methods and Materials) is sufficiently small, and hence if allows to recursively compute Eq. 11.In practice, the different terms in Eq. 12 are computed as such: ( | , ) corresponds to the Likelihood step of Algorithm 1; ( | −1 , , ) corresponds to the Propagation step; and ( −1 | 1∶ −1 , ) corresponds to the distribution of hidden states at time − 1.
Contrary to previous methods for fast posterior computation that were only applicable to specific models (Lewi et al., 2009), our filter can be applied to any state-space dynamical system, including non-stationary and input-dependent ones.Moreover, it does not require to approximate the posterior as a Gaussian nor require a time consuming (and possibly unstable) numerical optimization step, while being highly parallelizable and efficient (Besard et al., 2018(Besard et al., , 2019)).

The controller: computation of the optimal next stimulation time
The objective of experiment design optimization is to minimize the uncertainty of the estimates (classically quantified using the entropy) while reducing the cost of experimentation (defined as the number of required trials, samples, or observations).The optimal next stimulus * +1 that will maximize the mutual information (i.e.minimize the uncertainty about as measured by the entropy) can be written from Eqs. 1, 3, and 6 as Eq. 13 requires to compute two (possibly high-dimensional) integrals over and +1 , for which closed-form expressions only exist for specific models.To avoid long MC simulations, we propose to use mean-field computations and to replace integrals by point-based approximations.Firstly, instead of computing the full expectation over ( |ℎ ), we set to the mean posterior value ̂ = ∫ ( |ℎ ) , which can be conveniently approximated as Depending on the nature of the studied system and on the time constraints of the experiment, different estimators can also be used, such as e.g.̂ = arg max ( |ℎ ).Secondly, instead of computing the full expectation over the future observation, we set +1 to its expected value; Eq. 13 thus becomes * +1 ≈ arg min In the general case, ( +1 |ℎ , +1 , ̂ ) can be computed using Bayesian Quadrature (Acerbi, 2018).More specifically, for our model of a chemical synapse, an analytical formulation for the expected value ( +1 | 1∶ +1 , ̂ ) can be efficiently derived using mean-field approximations (see Section Meanfield approximation of vesicle dynamics).For each candidate +1 in a given finite set  +1 , the entropy (Θ|ℎ , +1 , +1 = ( +1 |ℎ , +1 , ̂ )) can be computed using Algorithm 1.
Finally, by assuming that the posterior distribution of Θ is well approximated by a Gaussian distribution (which is the case when there are sufficient observations (Paninski, 2005)), its entropy can be estimated as 1 2 log |2 Σ |, where Σ is the covariance matrix of the particles { } 1≤ ≤ out (Jha et al., 2022).

First setting: reducing the uncertainty of estimates for a given number of observations
From the experimentalist point of view, a highly relevant question is how to optimize the stimulation protocol such that the measured EPSCs are most informative about synaptic parameters.Previous studies showed that some stimulation protocols are more informative than others, but ignored the temporal correlations of the number of readily-releasable vesicles (Costa et al., 2013) or did not compute which protocol would be most informative (Barri et al., 2016).In classical deterministic experiment protocols, the stimulation times 1∶ are defined and fixed prior to the recordings.By contrast, active learning optimises the protocol on the fly as data are recorded.
Results for a simulated experiment with ground-truth parameters * = 7, * = 0.6, * = 1 pA, * = 0.2 pA, and * = 0.25s (i.e. the same set of parameters * used in Bird et al. (2016)) are displayed in Figure 2 (a).Here, we compare ESB-BAL to three deterministic protocols: • in the Constant protocol, the synapse is probed at a constant frequency, i.e. = cst; • in the Uniform protocol, ISIs are uniformly drawn from a set  of candidates consisting of equidistantly separated values ranging from min = 0.005s (i.e. one order of magnitude shorter than the shortest ISI used in Barri et al. (2016)) to max , i.e. ∼ Uniform([0.005, max ]); • finally, in the Exponential protocol, ISIs are drawn from an exponential distribution with mean .Such a protocol has been shown to provide better estimates of synaptic parameters compared to periodic spike trains with constant ISI (Barri et al., 2016;Costa et al., 2013).
The efficiency of these deterministic protocols will depend on their respective parametrizations.To conservatively assess ESB-BAL, we optimize the values of , max , and so that the Constant, Uniform, and Exponential protocols have the best possible performance for the used ground-truth parameters * .Figure 6 shows the average final entropy decrease (i.e. the information gain) after 200 observations using the Constant (top), Uniform (middle), or Exponential (bottom) protocol, for different values of their hyperparameters.These deterministic protocols (with their optimal respective parametrizations) are then compared to ESB-BAL.
For the different protocols, the average (over 100 independent repetitions) joint differential entropy of the posterior distribution of parameters is plotted as a function of the number of observations (Figure 2 (a)).ESB-BAL allows to reduce the uncertainty (as measured by the entropy) of the parameter estimates for a given number of observations.It should be noted that it is compared to deterministic protocols whose respective hyperparameters have been optimized offline, knowing the value of * .In real physiology experiments, classical protocols are non-adaptative and are defined using (possibly sub-optimal) default parameters.In contrast, in active learning the protocol is optimized on the fly as data are recorded, and its performance will not depend on a prior parametrization.ESB-BAL thus outperforms the best possible Constant, Uniform, and Exponential protocols.
We also verify that ESB-BAL does not lead to biased estimates of , as its average RMSE outperforms that of other protocols (Figure 2 (b)); and that it is sufficiently fast for online applications, as computation time exceeds the ISI in only a small proportion of cases (Figure 2 (c)).Similar results can be observed for different sets of ground-truth parameters * (Figure 7) or when only optimizing for the entropy of a specific parameter (Figure 8).
Finally, in Figure 2 (a), ESB-BAL (black dashed line) is compared to exact active learning (gray dashed line), in which Eq. 13 is computed exactly using MC samples.Samples to compute the expectation over are drawn from ( |ℎ ), whereas samples used to compute the expectation over +1 are drawn from the generative distribution ( +1 |ℎ , +1 , ) (Eqs. 8, 9, and 10).This shows that the approximations used in Algorithm 2 to make active learning online have only a small effect on performance.

Second setting: reducing the uncertainty of estimates for a given experiment time
Active learning allows, for a given number of observations, to improve the reliability of the estimated parameters.However, in its classical implementation, only the next stimulus input is optimized, disregarding all future observations in the experiment.This myopic approach is thus sub-optimal.Moreover, neurophysiology experiments are not only constrained by the number of observations, but also by the total time of the experiment.Since cell viability and recording stability may become limiting during an experiment, the total time of an experimental protocol ∑ =1 also needs to be accounted for.Here, to account for the total time of the experiment, and to globally  optimize the information gain per unit of time, we propose to modify the classical formulation of active learning (Eq.13) by adding a penalty term for longer ISIs: * ( ) The effect of the penalty weight on the entropy of the posterior distribution of is displayed in Figure 3 (a).As expected, adding a penalty term to Eq. 13 reduces the precision of the inferred parameter.The loss of information gain increases with the penalty weight .However, increasing also increases the speed of information gain, as seen in Figure 3 (b).Depending on the available time for the experiment, it is thus possible to tune so as to find a trade-off between long-term precision (Figure 3 (a)) and information rate (Figure 3 (b)).

Third setting: batch optimization and application to neural recordings
To reduce computational complexity, classical implementations of sequential experiment design usually only optimize for the immediate next observation.However, it might be critical for some systems to optimize not only the next stimulus, but rather the next stimuli of the experiment altogether (see Eq. 5) (Ryan et al., 2016;Drovandi et al., 2018).Synaptic characterization is a telling example: indeed, STD can only be observed for specifically organized batches of stimulation times.When probing the presynaptic cell, neuroscientists usually use repetitions of a spike train (Figure 4 (c)) consisting of a tetanic stimulation phase (sustained high-frequency stimulation used to deplete the presynaptic vesicles) followed by recovery spikes at increasing ISIs to probe the STD time constant (Markram et al., 1998).These spike trains (especially the duration and frequency of the tetanic phase, and the ISI between recovery spikes) are usually not optimized, and are held constant throughout an entire experiment.
Here, we show that ESB-BAL can be used to extend active learning to non-myopic designs, and to optimize the next input stimuli.Algorithm 3, which is a generalization of Algorithm 2, is used to select the next batch of stimuli We validate our method by applying it to EPSC recordings from acute mouse cerebellar slices (Figure 4 (a)): 5 mossy fiber to granule cell synaptic connections were studied.Each of them was stimulated using both ESB-BAL and different deterministic protocols: in deterministic protocols, the presynaptic cell is stimulated using a repetitive train stimulation consisting of either 20 or 100 stimuli at either 100 Hz or 300 Hz followed by 6 recovery pulses at increasing intervals.
For each stimulation protocol, the posterior distribution of the parameters was computed offline using the Metropolis-Hastings algorithm.Figure 4 (b1) shows the entropy at the end of the different stimulation protocols for the 5 studied synaptic connections.Each synapse was stimulated using several protocols (once with ESB-BAL, and at least once with a deterministic one), each having possibly different number of observations 1 , 2 , 3 … .Hence, for each synapse, only the first = min{ 1 , 2 , 3 … } observations of each protocol were kept, to compare them for the same number of observations.In case a synapse was stimulated with several deterministic protocols, their respective entropies were averaged to only get one point in Figure 4 (b1).Panels ( 2) to ( 6) show the marginal posterior distributions for parameters of an example synapse, obtained using either ESB-BAL or a deterministic protocol (consisting of repetitions of the same train of 20 stimuli at 100Hz followed by 6 recovery spikes).Our results show that batch optimization via ESB-BAL significantly outperforms deterministic stimulation.
Candidate batches of stimulation times in  +1∶ + are parametrized with a low-dimensional parametrization to span different durations and frequencies for the tetanic phase, and different ISIs between the recovery spikes (see Figure 4 (c) and Methods and Materials).They are chosen to span 3 parameters: the number < of spikes in the tetanic stimulation phase, the frequency of spikes in the tetanic stimulation phase, and the duration of the final recovery ISI last .The remaining − ( + 1) spikes are then distributed geometrically between the end of the tetanic phase and the penultimate spike.

Discussion
Summary: We developed a method called Efficient Sampling-Based Bayesian Active Learning (ESB-BAL).Using particle filtering, ESB-BAL selects the next experimental design to maximize the mutual information between the output of the experiment and the constants of the studied system.To validate it, we apply ESB-BAL to the problem of estimating the constants of a chemical synapse from its postsynaptic currents evoked by presynaptic stimulations.After each new observation, the optimal next stimulation time can be computed using ESB-BAL.Using synthetic data and synaptic whole-cell patch-clamp recordings in cerebellar brain slices, we show that our method is efficient and fast enough to be used in real-time biological experiments and can reduce the uncertainty of inferred parameters.
For illustrative purposes, we applied ESB-BAL to the specific problem of estimating the parameters characterizing a chemical synapse.However, we argue that our framework is sufficiently general and efficient to be applicable to a broad range of systems and domains of research.Especially, our extension of the Nested Particle Filter can be applied to any state-space system, even time-variant ones.Moreover, as the Nested Particle Filter is robust to time-varying parameters and model uncertainties (Crisan et al., 2018), we believe that our proposed solution will be especially relevant for neurophysiology experiments or for clinical applications, for instance for optimizing Deep Brain Stimulation (DBS) for the treatment of Parkinson's Disease (Tinkhauser and Moraud, 2021;Carè et al., 2022).
We expect active learning to be particularly beneficial to neurophysiology experiments involving live cells or subjects.By reducing the number of samples required to obtain a certain result, or by improving the efficiency of information gain, we can reduce the cost of the experiment and the need for animal subjects1 .
Limitations: Our approach has some room for improvements.An evident drawback of using particle filtering is that it requires a very large number of particles to provide low variance estimates, as the approximation error only decreases with the square root of the number of particles.Moreover, future experimental work should focus on implementing ESB-BAL for different and more complicated models of a chemical synapse, including for instance short-term facilitation (Pfister et al., 2010;Costa et al., 2013;Barri et al., 2016;Bird et al., 2016;Gontier and Pfister, 2020) or vesicle content variability (Bhumbra and Beato, 2013;Soares et al., 2019).Finally, future theoretical work should focus on obtaining results on the convergence of the estimators when using active learning.When observations are independent and identically distributed (i.i.d.), active learning will give an unbiased estimate of the parameters, whose variance will decrease with the number of observations (Paninski, 2005).Such theoretical results lack for systems with correlated outputs (such as the EPSCs in the studied synapse model), possibly leading to information saturation (Moreno-Bote et al., 2014) or biased estimates.
For some recordings in Figure 4 (a), the benefit of using ESB-BAL instead of a deterministic protocol might seem non-significant.For some synaptic connections (e.g.synapse 3), ESB-BAL even yields a higher entropy for the posterior distribution of .Different explanations can be put forward.Firstly, it is possible that the classically used deterministic protocols (20 stimuli at 100Hz followed by 6 recovery spikes at increasing ISIs, see Figure 4 (b)) are already well informative about the synaptic parameters.For these protocols, the tetanic stimulation phase and the long inter-sweep interval allow to estimate the value of the hidden states and with a high accuracy, which facilitates the estimation of the synaptic parameters (Barri et al., 2016).Moreover, the recovery spikes at varying ISIs are known to be more informative about the synapse's dynamics than a constant stimulation frequency (Costa et al., 2013).Secondly, the small benefit of using ESB-BAL for some synaptic connections might be due to a model mismatch issue, as the model defined by Eqs. 7 to 10 might not exactly represent the ground-truth mechanisms of the studied synapses.Although mossy fiber to granule cell synapses are believed to be good examples of depressing synapses, our simplified model disregards several aspects of synaptic transmission such as facilitation, postsynaptic saturation, or presynaptic vesicles heterogeneity (Ritzau-Jost et al., 2014;Sargent et al., 2005;Saviane and Silver, 2006).These assumptions might explain why ESB-BAL performs more consistently in simulations than in real experiments.
Future work: An interesting area of future research would be to formulate Optimal Experiment Design as an optimal control problem, using the framework of the Bellman equation (Bellman, 1966;Sutton and Barto, 2018).This multi-stage optimization problem could be solved exactly by defining the associated Bellman equation, in which (Θ; 1∶ ) is the objective function, current observation is the state, input is the control, and where the optimal policy determines the next input +1 .This approach would allow to account for the remaining available experimental time.
Bayesian Active Learning is an efficient framework for solving the problem of optimal experiment design for parameters inference.Its goal is, for a given generative model , to optimize the accuracy of the estimates of the parameters of , i.e. to minimize the entropy of the posterior distribution ( | 1∶ , 1∶ , ).But it is also possible to extend optimal experiment design to model selection: in this setting, the goal is to maximize the discriminability between competing candidate models, i.e. to minimize the entropy of (| 1∶ , 1∶ ).Different schemes for OED for model selection have been proposed (see Gontier (2022) for a discussion), but their computational complexity is a major impediment to their concrete applicability.An interesting future application of ESB-BAL would be to extend it to optimal model selection.
Overall, we expect our proposed solution to pave the way towards better estimates of stochastic models in neuroscience, more efficient training in machine learning, and more systematic and automated experimental designs.

Conclusion:
When designing an experiment in physiology, or when training a model on data in machine learning, it is common to choose a priori a fixed set of inputs to the studied system.The use of such non-adaptive, non-optimized protocols often leads to a large variance of the estimated parameters, even when using a large number of trials or data points.Bayesian active learning is an efficient method for optimizing these inputs, but exact solutions are often intractable and not applicable to online experiments.Here, we introduce ESB-BAL, a novel framework combining particle filtering, parallel computing, and mean-field theory.ESB-BAL is general and sufficiently efficient to be applied to a wide range of settings.We use it to infer the parameters of a model of synapse: for this specific example, computation time is a critical constraint, since the typical ISI is shorter than 1s, and because several future inputs need to be optimized together.Using synthetic data and neural recordings, we show that our method has the potential to significantly improve the precision and speed of model-based inferences.

Bayesian Active Learning
For a fixed model , the goal of BAL is to optimize the accuracy of the estimates of its parameters , i.e. to minimize the entropy of the posterior distribution ( | , ) (see Lindley (1956); Huan and Marzouk (2013) for a detailed discussion).The utility  ( , ) of a given experimental protocol and of a data set can be either defined as the gain in Shannon information between the prior and the posterior distribution of the parameters , as suggested in Lindley (1956):  ( , ) can also be defined as the Kullback-Leibler divergence between the prior and the posterior: The expected utility  ( ) of a protocol is finally the expected value of  ( , ) under ( | ), which yields the same result under ( 18) and ( 19): It is worth noting that  ( ) is actually the mutual information between and Θ.Indeed, Eq. ( 20) can be rewritten as which yields  ( ) = (Θ; ).Different MCMC-based methods to compute  ( ) are described in Huan and Marzouk (2013).

Particle Filtering for synaptic characterization
Initialisation: Computing the posterior distribution of firstly implies to specify a prior ( ) from which the initial particles { 0 } 1≤ ≤ out will be drawn.For simplicity, we consider here uniform priors (as in Bird et al. (2016); Gontier and Pfister (2020)), although the algorithm readily extends to different choices of prior.

Jittering step:
The parameters that we wish to infer are supposed to be constant.It is thus impossible to define dynamics of the form ( +1 | ) for the particles (as opposed to filtering problems aiming at inferring a dynamical hidden state, as for instance in Kutschireiter et al. ( 2017)).To avoid particle degeneracy, it is thus necessary to mutate particles using a jittering kernel ( −1 ).When particles take continuous values, a classical choice for the jittering kernel is to draw the next particle from a Gaussian distribution with mean −1 and which variance is called the jittering width (see Crisan et al. (2018) for a detailed discussion).In our implementation, the range of possible values for each parameter is discretized, so that each particle corresponds to a position on the grid of possible parameters values (same implementation as in Bird et al. (2016)).The free parameter in our jittering kernel thus corresponds to the probability of moving by one bin: where ̃ −1 is one (randomly chosen) bin away from −1 .
Algorithm 1: Particle filtering for computing one step update of the posterior distribution of parameters Algorithm 2 is slightly modified in Figure 2 (a) for the "ESB-BAL (exact)" simulations, in which Eq. 13 is computed using MC samples instead of the point-based simplifications explained in Eqs. 14 and 15.Samples to compute the expectation over are drawn from the current posterior distribution ( |ℎ ), i.e. by random sampling from the pool of particles { } ∈{1,…, } .For each of these samples, and for each candidate next input +1 in  +1 , samples used to compute the expectation over +1 are drawn by randomly sampling +1 , +1 , and +1 (using respectively Eqs 10, 9, and 8) from the ground-truth values of the hidden states and .

Mean-field approximation of vesicle dynamics
Our synapse model, as defined by Eq. 7 to 10, is a Hidden Markov Model with observations and hidden states = ( , ).The predictive distribution ( +1 |ℎ , +1 , ) used in Eq. 1 can be computed using the Baum-Welch algorithm: however, the algorithmic complexity of this forwardbackward procedure, which scales with 4 , makes it impractical for closed-loop applications.Here, we suggest that computation can be massively simplified by using a mean-field approximation of vesicle dynamics: the analytical mean and variance of hidden and observed variables can be

Electrophysiology recordings
Animals were treated following national and institutional guidelines.The Cantonal Veterinary Office of Zurich approved all experiments (authorization no.ZH009/2020).Experiments were performed in male and female 1-2-month-old C57BL/6J mice (Janvier Labs, France).Animals were housed in groups of 3-5 in standard cages on a 12h-light/12h-dark cycle with food and water ad libitum.Mice were sacrificed by rapid decapitation after isoflurane anesthesia.The cerebellar vermis was removed quickly and mounted in a chamber filled with cooled extracellular solution.300-µm thick parasagittal slices were cut using a Leica VT1200S vibratome (Leica Microsystems, Germany), transferred to an incubation chamber at 35 • C for 30 minutes, and then stored at room temperature until experiments.
Voltage-clamp recordings were done using a HEKA EPC10 amplifier controlled via Patchmaster software (HEKA Elektronik GmbH, Germany) essentially as described in (Kita et al., 2021).Voltages were corrected for a liquid junction potential of +13 mV.Extracellular mossy fiber stimulation was performed using square voltage pulses (duration, 150 µs) generated by a stimulus isolation unit (ISO-STIM 01B, NPI) and applied through an ACSF-filled pipette.The pipette was moved over the slice surface close to the postsynaptic cell while applying voltage pulses until excitatory postsynaptic currents (EPSCs) could be evoked reliably.Care was taken to stimulate single mossy fiber inputs.EPSCs were recorded at a holding potential of -80 mV; data were low-pass filtered at 2.9 kHz and digitized at 20-50 kHz.Train stimulation protocols comprised bouts of 20 or 100 MF stimulations at 100 Hz, followed by single pulses to monitor recovery from short-term depression (intervals: 25 ms, 50 ms, 100 ms, 300 ms, 1 s, 3 s).The interval between subsequent train recordings was at least 30 s.For OED experiments, custom protocols were generated online as file templates for use with Patchmaster.EPSCs were quantified as peak amplitudes from a 300-µs baseline before onset.
To facilitate the definition of the range of possible values for parameters and (and especially to avoid running an experiment with too narrow ranges), recorded EPSC amplitudes were normalized by dividing them by their maximum value.Analyses are thus performed by assuming ∈ [0, 1] and ∈ [0, 1].Posterior plots in Figure 4 (a) were then multiplied by the maximum amplitudes observed in each dataset.
Figure 1.(a)Model of binomial synapse with STD.The presynaptic axon is stimulated to evoke an action potential.The input refers to the time interval since the previous stimulation, i.e., to the inter-spike interval Δ .In chemical synapses, the presynaptic terminal is characterized by the presence of vesicles containing the neurotransmitter molecules, of them being in the readily-releasable state(Kaeser and Regehr, 2017).Upon the arrival of a presynaptic spike, these vesicles will stochastically fuse with the plasma membrane and release their neurotransmitters into the synaptic cleft.After spike , vesicles (out of the available ones in the readily-releasable pool) release their neurotransmitters with a probability .Neurotransmitters will bind to postsynaptic receptors: a single release event triggers a quantal response .The total recorded postsynaptic current (i.e. the output of the system) is the sum of the effects of the release events.After releasing, vesicles are replenished with a certain time constant , which determines short-term depression.(b) Bayesian Active Learning applied to biology.At each time step, the response of the system (e.g.here a synapse) to artificial stimulation is recorded.This observation is used by the filter to compute the posterior distribution of parameters ( | 1∶ , 1∶ ).Given this posterior, the controller then computes the next input * +1 to maximize the expected gain of information of the next observation.In classical experiment design, the inputs 1∶ are defined and fixed prior to the recordings.
of presynaptic independent release sites [-]); • (their release probability upon the arrival of a presynaptic spike [-]); • (the standard deviation of the recording noise [A]); • (the quantum of current elicited in the postsynaptic cell by one release event [A]); • (the time constant of synaptic vesicle replenishment [s]).

Figure 2 .
Figure 2. First setting: reducing the uncertainty of estimates for a given number of observations.(a) Entropy of the posterior distribution of vs. number of observations for different stimulation protocols.Synthetic data were generated from a model of synapse with ground truth parameters * = 7, * = 0.6, * = 1 A, * = 0.2 A, and * = 0.25s (Bird et al., 2016).Gray dashed line corresponds to "exact" active learning, in the sense that * +1 is computed from Eq. 13 using MC samples instead of using Eq. 15.Traces show average over 100 independent repetitions.Shaded area: standard error of the mean.(b) RMSE for the same simulations.(c) Histogram of the differences between the ISI and the corresponding computation time for the ESB-BAL simulations.

Figure 3 .
Figure 3. Second setting: reducing the uncertainty of estimates for a given experiment time (effect of penalizing long ISIs on parameter estimates uncertainty and rate of information gain).(a) Entropy of the posterior distribution of vs. number of observations for different values of in Eq. 16.Same settings as in Figure 7.(b) Same results, but displayed as a function of time.Inset: slope of the entropy vs. time curves (i.e.information rate) vs. after 10 seconds.

Figure 4 .
Figure 4. Third setting: batch optimization and application to neural recordings.(a) Left: 5 mossy fiber to granule cell synaptic connections from acute cerebellar slices of mice were studied.Each of them was stimulated using both deterministic protocols and ESB-BAL.Right: examples of postsynaptic current traces recorded from a granule cell upon extracellular mossy fiber stimulation.(b) Panel (1): Entropy at the end of simulation protocols for the 5 studied synapses.Markers' sizes are proportional to the number of observations obtained for each synapse.Test: weighted regression analysis ( = 0.01945).Panels (2) to (6): Marginal posterior distributions for an example cell using either a deterministic protocol (blue) or ESB-BAL (black).(c) Schematic of how elements in  +1∶ + in Algorithm 3 are defined.They are chosen to span 3 parameters: the number < of spikes in the tetanic stimulation phase, the frequency of spikes in the tetanic stimulation phase, and the duration of the final recovery ISI last .(d) Simulated experiment with ground-truth parameters * = 47, * = 0.27, * = 2.65 pA, * = 1.32 pA, and * = 0.17s (i.e. the MAP values from the recordings shown in (b)).