Inferring the Forces Controlling Metaphase Kinetochore Oscillations by Reverse Engineering System Dynamics

Kinetochores are multi-protein complexes that mediate the physical coupling of sister chromatids to spindle microtubule bundles (called kinetochore (K)-fibres) from respective poles. These kinetochore-attached K-fibres generate pushing and pulling forces, which combine with polar ejection forces (PEF) and elastic inter-sister chromatin to govern chromosome movements. Classic experiments in meiotic cells using calibrated micro-needles measured an approximate stall force for a chromosome, but methods that allow the systematic determination of forces acting on a kinetochore in living cells are lacking. Here we report the development of mathematical models that can be fitted (reverse engineered) to high-resolution kinetochore tracking data, thereby estimating the model parameters and allowing us to indirectly compute the (relative) force components (K-fibre, spring force and PEF) acting on individual sister kinetochores in vivo. We applied our methodology to thousands of human kinetochore pair trajectories and report distinct signatures in temporal force profiles during directional switches. We found the K-fibre force to be the dominant force throughout oscillations, and the centromeric spring the smallest although it has the strongest directional switching signature. There is also structure throughout the metaphase plate, with a steeper PEF potential well towards the periphery and a concomitant reduction in plate thickness and oscillation amplitude. This data driven reverse engineering approach is sufficiently flexible to allow fitting of more complex mechanistic models; mathematical models of kinetochore dynamics can therefore be thoroughly tested on experimental data for the first time. Future work will now be able to map out how individual proteins contribute to kinetochore-based force generation and sensing.


Parameter updates
We update each parameter separately by drawing samples from the full conditional probability distributions as follows:

Precision τ
The conditional distribution π(τ |·) is Gamma distributed, where π(·) indicates a probability function and the conditioning is on all parameters and data except τ , giving the update where f k i (σ k i ) = (−1) k v σ k i + (−1) k κ(d i − L cos(θ i )) − αx k i , and parameters are defined as in main text.

Velocities v ±
The conditional distributions on v ± are Gaussian distributed. For v + we have conditional and similarly for v − . Completing the square, we have for Here n ± are the number of time points in state ± respectively combined over both sister trajectories and we have n + + n − = 2(n − 1). We use a Metropolis-Hasting rejection step to impose the constraints v + > 0, v − < 0, using the above as a proposal.

Spring constant κ
The conditional distribution on κ is Gaussian. It is given by We impose the constraint κ ≥ 0 by using the above as a proposal with a Metropolis-Hasting rejection step.

Spring natural length L
The natural spring length L has conditional Completing the square gives the update with τ L = s −2 L + 2τ κ 2 n−1 i=1 cos 2 (θ i ). Again, we impose positivity with a rejection step.

Anti-poleward force parameter α
The conditional π(α|·) is a Gaussian, with α > 0. This gives a conditional posterior, k=1 (x k i ) 2 and truncated to α > 0 by rejection step. The conditionals on p c , p ic are Beta distributed as follows where N c , K c are the number of coherence time points and the number of switches to incoherence respectively for the (current) sister hidden states σ k t . Similarly, 1.1.7 Hidden states σ k t The hidden MC can be updated by a Gibbs move locally at each time point [1]. Define a rs as the transition matrix between the states (+, −), with p c , p ic the probabilities of no change, a depending on the current state. The conditional for state σ k i , sister k is then given by where φ is the Gaussian pdf of ∆X k i depending on σ k i , and we indicate the dependence of a on the other sisterk (determining whether the sisters are coherent/incoherent). The first and last time points are slightly different, as there is no preceding/following state. This conditional is easily normalised for a Gibbs update, running consecutively over sisters and through time. However, this algorithm has poor mixing; on the Gelman-Rubin convergence statistic [2] the noise τ is the slowest to converge because of poor convergence of the hidden MC. Performance is improved by using two moves. Firstly, doing a joint update of both sisters at each time point, i.e. we use the above conditional (product over sisters) to compute the probabilities of states (+, +), (+, −), (−, +), (−, −) at each time point and use a Gibbs update. Secondly, we use a block move of length w for each sister separately. This is computationally time consuming taking about 4 times longer than the joint sister update. We found a 20% mix of the block move, w randomly selected over 4-10 gave as good convergence as a full block move. The block move is constructed as follows. Define row vectors u 0 , u 1 associated with current states 0,1 at position i. Then iteratively move to the next time point reconstructing for sister k, thereby doubling the length of u r . This works through the window. Initialise the vectors by u r = a sr if σ k i1 = s, and then u r is the unnormalised probability vector of all 2 w states if σ k iw+1 = r. The state in the window is then updated using a Gibbs move. Windows from the start of the time series have a prior on the initial position as above. Computational costs to update over the full time series are similar for w = 2...11, thereafter rising.

Initial conditions, mixing and convergence
We ran 5 chains for each sister pair from over-dispersed initial conditions and used the Gelman-Rubin statistic [2] to test for convergence. Chains were initially run up to 250,000 samples and sub-sampled by a factor of 20. If they failed to converge (convergence diagnostic > 1.2 on any parameter), run length was doubled (subsampling frequency also doubled), repeated up to a maximum of 3 times to meet the convergence criterion. Any trajectory with a convergence statistic still above 1.2 was considered unconverged and discarded from further analysis. The hidden state was initialised from a simulation of the MC given the initial parameters p c , p ic . We used the last 5000 samples per chain for final analysis, discarding burn-in.

Identifiability
The model in (5) has an identifiability problem for constant twist θ t , i.e., data in a single trajectory is insufficient to determine all the parameters. This derives from the symmetry L → L + l/ cos θ, v ± → v ± + κl for arbitrary l; thus only the two combinations v + − v − , v + − κL cos θ (and recombinations of these) are independent of the symmetry; thus the model is a priori unidentifiable. The sign constraints v + > 0 and v − < 0 limit this identifiability symmetry to −v + κ < l < −v − κ in addition to L + l/ cos θ > 0, where L and v ± are the unknown true values. This symmetry is lost if θ t varies over the trajectory; however the low variation of θ t leaves an a posteriori identifiability problem where the posterior distribution for L shifts towards higher values as the prior variance is increased (Fig. S1B). To infer the model we therefore require an informed prior on L; using a prior for L determined from nocadozole treated cells (Fig. S1A) we found all other parameters were estimatable from the data. We tested the sensitivity of estimated parameters to changes in the L prior mean (Fig. S1C) and found that apart from v ± (which was expected due to the nature of symmetry) almost all other parameters were robust, κ being the next most sensitive showing significant change only when the L prior mean was decreased by 20%. This confirmed that any error in determining the natural length (because of other effects of nocadozole treatment, such as compaction of the chromosomes) did not significantly affect our conclusions as regards v ± , and did not affect conclusions pertaining to the other parameters at all.

Model selection details 2.1 Brownian motion model
Brownian motion in an over-damped environment (with negligible inertia) can be modelled by the following stochastic differential equation where D is the diffusion coefficient. Integrating over a small time interval ∆t gives the approximate displacement, to order O(∆t) where η t ∼ N (0, s 2 ) and s 2 = 2D ∆t. Hence, frame-to-frame displacements are approximately For two sisters (X 1 ,X 2 ), the likelihood of a set of displacements {∆x k i } is where τ = s −2 is called the precision.

Brownian motion models with spring and drift
We also computed the Bayes factor of M coh against two other Brownian models: one with sister pairs connected by a Hookean spring but otherwise each undergoing a 1D BM, M spring-BM , and further extension of M spring-BM to allow for a constant drift acting on both sisters, M spring-drift-BM . The Bayes factors for these models largely agreed with that for M BM (Fig. S3A,B) so we did not compare these models further.

Marginal likelihoods
To compare the support for different models we computed marginal likelihoods to estimate Bayes factors, that is, the posterior odds ratios conditioned on two models. We employed algorithms due to Chib [3] and Chen [4] to take advantage of previously generated MCMC samples. Chib's method also requires generating further chains with blocks of the parameters fixed. We sequentially fixed each parameter in each subsequent chain, setting the fixed parameter to its posterior mean estimate. The extra chains incur considerable additional computational cost, therefore after checking that the difference in marginal likelihood between the two methods was negligible over a wide range of trajectories, we favoured Chen's method, which is based on the following identity for K samples from a converged MCMC chain where X is the displacement data, Θ * is a fixed point for the model parameters (here, the posterior mean estimate), Θ (k) and Σ (k) are parameter and hidden state samples from the MCMC run, respectively and g(Θ k |Σ (k) ) is an arbitrary density. To calculate π(X|Θ * ) we integrated out the hidden state Σ by importance sampling weighted by drawing samples from the posterior distribution for Σ. The choice of g affects the variance of the estimate. We set g to the product of Γ distributions fitted independently to the posterior samples for parameters v ± , κ, α, L, and τ , and Beta distributions similarly fitted for p c , p ic . This choice ensures that the density of g will not deviate too far from the joint posterior density. We tested that the resulting marginal likelihood estimate was independent of this choice by comparing with estimates derived using g(Θ|Σ) = π(Θ) and estimating the variance of both methods using overlapping batch statistics [4]. The variance under the fitted g was markedly lower but the mean estimates were consistent across trajectories. Having computed the marginal likelihoods for two models M 1 and M 2 , their Bayes factor is B 1/2 = π(X|M 1 ) π(X|M 2 ) and the data supports M 1 over M 2 when B 1/2 > 1.

Explained variance
The explained variance (EV) is a measure of how well a model accounts for the variance of a trajectory. Given the model parameters, hidden state and data we can compute the sum of squares where f k i is as in (7). Here V is the difference between the predicted displacements of the model and the observed displacements. By averaging over the posterior samples from the MCMC, we obtain the average explained variance, where is an estimate of the displacement variance of the two sisters. This is equal to the displacement variance if we subtract the mean displacement but this is close to zero for long time series.

Directional correlation statistic
In a Brownian motion the probability of a change in direction of motion (i.e., sign(∆x i ) = sign(∆x i+1 )) is 0.5 and is independent at each time-step. Hence the number of directional switches K in n time-points is binomially distributed The number of switches k in a trajectory is computed as the number of times the sign of the displacement of the centre position between the two sisters changes (i.e., sign(∆(x 1 i + x 2 i ))). We also examined subsampled displacements over the interval n∆t. Using this we define a statistic D n∆t = 1 − 2 min(P (K ≤ k), P (K > k)) ∈ [0, 1] A trajectory is not likely to be Brownian motion if D n∆t ≈ 1 due to correlations between successive displacements.

Live cell imaging and kinetochore tracking
HeLa-Kyoto (K) cells stably expressing eGFP-CENP-A / eGFP-Centrin1 were cultivated and maintained as described in [5]. Cells were seeded in 35 mm glass bottom dishes (MatTek Corporation) and the media changed to Leibovitz L-15 supplemented with 10% foetal calf serum prior to imaging. Cells were imaged using a 100X 1.4 NA oil objective on a confocal spinning-disk microscope (VOX Ultraview; Perkin and Elmer) with a Hamamatsu ORCA-R2 camera, controlled by Volocity 6.0 (Perkin and Elmer). Mitotic cells were first identified using bright-field illumination to minimise phototoxicity. Image stacks (25 Z-sections, 0.5 µm spacing) were collected every 2 seconds for 5 minutes (150 time-points per video). Camera pixels were binned 2 × 2 giving an effective pixel size of 138 nm in the lateral direction with 16 bits per pixel imaging depth. Exposure conditions were 50 ms per Z-slice using a 488 nm laser set to 15% power. 90% of cells entered anaphase after measurement demonstrating that cells were unharmed by phototoxicity. Image time-series were deconvolved using Huygens 4.1 (SVI) using a point-spread function measured from micro-bead images. Kinetochores trajectories were tracked as previously described [6] except we implemented a 3D Gaussian mixture model algorithm for refining spot positions to sub-pixel accuracy [7], based on an earlier 2D version [8]. MATLAB kinetochore tracking software (KiT) is available from http://mechanochemistry.org/mcainsh/software.php and will be described in detail in a forthcoming paper.

Nocodazole treatment for spring rest length determination
To estimate the spring rest length, we treated eGFP-CENP-A / eGFP-Centrin1 HeLa-K cells with nocodazole. This causes depolymerisation of all microtubules, thus relieving kinetochore pairs of tension. Cells were imaged in prometaphase as described in section 3, except with 31 Z-sections and 0.2 µm spacing. Higher Z resolution was necessary because kinetochores pairs were randomly oriented in 3D. Kinetochore trajectories were tracked as described in section 3.
To accurately determine the spring rest length, we fitted trajectories to a 1D harmonic oscillator model where D t is the distance between sister kinetochores at time t, κ is the spring constant and L is the rest length. s 2 quantifies the noise in the trajectory. We used an analogous MCMC algorithm to section 1, excluding any need to compute hidden states, to infer L from individual trajectories.