Markov State Models Reveal a Two-Step Mechanism of miRNA Loading into the Human Argonaute Protein: Selective Binding followed by Structural Re-arrangement

Argonaute (Ago) proteins and microRNAs (miRNAs) are central components in RNA interference, which is a key cellular mechanism for sequence-specific gene silencing. Despite intensive studies, molecular mechanisms of how Ago recognizes miRNA remain largely elusive. In this study, we propose a two-step mechanism for this molecular recognition: selective binding followed by structural re-arrangement. Our model is based on the results of a combination of Markov State Models (MSMs), large-scale protein-RNA docking, and molecular dynamics (MD) simulations. Using MSMs, we identify an open state of apo human Ago-2 in fast equilibrium with partially open and closed states. Conformations in this open state are distinguished by their largely exposed binding grooves that can geometrically accommodate miRNA as indicated in our protein-RNA docking studies. miRNA may then selectively bind to these open conformations. Upon the initial binding, the complex may perform further structural re-arrangement as shown in our MD simulations and eventually reach the stable binary complex structure. Our results provide novel insights in Ago-miRNA recognition mechanisms and our methodology holds great potential to be widely applied in the studies of other important molecular recognition systems.


Introduction
RNA interference (RNAi) is a key cellular mechanism involved in the regulation of gene expression and the defense against viruses [1,2]. As the central component of RNAi, the RNA induced silencing complex (RISC) mediates the sequence-specific recognition and inhibition of target mRNA [3,4]. At the core of the RISC, the Argonaute protein (Ago) in complex with the non-coding microRNAs (miRNAs) can selectively silence certain genes through specific binding to their mRNAs [5][6][7][8][9][10][11][12]. In human there are four Agos (hAgo1-4) and they together facilitate miRNA-based regulation of more than 30% of human genes [9,13]. While the miRNA libraries associated with different hAgos largely overlap, recent studies have identified exceptional miRNAs that bind specifically to only one hAgo [14][15][16]. Elucidating the molecular mechanisms of how Ago recognizes specific miRNAs is thus of vital importance not only for the fundamental understanding of RNAi but also for further development of miRNA-based therapeutics [17][18][19].
Protein-RNA recognition mechanisms are often interpreted via two popular models: conformational selection and induced fit [20,21]. In conformational selection, the RNA selectively binds to the bound conformation of the protein [22][23][24]. On the other hand, in induced fit, the RNA first binds to an unliganded protein conformation and further induces conformational changes of the protein to its bound form [25]. For example, the conformational selection model has been suggested in the recognition between human U2 snRNP auxiliary factor and polypyrimidine tract RNA [26]. On the other hand, the induced fit model plays a major role in the recognition between the Xenopus ribosomal protein L5 and the 5S rRNA [27]. Since both protein and RNA are large and flexible biopolymers, their recognition mechanisms often lie between the above two models [28][29][30]. Therefore, to dissect the protein-RNA recognition mechanisms, one needs to consider the chemical details of individual binding partners.
Despite intensive studies, the molecular mechanism of Ago-miRNA recognition still remains largely elusive. In the recently solved crystal structures of miRNAs in complex with eukaryotic Agos, miRNAs are deeply buried in the Ago binding groves, and this partially open bound conformation of Ago apparently prevents a direct loading of miRNA [31][32][33][34]. Therefore, a conformational selection model alone appears insufficient to describe the mechanisms of miRNA loading into Ago. Using a series of stopped-flow experiments, Deerberg et al. proposed a multi-step model for miRNA to load into human Argonaute-2 (hAgo2) [35]. However, to fill in their model with molecular details, further studies that elucidate the conformational dynamics of hAgo2 and miRNA are necessary.
An effective approach to investigate the possible role of the conformational selection in Ago-miRNA recognition is to examine if apo Ago can transiently reach a sufficiently open conformation to load miRNA. Molecular dynamics (MD) simulations offer a valuable tool to investigate the conformational dynamics of large biomolecules at atomic resolution. Previous MD studies at sub-microsecond timescales have demonstrated the impact of miRNA and double strand RNA on the conformational stability of the Ago complex [36][37][38]. However, solely using MD to model the complete loading process is extremely challenging due to the gap between the experimental timescale (at millisecond or longer) and that of all-atom MD simulations (typically at sub-microsecond). A feasible alternative approach is to address the above issue in two steps: to obtain the structural ensemble of the apo Ago protein with sufficient sampling followed by examining if any conformation in this ensemble can geometrically accommodate miRNA.
Markov state models (MSMs) hold great promise to describe the apo Ago conformational dynamics because it can bridge the timescale gap between MD simulations and experimental observations [39][40][41][42][43][44][45][46][47][48][49][50]. In an MSM, we coarse grain the conformational space into a number of metastable states, and simultaneously coarse grain the time in Δt. In this way, fast protein motions can be integrated out. When Δt is longer than the intra-state relaxation time, the model becomes Markovian. In other words, the probability for the system to visit a certain state at the next time step (t+Δt) is solely determined by its location at the current time step t. If the model is Markovian, we can model the long timescale dynamics using the first order master equation (see Eq (1)). In recent years, MSMs have been widely applied to study conformational dynamics of biopolymers [41,[51][52][53][54][55][56][57][58][59][60][61].
Protein-RNA docking is a powerful tool to further examine if there exists apo Ago conformations that are open enough to accommodate miRNAs. The HADDOCK approach is particularly suitable for studying Ago-miRNA complexes because it is implemented with a flexible search algorithm that can benefit from the available information such as mutagenesis experiments and crystallographic contacts [62,63]. Performing HADDOCK simulations on apo Ago conformations identified by MSMs may provide an effective strategy to overcome the major limitation of protein-RNA docking, i.e. sufficiently considering the flexibility of both protein and RNA [64]. Furthermore, we can perform MD simulations initiated from docking poses to investigate subsequent conformational changes potentially induced by the binding.
In this work, we combine MSM with large-scale protein-RNA docking to elucidate the molecular mechanisms of miRNA loading into hAgo2. Our MSM identifies an open state of apo hAgo2 in fast equilibrium (at microsecond) with other metastable states. This open state has a characteristic structural feature: the miRNA binding groove is largely exposed to the solvent. Subsequent protein-RNA docking simulations confirm that a fraction of conformations in this state are sufficiently open to accommodate miRNAs. In further MD simulations initiated from the docking models, the complex undergoes structural re-arrangements to reach conformations closer to the crystal binary structure. Based on these observations, we propose a two-step mechanism for the recognition between miRNA and hAgo2: selective binding followed by structural re-arrangement.

Results/Discussion
Elucidating conformational dynamics of apo hAgo2 using MSMs A 480-microstate MSM was initially constructed with the APM algorithm [65] from over 394,000 conformations based on the Euclidean distances of the two regions, namely the PAZ domain and two loops of the PIWI domain (the minor loop P601-P609 and the major loop V818-D838, see S1C Fig). These two regions were chosen since they are the most flexible moieties of apo hAgo2 dynamics in our MD simulations (see S1A Fig). Moreover, the MID domain displays the smallest conformational flexibility among all domains (see S1A Fig). For the N domain, even though it is more flexible than the MID domain, its location is away from the miRNA binding groove and thus may not directly affect the miRNA loading. Our model has been further validated by residence probability tests [45,54], in which the probability for the system to remain in a certain state predicted by the production MSM is in reasonable agreement with the probability obtained directly from MD trajectories (see S2B Fig).
Due to the large number of states, it is challenging to use our validated 480-microstate MSM for human appreciation of hAgo2's conformational features. Therefore, we further lumped the microstates into seven metastable macrostates using the APM algorithm (see Methods for details). However, the residence probability tests show that the produced macrostate MSM predicts faster kinetics than the original MD simulations (see S2D Fig), even though the slowest implied timescales obtained from both MSMs are consistent (see S2C Fig). Therefore, all the quantitative properties reported in this study such as equilibrium populations and mean first passage times are obtained from the well-validated 480-microstate MSM.
According to the degree of opening, the metastable conformational states of hAgo2 can be divided into three groups: an open state (19.0%), a partially open state (55.8%), and various closed states (see Fig 1A). Interestingly, the opening motion of hAgo2 can be effectively described by two PIWI loops and their interactions with the PAZ domain. The conformation of the PIWI loops in combination with their center-of-mass (c.o.m.) distance to the PAZ domain can not only separate the open state from the partially open state (see Fig 1B), but also distinguish five closed states (see S5 Fig). In these closed states, the PIWI domain can form direct interactions with the PAZ domain through multiple residues on both minor (e.g. D605) and major PIWI loops (e.g. E821, D823, E826, see S1B Fig).
Strikingly, the open state conformations can reach a widely open form with the c.o.m. distance between PAZ and PIWI loops as large as 46Å (see Fig 1A). In these conformations, the miRNA binding groove is already fully exposed to the solvent (see the right panel of Fig 1C), which strongly suggests that they may geometrically allow a direct loading of miRNA. In the next section, we will further examine this possibility via protein-RNA docking. Moreover, our MSM shows that the MFPTs between the open and closed state are only at an order of tens of microseconds (see S1 Table). These results indicate that the open conformations may be readily available to bind to miRNA through diffusion-controlled collision in the cellular environment. We also notice that these open conformations are even substantially more open than the binary partially open crystal structure (see the green cross in Fig 1B). These observations suggest that if miRNA can directly load into hAgo2's open conformations, subsequent structural re-arrangements are necessary to reach the stable binary structure.

Modeling miRNA loading into open hAgo2 conformations by protein-RNA docking and subsequent structural re-arrangements by MD simulations
To examine if the open conformations are able to accommodate miRNA, we performed largescale protein-RNA docking using HADDOCK [62,63] on 150 selected open conformations of hAgo2. In particular, these conformations were selected from microstates with the c.o.m. distance between the PAZ domain and PIWI loops larger than 25Å (see Methods for details). For each docking simulation, we produced 50 hAgo2-miRNA structures. By computing their fraction of native contacts (f nat , see Eq (5)), we further show that the energy-based scoring function of HADDOCK can faithfully evaluate the quality of the docking structures. As shown in S6 Fig, the plot of the HADDOCK score against the fraction of native contact displays a nice funnel shape, indicating that the structure with the lowest HADDOCK scoring energy also preserves  the most native contacts. We thus collected the lowest scoring energy structure from each of the 150 docking simulations for further examination.
Some of the lowest-energy docked structures have both 5' and 3' miRNA termini successfully anchored in the binding groove of hAgo2 (see red points in Fig 2), while most other structures have only one end of miRNA forming proper contacts with the protein (see black points in Fig 2). This observation indicates that not all hAgo2 conformations with a large distance between the PAZ and PIWI domains could geometrically accommodate miRNA. Other structural features such as the orientation of PIWI loops may also be necessary to define a sufficiently open binding groove of hAgo2. Indeed, in most successful docking models, the major PIWI loop forms an angle of over 100°with the rigid part of the PIWI domain (see Fig 2). Finally, we found that in our successfully docked structures, even though miRNA forms correct contacts with hAgo2, the Argonaute protein itself adopts a structurally more open conformation than that in its binary crystal (see S7 Fig). This indicates that structural re-arrangement is necessary after the initial binding of miRNA to hAgo2.
Initiating MD simulations from successfully docked structures, we observed substantial structural re-arrangement that brings the docked conformations closer to the binary crystal structure. In particular, we performed five independent 20-ns MD simulations from each of the three representative successfully docked conformations. MD simulations from these three representative conformations displayed decreased distance between the PAZ domain and PIWI loops, indicating that hAgo2 tends to close upon initial miRNA binding (see representative MD trajectories in Fig 3A and S8 Fig). We further show that such structural re-arrangement brings the hAgo2 conformation closer to the crystal structure (see the binding interface RMSDs in the upper panel of Fig Fig 3B). In Fig 3C, we compare the binary crystal structure with a representative conformation after MD simulation, and these two structures display a high degree of structural similarity. Due to their limited lengths, our MD simulations could not fully reach the binary crystal structure, but they strongly suggest that protein-RNA interactions upon the initial miRNA loading may facilitate the further structural re-arrangement to reach the stable binary complex conformation.

A model for the loading of miRNA into hAgo2
Based on our results, we propose a two-step model of selective binding followed by structural re-arrangement model for the recognition mechanism between miRNA and hAgo2 (see Fig 4).  the collision rate between hAgo2 and miRNA under physiological conditions (1 × 10 −6 Ms −1 , see S1 Text for calculation details). These results indicate that the initial hAgo2-miRNA recognition could involve selective binding of open hAgo2 by miRNA. Subsequent MD simulations initiated from docked conformations suggest that protein-RNA interactions may further induce conformational changes, and eventually this will drive the system to reach the stable binary conformation.
Notably, the selective binding step in our two-step mechanism has clear features of the conformational selection because the open hAgo2 state pre-exists in equilibrium with other conformational states, and miRNA selects to bind to this state with high binding affinity (see Fig 4). While in the original conformational selection mechanism the ligand usually selects the protein conformation with the highest affinity, our study suggests that miRNA cannot selectively bind to the partially open conformation of the highest affinity due to steric hindrance to the binding groove (see Fig 2 and S12 Fig). Instead, it chooses the open hAgo2 with the medium affinity as illustrated by our model (see Fig 4). This scenario, where the ligand selects a relatively high-affinity conformation but not the highest, has been previously observed in a number of molecular recognition system including ubiquitin binding [66] and LAO proteinarginine interactions [67], and sometimes regarded as the "extended conformational selection" model [21]. As pointed out by Boehr et al., conformational selection mechanism allows more promiscuous binding than induced fit [28]. Proteins that adopt this recognition mechanism may potentially bind to a larger pool of targets. Therefore, our model may help understand the ability of hAgo2 to bind to miRNAs of different lengths and sequences.
Our simulations complement previous experiments and fill in molecular details for the understanding of hAgo2-miRNA recognition mechanisms. Deerberg et al. showed that three rate constants could nicely fit the kinetics of RNA binding to hAgo2 from their stopped-flow experiments [35]. Based on these observations, they proposed a 3-step model: formation of hAgo2-RNA collision complex, anchoring of the 5' terminus of the RNA and anchoring of the 3' terminus [35]. Interestingly, the initial hAgo2-RNA collision rate was orders of magnitude faster than the subsequent anchoring of RNA termini in their model. Based on our simulations, we suggest that the initial hAgo2-miRNA collision observed in stop-flow experiments could contain the preferential binding of miRNA to an open conformation of hAgo2. The structural re-arrangement of the complex after the initial loading could include anchoring of miRNA termini in a sequential order. Since a significantly larger number of contacts are formed between hAgo2 and 5' miRNA (43 contacts) compared to the 3' miRNA (only 5 contacts) in the crystal structure, we speculate that 5' miRNA may successfully anchor to the binding groove of hAgo2 first. To further investigate this issue, one needs to explicitly simulate the collision and binding between flexible miRNA and hAgo2 in solution. Considering the experimental timescale (at 100 seconds [35]), this will be extremely challenging for all-atom MD simulations. We also note that the structural re-arrangement step in our proposed model can also be regarded as an induced fit mechanism. Indeed, our results do not rule out the possibility that induced fit mechanism plays a role during the initial hAgo2-miRNA recognition. However, it may be difficult for the induced fit model alone to explain multiple distinct timescales observed in the stop-flowed experiments.
Our simulations also generate predictions that can be tested by experiments. For example, two mutants (Y529A and Y529E) were designed with modified binding pocket for the 5' terminus of miRNA. MD simulations of these two mutants display distinct features: Y529A mutant largely preserves the interactions between miRNA and hAgo2, while Y529E mutant significantly disrupts the protein-RNA interactions, resulting in substantial increase in the distance between miRNA and its binding pocket (see Fig 5). We note that even though one may need much longer simulations to thoroughly examine the stability of the hAgo2-miRNA complex, within the timescale we simulate (at 100ns), we already clearly see that the Y529E mutant forms a less stable complex with miRNA compared to the WT and the Y529A mutant. These predictions are consistent with previous experiments showing that miRNA can bind to the Y529A mutant, but not the Y529E mutant [68,69]. More interestingly, since our MSM shows that interactions between the PIWI loops and the PAZ domain can stabilize the apo hAgo2 in the closed state, we designed three modifications on PIWI loops (single mutant D823A, triple mutant E821A-D823A-E826A and deletion mutant ΔP602-D605ΔD819-Q833) that may cause hAgo2 to favor more the open state. Initiated from a closed conformation, the WT hAgo2 stays in the closed state throughout the 50-ns MD simulations (see Fig 6A). This is expected as our MSM predicts that it may take tens of microseconds for the WT hAgo2 to diffuse out of this state (see S10 Fig). However, the deletion mutant (Δ602-605Δ819-833) undergoes fast conformational changes and reaches the open conformation within 20ns (see Fig 6A). The other two mutants (D823A and E821A-D823A-E826A) also quickly escape from the closed conformational state and reach partially open conformations (see Fig 6A). As shown in Fig 6B, the mutant simulations initiated from a partially open conformation are mostly consistent with those started from the closed conformation. Since this initial hAgo2 conformation is extracted from a partially open crystal structure of hAgo2-miRNA complex, the removal of miRNA may leave space and further lead to the collapse of the hAgo2 to reach the closed state in the WT MD simulations. As controls, we have also performed MD simulations of the WT hAgo2 and the mutants initiated from an open conformation (see Fig 6C). As expected, all the systems remain in the open conformation throughout the MD simulations. Taken together, the above observations suggest that the PIWI loops mutations are likely to accelerate the transition from the closed to the open conformation by destabilizing the closed state. We hence predict that these mutations could facilitate the initial binding of miRNA by modulating hAgo2's conformational dynamics to make its open state more accessible, even though these mutants are located far from the protein-RNA binding interface. We note that the interaction between PAZ and the PIWI loops not only regulates the molecular recognition between hAgo2 and miRNA, but could also play a role during mRNA recognition by hAgo2-miRNA complex (see S2 Text for detailed discussions).
In conclusion, by combining MSMs, large-scale protein-RNA docking, and MD simulations, we propose a two-step mechanism for the molecular recognition between miRNA and hAgo2: selective binding followed by structural re-arrangement. Using MSMs, we identified an open state of apo hAgo2 in rapid equilibrium with partially open and closed states of this enzyme. Conformations in this open state can contain a widely open binding groove, which are shown to be able to accommodate miRNA in our protein-RNA docking simulations. We thus suggest that the initial binding of miRNA to hAgo2 adopts a selective binding mechanism. This initial binding complex may undergo further structural re-arrangement to finally reach the stable binary complex structure. Our model is consistent with previous experimental results, and fills in more details at molecular level. We have also made predictions that can be tested experimentally. Our studies provide novel insights in fundamental mechanisms of how the Argonaute protein recognizes miRNA, which plays an important role in the gene regulation by RNA interference. The methodology here also holds great promise to be widely applied to investigate molecular recognition mechanism of other biological event, such as protein-protein interactions and protein-ligand binding.

MD simulation setup
The initial apo hAgo2 conformation was taken from the crystal structure of hAgo2 in complex with miR-20a (PDB ID: 4F3T) [32], and the missing residues were added using MODELLER [70][71][72][73]. The apo hAgo2 protein was solvated in a dodecahedron box containing 42,847 SPC water molecules [74] with 129 Na + ions and 159 Clions to neutralize the charge. All the MD simulations were performed using the GROMACS 4.5.4 package [75] and the Amber99S-B-ILDN force field [76]. Long-range electrostatic interactions were treated with the Particle-Mesh Ewald method [77]. Both short-range electrostatic interactions and van der Waals interactions used a cutoff of 10Å. All bonds were constrained by the LINCS algorithm [78]. Velocity-rescaling thermostat [79] and the Parrinello-Rahman barostat [80] were used for temperature and pressure coupling respectively. The system was first energy minimized with the steepest descent algorithm and equilibrated for 1ns under NPT ensemble (T = 310K, P = 1atm) with the positions of all the heavy atoms restrained. Next, we performed a 5-ns NPT simulation (T = 310K, P = 1atm) to further equilibrate the system. Finally, all the production MD simulations were performed under NVT ensemble at 310K.
After the equilibration, we first performed six independent MD simulations ranging from 50 to 100ns which summed up to~400ns. We then divided the conformations obtained from these six MD simulations (saved every 20ps with a total of~20,000 conformations) into 20 clusters using the K-centers algorithm [81]. We randomly selected one conformation from each cluster and initiated a second round of 150-ns MD simulations with one simulation from each of these 20 conformations. Next, we extracted~170,000 conformations from the first two rounds of MD simulations and built an MSM containing 553 microstates that were further lumped into 9 macrostates. However, the 553-microstate MSM predicted kinetics inconsistent with the original MD simulations, indicating the necessity of additional sampling (see S13 Fig).
We thus seeded the third round of MD simulations from conformations taken from each of the 9 macrostate in the initial MSM with the number proportional to their predicted equilibrium populations. In this round, we performed 30 150-ns MD simulations with a saving interval of 20ps. Combining MD simulations from all three rounds, we obtained a dataset containing over 394,000 conformations extracted from~8μs MD simulations.

Construction of MSM
When constructing MSMs, one divides the conformational space into a group of metastable states, and coarse-grains time into a discrete interval Δt. If the model is Markovian, an MSM can predict the long-timescale dynamics via the first-order master equation: where p(nΔt) is the vector describing state population and T(Δt) is the transition probability matrix of the lag time Δt.
To construct MSMs from MD simulations, we applied our recently developed Automatic state Partitioning for Multi-body systems (APM) algorithm [65]. The main insight of the APM algorithm is to take into account the kinetic information when performing the geometric clustering, which is often based on the RMSD between pairs of MD conformations. This is achieved by splitting the conformations using a divide-and-conquer scheme until all the resulting microstates has a common maximum residence time t 0 . The residence time of a certain microstate i is measured by relaxation of the transition probability out of this state or escaping probability P(i, t): where m j (t) indicates which microstate the system is in at time t. T j is the length of the j-th MD trajectory with conformations stored at a time interval of s. We use the lifetime (t i ) of microstate i, estimated by P(i, t i ) = 1-1/e, as the indicator of its residence time. If t i < t 0 (t 0 is the predetermined maximum residence time), the system can relax out of a microstate within t 0 . The detailed procedure of the APM algorithm is as follows [65] (see S14 Fig): (1) Perform a geometric clustering using the K-centers algorithm [81] to divide MD conformations into two microstates. (2) Examine the residence time of each microstate. For microstate i, if t i ! t 0, we further split it until t i < t 0 . We can then obtain a set of microstates with an upper limit on the residence time. (3) Lump kinetically related microstates into metastable macrostates using spectral clustering [82]. (4) Perform multiple iterations of re-splitting and re-lumping to remove potential internal free energy barriers within microstates.
To apply the APM algorithm, we chose the maximum residence time t 0 = 20ns, and this resulted in 480 microstates that were subsequently lumped into seven macrostates by spectral clustering [82]. To compute the distance between each pair of conformations, we first aligned all MD conformations to the crystal structure against its PIWI Cα atoms, and then computed Euclidean distances based on Cα atoms of the PIWI loops and every 3rd Cα atoms of L1L2 linking regions in the PAZ domain. In this way, both PIWI loops and PAZ-L1L2 regions, the two critical components that determine the magnitude of opening of hAgo2, have relatively equal contributions to the pair-wise distance.

Validation of MSM
The implied timescale τ t is defined by the following equation: where λ k is the k-th largest eigenvalue of the transition probability matrix of τ.
As shown in S2A Fig, the implied timescale plots of the 480-microstate MSM level off at around 20ns, indicating that the model is Markovian at this or longer lag time. In our production MSM, we selected 20ns as the lag time. To further validate our model, we also performed residence probability tests [45,54]. During these tests, we compared the MSM-predicted probability of the hAgo2 staying in a certain microstate with the probability derived by counting the transitions in MD trajectories. As shown in S2B Fig, the two probabilities agree well with each other, suggesting that our MSM captured the kinetics of apo hAgo2 sufficiently well.
We further lumped the 480 microstates into seven macrostates because of the presence of a clear gap between 6 th and 7 th slowest timescales in the implied timescale plots of the microstate MSM. The implied timescale plot of the 7-macrostate MSM was congruent with that of microstate MSM (see S2C Fig). However, this model predicted faster dynamics than that shown by MD trajectories during the residence probability tests (see S2D Fig). Therefore, the 7-macrostate MSM model was used for visualization of the state decomposition and the 480-microstate MSM was used for calculating quantitative properties reported in this study. For example, we summed over the equilibrium populations of all the microstates that belong to a certain macrostate to obtain their populations.

Mean first passage time (MFPT)
MFPT was used to estimate the dynamics between the open and the closed states. It is defined as the average time it takes for the system to visit the final state f from the initial state i. We computed MFPT between the open and the closed states by solving the following equation: where T(τ) ij is the transition probability from state i to state j, τ is the lag time with X ff = 0.
To compute the MFPT between a pair of macrostates from the 480-microstate MSM, we set MFPTs to be zero for all the microstates that belong to the final macrostate. We then computed MFPTs starting from each of the microstates that belong to the initial macrostate, and performed a weighted average over these MFPTs according to the normalized equilibrium populations of these microstates predicted by the 480-microstate MSM. We also performed a cross validation of the MFPTs calculated from our MSM by evenly dividing the dataset into two non-overlapping sub-datasets. MFPTs predicted from MSMs built from these two subdatasets are in reasonable agreement with each other, and are also consistent with the MFPTs reported using the whole dataset (see S15 Fig).

Protein-RNA docking with HADDOCK
Protein-RNA docking simulations were performed with HADDOCK 2.1 [62,63]. Since the RNA binding groove is completely blocked or largely hindered by the PIWI loops in the closed or partially open hAgo2 conformations (see S4 Fig for representative conformations), these conformations are unlikely to allow the direct miRNA binding. Therefore, we selected hAgo2 conformations that are as open as possible (with PAZ-PIWI loops c.o.m. distance > 25Å) and obtained 15 microstates. From each of these 15 microstates, we randomly selected 10 conformations for docking. The input miRNA structure was modeled using ModeRNA [83] based on miRNA fragments in the crystal structure (PDB ID: 4F3T).
The inter-and intra-molecular interactions were calculated by CNS 1.3 [84,85] with PAR-ALLHDG5.4 [86,87] and the OPLS-AA force field [88] applied to protein and miRNA respectively. For each docking simulation, topologies and coordinates of hAgo2 and miRNA were generated from input structures using CNS. The two molecules were initially separated 25Å away before the randomization and rigid body energy minimization. 200 docking poses were generated after the rigid body docking and ranked by the HADDOCK score, a weighted sum of various energy terms (electrostatic, van der Waals, desolvation, ambiguous interaction restraints and buried surface area). The 50 best-scoring poses were subsequently used for semiflexible simulated annealing and the final solvated refinement.
Besides the HADDOCK score, we also employed an adapted definition of the fraction of native contacts (f nat ) [89] to examine the quality of docking structures. The fraction of native contacts is defined as: where q pose and q crystal denote the numbers of protein-miRNA contacts (distance < 4Å) observed in the docking pose and in the crystal structure, respectively. In total we collected 48 hAgo2-miRNA contacts from the crystal structure (see S3 Table).
Since HADDOCK cannot fully consider the conformational dynamics of hAgo2-miRNA complex upon the initial binding, we note that subsequent MD simulations are required to refine the docking poses.  Table. Unambiguous distance restraints for HADDOCK simulations. The restraints were selected from the closest contacts between hAgo2 and terminal nucleotides of miRNA in the crystal structure (PDB ID: 4F3T). (PDF) S3 Table. hAgo2-miRNA native contacts from the crystal structure. All the hAgo2-miRNA distances within 4Å are recorded. (PDF) S1 Text. Comparison between apo hAgo2 dynamics and kinetics of hAgo2-miRNA collision. (PDF) S2 Text. PAZ-PIWI loops interaction could enhance the fidelity of mRNA target recognition. (PDF)