Conformational Dynamics of Response Regulator RegX3 from Mycobacterium tuberculosis

Two-component signal transduction systems (TCS) are vital for adaptive responses to various environmental stresses in bacteria, fungi and even plants. A TCS typically comprises of a sensor histidine kinase (SK) with its cognate response regulator (RR), which often has two domains—N terminal receiver domain (RD) and C terminal effector domain (ED). The histidine kinase phosphorylates the RD to activate the ED by promoting dimerization. However, despite significant progress on structural studies, how RR transmits activation signal from RD to ED remains elusive. Here we analyzed active to inactive transition process of OmpR/PhoB family using an active conformation of RegX3 from Mycobacterium tuberculosis as a model system by computational approaches. An inactive state of RegX3 generated from 150 ns molecular dynamic simulation has rotameric conformations of Thr79 and Tyr98 that are generally conserved in inactive RRs. Arg81 in loop β4α4 acts synergistically with loop β1α1 to change its interaction partners during active to inactive transition, potentially leading to the N-terminal movement of RegX3 helix α1. Global conformational dynamics of RegX3 is mainly dependent on α4β5 region, in particular seven ‘hot-spot’ residues (Tyr98 to Ser104), adjacent to which several coevolved residues at dimeric interface, including Ile76-Asp96, Asp97-Arg111 and Glu24-Arg113 pairs, are critical for signal transduction. Taken together, our computational analyses suggest a molecular linkage between Asp phosphorylation, proximal loops and α4β5α5 dimeric interface during RR active to inactive state transition, which is not often evidently defined from static crystal structures.


Introduction
Microorganisms have been evolved with many sophisticated signal transduction systems to rapidly respond to various kinds of external and/or internal stimuli. Two-component systems (TCSs) have emerged as a major signal transduction pathway in microorganisms [1]. A typical TCS contains a membrane-integrated sensor histidine kinase (SK) and a response regulator (RR). A SK, which senses and interprets stimuli to activate its cognate cytoplasmic RR through The simulated RegX3s adopts a reasonably compact conformation, which is clearly different from its original active state (rmsd of 12.5 Å for all Cα atoms) (Fig 1A and S2 Fig). Globally, the ED swings 174°toward the RD with α4 rotation of 51.6°, resulting in a compact conformation resembling MtrA and PrrA, when compared with the active RegX3 (S2A Fig). Partial helical unwinding and rotation of α4 helix change its original perpendicular positioning to a position parallel to the RD of RegX3s as in PhoB, TorR, MtrA and YycF. The hydrophobic patches Val 88, 89 and Leu 93 are protected by β4 and β5. In addition, the helical unwinding of α4 leads two amino acids extension of loop β4α4. Meanwhile, α5 rotates 41°in the same direction as α4, completing the α4β5α5 face in an almost parallel conformation to both domains ( Fig 1A). During the simulation, α4β5α5 and the N-terminal region of α1 appear to have significant conformational changes ( Fig 1B and S3 Fig). In contrast, the ED keeps a rigid conformation as active RegX3 during simulation (rmsd of 1.2 Å for 96 Cα atoms) (S2C and S2D Fig).
More importantly, the signature residue Tyr 98 of the α4β5α5 face adopts an outward conformation as in inactive states of other RRs, while it flips without any rotameric changes to its coupling residue Thr 79 in original active RegX3. The side chain of the second signature switch residue Thr 79 adopts a new conformation that faces against the active pocket ( Fig 1C). Together, we concluded that RegX3s has adopted a putative inactive conformation of RegX3 generated from MD simulation.

Conformational dynamics of loops
Several loops surrounding the active site were proposed to be vital for RR phosphorylation and global conformational changes [21]. In order to examine their dynamics in RegX3, the trajectories for β1α1, β3α3 and β4α4 loops from both active and inactive states were assembled and analyzed in LigPlot + and Bio3d.
Loops β1α1 (amino acids 8-10) and β4α4 (amino acids 79-82) stabilize the active RegX3 conformation through interactions with trivalent ions [13]. Loop β1α1, composed of two glutamic acids and one aspartic acid, exhibits the least conformational fluctuation (rmsd of 0.5 Å for 3 Cα atoms) during active to inactive transition ( Fig 3A). In contrast, loop β4α4, as a connection between α4β5α5 face and the RD core region, shows a slightly larger conformational change (rmsd 0.8 Å for 4 Cα atoms) (Fig 3C). Both loops form an interaction network with La 3+ ion and water through Arg 81 , Glu 84 , Glu 10 and Asp 9 together with Asp 52 in active RegX3 ( Fig 3D). However, significant conformational shift of loop β4α4 in inactive RegX3s renders Arg 81 to make two hydrogen bonds directly with Asp 52 and projects Asp 9 and Glu 10 out of the phosphorylation site ( Fig 3E). During transition the observed dynamics of these loops are in good agreement with previously published data (20,36).
Taken together, these data from the MD simulation suggest that the loop β4α4 extended from rotation and partial helical unwinding of α4 causes Arg 81 to establish a completely different interaction network in inactive RegX3s. It is interesting to note that loop β3α3 (amino . The alignment was based on core folding of ED. α4β5α5 face elements are labeled. Rotation degree of α5 is indicated with arrow in the lower half of the structure. Perpendicular and parallel switch for β4α4 are marked in the upper half of the structure. (B) High mobility of RegX3 residues based root-mean-square fluctuation (rmsf). (C) Active pocket stabilized by key residues together with a lanthanum ion in the active conformation of RegX3 in comparison with simulated RegX3s. The alignment and color scheme are the same as panel A. Coupled residues Thr 79 and Tyr 98 together with other important residues around active site are labeled with sticks. The green sphere represents Lanthanum ion.  acids 53-59) exhibits the most conformational fluctuations (rmsd of 2.0 Å for 7 Cα atoms) among these three loops during the active-inactive transition (Fig 3B), but it does not appear to change the interaction network in phosphorylation site (Data not shown).

Conformation transition network
Low frequency normal mode analysis (NMA) of elastic network model (ENM) is a preferable approach to capture long-range conformational change, which is otherwise very expensive for MD [24][25][26]. The low frequency NMA modes depict the functional relevance. The estimated collective parameters of motions were calculated for the active to inactive transition of RegX3. The collective overlap score for the first 10 modes was 0.95, which was dominated by mode 2 (the core of 0.84), indicating that the transition of RegX3 structures is functionally relevant (Fig 4A). The NMA non-zero mode 2 revealed a closure rotation of RD towards ED using a hinge region α4β5, suggesting that α4β5 effectively controls the fluctuation of RegX3. To investigate dependency of the total overlap on α4β5α5 face during the active to inactive transition, we performed NMA again using α4β5α5 face as a pocket region. The initial modes were dominant and scored approximately 0.64, suggesting dependency of the α4β5α5 face in RegX3 conformational transitions ( Fig 4B). Consistently, the coherent nature of the pocket residue analysis further indicated an importance of the α4β5 region but not α5 in RegX3 function ( Fig 4C). It is interesting that this analysis also showed a relatively small signal for Asp 82 . To further refine α4β5 as a critical region in transition, we performed a hot-spot analysis. Residues Tyr 98 to Ser 104 in RegX3 exhibited high σω value in the overall elastic distortion, and thus considered as the "hot-spot" residues ( Fig 4D).
These hot-spot residues may likely be important to modulate global dynamics between active and inactive states of RRs in long-range conformational communications. Tyr 98 has been shown to play a key role in allosteric transition of OmpR/PhoB family whereas Lys 101 interacts with the side chain of Thr 79 only in active RR [3,13]. Taken together, our NMA analysis indicates that residues Tyr 98 to Ser 104 of α4β5 region are the most important in RegX3 conformational switch.

Coevolution analysis
We next performed a coevolution analysis to search for critical residues connecting the dimeric interface and global conformational changes. We collected 29,283 members of OmpR/PhoB family and analyzed them using a program GREMLIN, which constructs a global statistical model, assigns probability to each residue and calculates all evolving positions ( Fig 5). The top 23 pairs of coevolved residues involved in dimeric interface were selected within the cutoff distance of 5 Å (Table 1).
We then compared these coevolved residues in RegX3, PhoB (1ZES), TorR (1ZGZ) and YycF (1NXP) in OmpR/PhoB family. Residues of 19 pairs were involved only in stabilization of RegX3 dimer and not of PhoB, TorR and YycF, which were therefore excluded from our analysis ( Table 1, italicized). The remaining four pairs are conserved in dimer interfaces of active PhoB, TorR and YycF (Fig 6). Lys 87 -Glu 107 pair does not contribute to RegX3 dimerization although it surely is one of key residues in PhoB, TorR and YycF. In contrast, Asp 97 -Arg 111 pair is an important player in RegX3 since it stabilized conformation of β5 and α5 by a salt bridge [14]. Therefore, Glu 24 -Arg 113 , Asp 97 -Arg 111 and Ile 76 -Asp 96 were critical residues to connect the dimeric interface with global conformational changes during active to inactive transition of RegX3.

In silico mutagenesis
To further study these coevolved residues on RegX3 conformational changes, we subjected the active RegX3 with a series of alanine mutations for residues Glu 24 , Ile 76 , Asp 96 , Asp 97 , Arg 111 and Arg 113 to 5 ns MD simulation. The original active RegX3 was also set up to generate RegX3-5ns as a control. Trajectories from these MD simulations were assembled by Nosé-Hoover constant pressure and temperature (NPT) system, and analyzed by PCA for variances of the first 2 PCs, leading to estimation of mutational effects for these single or double mutations It is noteworthy that Ile 76 is important to mediate domain-swapped dimer only in RegX3 but not for other OmpR/PhoB family [13]. Together, all these mutations seem to release constrains on conformation of α4β5α5 region compared with wild-type RegX3 in our MD experiments, suggesting that these key residues identified from the coevolution analysis are indeed important for RegX3 structural dynamics.
We then analyzed these structures of RegX3 mutants for global conformational changes with OmpR/PhoB family members using Bio3d [27] and differences in these structures were presented in PCA planes (Fig 7A). DrrB and DrrD form one inactive group with an extended conformation. The simulated RegX3s is clustered next to PrrA and MtrA as another inactive group with a compact conformation, whereas RegX3 clusters with RegX3-5ns, retaining an active group. The mutants mRegX3(96), mRegX3(97), mRegX3(111), mRegX3(97,111) and mRegX3(76,96) are clustered near the active group. Interestingly, mRegX3(113) and mRegX3 (24,113) appear to attain a transitional conformation between the active and inactive groups.
Together, our in silico mutagenesis experiments indicates different effects of these coevolved residues on conformational changes of RegX3. The conservation of these coevolved residues depicts their importance for OmpR/PhoB family. Interestingly, the residues responsible for these structural distributions are in N-terminal RD rather than C-terminal ED (Fig 7B), further signifying the dominant role of RD in regulating the global conformation of a RR.  Table 1.

Discussions
In OmpR/PhoB family, phosphorylation actuates dimerization through two possible pathways: rearrangement of RD and ED domains that overcomes dimerization inhibition and subtle conformation changes of some critical residues at the dimer interface [28,29]. However, connections between these two pathways in conformational dynamics of a RR have not been thoroughly investigated. In this study we generated an inactive state of RegX3, a RR in OmpR/ PhoB family through 150 ns MD and compared it with its original active state by a variety of computational analyses. Further analysis by coevolution identified several critical interaction residues, which were fully supported by in silico mutagenesis. All these portray a coherent picture of molecular dynamics of OmpR/PhoB family during active to inactive transition.

Conformational dynamics of RegX3
Inactive state of RegX3 generated from 150 ns MD has a compact global conformation and typical rotameric conformation of Tyr 98 and Thr 79 as other inactive RRs (Figs 1 and 2). A hinge motion of α4β5 region bears a sufficient flexibility for RegX3 to mediate the active to inactive transition process (Fig 4). Indeed, phosphorylation of NtrC, FixJ, DctD, Spo0F and PhoB has been shown to induce conformational changes mainly in loop β4α4 and helix α4 [30][31][32][33][34]. α4 is also structurally unstable in other RRs such as CheY, DrrD and BaeR [10,11,35]. In RegX3, helix α4 adopts a perpendicular position that is closer to DrrD and BaeR, however in inactive state it rotates 51.6°occupying a parallel position to the core RD as in PrrA and MtrA [12,15]. The rotation of α4 is accompanied by β4α4 loop that subsequently exposes Thr 79 and partially Tyr 98 as in inactive DrrD (Fig 1C). Such inactive rotameric state of Tyr 98 has been suggested as the third state for inactive RRs [17]. Loop β4α4 is shown with more dramatic conformational changes than loop β1α1 and β3α3 during active to inactive transition, which breaks down the interaction network built by Thr 79 , Glu 8 , Glu 10 , Asp 52 and Glu 84 together with Arg 81 . Consequently, the conserved Arg 81 replaces its interaction partner with Glu 8 and Asp 52 to achieve the inactive state (Fig 3). On the other hand the re-positioning of Arg 81 , Asp 9 and Glu 10 in inactive state allows 5.7°rotation of Nterminal helix α1 (S3 Fig). A slow motion of helix α1 has been shown critical for interactions with KinA, Spo0B and RapB [36]. In addition, these loops stabilize the phosphorylatable Asp 52 in active state of RegX3, which is important in dephosphorylation for specific recognition by sensor histidine kinases [37,38]. The re-orientations of Thr 79 , Arg 81 and Glu 84 render α4 in a parallel position to the core RD, which is further supported by the 41°axial rotation of α5 that secures a compact conformation of RegX3s through new contacts of Arg 113 and Asp 160 , Arg 106 and Asp 180 and Arg 117 and Glu 163 on inter-domain interface (Fig 2B). At sub-family level of RRs (OmpR/PhoB), Arg 113 is an evolutionary conserved residue of dimeric interface ( Fig 6A) and interacts with Glu 92 in active state dimer.
The dynamic relevant hot-spots residues may probe specific responses to induce local changes that mediate signal transmission. A cluster of seven hot spots (Tyr 98 to Lys 101 ) with high δω values, containing C terminal half of β5 and its adjacent loop, are defined as most critical residues in active-inactive transition. Consistently, Tyr 98 and Lys 101 have been shown important to maintain a stable environment for phosphorylation site in OmpR/PhoB family [13,[39][40][41]. The α5 is less important but may possibly involve in interdomain interaction of RegX3 to form a compact conformation (Fig 2).

Critical residues on dimer interface
Last decade has seen considerable effect to study residue-residue contacts by covariation from aligned sequences using direct-coupling information and inverse-covariance, e.g. Direct Coupling Analysis (DCA) and Protein Sparse Inverse COVariance (PSICOV) [42,43]. GREMLIN program recently developed by Baker and his colleague obtains model parameters from conditional correlations with fewer sets of protein sequences using a frame work of pseudo- likelihood integrated with structural information [44]. The authenticity of GREMLIN was tested using a crystal structure of VicK [45] and its homologues (data not shown), which demonstrated its efficiency to capture co-evolutionary signals in both active and inactive states of sensor histidine kinases as DCA [42,46].
The α4β5α5 region is important for transition from inactive to active state [47,48]. The specificity of homo-and hetero-dimer interface in OmpR/PhoB has been demonstrated by Förster resonance energy transfer (FRET) analysis [49]. Our coevolution analysis suggested that the specificity of dimeric interface in OmpR/PhoB family is evolutionarily optimized (Fig  6). We identified 23 coevolved pairs involved in the dimer interface with top scores (Table 1). Three pairs of Glu 24 -Arg 113 , Asp 97 -Arg 111 and Ile 76 -Asp 96 were selected as critical residues for RegX3 after excluding all intra-monomeric contacts. Mutations of these residues altered the dynamics of global conformation of RegX3 (Fig 7 and S5 Fig). Consistently, Arg 113A/E mutation (Arg 111 in RegX3) abolished the dimer formation of active state PhoP in vivo. Single mutation Arg 115D dramatically decreased PhoB DNA binding and transcriptional activities, which were partially rescued by compensatory mutation Asp 101R (Asp 96 and Arg 111 in RegX3) [50]. Altogether, this data further substantiate that α4β5α5 face is critical for RR conformational change and dimerization during active to inactive transition.

Summary
RRs are sensitive to phosphorylation-induced conformational changes. Phosphorylated RRs are stabilized by divalent ions like Ca 2+ or Mg 2+ , which utilize binding sites from β1α1 and β4α4 loops. In active state these proximal loops are in stretched conformation and face the Cterminus of β3. The absence of divalent and phosphate ions from the active pocket remodeled β1α1 loop from a stretched to a relaxed conformation resulting the outward orientation of Glu 10 and Asp 9 that is supported by Arg 81 from β4α4 loop. Arg 81 breaks its contacts with the loop β1α1 to establish a new interaction network with the phosphorylatable aspartate, which possibly promotes an outward movement of N-terminal helix α1. The hot-spot cluster of α4β5 is most critical in maintaining the active/inactive conformation of a RR. All in all, these studies suggest a general working model for OmpR/PhoB family in which the α4β5 region with several coevolved residues together with Arg 81 are pivotal for global and local structural dynamics that govern the active to inactive conversion induced by RR phosphorylation state.

Coevolution analysis
A total of 29,283 sequences (The alignment file is available upon request) of all full length response regulators were gathered and an alignment was constructed using HHblits server [51]. The aligned sequences [X 1 , X 2 , X 3 ,..,X p ] with positions 1:p, were filtered by removing those sequences with >90% identity and more than 75% gaps. A program GREMLIN was used to extract all possible coevolution signals. The reduction of w ij matrices to single values reflects the total strength of the coupling between positions i and j. We first computed s ij , their vector 2-norm (the square root of the averages of the squares of the individual matrix elements) and corrected differences in s ij due to sequence variability using the row and column averages of these values: The brackets represent averages taken over the indices outside the brackets in a manner similar to Average Product Correction (APC) [52]. The normalized coupling strength or "scaled score", ncs ij , was computed by dividing the S ij corr value by the total average of the top 3L/2 S ij corr values (since there are roughly 3L/2 contacts for a protein of length L [44].
Gremlin model construction from paired alignments GREMLIN constructs a global statistical model, and assigns a probability to every amino-acid in an alignment: where v i are vectors encoding position-specific amino acid propensities and w ij are matrices encoding the coupling of amino acids between positions i and j. All these parameters are determined by maximizing the regularized pseudo-likelihood of the sequence alignment as described in [44,53].
Where each summation is a conditional distribution capturing the probability of an amino acid at particular position in context of protein sequences. Regularization term R(v,w) is used to prevent over-fitting.
Molecular dynamics NAMD 2.10 (Beckman Institute, University of Illinois) was used to generate ensemble trajectories of RegX3 [54]. All individual MD simulations were performed in explicit solvent system with constant parameters. Protein structure files (psf) were generated using CHARMM36ff [55]. The whole system was neutralized by NaCl and solvated in an 8 Å cubic box. The default configuration settings were used with 1/Å 3 PME grid density. Short-range, non-bonded interactions were calculated using a distance cut-off of 12 Å. The system was minimized for 1000 steps and simulated in the NPT ensemble (T = 310 K, P = 1 atm) with periodic boundary conditions having full particle-mesh Ewald electrostatics. Ligplot+ was used to analyze intra-protein contacts in active and inactive states of RegX3 [56]. Mutations were introduced into RegX3 structure using Modeller [57]. All structural illustrations were prepared in Pymol (DeLano Scientific LLC).

Principal Component Analysis
To compare different MD conformation, we performed principal component analysis (PCA) using a package Bio3d [27]. PCA is a multivariate technique, which minimizes the maximal variance of the data to two or three dimensions through examining inter-conformer relationships, based on the covariance matrix, C, with two elements i and j originates from the Cartesian coordinates of the all superposed structures.
Where X is the mass weighted Cartesian coordinate of an N-particle system and C ij represents an average of all sampled structures. The C matrix can be diagnolized with orthonormal transformation matrix R: T is transpose of R and λ 1 ! λ 2 !, . . .. ! λ 3N indicates eigenvalue. To obtain the principal components q i (t), i = 1,. . ..3N, the trajectories can be projected on the eigenvectors in the columns of R.
In the direction of principal mode, the eigenvalue λ 1 is the mean square fluctuation. In PCA the first few PCs normally show global motions and contain the largest root mean square fluctuation.

Elastic Network Model (ENM)
Elastic potentials of Cα coordinates of both active and inactive RegX3 as elastic bodies were constructed with a force constant of C indicating pair wise interaction of Cα atoms within R c cut-off distance. The resultant potential energy of the elastic network is expressed as: Where x is a 3N-dimmentional vector representing Cartesian coordinates of RegX3 Cα atoms and x o is a corresponding vector indicating Cα positions in RegX3 structure. d°i j and d ij represent the corresponding distance between two structures and two Cα atoms at positions i and j, respectively. The above equation of potential energy calculation for elastic network is expanded to its second order by computing its Hessian matrix H, which best describes large amplitude motions in a protein for low frequency normal modes [58]: MENM potential functions of RegX3 were calculated as Best et al [59]: Eðx ! Þ ¼ Àb À1 1n½e ÀbðE1ðx ! Àx ! 1 Þþ21Þ þ e ÀbðE2ðx ! Àx ! 2 Þþ22Þ Where Ɛ1 and Ɛ2 indicate energy off-sets and β is T m /k B , inverse of mixing temperature. E1 and E2 were calculated for two MENM low frequency modes of RegX3 active and inactive states with Hessian matrix. To detect hot spot residues, response of harmonic spring to perturbation in frequency of mode M was calculated as: doðM; nÞ ¼ VMT:dH:VM:dH Where VM represents the eigenvector of mode M at residue position n and δH is the observed change of Hessian matrix to the energy of elastic network. δω(M, n) indicates sensitivity of mode M to hot spot residues which illustrates evolutionary conservation and functional relevance [25].