A Polymorphism at Position 400 in the Connection Subdomain of HIV-1 Reverse Transcriptase Affects Sensitivity to NNRTIs and RNaseH Activity

Reverse transcriptase (RT) plays an essential role in HIV-1 replication, and inhibition of this enzyme is a key component of HIV-treatment. However, the use of RT inhibitors can lead to the emergence of drug-resistant variants. Until recently, most clinically relevant resistance mutations were found in the polymerase domain of RT. Lately, an increasing number of resistance mutations has been identified in the connection and RNaseH domain. To further explore the role of these domains we analyzed the complete RT sequence of HIV-1 subtype B patients failing therapy. Position A/T400 in the connection subdomain is polymorphic, but the proportion of T400 increases from 41% in naïve patients to 72% in patients failing therapy. Previous studies suggested a role for threonine in conferring resistance to nucleoside RT inhibitors. Here we report that T400 also mediates resistance to non-nucleoside RT inhibitors. The susceptibility to NVP and EFV was reduced 5-fold and 2-fold, respectively, in the wild-type subtype B NL4.3 background. We show that substitution A400T reduces the RNaseH activity. The changes in enzyme activity are remarkable given the distance to both the polymerase and RNaseH active sites. Molecular dynamics simulations were performed, which provide a novel atomistic mechanism for the reduction in RNaseH activity induced by T400. Substitution A400T was found to change the conformation of the RNaseH primer grip region. Formation of an additional hydrogen bond between residue T400 and E396 may play a role in this structural change. The slower degradation of the viral RNA genome may provide more time for dissociation of the bound NNRTI from the stalled RT-template/primer complex, after which reverse transcription can resume.

structures some residues are not resolved and a number of loop residues in the p51 subunit of the NNRTI bound structures used here are missing in the PDBs. The models were completed by copying in the coordinates for the missing loop residues from PDB: 1HQU (this structure was chosen due to its high resolution, 2.7 Å, and completeness in regions missing from other structures) after alignment of the surrounding residues using VMD [1].
These structures were then used as templates to homology model the NL4.3 wild-type and A400T sequences using the Deepview package (formerly known as Swiss-PdbViewer) [2].
This involved the fitting of the common atoms of the residues in the target sequence to those in the edited PDB template. the following mutations were made to transform the HXB2 background sequence of the crystal structures into NL4.3: Q102K, K122E, K162S, F214L, A272P, K358R, A376T, I343V, D460N, P468T, H483Y, K512Q and S519N. In each case the final model contains 556 residues in the first (p66) chain and 427 in the second (p51) chain for a total of 983 residues. Each system was solvated using a cubic box of TIP3P water molecules [3] providing a buffer of at least 14 Å distance around the protein.
The systems were neutralised by the addition of Clions for the unbound enzyme and NVP bound complexes and Na + ions for those containing nucleic acids. Simulations for each system produced 20 ns of production simulation with structures output for analysis every 10 ps. Fig S1: The positions in the HIV-1 RT sequence that differ between the NL4.3 and HXB2 are shown on the enzymes structure as magenta spheres, p51 residues 395 and 396 are shown as cyan spheres. (A) shows a view down into the template/primer binding cleft and (B) one from the side. The enzyme is shown in cartoon representation and coloured by subdomain; fingers are blue, palm red, thumb green, connection yellow and the RNaseH orange. The p66 subunit is shown in darker shades. Positions of significant residues in both subunits are labelled. Note that p66 residue 400 is located in a solvent exposed helix away from any functionally significant regions.
Inhibitor potential parameterization was performed by extracting the drug coordinates into separate files, using the PRODRG tool [4] to insert missing hydrogen atoms. The geometries were then optimized using Gaussian 98 [5] (with the 6-31G** basis functions).
The Restrained Electrostatic Potential (RESP) procedure, part of the AMBER package [6], was used to calculate the partial charges. The force field parameters for the inhibitors were described using the General AMBER Force Field (GAFF) [7]. The protein and nucleic acid elements of all systems were described by the standard AMBER force field (ff03) [8] which is parameterized for bio-organic molecules including DNA in particular. The default variants (such as protonation states) for amino acids in physiological conditions were used for all residues.
The molecular dynamics package NAMD2 [9] was used throughout the minimisation, equilibration and production stages of the simulations. Electrostatic interactions were treated using the particle mesh Ewald (PME) [10] method and SHAKE [11] constraints were applied to all bonds involving hydrogen atoms in order to employ a 2 fs integration time step. Minimisation was conducted using the conjugate gradient and line search algorithms for 2000 iterations of each system. During this process all heavy atoms were restrained using a force constant of 5 kcal mol -1 Å 2 . The next stage of the equilibration process was a mutational relaxation protocol in which each mutated residue and residues within 5 Å are released in turn from the restraints for 50 ps. This allowed the residues to re-orientate into more favourable conformations if necessary. After the 50 ps relaxation period the restraints are reapplied to each region. The equilibration phase anneals the system taking the temperature from 50 K to 300 K in 50 ps. Once achieved, the final temperature was maintained using a Langevin thermostat with a coupling coefficient of 5 ps -1 . This was followed by completely isothermal equilibration for 200 ps in the canonical (NVT) ensemble.
In both of these stages the restraints imposed during minimization were retained. The restraints were then gradually reduced in four steps of 1 kcal mol -1 Å 2 , each step running for 50 ps. The restraints applied are weaker than in the protease case as no regions are known to suffer solvation induced deformities unlike the flap region of the protease. After this, the restraints were removed completely and the systems allowed to evolve under isothermalisobaric (NPT) conditions using a Berendsen barostat [12] with a target pressure of 1 bar and a pressure coupling constant of 0.1 ps. Coordinate trajectories were recorded every 1 ps throughout all equilibration and production runs. The five systems were built using extensions to the Binding Affinity Calculator (BAC) scripts created to automate simulations and free energy calculations for the HIV-1 protease [13]. The Application Hosting Environment (AHE) [14] was used in order to automate the running of simulations and retrieval of data.
Physical properties can only be reliably calculated from systems which have been adjudged properly equilibrated. For all systems simulated here the potential energy minimisation applied is sufficient to remove all bad contacts as measured by the decrease in potential energy which, after the heating phase, remains stable with a standard deviation of less than 450 kcal mol -1 in all cases. A further test of whether the systems under study have equilibrated was made by investigating the structural variation seen over the simulation.
Using difference distance matrices, [15] determined a set of residues which vary in relative position by less than 2 Å in a wide range of HIV-1 RT crystal structures. These residues are assumed to represent the most structurally stable regions of the protein (listed in Tab S1).
The root mean squared fluctuations (RMSF) of each system was calculated relative to the average structure for these residues (HIV-1 RT is known to be a flexible protein with a number of loop regions for which conformational changes might be expected throughout even equilibrated simulations). The RMSF of the simulations reduced to, and remained below, 1.5 Å after 6 ns for all systems (See Fig S2). The root mean squared deviations for the protein backbone is also seen to plateau after around 5 ns (see Fig S3) further indicating that the systems have reached an equilibrated state following minimization and constraint removal. Consequently the trajectory between the start of the simulation and 6 ns in is defined as being the equilibration phase and all subsequent parts of the simulation comprise the production phase. All analyses presented here are derived from production phase simulations.
Tab S1: Residues determined to vary relative positions by less than 2 Å in a survey of HIV-1 RT crystal structures by Keller et al. [15].

Structure flexibility
The structural changes after equilibration are mainly confined to loop regions with only minor differences between the two sequences observable (see Figs S4-S7). The main differences are concentrated in the p66 fingers which are largely free to move in the absence of an incoming nucleotide or NRTI.

Definition of dihedral angles
The dihedral angle φ defines the rotation of the fourth atom in a chain about the middle bond of three linking the atoms (see Fig S8). Changes in this angle only alter the distance between the first and fourth atoms (the other distances are defined by the chemical bonds between them). In amino acid side chains the successive sidechain dihedral angles are labeled as χ 1 -χ 5 (depending on sidechain length). The χ 1 dihedral angle is defined by atoms N-C α -C β -C γ , the χ 2 dihedral by atoms C α -C β -C γ -C δ . Fig S8: The dihedral angle φ is defined for four atoms bound by three bonds.

Flexibility of p51 residue 395
Minimal changes in the p51 residue 395 sidechain  1 dihedral angle are observed between the wild-type and A400T sequences in any of the four simulated states (see Fig S9).  Fig S11: The angle between the RNaseH primer grip residues in p51 and p66. The location of the carbon alpha atoms of residues 395 (magenta) and 396 (purple) in p51and 359 (pink) and 360 (red) in p66. Vectors between these two pairs and the angle between them are shown in black.

Template flexibility
Fig S12 shows that, in line with experimental evidence [16], the RNA template in the wildtype HIV-RT is seen to show high flexibility around the p51 RNaseH primer grip residues, this flexibility is suppressed in the A400T RT. These results may be artificially exaggerated as the dsDNA template primer is comparatively short (18 nt compared to 29 nt in the hybrid). Fig S13 also shows that the dsDNA is held much more rigidly with low RMSF along both template and primer strands. duplex (see Fig S15). However, none of the changes occur at the polymerase end of the  [18,19]. In the RNA/DNA template/primer complex differences in this property are seen as the duplex enters the RNaseH following interactions with the RNaseH primer grip residues p51 395 and 396 which interact with base pairs -10 and -11 (see Fig S14). Narrowing of the groove is thought to be necessary for RNaseH activity [17]. The minor groove is wider in the A400T simulation as it enters the RNaseH. The minor groove width is more dramatically altered in the dsDNA chain which is the only point which is involved in catalytic interactions as there is no RNA in this system which could be degraded. Fig S14: The minor groove width along the RNA/DNA template/primer complex bound to the N4.3 wild-type (red) and A400T (blue) HIV-1 RT. Base pair number is measured from the base interacting with the polymerase active site. The RNaseH primer grip residues p51 395 and 396 interact with bases -10 and -11.