Structural Basis for Inhibitor-Induced Aggregation of HIV Integrase

The allosteric inhibitors of integrase (termed ALLINIs) interfere with HIV replication by binding to the viral-encoded integrase (IN) protein. Surprisingly, ALLINIs interfere not with DNA integration but with viral particle assembly late during HIV replication. To investigate the ALLINI inhibitory mechanism, we crystallized full-length HIV-1 IN bound to the ALLINI GSK1264 and determined the structure of the complex at 4.4 Å resolution. The structure shows GSK1264 buried between the IN C-terminal domain (CTD) and the catalytic core domain. In the crystal lattice, the interacting domains are contributed by two different dimers so that IN forms an open polymer mediated by inhibitor-bridged contacts; the N-terminal domains do not participate and are structurally disordered. Engineered amino acid substitutions at the inhibitor interface blocked ALLINI-induced multimerization. HIV escape mutants with reduced sensitivity to ALLINIs commonly altered amino acids at or near the inhibitor-bound interface, and these substitutions also diminished IN multimerization. We propose that ALLINIs inhibit particle assembly by stimulating inappropriate polymerization of IN via interactions between the catalytic core domain and the CTD and that understanding the interface involved offers new routes to inhibitor optimization.


Introduction
Despite the success of antiretroviral therapy for HIV infection, the emergence of drug-resistant viral variants and the recognition of long-term drug toxicities leave development of new drug classes a priority [1]. Integrase strand transfer inhibitors (INSTIs) targeting the active site of the HIV-encoded integrase (IN) protein have proven highly effective [2]. An additional class of IN inhibitors, the allosteric inhibitors of integrase (ALLINIs), act at a second site on HIV IN [3][4][5][6][7][8][9][10][11]. ALLINIs (also referred to as LEDGINs, noncatalytic site integrase inhibitors [NCINIs], or multimodal inhibitors) are highly active against HIV replication in cell culture but have not yet been fully developed for use in patients, motivating close study to inform ongoing inhibitor development.
During the early steps of HIV infection, IN catalyzes the initial covalent attachment of the viral cDNA to host cell nuclear DNA [12,13]. IN is comprised of three independently folded domains ( Fig 1A). The N-terminal domain (NTD; residues 1-50) binds Zn 2+ via a conserved His-His-Cys-Cys (HHCC) motif. The catalytic core domain (residues 50-212) adopts an RNase H superfamily fold and contains a D,D-35-E motif that binds Mg 2+ or Mn 2+ ions, which mediate DNA cleaving and joining. The C-terminal domain (CTD; residues 223-268) features an Src homology domain 3 (SH3)-like fold that contributes to DNA binding and is connected to the catalytic core domain by a α-helical linker (residues 213-222).
IN function is assisted by the cellular cofactor lens epithelium-derived growth factor (LEDGF/p75), which binds tightly to a site at the catalytic core domain dimer interface [12,13]. LEDGF/p75 binding targets HIV integration to active transcription units [14][15][16] and promotes the integration reaction [12,13,17]. These findings and the observation that small molecules can bind this site [4,10] motivated screens for small molecules that block LEDGF/ p75 binding to the catalytic core domain. Multiple compounds were identified and shown to bind the expected site on the catalytic core domain by X-ray crystallography, and many showed potent antiviral activity in cell culture [3,4,[7][8][9]11,18,19]. However, mapping the target of ALLINI inhibition within the HIV replication cycle revealed a surprise: ALLINIs interfered only modestly with early steps of HIV replication but potently disrupted late steps, including particle assembly and maturation [6][7][8][9]18]. As a result, defective particles were produced with abnormal organization of the viral RNA and nucleocapsid (NC) proteins and greatly reduced infectivity [6,8]. It has been unclear how binding of ALLINIs to the IN catalytic core domain could interfere with particle maturation.
Here we report the crystal structure of HIV IN bound with the ALLINI GSK1264 at 4.4 Å. Prior to this study, atomic insight into the interaction of GSK1264 with IN was limited to the catalytic core domain only [7], a result which did not completely explain the mode of action of this class of compound. Here we report that the complete GSK1264 binding interface is comprised of both the CTD and the catalytic core domain, which together almost entirely bury the compound within protein. The interface is rich in residues implicated in IN oligomerization and ALLINI sensitivity, indicating likely functional significance. The dimer-dimer interaction mediated by GSK1264 leads to formation of an open polymer in the crystal, a polymerization event that is readily reproduced in solution with purified components and readily attenuated by mutagenesis. To probe ALLINI function more broadly, we compared the properties of ALLINIs in biochemical, virological, and electron microscopic assays. Several IN variants, including escape mutations elicited by growth of HIV in the presence of ALLINIs, encode IN substitutions at or near the inhibitor binding site, and these substitutions also resulted in decreased IN oligomerization in vitro. The results support a mechanism in which ALLINIs disrupt viral particle maturation by promoting formation of IN polymers, as in the IN-GSK1264 crystal structure. Identification of the molecular interface responsible for polymer formation establishes a structural basis for improving the ALLINI class of HIV inhibitors.

GSK1264 and GSK002 Disrupt Formation of Mature HIV Particles
To investigate the structural basis for ALLINI function, we studied GSK1264 (described in [7]) and GSK002, which is newly reported here (Fig 1B). We first examined whether these compounds disrupt HIV particle organization, as has been documented for other ALLINIs (C) Disruption of assembly by the ALLINIs GSK1264 and GSK002. Viral particles produced in the presence of 1,000 nM GSK1264 or GSK002 were visualized by transmission electron microscopy, and morphology was scored (see S1 Data). The p-value is the probability of obtaining the observed (or greater) differences in numbers of nonmature particles (immature, deformed, or ambiguous) between treated and nontreated samples, given the null hypothesis of no inhibitor-induced changes.

Structure of an HIV IN•GSK1264 Complex
To crystallize HIV IN bound to an ALLINI, we sought substitutions that improved IN solubility but preserved some degree of ALLINI-dependent IN aggregation, allowing control of IN selfassociation. We used size-exclusion chromatography in line with multiangle light scattering (SEC-MALS) and analytical ultracentrifugation (AUC) to establish the solution properties of several IN variants. Our best candidate contained two amino acid substitutions: Y15A and F185H. IN Y15A,F185H formed discrete dimers in solution, retained the ability to bind to the LEDGF integrase binding domain (IBD), and had a reduced level of aggregation in the presence of GSK1264 and GSK002 compared to wild-type IN (S1 Fig and S1 Table). Substitutions of F185 are known to improve solubility [20], and viruses containing the F185H substitution are replication competent [21]. The Y15A substitution selects for a single conformation of the isolated NTD [22] that is correlated with diminished oligomerization of full-length IN [23]. We favored NTD substitutions for this study because IN constructs containing only the catalytic core domain and the CTD were previously shown to be sufficient for ALLINI-induced aggregation [7,24], suggesting that NTD substitutions would not interfere with understanding ALLINI action. HIV-1 containing IN Y15A is replication defective, and the purified protein is not active for integration.
Crystallization trials of IN Y15A,F185H with GSK1264 led to a diffracting crystal form containing one IN dimer in the asymmetric unit; trials with GSK002 did not yield crystals. We determined the structure by molecular replacement using the catalytic core domain dimer structure (without ligand) from the GSK1264-bound structure previously determined (PDB 4OJR, [7]) together with the previously reported IN CTD structure [7,25]. The structure was refined using deformable elastic network (DEN) restraints [26][27][28]. The catalytic core domains, CTDs, and connecting linkers displayed strong electron density in simulated annealing 2F o -F c composite omit maps, and thus, their positions are well defined in the structure (Fig 2A). The IN NTDs could not be positioned by molecular replacement and did not display interpretable electron density. However, IN recovered from washed crystals was full length by SDS-PAGE (S1 Fig). Hence, the NTDs are presumed to be spatially disordered. Strong difference density in simulated annealing omit maps was observed at both previously observed GSK1264 binding sites on the catalytic core domain dimer [7], allowing for placement of the ligand in the latest stages of refinement ( Fig  2B). At the modest resolution available for this crystal form, we were able to reliably place domains, secondary structure elements, and the bound inhibitor, but we could not visualize detailed side chain interactions. We inferred the approximate positions of side chains from the electron density and their known locations in the high-resolution crystal structures of smaller IN fragments [7,29]. Diffraction data and refinement results are provided in S2 Table. The GSK1264-bound HIV IN structure is shown in Fig 2C. The IN CTDs bind to the catalytic core domains and bound inhibitors in adjacent IN dimers, resulting in the formation of an open polymer of IN dimers. The interaction site resembles a hand (the CTD) grasping a ball (the catalytic core domain), with the ALLINI site cupped by the fingers (Fig 2D).

The GSK1264-IN Binding Interface
In the IN•GSK1264 complex, GSK1264 is almost entirely protected from solvent, with 500 out of 580 Å 2 of its accessible surface buried (Fig 3A-3C). Excluding bound inhibitor, over 1,100 Å 2 of solvent-accessible surface is buried at each CTD-catalytic core domain interface. Secondary structure elements mediating GSK1264 binding include one face of the CTD, defined by strands β1, β2, and β5, juxtaposed against the catalytic core domain dimer near helix α3 of one subunit and α4 of the other. The CTD-catalytic core domain interface has a strong electrostatic component, with the relatively acidic α4 helix of the catalytic core domain packing against the basic β5 strand of the CTD (S2 Fig). Extensive hydrophobic interactions are also formed between α3 of the catalytic core domain and the β1 and β2 strands of the CTD.  Table. doi:10.1371/journal.pbio.1002584.g002 The modest resolution of our structural model precludes direct visualization of interactions but does allow us to model the approximate positions of functional groups and potential contacts mediating GSK1264 binding (Fig 3A and 3C). The Tyr 226 and Trp 235 side chains in the CTD pack against the core isoquinoline moiety, where π-π interactions between inhibitor and aromatic side chains may contribute to binding stabilization. The tert-butoxy moiety is buried in a hydrophobic pocket formed by the catalytic core domain dimer, without contributions from the CTD. The benzopyran group extending from the isoquinoline core is surrounded by a cradle of hydrophobics at the catalytic core domain dimer interface and, additionally, Ile 268 from the CTD. A modeled hydrogen-bonding network involving the carboxylate and tertbutoxy oxygen completes the IN-inhibitor interface, where Lys 266 from the CTD is positioned GSK1264 (red) is predominantly buried via van der Waals contacts with 13 residues from the CTD (grey) and catalytic core domain (tan). Thr 174 , Lys 266 , and His 171 are predicted to hydrogen bond with the tert-butoxy and carboxylic acid moieties. This panel was generated using LIGPLOT [30]. See also S2 Fig

Functional Analysis of the ALLINI-Binding Interface
To probe the importance of the GSK1264 interface observed in this crystal form, we compared the IN aggregation properties and structural features of several compounds bound at ALLINI sites (Fig 4). Previously, we and others have reported [7,24] that truncated IN derivatives containing only the catalytic core domain and the CTD were sufficient to support aggregation by ALLINIs in vitro. In addition, crystal structures are available for the IN catalytic core domain bound to several ALLINIs and other small molecules (Fig 4A-4D). Structures of the GSK1264 [7], BI-D [8], and tetraphenylarsonium (TPA) [10] complexes were reported previously; the structure of GSK002 bound to the catalytic core domain at 1.75 Å is newly reported here ( Fig  4C, S2 Table, and S4 Fig). We found that GSK1264, GSK002, and BI-D all directed potent aggregation of IN containing the catalytic core domain and the CTD, but not IN derivatives containing the NTD and the catalytic core domain or only the catalytic core domain (Fig 4). TPA, which is not an ALLINI, did not promote aggregation of any derivative even at millimolar concentrations ( Fig 4H).
Analysis of the TPA-bound catalytic core domain structure superposed onto the GSK1264-bound full-length IN structure indicates that TPA would clash with the CTD and therefore block the CTD-catalytic core domain interaction seen for GSK1264 (arrows in Fig  4G). A small steric clash is also predicted for GSK002, in which the difluorobenzyl group collides with Trp 235 . However, rotation of the Trp 235 side chain could create enough space for the inhibitor, so we suggest that GSK002 could be accommodated in a nearly identical interface. Thus, this analysis supports the concept that CTD-ALLINI-catalytic core domain interactions are relevant to aggregation in vitro and that the shape and positioning of the ALLINI molecule is critical for the aggregation reaction.

CTD Substitutions That Block ALLINI-Induced Aggregation
IN substitutions that diminish ALLINI function have been studied on the catalytic domain side of the interface, but interactions on the CTD side are less well characterized. Biochemical studies have implicated the CTD in IN oligomerization [7,20,31,32], and CTD residues K264 and K266, when both mutated to alanine, diminished ALLINI-induced aggregation and blocked integration activity [24]. In the structure presented here, these residues contribute to the GSK1264 interface ( Fig 5A), supporting the biological relevance of the observed contacts. We further probed the importance of this interface by studying substitutions at CTD residues Y226 and W235 (S1 Table,

Mechanism of GSK1264 and GSK002 Resistance Mutations
To identify mutations that confer resistance to GSK1264 and GSK002, we passaged HIV-1 in the presence of increasing concentrations of each inhibitor and isolated resistant viral strains.
Three different HIV-1 strains were compared for each inhibitor: the lab-adapted HIV strain NL4-3 and Raltegravir-resistant strains 4376 and 8070. Strains were passaged 38 times in the  [10]). Models were generated by docking the indicated CCD/ compound structures into the GSK1264/ IN F185H structure studied here. Aggregation assays were carried out using light scattering at 405 nm at 25˚C with 10 μM IN (see S1 Data). In panel G, arrows indicate steric clashes predicted by modeling of the CTDcatalytic core domain interface with TPA, based on the TPA binding mode in PDB 1HYV [10]. presence of compound, resulting in an increase in the inhibitory IC 50 of >100-fold. IN coding regions were sequenced, and departures from wild-type sequence tabulated (S3 Table).
Each of the three viral strains studied showed mostly distinct escape mutations that were unique to each inhibitor (Fig 6). Multiple independent viral cultures were not compared for each ALLINI, so it is unknown whether the screens were saturated. Some of the substitutions match previously published ALLINI resistance mutations [3][4][5]9,11,18,19,[33][34][35]. The IN•GSK1264 structure allows us to consider the mechanism by which these escape mutations confer resistance.
In studies with GSK1264, six different substitutions were found (Fig 6A and S3 Table). Ala 129 and Thr 174 are positioned to contact GSK1264 directly, so the larger side chains seen in escape mutants (A129T and T174I) may disrupt inhibitor binding via steric clashes. Y99H and L172F do not contact bound GSK1264 but may disrupt inhibitor binding via changes in occupied volumes within the catalytic core domain, resulting in displacement of α1 (Y99H) or α5 (L172F). Ala 205 and Asn 222 lie at the beginning and the end of the α-helical linker that bridges the catalytic core domain and the CTD of IN, so substitutions may alter the CTD position relative to the catalytic domain, thereby potentially inhibiting formation of the ALLINI complex (Fig 6A and 6C). Eight mutations arose after passage in the presence of GSK002, only one of which was in common with the GSK1264 mutations (Fig 6 and S3 Table). Using the GSK1264 structure as a model, we can propose mechanisms of GSK002 resistance for several of these. IN residues 125, 128, 170, 171, and 173 are poised to contact the bound inhibitor and/or the CTD, so substitutions at these positions likely disrupt the inhibitor-mediated interface directly. A205T was seen for both GSK002 and GSK1264 resistance substitutions (described above). T124N was the only substitution to arise independently in two experiments (S3 Table), and substitutions at this position and at residue 128 have been reported as ALLINI resistance mutations previously [3][4][5]11,19,[33][34][35][36].
Ala 124 and Trp 131 are the first and last residues of the α3 helix; they contribute to the oligomerization interface between the catalytic core domain and the CTD, so substitution of these residues is likely to interfere with inhibitor-mediated oligomerization. The resistance mutation encoding W131C was identified in the earliest passages and is unique to this study (Fig 6B and  S3 Table). W131D-substituted IN has been reported previously to increase solubility of purified IN [29,37,38]. Trp 131 is largely buried in the CTD interface in the IN•GSK1264 structure reported here (S3 Fig). Trp 131 does not contribute to the ALLINI binding interface directly, but the IN W131D substitution shows reduced oligomerization and reduced sensitivity to ALLINI-induced aggregation (S3 Fig and S1 Table), supporting the idea that the Trp 131 -CTD interaction is important for the inhibitor response.
To assess cross-resistance between GSK1264 and GSK002, we examined the sensitivities of two long-term passaged variants. Thirty-eight passages in increasing concentrations of GSK002 yielded a variant with T124N, W131C, and A205T; 38 passages in GSK1264 yielded a variant with Y99H, L172F, and N222K. In both cases, these multiply mutant variants were highly resistant to both drugs (>550-fold increase in IC 50 ).
To investigate the mechanism of ALLINI resistance in vitro, we purified six IN variants with escape mutant substitutions and tested their oligomeric properties using sedimentation equilibrium. Each of the escape variants showed reduced oligomerization compared to IN F185H , suggesting that diminished multimerization is a common characteristic of ALLINI escape substitutions (S1 Table).
We also purified IN variants containing resistance substitutions along the α3 helix and tested their ability to be aggregated by GSK1264 or GSK002 (S1 Table). For IN F185H , 0.8 μM of GSK002 or 8 μM of GSK1264 was sufficient to promote aggregation. The T124N, A128T, and W131C substitutions each protected IN from aggregation by either compound at 67 μM. The T124A substitution protected IN from polymerization at 67 μM GSK1264, but not GSK002. These findings strengthen the link between HIV escape from ALLINI pressure in cell culture and diminished inhibitor-induced polymerization of IN in vitro.

Sequence Polymorphisms Affecting Sensitivity to ALLINIs
To characterize naturally occuring IN variants that affect ALLINI activity, we tested a panel of 38 HIV isolates for sensitivity to GSK1264 and GSK002. Viruses of subtypes A, B, C, D, F, and A/E were compared, thereby interrogating a wide range of primary IN amino acid sequences. For GSK1264, the IC 50 values ranged from 2-100 μM (Fig 7A). In contrast, IC 50 values  for GSK002 against the same panel ranged from less than 0.5 μM to greater than 500 μM (Fig 7A). There was little correlation among IC 50 values for the two inhibitors (R = 0.0492, R 2 = 0.0024, p = 0.811), emphasizing the differential effects IN sequence variation can have on the potency of ALLINIs.
The amino-acid polymorphisms best able to predict IC 50 were extracted from the data for each inhibitor using lasso logistic regression [39]. A model predicting GSK002 activity using the top four most influential amino acid substitutions (K20R, A124N, T125X, where X 6 ¼ T, and K188R) performed well under cross validation, where the cross-validated mean-squared error decreased from 8.49 to 2.26. The model for GSK1264 showed weak predictive power, with no mean-squared error greater than one standard error away from random. Substitutions at Lys 20 , Ala 23 , Ser 119 , and Arg 269 , selected earliest by lasso logistic regression, may have the potential to influence IC 50 for GSK1264, though less robustly than for residues affecting GSK002.
Positions of affected residues near the inhibitor interface are shown on the IN•GSK1264 structure in Fig 7B. Substitutions at residue 124 could affect ALLINI binding or the catalytic core domain-CTD interaction. Substitutions at 125 may also affect ALLINI binding. Arg 269 lies on the CTD at the interface with the IN catalytic core domain adjacent to the ALLINI binding site, where substitutions could disrupt this interaction. We consider the possible roles of additional substitutions, including those in the NTD (K20R, A23X), in the Discussion.

Probing the Mechanism of Resistance to GSK002
To investigate the mechanism of GSK002 resistance further, we determined three high-resolution crystal structures, one of GSK002 bound to catalytic core domain-only, one of GSK002 bound to the catalytic core domain containing the Ala 124 substitution, and one of GSK002 bound to the catalytic core domain with Thr 125 . These two substitutions were chosen for study because they were the most influential in reducing GSK002 IC 50

Discussion
Here we have identified the determinants of IN aggregation mediated by ALLINIs and obtained a structural model of the specific interfaces involved. In our model, IN polymers are formed by catalytic core domain-CTD interactions that are facilitated and strengthened by the bound ALLINI. We propose that ALLINIs act by causing IN to form extended polymers mediated by this interaction, which in turn leads to aggregation and inactivation. This mode of inhibition is reminiscent of microtubule-stabilizing agents such as Taxol that stimulate lateral y-axis shows IC 50 values. Data are provided in tabular form in S1 Data. (B) Polymorphisms identified by lasso logistic regression linked to resistance to GSK1264 (blue spheres) or GSK002 (green spheres) mapped onto the IN-GSK1264 structure. GSK1264 is shown in red. See S3 Table. doi:10.1371/journal.pbio.1002584.g007 interactions between oligomers that modulate assembly [40,41]. Our model is supported by the properties of IN mutants that are resistant to ALLINIs, including mutants elicited during HIV passage in the presence of ALLINIs, engineered INs, and naturally occurring IN polymorphisms in circulating HIV strains. Our model is also supported by a strong correlation between in vivo inhibition and susceptibility to polymerization of purified IN proteins in vitro.
Why was the catalytic core domain-CTD interaction seen here not observed in previous experimental structures of HIV IN? We suggest that the answer lies with the use of (3-((3-cholamidopropyl) dimethylammonio)-1-propanesulfonate) (CHAPS), the zwitterionic detergent used for decades to solubilize IN for in vitro experiments [29,42]. CHAPS binds to the same CTD surface used to form ALLINI-mediated polymers, providing an explanation for how the detergent prevents aggregation and for why previous structural studies would not have revealed the intermolecular interfaces reported here (S2 Fig).
How does aggregation of IN disrupt viral assembly? ALLINIs induce aggregation of IN in vitro, and escape mutations reverse this, implicating aggregation in the mechanism of inhibition. During assembly, IN is first synthesized as part of the Gag-Pol polyprotein precursor. After encapsidation, Gag-Pol is cleaved into component proteins, including IN [12,13]. Immature virions containing Gag-Pol formed in the presence of ALLINIs appear normal in electron microscopy (EM) images, but mature particles are abnormal, suggesting that IN, and not Gag-Pol, may be the ALLINI target. This model is consistent with studies in which ALLINIs were effective when IN was delivered to viral particles independently of Gag-Pol [6,8]. HIV particles can bud properly when Gag only is present to yield immature particles, but Gag-PolΔIN forms defective particles during maturation [6]. The defective particles have electron-dense material outside the normal core, as is seen after ALLINI treatment, reportedly containing NC and RNA [6]. Thus, the IN polymer seen here could act by either (1) sequestering IN and preventing a normal function important for localizing NC and the RNA during maturation or (2) creating a novel aggregate that is itself disruptive.
IN substitutions that confer resistance to ALLINIs in virus fall into three categories, the first of which likely interfere directly with inhibitor binding (e.g., A128T, E170K, and H171T). The second group perturb the catalytic core domain-CTD interaction that mediates polymer formation (e.g., A124N). These substitutions can lie at the interface between domains (e.g., K266A and W131C/D), in a "second shell" that flanks the binding site (e.g., Y99H and L192F), or in the helical connector between domains that is required for positioning the CTD (e.g., A205T and N222K). While mutations at 226, 235, 264, and 266 all confer resistance to ALLINI-induced aggregation in vitro, no strong escape mutants at the CTD side of this interface occur naturally. However, most of the residues on the CTD that are in contact with GSK1264 are absolutely conserved among HIV variants, suggesting that substitutions that block ALLINI binding are likely to be replication defective. The third group of ALLINI-resistance substitutions resides within the NTD (e.g., D6R, E11K, and Y15A). These substitutions are located on a face of the NTD that has been observed bound to the catalytic core domain in several reported crystal structures [23,38,43,44]. At present, we favor a model in which binding of the NTD to the catalytic core domain competes with binding of the CTD. Enhanced binding of the NTD as a result of amino acid substitutions could inhibit ALLINI-promoted engagement of the CTD at the catalytic core domain. Further work will be required to understand how these NTD substitutions modulate ALLINI action.
The results presented here suggest several directions for future research. We were not able to crystallize the ALLINI GSK002 with IN, and the spectrum of resistance mutations elicited is mostly different between the two inhibitors. We speculate that the IN polymer formed in the presence of GSK002 is similar to that seen for GSK1264, with side chains remodeled to accommodate the interaction, but how these small structural differences influence inhibitor potency and escape mechanisms is just beginning to be studied. The role of the NTD in inhibition and escape is not yet understood but should also be experimentally accessible. Perhaps most importantly, our data suggest specific means of optimizing ALLINI binding in the observed interdomain pocket and specify regions of IN where additional functional groups may potentially be attached without disrupting binding. Thus, these results should be useful in optimizing the function of ALLINIs, a unique new class of HIV inhibitors that block viral replication by promoting inappropriate polymerization.

Ethics Statement
The human biological samples were sourced ethically from Gulf Coast Regional Blood Center (GCRBC). The protocol titled : IRB Specimen Procurement Protocol, Specimens to Provide to Internal or External Customers, was reviewed and approved by the GCRBC IRB (IRB approval # 06-001). The research conducted was in accord with the terms of the informed consents.

Small Molecule Inhibitors
The compound GSK1264 was previously described [7] and is referred to as compound 159 in Patent WO 2012/102985 (Compound 159); GSK002 is described in Patent WO2013/012649 (Compound 87). Raltegravir-resistant strains HIV 4376 and HIV 8070 were obtained from the NIH AIDS repository.

EM
A chronic HIV-1 producer cell line A1953 [45] was pretreated for 24 h using 100 and 1,000 nM GSK002, 100 and 1,000 nM GSK1264, or diluent control (DMSO). The cells were washed with 1 x PBS and incubated with fresh medium containing inhibitor or DMSO. After 72 h, the cells were harvested by centrifugation, and the cell pellets were fixed (2.0% paraformaldehyde, 2.5% glutaraldehyde in 0.2 M Na-CaCO, pH 7.3), embedded in Epon, and sectioned. Microscopy was performed with a JEM-1010 (JEOL, Tokyo, Japan) transmission electron microscope operated at 80 kV. Viral particles were categorized based on the viral morphology as mature, deformed (aberrant/eccentric/empty core), immature (no core), or ambiguous. For the 100 nM dose, findings were quantified by two blinded investigators.

Protein Expression and Purification
Full-length and truncated HIV-1 IN(NL4-3) and coexpressions with LEDGF(IBD) (346-471) constructs were expressed and purified as described previously, with some modifications [46][47][48]. IN-only constructs encoding the F185H solubility mutation were expressed from a pET-Duet-derived (Novagen) vector in which the IN construct was inserted in frame with a C-terminal Mxe intein (New England Biolabs) containing a chitin binding domain and hexahistidine tag. The QuikChange kit (Stratagene) was used to generate the point mutation. Proteins were purified using nickel-nitrilotriacetic acid (Qiagen) and chitin (New England Biolabs) resins. After fusion proteins were liberated by intein cleavage in 50 mM dithiothreitol (DTT) overnight at 4˚C, IN preparations were further purified using a Superdex 75 HiLoad 16/60 column at room temperature, eluted isocratically in 20 mM HEPES-NaOH pH 7.0, 1 M NaCl, 7 mM CHAPS, 10 uM ZnOAc 2 , and 10 mM β-ME. Proteins were concentrated at 4˚C in a YM-10 Centricon (Millipore), and aliquots were flash-frozen in liquid nitrogen for storage at −80˚C. Genetically solubilized IN C56S,F139D,F185H,280S (quadramutated, QM) with and without the additional W131D mutation (pentamutated, PM) were purified similarly into a final buffer of 20 mM HEPES-NaOH, pH 7.0, 450 mM NaCl, 7 mM CHAPS, 10 uM ZnOAc 2 , and 10 mM β-ME or 1-10 mM DTT. IN•LEDGF(IBD) coexpressions were produced from the same expression vector, except the IN-coding DNA was inserted into the first multiple cloning site (MCS) and LEDGF into the second MCS, in frame with the C-terminal Mxe intein-hexahistidine tag. These preparations were purified similarly into a final buffer of 20 mM HEPES-NaOH, pH 7.0, 1 M NaCl, 7 mM CHAPS, 10 uM ZnOAc 2 , and 10 mM β-ME or 1-10 mM DTT. IN F185K (CCD) used for crystallization was obtained by expression from the vector pET24 (Novagen, Madison, Wisconsin, United States) in BL21star (DE3) cells (Novagen) at 37˚C. The sequence was inserted into the vector in frame with a N-terminal TEV-cleavable hexahistidine affinity tag. Protein was initially purified using Ni-Sepharose FF (GE Healthcare Life Sciences, Pittsburgh, Pennsylvania, US), followed by cleavage of the affinity tag using hexahistidine-tagged TEV at a 1:100 mass ratio in 25 mM HEPES-NaOH pH 7.5, 750 mM NaCl, and 25 mM Imidazole at 25˚C with a 10K MWCO Jumbosep (Pall, Exton, Pennsylvania, US) centrifugal concentrator. To separate the liberated protein from uncleaved materials and protease, another Ni-Sepharose FF purification step was performed. The flow-through material was concentrated and injected onto a Superdex-75 column and eluted isocratically in 10 mM HEPES-NaOH pH 7, 500 mM NaCl, and 3 mM DTT. For biophysical analyses, samples were exchanged into 20 mM HEPES•NaOH pH 7.5, 1 M NaCl, 7 mM CHAPS, 10 mM DTT, and 10 μM Zn(OAc) 2 . All data reductions were performed using the program XDS [49]. Diffraction outwards of 4 Å was observed in the best case, with resolution dropping off quickly as a function of X-ray dose. Analysis of the final dataset by the UCLA diffraction anisotropy server [50] indicated that diffraction was significantly anisotropic along the a Ã -and b Ã -axes. On the basis of an F/σ (F) cutoff of 3, reflections were subjected to an anisotropic truncation of 4.5, 4.5, and 4.3 Å along a Ã , b Ã , and c Ã , respectively, before use in refinement. Molecular replacement was performed using the program Phaser [51,52] as implemented in the Phenix software package [53]. Molecular replacement searches using a number of different combinations of monomer or dimer IN CCD  (PDB 4OJR, [7]) and CTD 220-270 (PDB 1EX4, [29]) all yielded strong molecular replacement solutions with very high LLG scores. In the initial experimental maps, strong and contiguous electron density was observed corresponding to the large helical extensions of IN 210:220 seen in the 1EX4 crystal structure of IN(CCD-CTD) and was readily traced. After an initial iteration of simulated annealing, iterative rounds of rigid body refinement and minimization were performed in Phenix alongside manual building and refinement in COOT [54]. After an IN  dimer was completely built, the structure was refined in CNS v1.3 [55,56] using rigid body refinement and the DEN method [26][27][28], using the high resolution IN F185K (CCD)•GSK1264 (PDB 4OJR, [7]) and residues 211-270 from the available 2.4 Å IN (CCD-CTD) (PDB 1EX4, [29]) structure as reference models. Ligand was omitted from the refinement until the latest stages. Strong ligand density corresponding to GSK1264 was observed in F o -F c and simulated annealing F o -F c omit maps, allowing for docking of the ligands at both binding sites at the IN(CCD) dimer interface. Composite omit maps were generated in Phenix, and figures created using PYMOL [57]. Crystallographic structures of IN (CCD) F185K and related mutants were prepared and solved as previously described [7], using the beam line 21IDG at the Advanced Photon Source (Argonne, Illinois, US). Crystallographic statistics for the five structures are summarized in S2 Table. Atomic coordinates and structure factors were deposited in the Protein Data Bank under the accession codes 5HOT, 5HRN, 5HRP, 5HRR, and 5HRS.

Virological Methods
Chronically infected HIV-1 producer A1953 cells were a gift from James Hoxie. The cells were cultured at 37˚C in 5% CO 2 in DMSO medium, supplemented with 10% fetal bovine serum and penicillin-streptomycin. To test the viability of HIV IN Y15A , DNA encoding the substitution was built into the HIV NL4-3 backbone using Gibson assembly. The sequence of the resulting plasmid was confirmed by Illumina sequencing of Nextera-XT libraries. Viral stocks (IN Y15A and wild type) were generated by transfection into 293T cells and culture supernatants containing the HIV particles that were collected. Viral stocks were normalized by the amount of p24 capsid antigen per unit volume and then applied to TZMBL indicator cells. High luciferase activity was detected on day 2 after infection for wild type (2,334 RLU), whereas luciferase expression in IN Y15A samples (21 RLUs) remained close to background (7 RLUs). A longterm culture experiment was carried out in which cells infected with the IN Y15A virus were monitored for 2 mo by luciferase expression and p24 assay, but no viral replication could be detected.

Virus Resistance Passage
MT4 cells infected with the HIV lab strain NL4-3 or Raltegravir-resistant viruses (NIH AIDS Research and Reference Reagent Program catalog #11842 and 11845) were cultured in the presence of suboptimal inhibitor concentrations at approximately one-half the IC 50 . Viral replication kinetics were monitored by the production of RT activity in the supernatant. When the kinetics of the inhibitor-treated culture matched that of the no-inhibitor control color for three consecutive passages, the inhibitor concentration was increased. At periodic intervals, approximately every ten passages, the virus was expanded, and IN sequence determined. Drug-induced mutations were created in the proviral plasmid pNL4-3 for confirmation of inhibitor sensitivity.

HIV Replication Assays in PBMC Culture
Primary HIV-1 isolates were grown in human PBMCs. PBMCs were sedimented by layering blood obtained from donors over a Ficoll (Histopaque 1077; Sigma) cushion in 50 ml tubes. The cells were washed, pooled from various donors, and cryopreserved. PBMCs were thawed and stimulated with 2 μg/mL PHA in RPMI 1640 media supplemented with 10% heat-inactivated fetal bovine serum, 2 mM L-glutamine, 100 units/ml penicillin G, and 100 μg/ml streptomycin for 3 d prior to infection. Viral yields were determined for primary isolates 7 d post infection by a radioactive RT endpoint. To determine isolate sensitivity for various inhibitors, the appropriate titer of each isolate was used to infect PHA-stimulated PBMC pools in the presence of an inhibitor dilution series. Viral replication was measured by RT as the endpoint after 7 d in culture, and the IC 50 calculated as the inhibitor concentration required to reduce viral replication by 50% of the control.

Aggregation Assays
Assays were performed as previously described [7], with the following modification: turbidity assays were performed using VICTOR3V 1420 multilabel counter (PerkinElmer, Waltham, Massachusetts, US) by measuring the absorbance of the reaction solution at 405 nm. Final reaction conditions were 20 mM HEPES, pH 7.3, 375-505 mM NaCl, 3.75-5.05 mM CHAPS, 10 mM DTT, and 10 uM ZnAc 2 with inhibitor concentrations ranging from 0.08 μM to 88 μM at 24-27˚C. For the graphical representation of aggregation, the baseline (IN stock buffer + drug buffer) has been subtracted from the results.

SEC-MALS
Absolute molecular weights were determined by multiangle light scattering coupled with refractive interferometric detection (Wyatt Technology, Santa Barbara, California, US) and a Superdex 200 10/300 GL column (GE Healthcare) at room temperature as previously described [48].

Sedimentation Equilibrium Analysis
Sedimentation equilibrium AUC experiments were performed at 4˚C with an XL-A analytical ultracentrifuge (Beckman-Coulter, Brea, California, US) and a TiAn60 rotor with two-channel charcoal-filled Epon centerpieces and quartz windows. Data were collected at 4˚C with detection at 280 nm for 5, 7.5, and 10 μM samples. Linear analyses were performed by plotting the natural log of absorbance versus the square of radius, with the slope being proportional to M w . Single-species plots with calculated slopes for idealized oligomers were also calculated at a given speed for comparison.
Supporting Information S1 Data. Excel spreadsheet containing, in separate sheets, the underlying numerical data for Figure panels Fig 1C, Fig 4B-4H, Fig 5B-5D, Fig 7A, S1C-S1F Fig, S3B-S3D Fig and  S5A [48]. While the W131D mutation in this background retains LEDGF binding, mostly LEDGF-bound dimers are observed, consistent with the role of the residue at the CTD-catalytic core domain interface and the model presented in