The Influence of N-Linked Glycans on the Molecular Dynamics of the HIV-1 gp120 V3 Loop

N-linked glycans attached to specific amino acids of the gp120 envelope trimer of a HIV virion can modulate the binding affinity of gp120 to CD4, influence coreceptor tropism, and play an important role in neutralising antibody responses. Because of the challenges associated with crystallising fully glycosylated proteins, most structural investigations have focused on describing the features of a non-glycosylated HIV-1 gp120 protein. Here, we use a computational approach to determine the influence of N-linked glycans on the dynamics of the HIV-1 gp120 protein and, in particular, the V3 loop. We compare the conformational dynamics of a non-glycosylated gp120 structure to that of two glycosylated gp120 structures, one with a single, and a second with five, covalently linked high-mannose glycans. Our findings provide a clear illustration of the significant effect that N-linked glycosylation has on the temporal and spatial properties of the underlying protein structure. We find that glycans surrounding the V3 loop modulate its dynamics, conferring to the loop a marked propensity towards a more narrow conformation relative to its non-glycosylated counterpart. The conformational effect on the V3 loop provides further support for the suggestion that N-linked glycosylation plays a role in determining HIV-1 coreceptor tropism.


Introduction
The HIV-1 envelope (Env) glycoprotein plays a critical role in the recognition, binding and entry of the virus to a new host cell [1]. Upon recognition of the host target cell, the surface subunit gp120 binds to the CD4 receptor and initiates a series of conformational changes in the gp120 trimer that enables subsequent binding to a chemokine coreceptor (CCR5 or CXCR4), fusion of the HIV and host cellular membranes, and entry into the cell. The HIV-1 envelope trimer, which consists of three gp120-gp41 heterodimers, is heavily glycosylated [2][3][4][5] and contains more than 75 potential N-linked glycosylation sites [6]. The glycans are covalently linked to specific asparagine residues of the envelope precursor protein gp160 by the host cell machinery and are important for the stabilisation and correct folding of gp120 [7,8]. Studies have shown that N-linked glycans bound to the gp120 trimer form a ''glycan shield'' protecting the virus from neutralisation by the host immune response [9][10][11][12][13][14][15][16]. More recently, a number of studies have isolated broadly cross neutralising (BCN) antibodies from HIV-infected individuals, the activity of which appears to be highly dependent on the presence of glycosylation at a number of positions on the gp120 trimer, particularly at position 332 (HXB2 numbering) [17][18][19][20].
Glycans have also been shown to increase the binding affinity of gp120 for CD4, and the extent of glycosylation has been linked to cellular coreceptor tropism [21][22][23][24]. Since viral coreceptor tropism is associated with cell selection and disease progression, substantial research has been undertaken to understand the differences between viruses that preferentially use one, or both, of the common chemokine coreceptors, CCR5 and CXCR4 [25][26][27][28][29][30]. Several regions of gp120 may influence coreceptor usage [31][32][33][34][35][36][37][38][39], but features of the gp120 V3 loop in particular are recognised as the key factors determining viral tropism [21,22,[35][36][37][38][39][40][41][42][43][44]. These factors include the net charge [21,45], mutations at particular positions along the 35 amino acids of the V3 loop region [40,42,44], and the presence or absence of an N-linked glycosylation site [22,[41][42][43]. The net charge and mutation profile at specific sites of the V3 loop are commonly used in coreceptor tropism prediction tools [44,[46][47][48], however the role of the glycan profile in determining tropism is less well understood. One of the reasons for this is that the glycans of glycoproteins are challenging structures to crystallise, and are therefore often excluded prior to structural investigations. The heterogeneity in carbohydrate composition and position, as well as their flexibility, inhibits crystallisation, yet many glycoproteins are dependent on these carbohydrates for folding into their native conformations [49,50]. Most of the currently available HIV-1 Env-related crystal structures are of a de-glycosylated gp120 monomer [51][52][53], with recent studies describing a cryo-EM structure of the unliganded trimeric Env precursor [54] as well as a partially glycosylated gp120 in complex with antibodies and CD4 [55].
Molecular dynamics (MD) simulations can be used to complement experiment-based structural studies by providing insight into spatial and temporal properties under physiological conditions [56,57]. MD simulations are particularly suited to systems that are experimentally underdetermined or poorly resolved, such as glycans and glycoproteins [58], and can provide valuable insight into the structure-function relationship of large biomolecular systems. Two recent studies used MD simulations to describe the conformational properties of gp120 and the gp120 V3 loop [59,60], but did not address N-linked glycosylation directly.
Yokoyama et al. [59] focused on the potential role of the net charge of the V3 loop, where one structure with a net charge of +7 and the other with a net charge of +3 was investigated. The MD simulations of these gp120 structures led the authors to conclude that the net charge of the V3 loop affects the structural dynamics of the entire gp120 outer domain by modulating its electrostatic potential [59]. In a separate study, MD simulations were carried out on only the V3 loop regions of crystallised structures originating from two viruses, one a CXCR4-using and the other a CCR5-using virus [60]. The conclusions drawn therefore relate to the coreceptor-specific dynamics of the V3 loop and the authors suggest that an open V3 loop conformation supports X4-tropic and a thinner conformation supports R5-tropic viruses.
Using MD simulations we can investigate the degree to which glycans affect the structural fluctuations of the gp120 protein. In this study we used molecular dynamics simulations to compare glycosylated and non-glycosylated HIV-1 gp120 structures to determine the influence glycans have on the mobility of the V3 loop of the glycoprotein.

Data Preparation
The crystal structure of the HIV-1 gp120 core including the V3 loop (PDB ID 2B4C [52]) was used as the starting structure for the simulations. The published protein structure was crystallised in a complex with CD4 and the X4 antibody; both of which were removed. In order to allow direct comparison of our work with that of Yokoyama et al. [59], who used the crystal structure 2B4C as template, Gly 25 of the V3-loop (protein residue 322 of HXB2) was mutated to arginine. The resulting G25R mutant had a net charge of +3. During the pre-processing steps, both the Nterminus (ACE; -COCH 3 ) and the C-terminus (NME; -NHCH-3 ) were capped to avoid introducing charges at these positions. This non-glycosylated structure was used as control. For the glycosylated structures, an online Glycoprotein Builder (http://www. glycam.com), which is part of the GLYCAM Web-tools suite [12], was used to build and covalently link a 3D structure of the highmannose glycan Man-9 (a-Manp- 9 GlcNAc 2 ) to specific N-glycosylation sites. Any steric clashes between glycans or between a glycan and the protein were resolved manually by rotating the phi (W), psi (Y), chi (X) torsions with the UCSF Chimera package [61]. A mono-glycosylated structure was prepared with a single Man-9 glycan linked at position 295 (glycosylated 295 ), which is located at the border of the V3 loop. This variant was created to determine the extent to which proximity of N-linked glycans to the V3 loop was enough to affect the structure or dynamic properties of the loop. A second glycosylated variant (glycosylated 5-glycans ) was also prepared, in which glycans were linked to five key asparagine residues (295, 332, 339, 392, and 448) surrounding the V3 loop [62][63][64][65][66][67][68]. These positions were identified in terms of their relevance to coreceptor, CD4, or neutralising antibody binding [62][63][64][65][66][67][68].

System Preparation
We used the tLEaP module of AMBER 10 [69] with the AMBER ff99SB [70] force field for the protein, and GLY-CAM06g [71] force field for the carbohydrates, to generate the coordinate and topology files for both the glycosylated and the non-glycosylated proteins. The topology files were converted for use with the glycam2gmxRB.pl Perl script, kindly provided by Dr. Marko Wehle, which enabled a mixed 1-4 non-bonded scaling scheme to be used for the protein and carbohydrate force fields, as recommended. [72][73][74]. GROMACS v4.07 [75,76] was used for setting up the system, as well as for all simulations. The nonglycosylated structure and glycosylated 5-glycans structure were immersed in a cubic water box of 16 nm 3 , whereas a cubic box of 14 nm -3 was defined for the glycosylated 295 structure (the smaller size was verified to be large enough for the protein to remain immersed throughout the simulation). This solvation gave rise to system sizes of 407880 atoms (non-glycosylated), 407934 atoms (glycosylated 5-glycans ), and 273457 atoms (glycosylated 295 ). The TIP3P explicit water model [77] was used for all simulations. Chloride ions were added to neutralise each system, and all electrostatic interactions were calculated within the Particle Mesh Ewald (PME) approach [78]. Constraints on hydrogen-containing bonds were imposed with the LINCS algorithm.

Simulations
Energy minimisation was performed on all systems for 100,000 steps using the steepest decent algorithm prior to the MD simulations. All simulations were performed at 300 K. A stepwise protocol was employed for equilibration, beginning with a simulation under constant volume (NVT) conditions for 500 ps followed by switching to constant pressure (NPT) conditions at 1 atm for a further 500 ps. During these steps the non-hydrogen atoms were restrained with a force constant of 1000 kJ/mol/nm 2 . Thereafter, the restraints were removed and the systems equilibrated for a further 500 ps. A post-equilibration trajectory of 30 ns with a time step of 1 fs was collected for each system. Three further simulations of 10 ns each for the non-glycosylated and glycosylated 5-glycans structures were initiated from the 30 ns trajectories. Starting structures were selected from snapshots taken at between 10 and 30 ns intervals, and these uncorrelated simulations were used to demonstrate data reproducibility.

Analysis
In order to gain a better understanding of the differences in the conformational propensity of the glycosylated and non-glycosylated gp120 proteins, the root mean square deviation (RMSD) between the C-alpha atoms for each of the glycosylated and nonglycosylated structures, relative to their corresponding initial minimised structures was calculated using the g_rms function in GROMACS [75,76]. Using g_rmsf we also calculated the root mean square fluctuation (RMSF) for each C-alpha atom, where the average structure generated over the 5 -30 ns range of the of the post-equilibration trajectory, was used as reference. The distance between the centers of mass for different C-alpha atom groups was calculated with the g_dist function. The distributions of the distances between C-alpha atoms at the base of the V3-loop (including the C-alpha atoms of amino acids 296-300 and 326-331, HXB2 numbering) and at the tip of the loop (308-319, HXB2 numbering), as well as between the two sides of the V3 loop (296-313 and 314-331, HXB2 numbering) for each trajectory were determined. Apart from the distance between the commonly defined regions of the base and tip [79], we also carried out an analysis on additional regions of the base (positions 296, 297 and 331) and tip (positions 312 -315) sections of the V3 loop, which included the two cysteine C-alpha atoms as well one neighbouring C-alpha atom at the base (three being the minimum that constitutes a group for this calculation) and the C-alpha atoms of the GPGR tip region. This allowed for the further investigation into the movement of the extremes of the V3 loop. Principal component analysis (PCA) was carried out using GROMACS [75,76] with the g_covar and g_anaeig functions. General statistical calculations were carried out in R [80]. Visual inspections were carried out using the VMD [81] and the USCF Chimera packages [61].

Results
The molecular dynamic simulations we performed in this study allow us to illustrate the effect of N-linked glycosylation on the dynamics of the HIV-1 gp120 protein with specific reference to the V3 loop. A non-glycosylated structure ( Figure 1A), a glycosylated structure with five glycans (glycosylated 5-glycans , Figure 1B) and a glycosylated structure with a single glycan (glycosylated 295 , Figure  1C) were used for the comparisons. Given that the only difference between the non-glycosylated and glycosylated gp120 structures were the presence or absence of N-linked glycans bound to the surface of the protein, we expect that any conformational differences observed between the various structures are as a result of the presence or absence of the glycans.
In order to gain a better understanding of the collective dynamics of each individual structure, we plotted the RMSD values for the C-alpha atoms of all structures across each 30 ns trajectory ( Figure S1). The RMSD plots illustrate the extent of movement of each structure away from the initial corresponding minimised starting structure at each time point on the trajectory.
The RMSD analyses provide an indication of the overall conformational dynamics of the system and Figure S1 indicates that for each system, the conformational equilibrium for the protein backbone is reached after approximately 5 ns. The subsequent analyses were carried out on the MD trajectories from 5 to 30 ns.
The RMSF profiles ( Figure 2) illustrate the C-alpha atom specific fluctuations for each MD simulation (averaging over 5-30 ns for each simulation). The plots show that the loop regions are generally more dynamic than other more structured regions of the protein (Figure 2). The C-alpha atom RMSF distribution for the non-glycosylated MD simulation was significantly greater than that of the glycosylated 5-glycans -simulations (p , 0.05; one-sided Wilcoxon rank-sum test), indicating that glycosylation dampens the dynamics of the protein backbone. That glycosylation leads to increased fold stability has been inferred experimentally [82,83] and predicted theoretically [83,84], and is a generally accepted phenomena. The C-alpha RMSF distribution of the nonglycosylated simulation was also significantly greater than that of the glycosylated 295 --system (p , 0.05; one-sided Wilcoxon ranksum test), however no significant difference was found for the comparison between the two glycosylated, glycosylated 295 and glycosylated 5-glycans -, systems. We also tested a further hypothesis that the effect of N-linked glycans on the protein conformational dynamics may be additive, and that the influence of a given glycan may affect only the regions surrounding the N-linked glycosylation site. In this hypothesis, the movement of the regions surrounding the glycans of the glycosylated 5-glycans structure would be influenced more than that of the equivalent regions of the glycosylated 295 -structure, but the region immediately adjacent to the N-linked glycan at position 295 will be affected in the same way for each structure. In order to test this, we compared the RMSF values for C-alpha atoms surrounding position 295 for the non-glycosylated, glycosylated 295 -and glycosylated 5-glycans structures. The distribution of the Calpha RMSF values 10 and 20 residues up-and downstream were compared using the Wilcoxon rank sum test. No significant differences were found (p . 0.05). We also compared the distribution of RMSF values for the V3 loop of all three structures and again, no significant differences were observed (p . 0.05).
In order to establish whether the dynamics of the V3 loop depends on the glycan profile, we calculated the distribution of the average distance between the centers of mass of the base and the tip of the V3 loop for each system (Figures 3A and 3B). In Figure 3A, the focus was on the commonly defined regions of the base (including the C-alpha atoms of amino acids 296-300 and 326-331, HXB2 numbering) and the tip (308-319, HXB2 numbering) of the V3 loop [79]. We found a significant difference (p , 0.001; Wilcoxon rank sum test) for all comparisons ( Figure 3A). The average distance between the base and tip for the glycosylated 295 V3 loop was significantly larger than for the non-glycosylated and glycosylated 5-glycans -V3 loops ( Figure 3A), the latter two displaying similar distances. In the second analysis ( Figure 3B) we determined the average distance between the centers of mass of the extreme groups of the V3 loop base (positions 296, 297 and 331) and tip (positions 312 -315). Similarly, in this case we found significant differences for all comparisons of the non-glycosylated, glycosylated 295 , and glycosylated 5-glycans structures (p , 0.001; Wilcoxon rank sum test). However, although the average distances for the glycosylated 295 structure were still the largest, in this case the distances for the glycosylated 5-glycans were visibly shorter than for either of the other two structures ( Figure 3B). The comparisons between the distance distributions for the extreme measurements ( Figure 3B) may therefore offer a more sensitive test for describing the movement of the V3 loop tip in relation to the base, but it does not reveal the direction of the movement. Since the first comparison ( Figure 3A) included amino acids adjacent to the stem of the V3 loop, it is possible that the movement of the stem in relation to the base and tip influenced these measurements. The stem residues on either side of the loop may move in the same direction causing a bend in the V3 loop, or they may move in the opposite directions forming a bulge in the loop. Both these scenarios could result in a shortening between the distance of the base and tip of the V3 loop, and this effect would be more amplified when focusing on the extremes of the base and tip.
To probe variations in loop topology further, we compared the distribution of the average distance between the centers of mass of the two sides of the V3 loop (residues 296-313 and 314-331, HXB2 numbering) for all systems ( Figure 3C). The difference between the distance distributions for the all the combinations of comparisons between the non-glycosylated, glycosylated 5-glycans --and glycosylated 295 was highly significant (p , 0.001; Wilcoxon rank sum test; Figure 3C). However, the average side-to-side distance for the non-glycosylated loop structure was larger than seen in either of the glycosylated forms ( Figure 3C). These results, together with the observations for the comparison of the distances between the base and tip, suggest that the glycosylated 295 structure takes on a narrow elongated shape, while the glycosylated 5-glycans takes on a narrow but bent shape (given the shorter average basetip distance). A visual representation of the average shape of the V3 loop for each trajectory also supported this hypothesis ( Figure  3D, using 5-30 ns). The V3 loop of the non-glycosylated structure had a much broader conformation with the two sides of the loop forming a bulge, whereas the glycosylated structures on the other hand showed a preference for a more narrow conformation ( Figure 3D). However, the average shape does not necessarily provide a meaningful representation of the underlying dynamics of the system. Furthermore, although the distributions of distances provide an indication of the movements across the trajectories, these distances include the high frequency movements that contribute to the background noise.
Principal component analysis (PCA) was carried out to describe the predominant movement of the V3 loop for each of the three MD simulations. PCA produces several principal components that describe a proportion of the system variance. For each of our structures, the first principal component (PC1) described 41%, 47% and 37% of the motion of the V3 loop for the non-  (Table 1). We found that the first three principal components incorporated the majority of the movement for each system, and collectively describe 63% (non-glycosylated), 73% (glycosylated 5-glycans ) and 70% (glycosylated 295 ) of the total motion of each system (Table 1). In order to visualise the variance described by each principal component, we plotted the extremes, and intermediate, positions of each motion ( Figure 4, and Figures  S2 and S3). These representation of each of the three principal components were generally consistent with the results from the distance calculation where the V3 loop of the non-glycosylated structure takes on an open, broader, shape ( Figure 4B) and the V3 loop of the glycosylated 5-glycansstructure has a more narrow and bent shape ( Figure 4C). The V3 loop in the glycosylated 295 structure also takes on a more narrow shape, but appears to fluctuate between elongated and bent conformations ( Figure 4D).
This may indicate that the additional glycans in the glycosylated 5glycans form may be stabilising the bent topology.
In order to confirm that the results obtained from the analysis of the 30 ns trajectories were not derived from the systems populating local minima, we repeated the V3 loop analyses on three additional 10 ns simulations (started from uncorrelated snapshots) for each of the non-glycosylated and glycosylated 5glycans systems. Apart from one calculation, the comparisons of the distributions of the average distance between the centers of mass of the base and the tip (both for the defined regions, Figure  S4A, as well as the extremes, Figure S4B) of the V3 loop were highly significant (p , 0.001; Wilcoxon rank sum test). The comparisons between the distributions of the average distance between the centers of mass of the two sides of the V3 loop (residues 296-313 and 314-331, HXB2 numbering) for each trajectory, was highly significant ( Figure S4C, p , 0.001; Wilcoxon rank sum test). These results, together with the representation of the average structures for each of the 10 ns simulations ( Figure S4D), are in agreement with the initial findings (30 ns trajectory results) where the glycosylated 5-glycans V3 loops is more narrow than the non-glycosylated form.
The PCA of the uncorrelated snapshots MD simulations showed that the first three principal components described between 65% and 81% of the variance for each system (Table  S1). The images representing the extremes, and intermediate, positions of each motion ( Figures S5, S6 and S7) followed the same trend as seen with the PCA based on the 30 ns trajectories of the non-glycosylated and glycosylated 5-glycans structures. Indeed, the V3 loop of the non-glycosylated system favours a much broader shape ( Figures S5B, S6B and S7B) compared to the glycosylated 5-glycans-  structures, which appear to prefer narrower and bent loops ( Figures  S5C, S6C and S7C).

Discussion
In this study we have investigated the impact of N-linked glycans on the dynamics of the HIV-1 gp120 protein with specific reference to the V3 loop. The dynamic properties of HIV-1 envelope proteins play an important role in the successful binding and transmission of a virion to a host cell [85,86]. Two recent studies have used molecular dynamic simulations to describe the movement of gp120 and the gp120 V3 loop [59,60], however the influence of glycans on the underlying movement was not explicitly addressed. Such structural studies are important to further our understanding of the dynamic interaction between HIV-1 gp120, the CD4 receptor and coreceptors. They also allow us to associate the observed dynamics with known sequence and chemical features of the V3 loop that play a role in determining coreceptor usage. These features are commonly used in coreceptor prediction tools and include specific amino acid mutations at particular positions of the V3 loop as well as the net charge [21,22,[41][42][43][44]47,48,87,88]. The presence or absence of an N-linked glycosylation site within the V3 loop has also been reported to influence coreceptor tropism [22,43], but little is known about the mode of this action. Here we used molecular dynamic simulations to describe the dynamics of the V3 loop with and without taking the potential influence of N-linked glycans into account.
Our findings suggest that the presence of N-linked glycans has a significant impact on the dynamics of the gp120 protein particularly on the movement and preferred conformation of the V3 loop. Differences in both the length of the loop, as well as its side-to-side separation, were observed as a function of glycosylation pattern. The average distance between the base and the tip of the V3 loop for all three structures provided a first look at the potential influence of a given glycan profile, and we observed a significant difference for all combinations of comparisons between the non-glycosylated, glycosylated 295 and glycosylated 5-glycans structures. However, the clear difference in distances between the two sides of the V3 loop illustrated the considerable extent to which glycans may be affecting the conformation of the loop. The V3 loop of the glycosylated structures took on a significantly more narrow formation compared to that of the non-glycosylated form. The side-to-side distances together with the base-tip distances suggested that the V3 loop of the non-glycosylated structure took on an open, broader, shape, the V3 loop of the glycosylated 5-glycans structure was more narrow and bent, and the V3 loop of the glycosylated 295 structure was narrow but more elongated ( Figure   3D). These observations were further substantiated by the results for the additional non-glycosylated and glycosylated 5-glycans examples (three uncorrelated simulations of 10ns each, Figure  S4D) where the same trend was seen.
The results from the principal component analysis further show a clear difference between the coordinate space most commonly occupied by the non-glycosylated, glycosylated 295 and glycosylated 5-glycans systems ( Figure 4, and Figures S2 and S3). The glycans appear to constrain the breadth of movement of the V3 loop, with the tip of the loop, in particular, appearing to move actively away from the glycans. Where a single glycan adjacent to the V3 loop was present (glycosylated 295 ) this observation was less distinct than for the structure with five glycans attached (glycosylated 5-glycans ). For the glycosylated 295 structure the V3 loop fluctuated between an elongated and bent shape, and therefore the tip appeared to move away from the glycan intermittently ( Figure 4D). In contrast, the V3 loop for the glycosylated 5-glycans structure remained in a more bent conformation with the tip moving away from the glycans consistently ( Figure 4C). The significant difference observed between the RMSF distribution of the C-alpha atoms for the entire gp120 protein for the non-glycosylated and glycosylated structures (Figure 2) may suggests that the glycan surface density plays a role in the fluctuation of nearby atoms and residues. In this case, steric hindrance imposed by the N-linked glycans, or the additional weight contribution from the N-linked glycans, could affect the surrounding protein structure. If this is the case then the type and size of each glycan could further influence the fluctuation of the protein regions neighbouring N-linked glycosylation sites. However, we did not observe a significant difference between the fluctuations of the C-alpha atoms bordering the N-linked glycosylation site at position 295 for the glycosylated 295 and the non-glycosylated forms, which may suggest that not all protein regions are as susceptible to a change in dynamics as loop regions may be. This may also reflect the relatively limited sampling of backbone conformations accessible to a 30 ns simulation. A previous study by Yokoyama et al. [59] also remarked that no significant change was observed in their MD simulations when a high mannose glycan was present in the V3 loop compared to when the glycan was absent, although no reference was given to the position of the added glycan. Future investigations will need to be carried out to determine whether any single glycan attached at an N-linked glycosylation site has a significant impact on the mobility of the protein.
The underlying amino acid sequences of the protein structures we used were identical and only the glycan profile differed between examples. [60]. The observation that the non-glycosylated example of the V3 loop takes on an open, broader form and the glycosylated forms thinner, elongated/bent shapes can therefore only be associated with the extent of glycosylation and not tropism. This is in contrast to a previous study where the authors used MD simulations to characterise the movement of the V3 loop regions of crystallised structures originating from two viruses, one a CXCR4using and the other a CCR5-using virus [60]. In their study, where the V3 loop sequence for each structure was unique, the authors suggested that an open V3 loop conformation supports X4-tropic and a thinner conformation supports R5-tropic viruses [60]. We suggest, therefore, that both the underlying amino acid sequence as well as the glycosylation profile impacts the molecular dynamics of the gp120 protein and that both factors must be considered in future structural analyses.
One of the major obstacles related to HIV-1 gp120 structural investigations relates to the availability of high-resolution crystal structures that include the variable loops. Here, we have used the 2B4C gp120 crystal structure, which includes the V3 loop, to enable comparison with previous reports that used the same structure for MD simulations [59,60]. Furthermore, previous studies have shown that the functions of the gp120 core are somewhat independent of the variable loops in that, regardless of the underlying amino acid sequence, it retains its fundamental structural characteristics and ligand binding capacity despite the absence of the variable loops [89,90]. Therefore we expect that the core structure derived from the crystal will have a minor influence on the overall outcome of the molecular dynamic simulations relative to that of the underlying amino acid sequence and glycan composition. The loop regions of a protein are also highly dynamic [91] and therefore difficult to crystallise [92], and it is likely that the configuration of the gp120 variable loops in the crystal structure represents only one of the many possible conformations. Molecular dynamic techniques generate multiple ensembles of the variable loops and the ongoing refinement of the force field parameters will continue to improve the accuracy of the methods to describe the conformational properties of proteins [93][94][95].
A recent study describes the differences in the crystal structures of the CCR5 and CXCR4 chemokine receptors and suggests that the steric hindrances due to amino acid substitutions may present one of the major factors involved in HIV-1 coreceptor selection [96]. Given the results from our study, we hypothesise that the steric hindrances imposed by N-linked glycans may present a further determinant of coreceptor usage. While the focus of this study was not specifically to identify the structural determinants of coreceptor usage, our results suggest that future studies should focus on exploring the interplay between the underlying amino acid sequence and the glycan composition in determining coreceptor tropism.
The importance of N-linked glycosylation of gp120 is not only evident in coreceptor studies, but they also play a role in HIV-1 antibody neutralisation and escape [9]. Because the intracellular glycosylation process, which adds the glycans to proteins, is that of the host, the immune system does not recognise the viral glycans as non-self, thereby allowing the virus to escape antibody driven responses [97]. It has also been shown that N-linked glycans bound to the surface of HIV-1 gp120 form part of several important neutralising antibody epitopes and the removal of even a single glycan can completely abolish a neutralising antibody response [17,19,20]. Our initial glycoprotein modelling results allude to not only a protein-glycan interaction, but also a glycanglycan interaction, in such that the movement and influence of a single glycan is affected by the presence and mobility of a second glycan at an adjacent N-linked glycosylation site. This further underlines the importance of broadening our understanding of the influence that N-linked glycans have on the various regions of HIV-1 gp120.
Our molecular dynamics results clearly illustrate the importance of taking the glycan profile of HIV-1 gp120 into account during structural investigations. Describing the temporal and spatial constraints affecting the dynamics of different regions of gp120 may provide key insight into the structural features that drive HIV evolution and allow the virus to switch between using the CXCR4 and CCR5 coreceptor for cell entry. Furthermore, these analyses will add to our understanding of how we can take advantage of these protective and adaptive characteristics to guide the design of carbohydrate-binding HIV therapies. Figure S1 RMSF values for the C-alpha atoms. These values represent the entire gp120 for the non-glycosylated, glycosylated 5-glycans , and glycosylated 295 trajectories. The shaded areas represent the part of the trajectory that was discarded as burn-in.  S1 Percentage motion included in each of the first three, and the total sum of the first three, principal components (PCs) for each of the uncorrelated repeats (10 ns). (PDF)

Author Contributions
Conceived and designed the experiments: SAT EF NTW. Performed the experiments: NTW EF OCG JCM. Analyzed the data: NTW. Contributed reagents/materials/analysis tools: RD EF OCG RJW. Wrote the paper: NTW SAT.