Fig 1.
Map of gp120 variable domains and diagram of the V1V2 peptide construct considered in this study.
(A) Cartoon diagram showing the variable regions of gp120 (V1 –V5), disfulfide bonds, and structural region corresponding to the peptide construct in the context of gp120. Blue–hyper-variable regions; Red—disulfide bonds; Green–V1V2 peptide region. This cartoon is modified from the original Leonard reference [49]. (B) The peptide construct (green) contains 6 amino acids from V1 and 33 amino acids from V2 connected by a disulfide bond (red). The sequence corresponds to the CAP45 strain (C) The sequence variability of the regions encompassing the peptide construct among HIV-1 isolates shown as a web logo (http://www.hiv.lanl.gov/content/sequence/ANALYZEALIGN/analyze_align.html). All residues are numbered according to the HXB2 reference sequence.
Fig 2.
A three-dimensional image of the glycosylated peptide construct from the CAP45 strain.
(A) A ribbon diagram shows the gp120 V1 fragment in green, and the V2 fragment in red. A disulfide bond is shown in yellow, and the carbohydrates corresponding to sites N156 and N160 are shown as sticks in white (B) The chemical composition of high-mannose glycan, (Man)5(GlcNAc)2, is shown.
Table 1.
Systems studied by MD simulations.
Fig 3.
The relationship between loop lengths and number of glycosylation sites in the gp120 variable regions V1-V5.
These plots are based on the Los Alamos Database (www.hiv.lanl.gov) alignment, which contains 4,633 curated HIV-1 Env sequences. This alignment contains intact, full length Env sequences, and includes only one sequence per sampled individual. Sequences of poor quality (frameshifts, ambiguity codes or inappropriate stop codons) were excluded from the alignment. The relative width of the box plots is proportional to the square root of the number of sequences in this set that have a given number of potential N-linked glycosylation sites, having the sequence pattern (NX[ST]), were N is an Asparagine, followed by X, any amino acid except Proline, followed by either a Serine or Threonine. Also shown is the epitope region, spanning HXB2 positions 152–184, from the V1V2 peptide construct. In the case of hyper-variable loops, V1, V2, V4, and V5, a p-value < 2.2e-16 was estimated for the correlation between length and number of N-linked glycosylation sites using Kendall's tau statistic (estimated using the R statistical package; it is non-exact due to ties and large sample sizes). The Variable Length Characteristics tool was used to evaluate these regions, and the full loop regions were included in the analysis (V1, HXB2 positions 131–157; V2 158–196; V3 296–331; V4 385–418; and V5 360–469).
Fig 4.
Net charge distribution of the variable regions.
Using the same input data as in Fig 3, charge variability in the V-1V5 regions and the V2 epitope region is shown. The V3 loop and V2 epitope region, despite being conserved in terms of length and number of glycosylation sites (Fig 3), both show a great deal of variation in net charge, comparable to the level of diversity found in hyper-variable regions. Net charge is calculated as the sum of positive and negative charges, where amino acid residues E and D are assigned -1, and K, R, and H are assigned +1. The V2 epitope region, V3 and V2 tend to be positively charged; V1, V4, V5 tend to be negatively charged.
Fig 5.
Six out of the eight conserved cysteines located at the base of HIV-1 variable loops V1-V4 are very frequently proximal to a N-linked glycosylation site.
Of the 4633 sequences in the filtered alignment in the 2014 database, there are on average 24 Cys per gp160, including 8 that close the variable loops V1-V4 and 16 others. Among the 8 conserved Cys that form disulfide bonds at the base of V1-V4, 21,837 immediately neighbor an N-linked glycosylation site, or 59%. These are concentrated in positions Cys131, Cys157, Cys196, Cys296, Cys331, and Cys385. Among the 73,354 Cys that are not located at the base of the variable loops, proximal glycans are very rare at 1.8%. Of the 6 Cys with conserved proximal glycans in HIV, only the most conserved, Cys157 and Cys196 are also highly conserved in 14 SIVCPZ sequences.
Fig 6.
Fraction of residues that form helix, beta-strand or turn structure as a function of increasing temperature.
Both glycosylated (symbols) and unglycosylated (no symbols) cases are shown.
Fig 7.
Influence of glycosylation on the free energy landscape of the peptide without (A) and with (B) glycosylation.
Landscape is considered as a function of end-end distance and radius of gyration. The contour plots are in units of kBT, and the difference between neighboring lines is 1 kBT.
Fig 8.
The backbone Ramachandran plot of the residue N156 for the two peptide systems without (A) and with (B) glycosylation is shown.
The polyproline II (1), extended beta (2) and right-handed alpha-helical (3) regions are marked. The contour plots are in units of kBT, and the difference between neighboring lines is 1 kBT.
Fig 9.
Fraction of configurations as a function of number of hydrogen bonds between different components of the system.
(A) Total hydrogen bonding within the peptide, between peptide and solvent, and between glycan and peptide. (B) Hydrogen bonding between the glycans and charged residues. Pep stands for peptide, Sol for solvent water, Gly for glycan, and Charged for the charged residues of the peptide.
Fig 10.
Electrostatic contribution from glycan to the stability of the peptide at 300K.
Different contributions are depicted and averaged along the simulation time. (A) intra and inter-molecular interaction of the peptide, with or without the glycan. (B,C,D) Coulomb contributions after the addition of the glycan to the peptide.
Fig 11.
Fraction of configurations as a function of inter-glycan distance.
Two representative configurations of shorter and longer inter-glycan distances are shown in (A) and (B). The peptide is shown as a purple ribbon, the glycans are in cyan and red stick representation, and the disulfide bond is shown in yellow. C) Cumulative fraction of the configurations as depicted in A and B. Clustering is based on glycan-glycan distance.
Fig 12.
Thermal unfolding simulations.
Panels (A) and (B) show the average number of residues in beta-strand as a function of time for the peptide alone and the glycosylated peptide; panels (C) and (D) show the average solvent accessible surface area (SASA) for the two systems as a function of time.
Fig 13.
Secondary structure propensity of the glycosylated peptide.
(A) Fraction of residues in the peptide that are in the beta-strand structure as a function of temperature for the glycosylated (V2g) and unglycosylated (V2) forms of CAP45 V1V2 peptide and the glycosylated (V2g_c) and unglycosylated (V2_cc) forms of ConC V2 peptide. (B) Fraction of residues in the peptide that are in the helix structure as a function of temperature for the glycosylated (V2g) and unglycosylated (V2) forms of CAP45 V1V2 peptide and the glycosylated (V2g_c) and unglycosylated (V2_cc) forms of ConC V2 peptide.
Fig 14.
Effects of glycosylation on V1V2 peptide region in the context of the BG505 Env trimeric spike.
(A) Root mean square fluctuation of the backbone atoms corresponding to residues 129–134 and 152–184 (HBX2 numbering) and computed for either the glycosylated (black line) and non-glycosylated (red line) protein. Error bars were estimated from calculation in each of the independent protomers. (B) Cumulative configurational entropy for the backbone atoms corresponding to the same residues as in panel A. Values were estimated by considering the total entropy from the three promoters. (C) Total interaction energy from the representative sequence as in B. The energy corresponds to the total value calculated among the three protomers and during 1us trajectory simulation. (D) Secondary structural percentage as computed from 1us MD simulations of the full Env spike. Four stretches were considered for the analysis, each featuring disulfide bonds and glycosylation sites. Computed secondary structure percentage for amino acid stretches that contain glycans adjacent to Cysteins (HXB2 numbering): 131–157 (analogous to the V1V2 peptide), 385–418 and 296–331. It further demonstrates, in the context of Env trimer, that glycosylation decreases the amount of alpha-helix, beta strands, bridge and turns in these regions.
Fig 15.
Effect of removing the glycosylation sites adjacent to Cys residues in V2 on processing of Env.
(A) Sequence of wt SF162 Env protein. The two Cys-distal glycan motifs at positions 136 and 188 in the V1V2 hyper-variable region are shown in blue, while the two Cys-adjacent glycans at positions 156 and 197 in the semi-conserved V2 and C2 domains are indicated in red. Cys residues 131, 157, and 198 are indicated in green. (B) SDS-PAGE analysis of wt SF162 gp120 and gp120 containing mutated Cys-adjacent glycosylation sites at position 156 (S158A) and 197 (S199A). The increase in mobility is consistent with the loss of a glycan at these positions, confirming that these sites are in fact utilized. (C) Analysis of removing two Cys-adjacent glycan sites on intracellular processing of SF162 Env. Plasmids that encode wt and mutant SF162 Env proteins were transfected into 293T cells. Forty-eight hours post-transfection, the cells were radiolabeled with 35S-cysteine for 5 hours, and cells were lysed and immunoprecipitated with polyclonal HIV+ antiserum (HIVIG), or mAbs b12 and 5145A directed against conformational epitopes in the CD4-binding domain (top panels), or mAbs 697D, 830A and 1393A directed against conformational epitopes in the V1/V2 domain (bottom panels). Lane 1—wt Env, lane 2- S158A mutant, lane 3—S199A mutant, lane 4—S158A/S199A double mutant. (D) Analysis of removing two Cys-distal glycans on intracellular SF162 Env processing. Cells transfected with wt SF162 Env (lanes 1), T138A (V1) mutant (lane 2), S190A (V2) mutant (lane 3) and T138A/S190A double mutant were labeled for 5 hrs, and then cells were lysed and labeled Env proteins immunoprecipitated with mAbs recognizing conformational epitopes in the CD4-binding domain (5145A) or in the V1/V2 domain (697D). Precursor gp160 and processed gp120 bands are indicated in (C) and (D).