Figure 1.
Methods evaluate (1) differential deviation (vaccine versus placebo) from the immunogen sequences at specific loci or in peptide regions that are relevant to antibody binding; (2) differential codon selection, and differences in physico-chemical properties across vaccine and placebo; (3) differential vaccine efficacy versus HIV-1 sequences that do not match immunogen sequences at individual sites and in each of several pre-specified antibody-relevant protein regions; (4) greater or more rapid viral escape (vaccine versus placebo) at predicted class I and class II HLA-restricted T cell epitopes; and (5) differences in phylogenetic diversity of the breakthrough amino acid sequences (vaccine versus placebo) or differential evolutionary divergence from the vaccine immunogen sequences. T cell and tree images are from openclipart.org.
Figure 2.
Signature sites by HIV-1 protein.
All sites with evidence for a different amino acid distribution in vaccine versus placebo sequences relative to a reference residue (unadjusted p < 0.05) by any of the site-scanning methods DVE, GWJ, or MBS. Panel (A) depicts signature sites found in the full HIV-1 genome, including in vaccine-immunogen regions (relative to a vaccine immunogen sequence) and non-vaccine-immunogen regions (relative to the consensus AE sequence) and (B) depicts a more detailed view of the vaccine immunogen regions in Gag, Pol, and Env. The blue horizontal lines above the protein regions in panel (B) indicate the regions included in the vaccine immunogen sequences, and the blue lines below the Env protein region indicate the positions of the variable loops. Signature sites are indicated by red vertical lines, with a red point that is placed on the line as an indicator of the magnitude of the site’s test statistic using the GWJ method, which is a t statistic comparing substitution weights across treatment groups. For sites with multiple reference AAs, the red point indicates the largest magnitude of the multiple test statistics. The black dashed horizontal lines in the middle of the gene and protein regions indicate the zero-point for the test statistic, so the farther away the point is from the center line, the more significant it was observed to be with this method. Points above the dashed line indicate that a site was found to have a “vMatch” sieve effect, while points below the dashed line indicate “vMismatch” signature sites.
Figure 3.
Vaccine protein signature sites AA distributions.
For the vaccine protein signature sites shown in Fig. 2, Fig. 3 shows distributions of amino acids relative to the vaccine sequences for vaccine versus placebo recipient sequences: Each subject is represented by a bar. Bars all have equal height. The vaccine sequence AA residue, in black, is shown above the midline. Within a bar, colors depict the fraction of the subject’s sequences with that AA residue (or insertion or deletion, indicated by a “−“). The widths of the bars are scaled so that the total width of the vaccine-recipient part of the plot is the same as for the placebo-recipient part.
Figure 4.
Non-vaccine protein signature sites AA distributions: First half.
For the first half of the non-vaccine protein signature sites shown in Fig. 2, Fig. 4 shows distributions of amino acids relative to the vaccine sequences for vaccine versus placebo recipient sequences: Each subject is represented by a bar. Bars all have equal height. The vaccine sequence AA residue, in black, is shown above the midline. Within a bar, colors depict the fraction of the subject’s sequences with that AA residue (or insertion or deletion, indicated by a “−“). The widths of the bars are scaled so that the total width of the vaccine-recipient part of the plot is the same as for the placebo-recipient part.
Figure 5.
Non-vaccine protein signature sites AA distributions: Second half.
For the second half of the non-vaccine protein signature sites shown in Fig. 2, Fig. 5 shows distributions of amino acids relative to the vaccine sequences for vaccine versus placebo recipient sequences: Each subject is represented by a bar. Bars all have equal height. The vaccine sequence AA residue, in black, is shown above the midline. Within a bar, colors depict the fraction of the subject’s sequences with that AA residue (or insertion or deletion, indicated by a “−“). The widths of the bars are scaled so that the total width of the vaccine-recipient part of the plot is the same as for the placebo-recipient part.
Table 1.
Signature sites in vaccine proteins: 2-sided unadjusted p-values calculated by five methods.
Table 2.
Signature sites in non-vaccine proteins: 2-sided unadjusted p-values calculated by five methods.
Table 3.
Vaccine efficacy at the signature sites in the vaccine proteins.
Table 4.
Vaccine efficacy at the signature sites in the non-vaccine proteins.
Table 5.
Immunologically relevant subsets of sites.
Figure 6.
Estimated vaccine efficacy as a function of distance to contactsites of the CRF01_AE vaccine sequences.
SmoothMarks [20] estimates of vaccine efficacy (VE) against acquisition with an HIV-1 CRF01_AE virus with genetic distance v from the 92TH023 or CM244 vaccine sequences, with 95% confidence intervals, using Env mindist amino acid sequences and computed with the HIVb PAM substitution matrix[16] across the contactsites. For each panel, the first p-value is for testing whether there is any VE against any virus genotype, and the second p-value is for testing whether VE declines with the distance v. The PAM distances are directly proportional to Hamming distances, where a PAM distance of v approximately equals a Hamming distance of 0.85×v. Given that contactsites contains 176 residues, the span of contactsites distances 0.08 to 0.25 correspond to 13–39 amino acid mismatches.
Figure 7.
Mapping of signature sites and sites under selection on an Env trimer structure.
Trimer (PDB id: 4NCO [27]) is shown in surface representation, with gp120 in grey and gp41 in dark grey. (a-e) Panels correspond to the five physico-chemical properties analyzed for evidence for property importance: (a) chemical composition, (b) polarity, (c) volume, (d) isoelectric point or (e) hydropathy [25]. Signature sites identified in Env-gp120 are colored in green, and sites that were under selection are colored from pink to red (corresponding p-values from 0.05 to < 0.0001). (f) Visualization of the major sites of vulnerability on the HIV-1 Env.
Table 6.
Tests for enrichment of signature sites in biologically-defined subsets compared to all other sites.