Figure 1.
The VP/AAP overlapping ORFs in the AAV2cap gene.
(A) Organization of the cap gene and evolutionary conservation of the VP proteins. Evolutionary conservation scores were calculated by a ConSurf analysis [39] of 128 AAV species. (B) Sequence alignment of amino acids around the VP and AAP overlapping regions indicated with black lines in Panel A. The QVKEVTQ/KSKRSRR motifs are indicated with seven thick vertical lines. The numbers indicate amino acid positions. (C) VP/AAP-overlapping ORFs encoding the QVKEVTQ/KSKRSRR motifs. (D) Amino acid sequence logos (frequency plots) representing the VPs and AAPs derived from 128 AAV species.
Figure 2.
The approach used for the experimental directed evolution of the VP/AAP-overlapping ORFs.
(A) Map of VP3 and AAP mutant plasmid libraries. Red bars indicate the random heptapeptide region corresponding to the QVKEVTQ/KSKRSRR motifs. (B) Schematic representation of the AAV directed evolution procedure. The numbers 1 through 5 in the figure indicate steps in the procedure. Step 1, Construction of the original plasmid DNA library (pAAV2-Lib-0); Step 2, Production of AAV virus library (AAV2-Lib-1); Step 3, Recovery of viral genomic DNA; Step 4, PCR amplification of the heptapeptide-coding region in the viral genome; Step 5, Construction of the next round plasmid DNA library using the PCR amplicons from Step 4 (pAAV2-Lib-1). Then Step 2 follows for the next round positive selection. This procedure was repeated three times to obtain AAV2-Lib-2 and AAV2-Lib-3. (C) The timeline of Illumina sequencing of PCR amplicons of the heptapeptide region. Illumina sequencing was used for extensively characterize the heptapeptides evolved by the experimental procedure.
Figure 3.
The results of the experimental evolution of VP and AAP heptapeptides and their biochemical characterization.
(A and H) Evolutionary history of the viable VP (Panel A) and AAP (Panel H) heptapeptide mutants. Percentage of sequence reads of each mutant among all the sequence reads is plotted as a function of the number of selection cycles. The red line with an arrow in Panel A represents GGGGGGG. (B and I) Chemical properties of top-ranked VP (Panel B) and AAP (Panel I) heptapeptides with no stops identified in Lib-3. Top-ranked 550 heptapeptides are divided into 11 bins (50 peptides per bin) according to their rank. Red lines indicate chemical properties representing all the 930372 VP heptapeptides and 554743 AAP heptapeptides in Lib-0. Asterisks indicate statistical significance with P<0.05 and power >0.8 (Mann Whitney U-test) when compared to the values of Lib-0. The power analysis was performed by comparing 50 heptapeptides in each bin with 50 randomly selected heptapeptides from Lib-0 100 times. Vertical bars represent standard deviations. (C and J) Amino acid frequency plots of the 143 viable VP heptapeptide mutants (Panel C) and the 492 viable AAP heptapeptide mutants (Panel J) identified by the experimental evolution of random nucleotide sequences. (D and K) Experimental validation of the capsid-forming ability of the viable VP (Panel D) and AAP (Panel K) heptapeptides identified by the experimental directed evolution procedure. In Panel D, relative titers of wild type and mutant AAV2 VP3 only viral particles recovered from one 6 cm dish are shown. The VP heptapeptide sequences selected from the viable 143 VP heptapeptide mutants found in Lib-3 (i.e., 1 through 12) and those selected from the random mutants found in Lib-0 (i.e., 13 through 18) are as follows: 1, SAYWVTQ; 2, TVWASSV; 3, CLHDVMS; 4, SVHDACV; 5, GVFWVGV; 6, VVHDVSD; 7, CVRDVTL; 8, CVWSVGL; 9, 10, QVWDERY; 11, AVMTVCS; 12, CVFFSSF; 13, PHNFVAL; 14, LDDFLEF; 15, EGPCGGL; 16, RGAEWNK; 17, GGRWGRG; 18, GVAWGVG. In Panel K, relative titers of wild type AAV2 VP3 only viral particles produced with wild type or each mutant AAP in one 6 cm dish are shown. The AAP heptapeptide sequences in the AAP mutants selected from Lib-3 (i.e., 1 through 12) and Lib-0 (i.e., 13 through 18) are as follows: 1, GGGRRRR; 2, RGRRRRW; 3, VRRRRGG; 4, WRRPRRV; 5, PRLSRRR; 6, APGRGAR; 7, RGGRRRA; 8, GRVGPRG; 9, RRVGRLG; 10, PGRGRRG; 11, VGGGGRR; 12, GERKGRG; 13, RSSPALR; 14, PGGGSIS; 15, GAQVGVV; 16, GGARRGG; 17, RGHDGAS; 18, ACWRLF_ (a stop codon follows after F). All the experiments were performed in triplicate. (E, F, G, L, M, and N) Histograms showing chemical properties (MW, IP and GRAVY score) of the viable VP (Panels E, F and G) and AAP mutants (Panels L, M and N). In each graph, upper bars represent histograms of the viable 143 VP or 492 AAP heptapeptide mutants found in Lib-3 and lower bars represent histograms of 930372 VP and 554743 AAP heptapeptides found in Lib-0.
Figure 4.
Peculiar positive-negative and neutral-neutral amino acid combinations at P3 and P4 in the 143 viable VP heptapeptides.
(A) Amino acid compositions of all the 143 viable VP heptapeptide mutants. (B) Various amino acid combinations at P3 and P4 positions in the viable VP heptapeptide mutants. Left panels show P3/P4 combinations when histidine (H, upper left), arginine (R, middle left) or a non-charged amino acid (lower left) is at P3. Right panels show P3/P4 combinations when aspartic acid (D, upper right), glutamic acid (E, middle right) or a non-charged amino acid (lower right) is at P4. (C) Topological location of amino acid residues at P3 (K321, red) and P4 (E322, blue) in the wild type AAV2 capsid VP3 is shown on a VP3 pentamer viewed down an icosahedral five-fold symmetry axis at the center. E322 is partially exposed on the outer surface near the five-fold pore while K321 is barely seen from the outside of the capsid. (D) A close-up view of K321 and E322. Ten amino acids (from V323 to I332) forming the outer ridge around the five-fold pore are removed to make K321 and E322 visible. Five pairs of K321 and E322 form a ring in the five-fold channel wall. Panels C and D are created using PyMOL.
Figure 5.
HEK293 cell transduction with wild type AAV2 and viable AAV2 VP heptapeptide mutants.
HEK293 cells were infected with dsAAV2-CMV-GFP encapsidated with wild type or mutant capsid at MOI of 20,000. The mutant capsids carried the following VP heptapeptide mutants: 1, SAYWVTQ; 2, TVWASSV; 3, CLHDVMS; 4, SVHDACV; and 5, GVFWVGV. Forty-eight hours post-infection, percentage of GFP-positive cells was determined by flow cytometry. Non-transduced cells were used as a negative control (NC). Vertical bars represent standard deviations.
Figure 6.
A flowchart of the evolutionary algorithm.
This flowchart outlines the steps involved in the computational directed evolution of 22 nucleotide-long DNA coding the VP/AAV-overlapping heptapeptides.
Figure 7.
Amino acid frequency plots of computationally evolved VP and AAP heptapeptides.
(A) Computational evolution of random nucleotide sequences. The in silico evolution was performed in three ways, VP only evolution without AAP-originating constraints, AAP only evolution without VP-originating constraints, and VP and AAP co-evolution. The amino acid logos to the left are a representative set of 200 ancestral sequences. The logos in the middle and to the right represent amino acid frequencies in all the individuals in the 200th generation. (B) The same analysis was done using the wild type AAV2 genome sequence.
Figure 8.
HEK293 cell transduction with wild type AAV2 or AAV2 K321A mutant expressing GFP.
HEK293 cells were infected with double-stranded (ds) AAV2-CMV-GFP at MOI of 20,000 and maintained at 32, 37 or 39.5°C. Forty-eight hours post-infection, cells were observed under a fluorescent microscope (Panel A) and subjected to flow cytometry (Panel B). The experiment was done in quadruplicate. Asterisks indicate statistical significance with P = 0.000054 to 0.00032 (Student’s t-test) compared to the values of the corresponding wild type controls. Non-transduced cells were used as a negative control (NC).
Figure 9.
A hypothetical model for a mode of co-evolution of VP/AAP-overlapping heptapeptides.
(1) The heptapeptide VP region in the evolutionarily most primitive form when no AAP exists is shown. (2) Acquisition of lysine (K) at VP P3 increases infectivity and is fixed preferentially, but may decrease virion stability. (3) Due to the neutral-neutral or basic-acidic structural constraints imposed on VP P3 and P4, aspartic acid (D) or glutamic acid (E) is favorably placed at P4. (4) AAP overprinting takes place on either xVKDVxx or xVKEVxx sequence motif of VP. A computational co-evolution analysis of xVKDVxx or xVKEVxx-coding DNA predicts evolutionary extinction of VP carrying aspartic acid (D) at P4 and emergence of AAP exhibiting a xSR(K/R)SxS consensus sequence. (5) The overlapping ORF further evolves into the present form by a mechanism that has yet to be elucidated.