Genetic Analysis of a Novel Human Adenovirus with a Serologically Unique Hexon and a Recombinant Fiber Gene

In February of 1996 a human adenovirus (formerly known as Ad-Cor-96-487) was isolated from the stool of an AIDS patient who presented with severe chronic diarrhea. To characterize this apparently novel pathogen of potential public health significance, the complete genome of this adenovirus was sequenced to elucidate its origin. Bioinformatic and phylogenetic analyses of this genome demonstrate that this virus, heretofore referred to as HAdV-D58, contains a novel hexon gene as well as a recombinant fiber gene. In addition, serological analysis demonstrated that HAdV-D58 has a different neutralization profile than all previously characterized HAdVs. Bootscan analysis of the HAdV-D58 fiber gene strongly suggests one recombination event.


Introduction
Human adenoviruses (HAdVs) were first isolated in 1953 from pediatric adenoid tissue and from a military basic trainee as respiratory pathogens [1] [2].Since then, 56 types have been isolated and characterized [3,4,5,6,7].Currently, there are 56 HAdVs in the genus Mastadenovirus in the family Adenoviridae, that are organized into seven species (A-G) [3,4,7,8].Individual HAdV types were originally differentiated based on immunochemical or serological methods, but more recently, genomics and bioinformatic analyses have supplanted serology [8].Members of each species are highly similar at the nucleotide level, and do not appear to recombine readily with members of another species.Species grouping also reflect a tendency for specific human diseases: for example many HAdVs within species HAdV-D cause epidemic keratoconjunctivitis [9], whereas HAdVs in species HAdV-B are known to cause respiratory infections [10].
Currently there are three human adenoviruses (HAdV-F40, HAdV-F41 and HAdV-G52) that are associated with gastroenteritis [7,8].Gastroenteritis is associated with an estimated 5,000 deaths per year in United States [11].It is likely that the etiological agents of gastroenteritis include yet-to-be identified pathogenic agents.
In this report we examined an adenovirus isolated from the stool of an AIDS patient who presented with severe chronic diarrhea.
Based upon whole genomic and bioinformatics analysis, this virus appears to belong to species HAdV-D, with the proposed name HAdV-D58.

Microbiological Investigation
In February of 1996 an adenovirus was isolated from the stool of a 31-year-old AIDS patient who presented with severe chronic diarrhea and was subsequently hospitalized.Cryptosporidium parvum and Giardia lamblia were also found in the fecal matter of the patient; therefore, the clinical symptoms cannot be exclusively linked with the adenovirus infection.

Amplification and sequencing of the novel adenovirus
Partial sequencing of HAdV-D58, previously published as the Ad-Cor-96-487 strain [12], via imputed serum neutralization, demonstrated that portions of the hexon and fiber genes resembled HAdV-D33 and HAdV-D29, respectively [12].This suggested that this novel HAdV isolated from an AIDS patient originated at least in part by recombination.To elucidate the genetic characteristics of HAdV-D58, the entire genome has been sequenced and analyzed.

Physical features of the novel adenovirus genome
The genome length of HAdV-D58 is 35,218 base pairs (Fig. 1), with a base composition of 22.6% A, 20.3% T, 28.6% G, 28.4% C and with a GC content of 57.0%.The GC content is consistent with members of species human adenovirus D (HAdV-D) (57.0%mean).
The organization of the 36 open reading frames (ORFs) that were annotated had a genome organization similar to other mastadenoviruses (Fig. 1).The inverted terminal repeat (ITR) sequences for HAdV-D58 were determined to be 160 bp in length.Within species HAdV-D, HAdV-D58 has a genome percent identity ranging from a low of 90.72% (HAdV-D8; phylogenetic distance of 0.0711) to 93.97% (HAdV-D49; phylogenetic distance of 0.0341).

Genomic recombination analysis
Comparison of HAdV-D58 with the full-length genomes of viruses in species HAdV-D using SimPlot analysis revealed significant sequence divergence in the hexon, E3, and fiber coding sequences (Fig. 2).

Genetic analysis of the novel adenovirus hexon coding sequences
Analysis of the HAdV-D58 genome via pairwise comparison suggested that the hexon coding sequence was unlike any other known human adenovirus hexon sequence (Fig. 2).To determine if the hexon gene was novel, we performed SimPlot analysis using all hexon loop 1 (L1) and loop 2 (L2) coding sequences in species HAdV-D.L1 and L2 contain the epsilon (e) determinant, which contain the epitopes for serum neutralization [13].SimPlot analysis confirmed that the hexon gene of HAdV-D58 is unique compared with all other hexon genes in species HAdV-D (Fig. 3).In terms of nucleotide identity, the L1 and L2 of HAdV-D33 were most similar to HAdV-D58 with 84.4 and 89.8% nucleotide identity, respectively (Table S1).No substantial evidence of recombination in the hexon coding sequence was revealed.

Analysis of the E3 genes
In the E3 region 19K, RIDa, RIDb, and 14.7K are the only genes that have been investigated.The function of the E3/19K protein is to prevent human MHC class I molecules from being transported to the cell surface [14].Specifically, amino acids W52, M87, and W96 were shown to be important for HLA-I modulation [14].A second function of E3/19K is to inhibit NK cells from recognizing HAdV-infected cells by sequestering MHC-I chain-related proteins A and B (MICA/B) [15].The 14.7K protein product inhibits the internalization of TNF receptor 1 [16].The RIDa and RIDb proteins down-modulate the apoptosis receptor Fas/Apo-1 [17].
Bootscan analysis strongly suggests that there was a recombination event in the middle of the open reading frames of 19K and CR1-c (Fig. 4).These recombination events did not disrupt any of the E3 open reading frames in the HAdV-D58 genome.Analysis of the 19K open reading frame in HAdV-D58 demonstrated that amino acids W52, M87, and W96 were present (data not shown).The percent identities of the HAdV-D58 19K, RIDa, RIDb, and 14.7K open reading frames were 96.6, 98.9, 98.4, and 97 percent identical to the homologous open reading frames of E3 coding sequences for HAdV-D49-19K, HAdV-D36-RIDa, HAdV-D15-RIDb, and HAdV-D15-14.7K,respectively (Table S2).

Fiber recombination analysis
To determine whether or not there was recombination in the fiber gene of HAdV-D58, we performed Bootscan and SimPlot analysis using the fiber sequences in GenBank.Our results suggested the fiber gene of HAdV-D58 contains two recombination sites (Fig. 5).The first was in the middle of the shaft coding sequence and the second was in the shaft/knob boundary.The possible recombination at the shaft knob boundary is tenuous since it is not possible to differentiate between HAdV-D25 and HAdV-D29 at this junction as evidenced by SimPlot analysis (Fig. 5B).

Serum neutralization vs. Phylogenetic analysis
A previous study proved that when the nucleotide identity of L2 in the hexon differs by $2.5%, a new HAdV type is suspected [13].To provide a correlation between serum neutralization data, molecular typing (i.e., imputed serum neutralization), and phylogenomics data for the determination of a new HAdV type, the hexon L2 sequence of the proposed novel HAdV-D58 was compared against the L2 sequences of HAdV-D33, -D49, and -D38 (the closest phylogenetic relatives of the HAdV-D58 L2).The difference in percent nucleotide identity between L2 of HAdV-D33, -D49 and -D38, and that of HAdV-D58 was 10.18, 20.73, and 25.09 percent, respectively.Thus, using the L2 sequencing criteria established by Madisch et al also demonstrates that HAdV-D58 is a new type.

Discussion
In the past, human adenoviruses were characterized primarily based on their serological profile and hemagglutination properties [18].Today the classification methods used for novel adenoviruses has been expanded to include whole genome sequencing and bioinformatic analysis [19].We used whole genome sequencing, bioinformatic analysis, and serology to irrefutably demonstrate that HAdV-D58 is a novel human adenovirus type.
We found that the fiber gene of HAdV-D58 contains at least one recombination event and possibly a second (Fig. 5).The second possible recombination site is located at the shaft/knob boundary.It is not currently possible to determine if recombination happened at the shaft/knob boundary (Fig. 5B).A prior study described a recombination hotspot in the fiber gene of species HAdV-D at the shaft/knob boundary [20].However, our Bootscan analysis on the same fiber coding sequences listed in Darr et al [20], did not reveal evidence of recombination (Fig. 9).This result was also corrobo-rated independently (personal communication Jason Seto).The analysis describing recombination in the fiber proteins of HAdV-D47 and HAdV-D30 utilized consensus sequences for two of four alignments [20].The problem with this analysis is that consensus sequences do not exist in nature and could induce artifactual data when introduced into recombination analysis.Furthermore, the only way to re-create the supposed recombination events (proposed by Darr et al) [20] that created HAdV-D20 was by combining sequences HAdV-D20-FM210561 and HAdV-D23-FM210540, which are 100% identical (Table 2), with the 39 sequences of HAdV-D20-AJ811444 and HAdV-D23-AJ811446 (see Materials and Methods), respectively (Table 2).We were also able to recreate the proposed recombination event that created HAdV-D25 when we combined the sequences of HAdV-D25-FM210542 and HAdV-D26-FM210543, which are also 100% identical (Table 2), with the 39 sequences of HAdV-D25-AJ811448 and HAdV-D26-AJ811449 (see Materials and Methods), respectively (Table 2).When this data is considered together, we find no concrete evidence that the shaft/ knob junction is a hot-spot for recombination.
For HAdVs, the number of E3 ORF's ranges between 6 and 9 [7,21].HAdVs in species HAdV-D and HAdV-G contain the 49K/CR1-b ORF [7,21].Interestingly, Bootscan analysis suggests that the E3 region of HAdV-D58 was created by recombination with HAdV-D29 (Fig. 4).However, analysis of all sequenced E3 regions in species HAdV-D demonstrates that recombination hot spots do not exist in this part of the genome for species HAdV-D (data not shown).Thus, it is difficult to speculate what advantage there is for a seemingly random recombination in the E3 region.

Conclusions
In this study, we sequenced the genome of an apparently novel adenovirus.The novel hexon coding sequence, coupled with bioinformatic analysis, demonstrated that this genome is different from all previously characterized HAdVs, and is a novel human adenovirus.

Ethics Statement
The work reported herein was performed under United States Air Force Surgeon General-approved Clinical Investigation No. FDG20040024E, by the Institutional Review Board at the David Grant USAF Medical Center.Informed Consent was not required, because we did not use clinical samples.

Viruses, cells and neutralization test
The isolation of HAdV-D58 (previously known as Ad-Cor-96-487) was previously described [12].In brief, the stool sample was inoculated into Hep-2 cells and subcultured in Earle's MEM supplemented with 10% of fetal bovine serum (FBS), penicillin (200 U/ml), L-glutamine (2 mM), Fungizone (1 mg/ml), and streptomycin (200 mg/ml).HAdV-D58 was investigated serologi- cally by viral neutralization assay (VN) using horse polyclonal antisera directed against prototype strains of HAdV-D8, -D9, -D10, -D13, -D15, -D17, -D29, -D33, -D43, -D44, -D45, -D46 and -D47.VN tests were conducted on Hep-2 cells grown in 96-well microplates.The Hep-2 cells used in this study were used previously [22] and are a common cell line used for adenovirus research.Type-specific antisera were inactivated at 56uC for 30 min and serially diluted twofold, 50 ml per well with four replicate wells per dilution.A working dilution of virus (HAdV-D58) containing 100 TCID50 in 50 ml was added to each well, and the plates were incubated at 37uC in 5% CO 2 for 1 h.During the incubation period, Hep-2 cells were trypsinized and resuspended at 5610 4 cells per ml.After the incubation, 100 mls of cell suspension were added to each well.The contents of each well were mixed, and the plates were incubated at 37uC in 5% CO 2 for 6 days.After 6 days, the medium was removed and cells were stained with crystal violet solution (1.46 g crystal violet, 50 ml ethanol, 300 ml formaldehyde, 650 ml distilled water).The neutralization titer was calculated as the maximum dilution of antiserum that completely inhibited viral growth as evidenced by the lack of cytopathic effects.

Amplification of the HAdV-D58 genome
To amplify regions of HAdV-D58 flanking the sequences previously described by Ferreyra et al. [12], we designed primers based on conserved adenovirus sequences in species HAdV-D.All amplicons were then sequenced using primer walking.The genome was assembled using SeqMan, which is an assembly program inside of the Lasergene 8 software suite.

Nucleic Acid Isolation
HAdV-D58 particles were separated from Hep-2 cells by ultracentrifugation.Genomic DNA was acquired from viral particles using AccuPrep Genomic DNA Extraction Kit (Bioneer Corporation).Finally, the viral DNA was resuspended in deionized water and stored at 220uC until use.

Bioinformatics
The available genomes from species HAdV-D were aligned using the clustalW [23] alignment method which is available through a web interface at http://www.ebi.ac.uk/Tools/clustalw2/index.html.The default parameters for gap open penalty and gap extension penalty were used.

Figure 1 .Figure 2 .
Figure 1.Genome organization of HAdV-D58.Genome is represented by a central black horizontal line marked at 5-kbp intervals.Protein encoding regions are shown as arrows indicating transcriptional orientation.Forward arrows (above the horizontal black line) denote coding regions in the 59 to 39 direction and arrows pointing to the left (below the horizontal black line denote coding regions in the 39 to 59 direction).Spliced genes are indicated by V-shaped lines.doi:10.1371/journal.pone.0024491.g001

Figure 4 .
Figure 4. Computational analysis of the E3 region.SimPlot analysis of the E3 region of HAdV-D58 compared to fully sequenced E3 regions from species HAdV-D.The arrows over the Bootscan demarcate the approximate positions of the E3 coding sequences.doi:10.1371/journal.pone.0024491.g004

Figure 5 .
Figure 5. Computational analysis of the Fiber regions.(A) Bootscan and (B) SimPlot analysis of the fiber region of HAdV-D58 compared to fully sequenced E3 and fiber regions from species HAdV-D.doi:10.1371/journal.pone.0024491.g005

Figure 6 .
Figure 6.Phylogenetic analysis of whole genome, penton base, and E3 CR1-b in HAdV-D58.Phylogenetic nalysis is based on the nucleic acid sequence of (A) whole genomes, (B) penton base, and (C) CR1-b.Phylogenetic trees were constructed from aligned sequences using MEGA, via the neighbor-joining methods and a bootstrap test of phylogeny.Bootstrap values shown at the branching points indicate the percentages of 1000 replications produced the clade.doi:10.1371/journal.pone.0024491.g006

Figure 7 .
Figure 7. Phylogenetic analysis of HAdV-D58 hexon loops 1 and 2. Analysis of HAdV-D58 hexon L1 and L2 is based on the nucleic acid sequence of (A) hexon and (B) hexon L2.Phylogenetic trees were constructed from aligned sequences using MEGA, via the neighbor-joining methods and a bootstrap test of phylogeny.Bootstrap values shown at the branching points indicate the percentages of 1000 replications produced the clade.doi:10.1371/journal.pone.0024491.g007

Figure 8 .
Figure 8. Phylogenetic analysis of the fiber coding sequence in HAdV-D58.Analysis of HAdV-D58 is based on the nucleic acid sequence of the fiber knob.Phylogenetic trees were constructed from aligned sequences using MEGA, via the neighbor-joining methods and a bootstrap test of phylogeny.Bootstrap values shown at the branching points indicate the percentages of 1000 replications produced the clade.doi:10.1371/journal.pone.0024491.g008