Most traditional genome sequencing projects involving viruses include the culture and purification of the virus particles. However, purification of virions may yield insufficient material for traditional sequencing. The electrophoretic method described here provides a strategy whereby the genomic DNA of the Korean isolate of Pieris rapae granulovirus (PiraGV-K) could be recovered in sufficient amounts for sequencing by purifying it directly from total host DNA by pulse-field gel electrophoresis (PFGE).
The total genomic DNA of infected P. rapae was embedded in agarose plugs, treated with restriction nuclease and methylase, and then PFGE was used to separate PiraGV-K DNA from the DNA of P. rapae, followed by mapping of fosmid clones of the purified viral DNA. The double-stranded circular genome of PiraGV-K was found to encode 120 open reading frames (ORFs), which covered 92% of the sequence. BLAST and ORF arrangement showed the presence of 78 homologs to other genes in the database. The mean overall amino acid identity of PiraGV-K ORFs was highest with the Chinese isolate of PiraGV (∼99%), followed up with Choristoneura occidentalis ORFs at 58%. PiraGV-K ORFs were grouped, according to function, into 10 genes involved in transcription, 11 involved in replication, 25 structural protein genes, and 15 auxiliary genes. Genes for Chitinase (ORF 10) and cathepsin (ORF 11), involved in the liquefaction of the host, were found in the genome.
The recovery of PiraGV-K DNA genome by pulse-field electrophoretic separation from host genomic DNA had several advantages, compared with its isolation from particles harvested as virions or inclusions from the P. rapae host. We have sequenced and analyzed the 108,658 bp PiraGV-K genome purified by the electrophoretic method. The method appears to be generally applicable to the analysis of genomes of large viruses.
Citation: Jo YH, Patnaik BB, Kang SW, Chae S-H, Oh S, Kim DH, et al. (2013) Analysis of the Genome of a Korean Isolate of the Pieris rapae Granulovirus Enabled by Its Separation from Total Host Genomic DNA by Pulse-Field Electrophoresis. PLoS ONE 8(12): e84183. https://doi.org/10.1371/journal.pone.0084183
Editor: Qunfeng Dong, University of North Texas, United States of America
Received: April 8, 2013; Accepted: November 12, 2013; Published: December 31, 2013
Copyright: © 2013 Jo et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This study was supported by Mid-career researcher program through NRF grant funded by the MEST (No. 2010-0027694). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: Dr. Sung-Hwa Chae, who is currently working at GnC Bio Co., Ltd., 4F, Yekun-plaza, 621-6, Banseok-dong, Yuseong-gu, Daejeon 305–150, South Korea, is one of the co-authors for the manuscript. Dr. Sung-Hwa Chae does not hold any competing interests on the publication of the manuscript, consultancy, patents, products in development or marketed products. This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials.
Baculoviruses represent a diverse group of viruses with covalently closed, double-stranded, circular, supercoiled genomes, with sizes varying from 80 to 180 kb, encoding between 90 and 180 genes. The DNA genome is packaged in rod-shaped nucleocapsids that are 230–385 nm in length and 40–60 mm in diameter. The virions occur in two types- occluded virions (ODV) and budded virus particles (BV). Baculoviridae are divided into four genera, Alphabaculovirus [lepidopteran-specific nuclear polyhedrosis virus (NPVs)], Betabaculovirus [lepidopteran-specific granulosis virus (GVs)], Gammabaculovirus (hymenopteran-specific NPVs) and Deltabaculovirus (dipteran-specific NPVs) , . Viruses belonging to the order Hymenoptera contain the smallest genomes, at >80 kb, which has been explained as a result of their restricted life cycle, confined to replication in insect gut cells . Group I alphabaculoviruses cluster ∼130 kb, whereas Group II shows a high degree of diversity, varying from ∼130 to 170 kb. The larger genomes of the Group II alphabaculoviruses can be attributed to a combination of repeated genes that are not found in the smaller genomes. This is in contrast to the betabaculoviruses genomes, varying from 101 kb in the case of Plutella xylostella granulovirus (PlxyGV)  to 178 kb in Xestia c-nigrum granulovirus (XecnGV) . Despite the large difference in gene content in betabaculovirus genomes, as reflected in this range of sizes, their genomes are surprisingly collinear, compared with alphabaculoviruses, which show a high degree of variation , . The first dipteran-specific deltabaculovirus, the Culex nigripalpus nucleopolyhedrovirus (CunniNPV), was isolated and sequenced from the mosquito Culex nigripalpus . A phylogenetic analysis showed its distinctive form, making it a member of a new genus within the family Baculoviridae . Compared to alphabaculoviruses family members, betabaculoviruses have been investigated to a lesser degree, because of the limitations of permissive cell lines . Currently, 60 complete genomes are known in the Baculoviridae family; 45 genomes from NPV (41 alphabaculoviruses, 3 gammabaculoviruses, and 1 deltabaculovirus), 14 genomes from GV, and 1 unclassified Hemileuca sp. NPV (http://www.ncbi.nlm.nih.gov/genomes/GenomesGroup.cgi?taxid=10442).
The small cabbage white butterfly, Pieris rapae (P. rapae) is a serious pest of cultivated cabbages and other mustard family crops worldwide. A serious infestation can lead to the death of the plant due to reduced photosynthesis. P. rapae granulovirus (PiraGV) infects P. rapae in nature and functions as an important biological agent in controlling the population of P. rapae in the ecosystem. Although PiraGV is now a registered biocontrol agent for the control of P. rapae, research on the genetic and molecular information of the virus is still limited, apart from a recent study on occlusion-derived virus (ODV)-associated proteins of the betabaculovirus . Sequencing of the complete genome of the Chinese isolate of P. rapae granulovirus (PiraGV-C) showed a size of 108,592 bp and predicted 120 open reading frames (GenBank, GQ884143) . Although sequencing efforts have been significant, more detailed information about a wide range of isolates inhabiting different geographical regions would provide a more comprehensive overview of baculoviruses and further establish their candidature as pest control agents.
This study is unique, as we have taken advantage of the large-sized genome and high titer of infection of P. rapae granulovirus (Korean isolate) to purify the viral genome away from host DNA by pulse-field gel electrophoresis. The viral DNA is recovered in amounts sufficient for its classical genome sequencing. The procedure requires less starting material than would be necessary if starting with the purification of virus particles from inclusion bodies. The genome sequence produced in this work was through a subcloning approach, without recourse to the use of automated high-throughput next-generation sequencing (NGS) technology.
Materials and Methods
Separation of Nuclei from P. rapae
Larvae of P. rapae were obtained from a mass rearing facility at Hampyeong Insect Institute (Hampyeong, Korea) and were reared in the laboratory on kale leaf at 25±3oC with 60±5% relative humidity, under a 12/12 hr natural light/dark cycle for a short duration. The final instar larvae were dissected to remove the gut and were subsequently ground and centrifuged (5,000 rpm, 10 min, 4oC) to separate the nuclei and remove the cell debris from the solution.
All chemicals used were of analytical grade, and were obtained from Sigma Chemical Co. (St. Louis, MO, USA) until indicated otherwise.
Preparation of High Molecular Weight (HMW) DNA Plugs Embedded in Agarose
HMW DNA is considered vulnerable to mechanical shearing forces and suffers frequent double-stranded breaks. It is thus not suited to large-insert cloning. To prevent HMW DNA from being damaged in the nucleus lysis process, the separated nuclei were embedded in agarose gel. The nuclei were warmed for 5 min at 45oC and were mixed with 1% InCert agarose. The mixture was subsequently poured into a plug mold (BioRad, Hercules, CA), kept on ice and allowed to solidify for 1–2 hr. The agarose plugs were then put into 50 ml of proteinase K lysis buffer (0.5 M EDTA, 1% N-lauroylascosine, 1 mg of proteinase K/ml) and incubated for 24 hr at 50oC. After the subsequent removal of proteinase K lysis buffer from the agarose plugs, the lysis process was repeated, for a further 24 hr. After 2–3 washes in deionized water, the plugs were placed in 50 ml of TE50 buffer (10 mM Tris-HCl, 50 mM EDTA, pH 8.0) and washed for 12 hr. Additional washing was performed for another 12 hr after replacing with TE50 buffer. Subsequently, the plugs were incubated for 2 hr in 0.1 mM phenylmethylsulfonylfluoride (PMSF) buffer at 4oC to inactivate proteinase K, followed by another subsequent wash in TE50 buffer for 24 hr, and were stored in 0.5 M EDTA at 4oC.
Pre-electrophoresis of Agarose Plugs
Next, the agarose plugs were placed in 0.5× TBE buffer (45 mM Tris-base, 1 mM EDTA, 45 mM boric acid) and dialyzed for 3 hr. Subsequently, they were inserted into the preparative slot of 1% pulse- field certified agarose gel, and PFGE was conducted using 0.5× TBE buffer and the CHEF DR-II apparatus (Bio-Rad, Hercules, CA) with a pulse time of 5 s for 10 hr at 12oC and a voltage of 4V/cm. After the electrophoresis, the plugs were removed from the slot, stored in 50 ml of 0.5 M EDTA buffer, and dialyzed overnight at 4oC.
Partial Digestion of Plugs
HMW DNA embedded plugs (n = 10) were placed in 500 µl of an enzyme mixture, consisting of 1 µl EcoRI at a concentration of 2 U/µl, 1 µl EcoRI methylase at a concentration of 40 U/µl (New England Biolabs, Ipswich, MA), 25 µl of 100× Bovine Serum Albumin (10 mg/ml), 5 µl of polyamine (100×), 50 µl of methylase buffer (10×) in 394 µl of DW and equilibrated for 2 hr at 4oC, followed by a 4 hr incubation at 37oC. After digestion, the plugs were treated with 150 µl of 0.5 M EDTA, 37.5 µl of 20% N-lauroylsarcosine and 15 µl of proteinase K (20 mg/ml), and incubated for 1 hr at 37oC to inactivate the endonuclease. Subsequently, PFGE was conducted with a CHEF DR-II apparatus (Bio-Rad) with a pulse time between 0.1 and 40 s for 16 hr at a voltage of 6 V/cm to check the partially digested plugs.
Separation of PiraGV-K DNA from P. rapae Genomic DNA
PiraGV-K DNA was separated by PFGE with an initial pulse time of 0.1 s, a final pulse time of 40 s, a temperature of 12oC and a voltage of 6 V/cm for 14 hr. Furthermore, a lambda ladder PFG marker (New England Biolabs, Ipswich, MA) was used as a size marker to enable the band of PiraGV-K at ∼125 kb to be eluted selectively.
After the PFGE treatment, the edge of the gel, including a size marker, was cut and put into ethidium bromide staining buffer to mark the location of the 125 kb band of PiraGV-K. Subsequently, the eluted portion was placed into a dialysis bag to recover the PiraGV-K DNA using PFGE with a pulse time between 0.1 and 40 s and a voltage of 6 V/cm for 14 hr.
Construction and Characterization of PiraGV-K Fosmid Library
Randomly sheared PiraGV-K DNA was cloned into the Eco72I blunt-end site of the CopyControl pCC1FOS fosmid vector (Epicentre Biotechnologies, Madison, WI). The fosmids were packaged using ultra-high efficiency MaxPlax lambda packaging extracts and plated on TransforMax EPI300 E. coli (Epicentre Biotechnologies, Madison, WI). The quality of the constructed fosmid library was assessed using standard techniques. Of a total of 6,000 clones, 96 were picked randomly and the fosmids were end sequenced from both directions using the primers (forward sequencing primer 5′– GGATGTGCTGCAAGGCGATTAAGTTGG –3′ and reverse sequencing primer 5′– CTCGTATGTTGTGGAATTGTGAGC –3′) to the vector. Stand-alone BLAST was performed for the nucleotide sequences against a locally curated viral sequence database (http://edunabi.com/~prgv/).
Whole Genome Shotgun Sequencing
Based on the mapping data in the locally curated viral sequence database (http://edunabi.com/~prgv/), a minimum tiling path was prepared and four fosmid library clones were selected to construct a shotgun library. The selected fosmid clones were named as NB-FOS-1-1-F40_A05A02 (27 kb), NB-FOS-1-1-F40_A23B06 (33 kb), NB-FOS-1-1-F40_C07D02 (32 kb) and NB-FOS-1-1-F40_E13E04 (37 kb). Equivalent volumes of fosmid DNA clones were digested with NotI to obtain 3-7 kb DNA pieces that were then ligated into a purified pUC118 BamHI/BAP ready vector (Takara Bio Inc., Shiga, Japan) . Ligated products were transformed into E. coli DH5α cells by electroporation and spread on LB (ampicillin, 100 µg/ml) plates. The quality of the library was checked for E. coli genomic DNA contamination and empty vector contamination by cross-match. Plasmid clones that were eight times larger than each of the selected clones were randomly picked for plasmid preparation and sequencing with M13 forward and reverse universal primers using an Applied Biosystems 3730 XL DNA analyzer (Applied Biosystems, Carlsbad, CA) using the cycle sequencing method with fluorescent dye terminators and AmpliTaq DNA polymerase (ABI PRISM BigDye Terminator Cycle Sequencing Ready Reaction, Perkin Elmer, Waltham, MA). Applied Biosystems sequencing software was used for lane tracking, trace extraction and data were transferred to UNIX workstations for further processing.
Genomic DNA Assembly
Contigs were prepared using the software Pregap4, including PHRED , , PHRAP (www.phrap.org), and vector masking on the average read length, clustering and assembling a repeated sequence. The primer walking procedure was used to close remaining gaps. The map of the first clone selected from PiraGV-K was constructed and a clone capable of covering 60 k to 85 k was also screened.
Putative coding regions of PiraGV-K genome was predicted using the Genemark ; Glimmer  and AMIgene  open reading frame (ORF) finding software. ORFs of more than 150 bp were designated as putative genes; the overlap between any two ORFs was set to a maximum of 25 amino acids (aa); otherwise, the longer one was selected. Gene annotations and comparison of the sequences with those in public databases were carried out using the BLAST at National Centre for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/BLAST/). Multiple sequence analysis was performed using Clustal X and GeneDoc (2.7.0). The PiraGV-K genomic DNA sequence was deposited in GenBank under the accession number JX968491.
Twelve betabaculovirus genomes were used to identify gene conservation in PiraGV-K. These genomes were from Adoxophyes orana GV (AdorGV; NC_005038) , Agrotis segetum GV (AgseGV; NC_005839), Choristoneura occidentalis GV (ChocGV; NC_008168) , Cryptophlebia leucotreta GV (CrleGV; NC_005068) , Cydia pomonella GV (CypoGV; NC_002816) , Helicoverpa armigera GV (HearGV; NC_010240) , Phthorimaea operculella GV (PhopGV; NC_004062), P. rapae GV-Chinese isolate (PiraGV-C; NC_013797), Plutella xylostella GV (PlxyGV; NC_002593 , Xestia c-nigrum GV (XecnGV; NC_002331) , Pseudaletia unipuncta GV (PsunGV; NC_013772) and Spodoptera litura GV (SpliGV; NC_009503) . Detailed descriptions of the putative PiraGV-K ORFs, including their positions in the genome, length, and their relationship with AdorGV, AgseGV, ChocGV, CrleGV, CypoGV, HearGV, PhopGV, PiraGV-C, PlxyGV, PsunGV, SpliGV, and XecnGV are presented in Table S1.
The whole-genome data of PiraGV-K and relevant sequence information has been maintained in a database at ‘http://edunabi.com/~prgv/’ for ready reference. The PiraGV-K whole genome sequence is registered under GenBank accession number JX968491.
Results and Discussion
The Electrophoretic Separation Method for PiraGV-K Whole-genome Sequencing
Today, most genome sequencing projects rely on the whole-genome shotgun (WGS) method, which uses the Sanger technique to sequence genomic libraries over conventionally mapped clones using bacterial artificial chromosome (BAC), cosmid or fosmid libraries –. Although WGS strategy has provided rapid access to new gene models from diverse organisms with continued improvements in the assemblers, read lengths and mate pair technologies, the resulting assemblies still remain highly fragmented with an incomplete genomic representation , . This has helped the focus on BAC-based physical map construction and its integration with high-density genetic maps that have benefited from next-generation sequencing (NGS) platforms and high-throughput array platforms , . In this context, fosmids, with a narrower insert range (average of 40 kb), stable maintenance, and easy production, have found wide applications in studies related to structural variation and the organization of genomes –.
The selection of target substances from the environment is the most critical component for the implementation of suitable approaches for whole-genome sequencing. In the case of infectious viruses, the study of the genome is more cumbersome because these agents are difficult to culture and purify. Conventional methods for the purification of genomic DNA fragments present the drawback of obtaining a large number of populations from multiple locations to acquire sufficient high-quality DNA samples for sequence analysis.
The genome sequencing method (Fig. 1), detailed here for the first time, was used to construct fosmid library clones of double-stranded PiraGV-K genome, generating a library size of 100–150 kb corresponding to the genome size of the virus. This approach was successful in the analysis of the PiraGV-K genome, without the need for purifying PiraGV-K from P. rapae, thus simplifying sampling and reducing labor time. This approach provides a significant advantage over traditional protocols for the sequencing of dsDNA genomes and could potentially be used for circular DNA genomes of viruses, although its wider application needs to be further validated. Recently, a report highlighted the importance of sequencing small genomes without the need for standard library preparation using the Pacific Biosciences RS sequencer (the “PacBio”) with as little as 1 ng of DNA . That our method can be performed without the specialized expertise required for virus culturing and purification from their hosts, coupled with its requirement for little time and reliable precision, makes it particularly useful for laboratories lacking sophisticated viral culturing facilities. The limitations of the genome sequencing method purified by the electrophoretic method may lie in the sequencing of RNA viruses, because they are less stable than DNA in nature and may require the maintenance of cultured viral isolates, unlike our approach. A new system for rapid determination of viral RNA sequence (RDV) uses small amounts of RNA to synthesize first- and second- strand cDNAs for library construction and direct sequencing using optimized primers . Although reverse transcription followed by polymerase chain reaction is commonly used for deciphering RNA viral genomes, low-copy number viral samples remain a challenge; sequence-independent methods provide attractive solutions , .
Flow chart showing the electrophoretic method for purification of the virus from the host genome for the construction of fosmid library of PiraGV-K and its significance in comparison with the traditional methods.
In the method described here, HMW DNA embedded agarose plugs of P. rapae were digested with EcoRI, before confirmation of the potential PiraGV-K DNA at 125 kb by PFGE analysis (Fig. 2). The potential PiraGV-K DNA was found readily when EcoRI (8 U) and methylase (20 U) were used after a 2 hr pre-electrophoresis step. The partial digestion step is considered critical for both the construction of the host BAC library, and also converting the viral genome into a family of circularly-permuted linear molecules of genome length. The linear form of the viral genome, thus obtained from the digestion step facilitates efficient separation of the genomic DNA in PFGE. Subsequently, PCR was conducted with different primers, designed to provide variable sizes from the nucleotide sequence of PiraGV-K, to check the validity of the potential PiraGV-K DNA. The PCR product size in all cases was found to be the same as expected for the PiraGV-K DNA sequence (Fig. 3). Subsequently, for effective separation of PiraGV-K DNA, pre-electrophoresis and partial digestion of agarose plugs was repeated with PFGE. Following the PFGE run, the DNA band of 125 kb corresponding to PiraGV-K DNA was eluted, eventually separating PiraGV-K DNA from P. rapae embedded agarose molds (Fig. 4A). The eluted DNA (20 ng) was subsequently electrophoresed in parallel with a 1 kb ladder to validate the separation process (Fig. 4B). The eluted and end-repaired PiraGV-K DNA was ligated into the pCC1FOS vector and the purified products were checked for quality by titering. In total, approximately 6,000 clones resulted, out of which 96 were selected and end-sequenced. To effectively map the fosmid-end sequences, we performed a stand-alone BLAST against a locally constructed viral sequence database. Based on the mapping data from the databases, a minimum tiling path (MTP) was prepared, leading to the selection of four fosmid library clones for the construction of a PiraGV-K shotgun library. The sizes of the four selected fosmid clones, (NB-FOS-1-1-F40_C07D02, NB-FOS-1-1-F40_E13E04, NB-FOS-1-1-F40_A05A02, and NB-FOS-1-1-F40_A23B06), measured by NotI restriction digestion were approximately 32, 37, 27, and 33 kb, respectively (Fig. 5). The shotgun library resulted in a total of 20,000 clones, of which 96 were selected and sequenced (Fig. 6).
HMW DNA embedded agarose plugs of P. rapae confirmed by PFGE, wherein the plugs were partially digested by an enzyme mixture following pre-electrophoresis. ‘M’ represents PFG lambda marker (NEB) and lanes 1–5 depict EcoRI digested DNA molds. A potential PiraGV-K DNA band was seen approximately at 125 kb after PFGE of enzyme digested DNA. PFGE conditions included 1% pulsed field certified agarose gel, a pulse time between 0.1–40 sec for up to 16 hrs and a voltage of 6 V/cm to check for partially digested plugs.
PCR was conducted to check the identity of PiraGV-K with 5 primers designed from the nucleotide sequence of PiraGV-K. The size of the PCR product was the same as the expected size of the nucleotide sequence. Lane 1; primer 1: AY-519253-1 (expected size of 227 bp), lane 2; primer 2: AY-706575-1 (expected size of 223 bp), lane 3; primer 3: AY-428513-1 (expected size of 234 bp), lane 4; primer 4: AY-449794-2 (expected size of 212 bp), lane 5; primer 5: AY-519252-1 (expected size of 231 bp).
(A) Elution of DNA band (approximately 125 kb) of potential PiraGV-K. This indicates that the DNA of PiraGV-K is separated from P. rapae DNA embedded agarose molds. Lanes 1 and 4 show PFG lambda marker (NEB) and lanes 2 and 3 depict EcoRI digested DNA molds. (B) This indicates the concentration of DNA that has been collected by PFGE as determined using a spectrophotometer. Lanes 1 and 2 show eluted DNA (20 ng loading) and a 1 kb ladder, respectively.
Four fosmid clones were selected on the basis of minimum tiling path towards construction of shotgun library. Lane 1, fosmid clone NB-FOS-1-1-F40_C07D02 (approximately 32 kb); Lane 2, fosmid clone NB-FOS-1-1-F40_E13E04 (approximately 37 kb); Lane 3, fosmid clone NB-FOS-1-1-F40_A05A02 (approximately 27 kb); Lane 4, fosmid clone NB-FOS-1-1-F40_A23B06 (approximately 33 kb). Lane ‘M’ is represented by monocot lambda marker.
Genomic DNA or BAC DNA isolation and purification was followed by size fractionation and ligation into a pUC118 ready vector for 4oC followed with transformation by electroporation into DH5α. The quality of thus constructed shotgun library was checked by titering (40 µl of cell stock, white: blue = 400∶100). The number of clones was approximately 20,000 in total. 96 clones were selected and sequenced including insert size check, E. coli and vector % check.
Characteristics of the PiraGV-K Genome Sequence
To date, whole-genome sequencing has been conducted successfully for 60 baculoviruses: 45 were NPVs (41 alphabaculoviruses, 3 gammabaculoviruses and 1 deltabaculovirus). Only 14 complete genomes have been sequenced of betabaculoviruses, including PiraGV-C . The growing number of fully sequenced baculovirus genomes now allows some understanding of the evolutionary history of baculoviruses by comprehensive analyses of nucleotide/protein sequences, gene order, and content , . We have sequenced and analyzed the 108,658 bp PiraGV-K genome purified by electrophoretic method. The approach allows for the determination of the viral sequence with multiple fold redundancy per base position. An 8x sequence of the PiraGV-K genome was compiled from the sequence data generated here. The size of the final draft sequence was 108,658 nt (Fig. 7). The length of the sequence obtained was consistent with the predicted size of PiraGV-C (108,592 nt), differing by only 66 nt. It can thus be categorized as one of the smaller betabaculoviruses sequences, with AdorGV (99,657 nt) being the smallest. XecnGV has a whole genome size of 178,733 nt , which is largest genome among the completely sequenced betabaculoviruses and is closely related to sequences studied from noctuid moths, including Autographa gamma GV, Hoplodrina ambigua GV, Euxoa ochrogaster GV, and Scotogramma trifolii GV . These are closely followed by HearGV, with a genome size of 169,794 bp . PiraGV-K coding sequences represent 92% of the genome, leaving very little noncoding DNA.
ORFs are represented by arrows, with the position and direction of the arrow indicating ORF position and orientation. Red arrows and blue arrows represent forward and reverse strand ORFs in the circular map. VOG code and colors assigned indicate the grouping of the genes according to function.
The PiraGV-K genome has an AT content of 66%, identical to PiraGV-C (66%), and is closely related to CrleGV, having the highest known AT content of 67.6%. This result is consistent with previous findings that the sequenced betabaculovirus genomes are AT-rich, with the lowest AT content of 54.8% observed in case of CypoGV, with an overall average of 62.6%. The difference in AT content is due to the base composition at the third nucleotide position of the codon in the coding regions. It has been established previously that proteins encoded by more extreme AT and GC-rich genomes generally have lower compositional complexity than those of more typical organisms . A consequence of this is that the overall amino acid composition of the peptides in such organisms is skewed. Peptides of AT-rich organisms have higher proportions of Phe, Leu, Ile, Met, Asn, Lys and Tyr that are relatively rare in the organisms with GC-rich genomes. Similar correlation has been noted with smaller data sets in earlier research , , . The end result of this is that organisms with an extreme genome composition encode peptides of a lower complexity, as measured by the global complexity value . It is known that the median global complexity value, G1 for AT-rich genes from a variety of cellular organisms is in the range of 0.72 to 0.78 . Whereas most PiraGV-K ORFs had an AT composition (average 65%) close to the average AT composition of PiraGV-K genome (66%), granulin had an AT composition that was significantly lower at 56% (results not shown). It is to be noted that in case of extremely anchored proteins, such as granulin, it might be impossible for the virus to maintain its preferred nucleotide composition and codon usage and still encode a particular peptide. This observation has been confirmed in other annotated, AT-rich, viral genomes ,  Also, it is understood that, although various ORF prediction methods have been used (Fig. 8), no one method can define all possible ORFs in compositionally extreme (AT or GC-rich) genomes, as is clearly illustrated in the PiraGV-K genome. PiraGV-K granulin had a subjective appearance of an “alien” gene, because the codon usage did not conform to the overall codon usage . However, we believe that granulin represents a specific class of highly expressed, complex peptide that the virus encodes by sacrificing the constraints it maintains on other genes.
The putative coding regions were predicted using GeneMark (Georgia Institute of Technology, Atlanta, Georgia), Glimmer-Microbial gene-finding system (University of Maryland) and AMIgene-an integrated computer environment for sequence annotation and analysis (Institut Pasteur, France) ORFs finding softwares.
The primary criteria used to identify potential ORFs on the PiraGV-K genome were a minimum of 50 aa in length, having minimal overlap with larger ORFs, and sharing significant sequence identity with previously characterized ORFs of betabaculoviruses (Table 1). Also, by convention, the first nucleotide of the methionine start codon of granulin was defined as the first nucleotide of the genome, and the sequence was numbered in the direction of transcription of the gene. As in the case of other baculovirus genomes, minimal overlaps were observed in the PiraGV-K genome sequence with 65 ORFs in the granulin-sense orientation and, 54 in the opposite orientation, clustering together according to expression or function. Homologous repeat regions (hrs), functioning as enhancers of transcription and origins of replication, were also found interspersed in the genome. These repeated sequences have been reported to be more variable in betabaculoviruses than in alphabaculoviruses, where they consist of repeated palindromes. The CypoGV genome includes 13 hrs, as do the XecnGV and HearGV genomes. The AdorGV genome includes nine repeated regions that are unlike typical hrs . Six repeat regions, including one unique hrs, have also been identified in the EppoMNPV genome . In the completely sequenced genome of SpltNPV, 17 hrs were identified . In the AcMNPV, hrs consist of repeated units of about 70 bp with an imperfect 30 bp palindrome near their center, binding to the transcriptional activator ie1 (Ac147) . Also, cAMP and 12-O-tetradecanoylphorbol 13-acetate (TPA) response elements (CRE and TRE)-like sequences, located between hrs palindromes have been found to be evolutionarily conserved in alphabaculoviruses, but were not found in betabaculoviruses.
Genomic sequence identity of PiraGV-K was studied against other known betabaculoviruses genomes, with a maximum identity of 99% with PiraGV-C (Table 2). The 1% difference was thought to be related to the presence of extra nucleotides in the intronic sequences of the PiraGV-K genome and not corresponded to any known ORF. The identity with other genomes was in the order of 42–58%, with greater identity with ChocGV (58.5%), CrleGV (55.78%) and CypoGV (55.6%) genome sequences. Of a total of 120 ORFs, only ORFs 9, 32, and 117 were considered unique to the PiraGV genome sequences of the Korean and Chinese strains. This represents 1.7% of the whole genome sequence. Also, 78 ORFs found in all betabaculoviruses sequences studied, have been called “core GV genes”. Based on gene function, PiraGV-K ORFs have been grouped into four functional categories (Table-3): transcription (10 genes), replication (11 genes), structural (25 genes), and auxiliary (15 genes), with 59 unrepresented in the annotation. The most conserved among the core set of genes was granulin, with 100% identity with PiraGV-C. We compared the identified PiraGV-C ODV associated proteins , with the structural proteins found in PiraGV-K and found that the ORFs complemented and matched each other. PiraGV-K-ORF 1 (granulin), ORF-14 (odv-e18), ORF-15 (p49), ORF-16 (odv-e56), ORF-17 (p10), ORF-39 (odv-e66), ORF-44 (odv-e66a), ORF-45 (ubiquitin), ORF-51 (p74), ORF-61 (pif-1), ORF-71 (p6.9), ORF-75 (helicase-1), ORF-81 (vp39), ORF-82 (odv-e27), ORF-85 (vp91 capsid), ORF-88 (gp-41), ORF-90 (vlf-1), ORF-93 (DNA pol), ORF-95 (lef-3), ORF-118 (fgf-3) and ORF-120 (ME-53) were also among the reported proteins in PiraGV-C. Other proteins common to both the PiraGV genomes were found to be hypothetical or unknown proteins.
PiraGV-K ORF 98 encoded an inhibitor of apoptosis (iap-5) that seems to be betabaculovirus specific . Also, PiraGV-K ORF 37 (homologous to Cypo46, Xecn40, and Plxy35) is likely a member of the stromelysin family within the matrix metalloproteinase (MMP) superfamily. It has been observed that this peptide is retained within infected cells until death, and subsequently is released into the body of the insect, causing proteolysis of tissues , . The most conserved baculovirus gene is polyhedrin/granulin, the major component of occlusion bodies. Another conserved PiraGV-K structural gene was odv-e25 (PiraGV-K, ORF 76), showing 80% amino acid identity to betabaculovirus homologs. In contrast, p24 capsid (PiraGV-K-58, ORF 58), which encodes a protein associated with both ODV and BV , was found to be poorly conserved (60% average amino acid identity to other betabaculoviruses). The p80/p87-capsid gene was absent from the PiraGV-K genome, as with other betabaculovirus genomes. The putative p10 (PiraGV-K, ORF17) gene showed similarities to three XecnGV ORFs (Xecn ORF 5, Xecn ORF 19, and Xecn ORF 83). Homologs of these three ORFs were found in PlxyGV (Plxy ORF 2, Plxy ORF 21, and Plxy ORF 50) and they were thus suggested to be p10 homologs . p10 is implicated in occlusion body morphogenesis and disintegration of the nuclear body matrix, resulting in dissemination of OBs . In NPV-infected cells, p10 forms fibrillar structures in the nucleus and cytoplasm. PiraGV-K ORF 17 showed a significantly low identity 14%, with AcMNPV p10, and was smaller than its counterpart (104 vs 336 amino acids). A high sequence identity of 48% was noted with ClanGV p10, having 101 amino acid residues in relation to other betabaculoviruses.
The PiraGV-K genome did not encode the glycoprotein gp64 that constitutes a major envelope fusion protein in AcMNPV, BmNPV, OpMNPV, and EppoMNPV , . This protein thus appears to be unique to group I NPVs , . Also, 19 lef genes have been found in AcMNPV genomes, and have been implicated in DNA replication and transcription . Early baculovirus genes are transcribed by the host cell RNA polymerase II, but these are often transactivated by genes such as ie-0, ie-1, ie-2, and pe38 . Of these early baculovirus genes, the PiraGV-K genome contained only ie-1 and it was found to be poorly conserved in comparison with other betabaculovirus genomes, except PiraGV-C. These genes have previously been reported to be poorly conserved among baculoviruses. The CypoGV and PhopGV genomes have been reported to have a pe38, consistent with PiraGV-K genome .
Six genes have been described as essential for baculovirus DNA replication: lef-1, lef-2, lef-3, dnapol, helicase and ie-1 . Homologs for all these necessary genes were found in the whole-genome of PiraGV-K with moderately conserved sequences. A PiraGV-K genome-wide scan suggested the absence of a lef-7 homolog. Earlier reports suggested that lef-7 was a group I NPV-specific gene, and stimulated transient DNA replication in AcMNPV and BmNPV , . The PiraGV-K ORFs also encode a DNA ligase (PiraGV-K ORF 102) and a helicase-2 (PiraGV-K ORF 108), in common with LdMNPV and other betabaculovirus genomes. The LdMNPV DNA ligase displays catalytic properties of a type-III DNA ligase . Because the homologs of helicase-2 and DNA ligase are involved in DNA repair and recombination , the PiraGV-K genes likely have similar functions. The PiraGV-K genome lacks large (rr1) and small (rr2) subunits of ribonucleotide reductase and deoxyuridyltriphosphate (dUTPase) genes, that may account for the loss of enzymatic functions during facilitation of virus replication in non-dividing cells, where dNTP pathways are inactive. The lack of these genes has also been noted in alphabaculoviruses, such as AcMNPV, BmNPV, HaSNPV, HzSNPV, and EppoMNPV and other betabaculoviruses, such as PlxyGV and XecnGV , , , , , . Late transcription genes, including lef 4–6, 8–11, 39K, p47, and vlf-1  have been found among the PiraGV-K ORFs, except a lef-10 homolog. The most conserved PiraGV-K lef homolog was lef-8, while lef-6 was the most poorly conserved. It has been understood that the GV lef-6 genes are smaller than the NPV lef-6 genes (86–102 amino acids vs 138–187 amino acids) and were reported in the XecnGV genome .
Chitinase  and cathepsin were present as auxiliary genes in the PiraGV-K genome. These genes have been identified in almost all the baculoviruses completely sequenced to date, except PlxyGV  and AdorGV . The protein products encoded by these genes provide selective advantages in the breakdown of insect tissues at the end of infection and the release of OBs to the environment, which then spread horizontally . The lack of the same in the cases of the PlxyGV and AdorGV genomes may account for the infected larvae not lysing at the end of infection; this may lead to the spread of viral infection by discharging large amounts of virus from their posterior ends. PiraGV-K ORF 50 corresponded to superoxide dismutase (sod), a well-conserved gene in baculoviruses. Among the betabaculoviruses, it was not reported in the SpliGV genome, although it is known in other betabaculoviruses. Although, SOD functions as an endogenous antioxidant, its proper function in baculoviruses remains unknown. Gene deletion studies conducted in AcMNPV did not show any deleterious effect , although it may be predicted that SOD may protect OBs from superoxide radicals generated by exposure to sunlight in the environment.
PiraGV-K ORF 45 corresponded to a ubiquitin protein, which have been identified in all baculoviruses sequenced to date, although it was found fused to gp37 as a single ORF in SpltMNPV . Apart from polyhedrin and granulin , it is also one of the most highly conserved genes in the baculovirus genome, with 73% average amino acid identity to betabaculovirus homologs. Interestingly, the homolog of viral ubiquitin has not been reported in AcMNPV-ODV or HearNPV-ODV, but is known in AcMNPV-BV . Per os infectivity factors (pif), another highly conserved gene, involved in oral infectivity of baculovirus ODV, has been characterized from almost all baculovirus genomes sequenced so far. We identified ORF 61, corresponding to pif-1, and ORF 16, corresponding to ODV-E56, also known as pif-5  in the PiraGV-K genome. Although pif-1 and p74 (ORF 51 in the PiraGV-K genome) have been proposed to form structural components of the ODV envelope and may regulate infectivity of OBs, pif-5 is not an essential protein for binding and fusion of ODV or virus replication , . Additionally, the PiraGV-K genome was found to contain three putative fibroblast growth factors (fgf), represented by ORFs 62, 105, and 118. These fgfs contained the fgf superfamily domains, as determined by a conserved family domain search with the BLAST program. No enhancin homolog was found in PiraGV-K genome and is consistent with the absence of the same in the AdorGV, CypoGV and PlxyGV genomes. In contrast to the above betabaculovirus genomes, four enhancin homologs were reported in XecnGV, two in LdMNPV, and one in MacoNPV. Enhancin functions in disrupting the insect peritrophic membrane, and facilitates the initiation of infection . PiraGV-K ORF 13 corresponded to the gp37 homolog (spindling acting as enhancing factor) that was shown to be absent from the AdorGV, AgseGV, ChocGV, CrleGV, PhopGV, PlxyGV, and SpliGV genomes, although the ORF was reported in the CypoGV, HearGV, PsunGV, XecnGV, and PiraGV-C genomes.
Furthermore, PiraGV-K was found to lack a conotoxin-like (ctl) homolog, as reported in the BmNPV, SeMNPV, HaSNPV, AdorGV, CypoGV, and PlxyGV genomes, although a ctl homolog has been identified in the genome of XecnGV. The ORF contains a six-cysteine motif similar to that in chitin-binding proteins . A gene encoding protein kinase 1 (pk-1; PiraGV-K ORF 3) was also identified in the whole-genome sequence of PiraGV-K; this may be involved in the regulation of the phosphorylation status of viral and host proteins during infection. Two members of the iap genes, corresponding to iap (PiraGV-K ORF 79) and iap5 (PiraGV-K ORF 98), were also identified in the PiraGV-K genome. Although the p35 with antiapoptotic activity has been identified previously in the AcMNPV, BmNPV, and SpltMNPV genomes, it is absent from betabaculovirus genomes. The iap homologs generally contain two baculovirus IAP repeats (BIP) , that are associated with binding to apoptosis-inducing proteins , and a C-terminal zinc finger-like (RING) Cys/His motif . The iap-5 appears to be GV-specific, and all betabaculoviruses sequenced to date have iap-5. PiraGV-K ORF 94 is a homolog of Plxy ORF 94, named desmoplakin because it shows similarity to an internal region of a human desmoplakin, an essential constituent of intracellular junctions . Baculovirus-repeated ORFs (bro) have not been seen in the PiraGV-K genome, although truncated versions have been observed in CpGV . These repeats are more conspicuously present in many baculoviruses (1 and 16 copies), although their function is unclear, with the possibility of binding to DNA.
Two uncharacterized ORFs were also identified in the whole genome sequence of PiraGV-K and PiraGV-C, indicated as PiraGV-K ORF 9 and PiraGV-K ORF 117.
There has been a significant increase in the number of whole-genome sequencing projects using the shotgun method, but traditional mapped clone methods using BAC, cosmid, and fosmid libraries remain an important intermediate layer for hybrid sequencing strategies. With a view towards advancing the whole-genome sequencing strategies of infectious viruses, we adopted a method for the construction of a fosmid library of virus mixed with the infected host and further screening only the viral genomic library. The method overcomes the often-difficult need to culture and purify viruses by traditional methods of genome analysis and reduces the difficulties in obtaining starting material than would be necessary if starting with the purification of virus particles from inclusion bodies. The viral DNA is recovered in amounts sufficient for classical genome sequencing, without recourse to the use of automated high-throughput NGS technology. Thus, the analysis of the genome of PiraGV-K by the novel method of electrophoretic separation provides significant advances towards analysis of other infectious viruses.
Analysis and homology search of PiraGV-K ORFs. The PiraGV-K ORFs have been analyzed for homology using representative granulovirus genomes such as Adoxophyes orana granulovirus (AdorGV), Agrotis segetum granulovirus (AgseGV), Choristoneura occidentalis granulovirus (ChocGV), Cryptophlebia leucotreta granulovirus (CrleGV), Cydia pomonella granulovirus (CypoGV), Helicoverpa armigera granulovirus (HearGV), Phthorimaea operculella granulovirus (PhopGV), Pieris rapae granulovirus-Chinese isolate (PiraGV-C), Plutella xylostella granulovirus (PlxyGV), Pseudaletia unipuncta granulovirus (PsunGV), Spodoptera litura granulovirus (SpliGV) and Xestia c-nigrum granulovirus (XecnGV). Pid and Psi refers to percent identity and percent similarity.
Conceived and designed the experiments: YSL YSH BBP. Performed the experiments: YHJ SWK SHC MYN GWS HJH. Analyzed the data: BBP SO HCJ JYN JEJ KK. Contributed reagents/materials/analysis tools: YSL YSH BBP DHK. Wrote the paper: BBP YSL YSH KK.
- 1. Garcia-Maruniak A, Maruniak JE, Zanatto PMA, Doumbouya AE, Liu JC, et al. (2004) Sequence analysis of the genome of the Neodiprion sertifera Nucleopolyhedrovirus. J Virol 78: 7036–7051.
- 2. Jehle JA, Lange M, Wang H, Zhihong H, Wang Y, et al. (2006) Molecular identification and phylogenetic analysis of baculoviruses from Lepidoptera. Virology 346: 180–193.
- 3. Lauzon HA, Garcia-Maruniak A, Zanatto PM, Clemente JC, Herniou EA, et al. (2006) Genomic comparison of Neodiprion sertifera and Neodiprion lecontei nucleopolyhedroviruses and identification of potential hymenopteran baculovirus-specific open reading frames. J Gen Virol 87: 1477–1489.
- 4. Hashimoto Y, Hayakawa T, Ueno Y, Fugita T, Sano Y, et al. (2000) Sequence analysis of the Plutella xylostella granulovirus genome. Virology 275: 358–372.
- 5. Hayakawa T, Ko R, Okano K, Seong S, Goto C, et al. (1999) Sequence analysis of the Xestia c-nigrum granulovirus genome. Virology 262: 277–297.
- 6. Hayakawa T, Rohrmann GF, Hashimoto Y (2000) Patterns of genome organization and content in lepidopteran baculoviruses. Virology 278: 1–12.
- 7. Lange M, Jehle JA (2003) The genome of the Cryptophlebia leucotreta granulovirus. Virology 317: 220–236.
- 8. Afonso CL, Tulman ER, Lu Z, Balinsky CA, Moser BA, et al. (2001) Genome sequence of a baculovirus pathogenic for Culex nigripalpus. J Virol 75: 11157–11165.
- 9. Moser BA, Becnel JJ, White SE, Alfonso C, Kutish G, et al. (2001) Morphological and molecular evidence that Culex nigripalpus baculovirus is an unusual member of the family Baculoviridae. J Gen Virol 82: 283–297.
- 10. Winstanley D, Crook NE (1993) Replication of Cydia pomonella granulosis virus in cell cultures. J Gen Virol 74: 1599–1609.
- 11. Wang XF, Zhang BQ, Xu HJ, Cui YJ, Xu YP, et al. (2011) ODV-associated proteins of the Pieris rapae granulovirus. J Prot Res 10: 2817–2827.
- 12. Zhang BQ, Cheng RL, Wang XF, Zhang CX (2012) The genome of Pieris rapae granulovirus (Genome announcement). J Virol 86: 9544.
- 13. Yanisch-Perron CJ, Vieira J, Messing J (1985) Improved M13 phage cloning vectors and host strains: nucleotide sequencing of the M13mp18 and pUC9 vectors. Gene 33: 103–119.
- 14. Ewing B, Green P (1998) Base-calling of automated sequencer traces using PHRED. II. Error probabilities. Genome Res 8: 186–194.
- 15. Ewing B, Hillier L, Wendl M, Green P (1998) Base-calling of automated sequencer traces using PHRED. I. Accuracy assessment. Genome Res 8: 175–185.
- 16. Borodovsky M, McIninch JD (1993) GeneMark: Parallel gene recognition for both DNA strands. Comp Chem 17: 123–133.
- 17. Delcher AL, Harmon D, Kasif S, White O, Salzberg SL (1999) Improved microbial gene identification with GLIMMER. Nucleic Acids Res 27: 4636–4641.
- 18. Bocs S, Cruveiller S, Vallenet D, Nuel G, Medique C (2003) AMIGene: Annotation of microbial genes. Nucleic Acids Res 31: 3723–3726.
- 19. Wormleaton S, Kuzio J, Winstanley D (2003) The complete sequence of the Adoxophyes orana granulovirus genome. Virology 311: 350–365.
- 20. Escasa SR, Lauzon HAM, Mathur AC, Krell PJ, Arif BM (2006) Sequence analysis of the Choristoneura occidentalis granulovirus genome. J Gen Virol 87: 1917–1933.
- 21. Luque T, Finch R, Crook N, O’Reilly, Winstanley D (2001) The complete sequence of the Cydia pomonella granulovirus genome. J Gen Virol 82: 2531–2547.
- 22. Chen XW, Ijkel WFJ, Tarchini R, Sun XL, Sandbrink H, et al. (2001) The sequence of the Helicoverpa armigera single nucleocapsid nucleopolyhedrovirus genome. J Gen Virol 82: 241–257.
- 23. Wang Y, Choi JY, Roh JY, Liu Q, Tao XY, et al. (2011) Genomic sequence analysis of granulovirus isolated from the tobacco cutworm, Spodoptera litura. PloS ONE 6(11): e28163 Doi:10:1371/journal.pone.0028163.
- 24. Park TH, Park BS, Kim JA, Hong JK, Jin M, et al. (2011) Construction of random sheared fosmid library from Chinese cabbage and its use for Brassica rapa genome sequencing project. J Gent Genomics 38: 47–53.
- 25. Huang SW, Lin YY, You EM, Liu TT, Shu HY, et al. (2011) Fosmid library end sequencing reveals a rarely known genome structure of marine shrimp Penaeus monodon. BMC Genomics 12: 242.
- 26. Ariyadasa R, Stein N (2012) Advances in BAC-based physical mapping and map integration strategies in plants. J Biomed Biotech doi: 10.1155/2012/184854.
- 27. Feuillet C, Leach JE, Rogers J, Schnable PS, Eversole K (2011) Crop genome sequencing: lessons and rationales. Trends Plant Sci 16: 77–88.
- 28. Alkan C, Sajjadian S, Eichler EE (2011) Limitations of next generation genome sequence assembly. Nature Methods 8: 61–65.
- 29. Schatz MC, Delcher AL, Salzberg SL (2010) Assembly of large genomes using second-generation sequencing. Genome Res doi: 10.1101/gr.101360.109.
- 30. Kim CG, Fujiyama A, Saitou N (2003) Construction of a gorilla fosmid library and its PCR screening system. Genomics 82: 571–574.
- 31. Meyer JDF, Deleu W, Garcia-Mas J, Havey MJ (2008) Construction of a fosmid library of cucumber (Cucumis sativus) and comparative analyses of the Eif4E and eIF(iso)4E regions from cucumber and melon (Cucumis melo). Mol Genet Genomics 279: 473–480.
- 32. Hao DC, Ge G, Xiao P, Zhang Y, Yang L (2011) The first insight into the tissue specific Taxus transcriptome via Illumina second generation sequencing. PLoS one 6, doi:10.1371/journal.pone.0021220.
- 33. Coupland P, Chandra T, Quail M, Reik W, Swerdlow H (2012) Direct sequencing of small genomes on the Pacific Biosciences RS without library preparation. BioTechniques 53: 365–372.
- 34. Mizutani T, Endoh D, Okamoto M, Shirato K, Shimizu H, et al. (2007) Rapid genome sequencing of RNA Viruses. Emerging Infectious Diseases 13: 322–324.
- 35. Malboeuf CM, Yang X, Charlebois P, Qu J, Berlin AM, et al. (2013) Complete viral RNA genome sequencing of ultra-low copy samples by sequence-independent amplification. Nucleic Acids Res 41(1): e13 Doi:https://doi.org/10.1093/nar/gks794.
- 36. Ninomiya M, Ueno Y, Funayama R, Nagashima T, Nishida Y, et al. (2012) Use of illumina deep sequencing technology to differentiate hepatitis C virus variants. J Clin Microbiol 50: 857–866.
- 37. Li L, Donly C, Li Q, Willis LG, Keddie BA, et al. (2002a) Identification and genomic analysis of a second species of nucleopolyhedrovirus isolated from Mamestra configurata. Virology 297: 226–244.
- 38. Li Q, Donly C, Li L, Willis LG, Theilmann DA, et al. (2002b) Sequence and organization of the Mamestra configurata nucleopolyhedrovirus genome. Virology 294: 106–121.
- 39. Lange M, Wang H, Zhihong H, Jehle JA (2004) Towards a molecular identification and classification system of lepidopteran-specific baculovirus. Virology 325: 36–47.
- 40. Harrison RL, Popham HJ (2008) Genomic sequence analysis of a granulovirus isolated from the Old World bollworm, Helicoverpa armigera. Virus Genes 36: 565–581.
- 41. Wan H, Wootton JC (2000) A global compositional complexity measure for biological sequences: AT-rich and GC-rich genomes encode less complex proteins. Comput Chem 24: 71–94.
- 42. Jukes TH, Bhushan V (1986) Silent nucleotide substitutions and G+C content of some mitochondrial and bacterial genes. J Mol Evol 24: 39–44.
- 43. D’Onofrio G, Mouchiroud D, Aissani B, Gautier C, Bernardi G (1991) Correlations between the compositional properties of human genes, codon usage, and amino acid composition of proteins. J Mol Evol 32: 504–510.
- 44. Mount DW (2001) Bioinformatics: Sequence and Genome analysis. Cold Spring Harbor, NY, p.308.
- 45. Hilton S, Winstanley D (2008) Genomic sequence and biological characterization of a nucleopolyhedrovirus isolated from the summer fruit tortrix, Adoxophyes orana. J Gen Virol 89: 2898–2908.
- 46. Karlin S (1998) Global dinucleotide signatures and analysis of genomic heterogeneity. Curr Opin Microbiol 1: 598–610.
- 47. Hyink O, Dellow RA, Olsen MJ, Caradoc-Davies KMB, Drake K, et al. 2002. Whole genome analysis of the Epiphyas postvittana nucleopolyhedrovirus. J Gen Virol 83: 957–971.
- 48. Pang Y, Yu J, Wang L, Hu X, Li W, et al. (2001) Sequence analysis of the Spodoptera litura multicapsid nucleopolyhedrovirus genome. Virology 287: 391–404.
- 49. Leisy DJ, Rasmussen C, Kim HT, Rohrmann GF (1995) The Autographa californica nuclear polyhedrosis virus homologous region 1a: identical sequences are essential for DNA replication activity and transcriptional enhancer function. Virology 208: 742–752.
- 50. Ko R, Okano K, Maeda S (2000) Structural and functional analysis of the Xestia c-nigrum granulovirus matrix metalloproteinase. J Virol 74: 11240–11246.
- 51. Wolgamot GM, Gross CH, Russell RLQ, Rohrmann GF (1993) Immunocytochemical characterization of p24, a baculovirus capsid-associated protein. J Gen Virol 74: 103–107.
- 52. van Oers MM, Vlak JM (1997) The baculovirus 10-kDa protein. J Invertebr Pathol 70: 1–17.
- 53. Monsma SA, Oomens AG, Blissard GW (1996) The GP64 envelope fusion protein is an essential baculovirus protein required for cell-to-cell transmission of infection. J Virol 70: 4607–4616.
- 54. Pearson MN, Groten C, Rohrmann GF (2000) Identification of the Lymantria dispar nucleopolyhedrovirus envelope fusion protein provides evidence for a phylogenetic division of the Baculoviridae. J Virol 74: 6126–6131.
- 55. Ijkel WFJ, Westenberg M, Goldbach RW, Blissard GW, Vlak JM, et al. (2000) A novel baculovirus envelope fusion protein with a proprotein convertase cleavage site. Virology 275: 30–41.
- 56. Rapp JC, Wilson JA, Miller LK (1998) Nineteen baculovirus open reading frames, including LEF-12, support late gene expression. J Virol 72: 10197–10206.
- 57. Friesen PD (1997) Regulation of baculovirus early gene expression. In: L.K. Miller (ed). The Baculoviruses, Plenum Press. 141–170.
- 58. Lu A, Miller LK (1997) Regulation of baculovirus late and very late gene expression. In: L.K. Miller (ed). The Baculoviruses, Plenum Press. 193–216.
- 59. Gomi S, Zhou CE, Yih W, Majima K, Maeda S (1997) Deletion analysis of four of eighteen late gene expression factor gene homologues of the baculovirus, BmNPV. Virology 230: 35–47.
- 60. Morris TD, Todd JW, Fisher B, Miller LK (1994) Identification of lef-7: a baculovirus gene affecting late gene expression. Virology 200: 360–369.
- 61. Pearson MN, Rohrmann GF (1998) Characterization of a baculovirus encoded ATP-dependent DNA ligase. J Virol 72: 9142–9149.
- 62. Kuzio J, Pearson MN, Harwood SH, Funk CJ, Evans JT, et al. (1999) Sequence and analysis of the genome of a baculovirus pathogenic for Lymantria dispar. Virology 253: 17–34.
- 63. Ayres MD, Howard SC, Kuzio J, Lopez-Ferber M, Possee RD (1994) The complete DNA sequence of Autographa californica nuclear polyhedrosis virus. Virology 202: 586–605.
- 64. Gomi S, Majima K, Maeda S (1999) Sequence analysis of the genome of Bombyx mori nucleopolyhedrovirus. J Gen Virol 80: 1323–1337.
- 65. Lu A, Miller LK (1997) Regulation of baculovirus late and very late gene expression. In: L.K. Miller (ed.) The Baculoviruses, Plenum Press. 193–216.
- 66. Oh S, Kim DH, Patnaik BB, Jo YH, Noh MY, et al.. (2013) Molecular and immunohistochemical characterization of the chitinase gene from Pieris rapae granulovirus. Arch Virol DOI 10.1007/s00705-013-1649-z.
- 67. Hawtin RE, Zarkowska T, Arnold K, Thomas CK, Gooday GW, et al. (1997) Liquefaction of Autographa californica nucleopolyhedrovirus-infected insects is dependent on the integrity of virus-encoded chitinase and cathepsin genes. Virology 238: 243–253.
- 68. Tomalski MD, Eldridge R, Miller LK (1991) A baculovirus homolog of a Cu/Zn superoxide dismutase gene. Virology 184: 149–161.
- 69. Oh S, Kim DH, Patnaik BB, Jo YH, Noh MY, et al. (2013) Molecular and immunohistochemical characterization of granulin gene encoded in Pieris rapae granulovirus genome. J Invertebr Pathol 113: 7–17.
- 70. Wang RR, Deng F, Hou DH, Zhao Y, Guo L, et al. (2010) Proteomics of the Autographa californica nucleopolyhedrovirus budded virions. J Virol 84: 7233–7242.
- 71. Xiang XW, Chen L, Guo AQ, Yu SF, Yang R, et al. (2011) The Bombyx mori nucleopolyhedrovirus (BmNPV) ODV-E56 envelope protein is also a per os infectivity factor. Virus Res 155: 69–75.
- 72. Song JJ, Wang RR, Deng F, Wang HL, Hu ZH (2008) Functional studies of per os infectivity factors of Helicoverpa armigera single nucleocapsid nucleopolyhedrovirus. J Gen Virol 89: 2331–2338.
- 73. Wang P, Grandos RR (1998) Observations on the presence of the peritrophic membrane in larval Trichoplusia ni and its role in limiting baculovirus infection. J Invertebr Pathol 72: 57–62.
- 74. Jiang H, Liu S, Zhao P, Pope C (2009) Recombinant expression and biochemical characterization of the catalytic domain of acetylcholinesterase-1 from the African malaria mosquito, Anopheles gambiae.. Insect Biochem Mol Biol 39: 646–653.
- 75. Dall D, Luque T, O’Reilly D (2001) Insect-virus relationships: sifting by informatics. Bioessays 23: 184–193.
- 76. Birnbaum MJ, Clem RJ, Miller LK (1994) An apoptosis inhibiting gene from a nuclear polyhedrosis virus encoding a polypeptide with Cys/His sequence motifs. J Virol 68: 2521–2528.
- 77. Vucic D, Kaiser WJ, Harvey AJ, Miller LK (1997) Inhibition of Reaper-induced apoptosis by interaction with inhibitor of apoptosis proteins (IAPs). Proc Natl Acad Sci USA 94: 10183–10188.
- 78. Crook NE, Clem RJ, Miller LK (1993) An apoptosis-inhibiting baculovirus gene with a zinc-finger motif. J Virol 67: 2168–2174.