Characterization of joining sites of a viral histone H4 on host insect chromosomes

A viral histone H4 (CpBV-H4) is encoded in a polydnavirus, Cotesia plutellae bracovirus (CpBV). It plays a crucial role in parasitism of an endoparasitoid wasp, C. plutellae, against diamondback moth, Plutella xylostella, by altering host gene expression in an epigenetic mode by its N-terminal tail after joining host nucleosomes. Comparative transcriptomic analysis between parasitized and nonparasitized P. xylostella by RNA-Seq indicated that 1,858 genes were altered at more than two folds in expression levels at late parasitic stage, including 877 up-regulated genes and 981 down-regulated genes. Among parasitic factors altering host gene expression, CpBV-H4 alone explained 16.3% of these expressional changes. To characterize the joining sites of CpBV-H4 on host chromosomes, ChIP-Seq (chromatin immunoprecipitation followed by deep sequencing) was applied to chromatins extracted from parasitized larvae. It identified specific 538 ChIP targets. Joining sites were rich (60.2%) in AT sequence. Almost 40% of ChIP targets included short nucleotide repeat sequences presumably recognizable by transcriptional factors and chromatin remodeling factors. To further validate these CpBV-H4 targets, CpBV-H4 was transiently expressed in nonparasitized host at late larval stage and subjected to ChIP-Seq. Two kinds of ChIP-Seqs shared 51 core joining sites. Common targets were close (within 1 kb) to genes regulated at expression levels by CpBV-H4. However, other host genes not close to CpBV-H4 joining sites were also regulated by CpBV-H4. These results indicate that CpBV-H4 joins specific chromatin regions of P. xylostella and controls about one sixth of the total host genes that were regulated by C. plutellae parasitism in an epigenetic mode.


Introduction
Polydnaviruses (PDVs) are a group of insect viruses symbiotic to some endoparasitoid wasps belonging to families Braconidae and Ichneumonidae. Depending on host wasp families, PDVs are divided into two genera: Ichnovirus (IV) and Bracovirus (BV) [1]. PDV genome consists of genome encapsidated into viral particles during replication and nonencapsidated genome [2]. Its nonencapsidated genome appears to be essential to the replication of encapsidated genome. Some gene products of the nonencapsidated genome are likely viral coat a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 levels: nonparasitized larvae ('NP5'), parasitized larvae ('P7'), nonparasitized larvae transiently expressing CpBV-H4 ('vH4'), and truncated CpBV-H4 ('vH4T'). All four transcriptomes had reads of more than 60 Gb in size. More than 50% of these sequences were mapped to P. xylostella genome sequence with P7 having lower mapping rate (S1 Fig). Relatively low mapping ratio of P7 might be due to contaminants of immature C. plutellae wasp and teratocyte transcripts because all reads were mapped only to P. xylostella genome. Using these four transcriptomes, differentially expressed gene (DEG) analyses were performed between NP5 and P7 to obtain specific host genes regulated by parasitism or between vH4 and vH4T to search for specific host genes regulated only by CpBV-H4 expression (Fig 1). Expression pattern analysis showed that the number of major clades in the comparison between P7 and NP5 was higher than that between vH4 and vH4T ( Fig 1A). DEG analysis between P7 and NP5 showed that 1,858 host genes exhibited more than 2-fold of changes in expression level after parasitism, including 877 up-regulated genes and 981 down-regulated genes ( Fig 1B). It also showed that 1,190 host genes exhibited more than 2-fold of changes in expression level after CpBV-H4 expression by comparing the transcript levels between vH4 and vH4T treatments, including 431 up-regulated host genes and 759 down-regulated host genes.
To determine the influence of CpBV-H4 on the regulation of host genes during C. plutellae parasitism, parasitism-specific DEG ('P7/NP5', 1,858 genes) and CpBV-H4 DEG ('vH4/vH4T', 1,190 genes) were compared (Fig 2). This comparison resulted in 302 genes overlapped between the two DEGs, including 81 up-regulated genes (Fig 2A and Table 1) and 221 downregulated genes ( Fig 2B and Table 2). Almost half of these genes up-regulated by CpBV-H4 had no functional annotation whereas almost 70% genes down-regulated by CpBV-H4 were predicted to have functions in development and metabolism. These DEG analyses indicated that CpBV-H4 expression contributed 16.3% (302/1,858 x 100) to C. plutellae parasitism with respect to host gene regulation. In addition, CpBV-H4 appeared to control specific target genes because host genes regulated by CpBV-H4 were significantly (X 2 = 1,804.12; df = 10; P < 0.0001) different in functional categories compared to those regulated by C. plutellae parasitism ( Fig 2C).

Sequence specificity of CpBV-H4-joining sites on host chromosomes
To determine the specificity of CpBV-H4-joining sites on host chromosomes, ChIP-Seq analysis was performed using a polyclonal antibody raised against CpBV-H4. ChIP isolated 1,498 targets against chromatins originated from larvae at late parasitism ('P7'). After deleting nonspecific sites detected from ChIP-Seq analysis using chromatins originated from NP5 larvae, P7-specific ChIP targets had 538 sites (Fig 3A). Similar ChIP-Seq was performed against chromatins extracted from larvae transiently expressing CpBV-H4 and 394 targets were obtained after deleting nonspecific sites detected from ChIP-Seq analysis using larvae expressing truncated CpBV-H4. There were 51 common core target genes ('vH4-ChIP1~vH4-ChIP51') between P7-specific ChIP targets and CpBV-H4-specific ChIP targets (Table 3). Genes close (within 1 kb) to these 51 core target sites were predicted to have functions in development, metabolism, immunity, signaling, and gene expression (Fig 3B).
Sequences of P7-specific ChIP targets (538 sites) were further analyzed in order to find any consensus sequences (Fig 4). For this analysis, 480 target sequences were used while the other 58 target sequences did not have unique match with P. xylostella genome. These 480 targets were rich (60.2%) in AT content. Almost 40% targets contained two nucleotide repeat sequences such as GT-(14.9%), AC-(12.4%), CT-(6.3%), and AG-(4.6%) repeats ( Fig 4A and  S2 Fig). Three nucleotide repeat sequences (CAT and TGA), four nucleotide repeat sequences (TACA, TCAC, TGAG, TCTG, GTCT, and TAGA), and five nucleotide repeat sequences (TTCTG, CAATA, and ATTCT) were also detected (S2 Fig). GT and AC repeat-joining sites were close to genes that were up-or down-regulated in their expression during C. plutellae parasitism (Fig 4B). In contrast, CT-and AG-repeat joining sites were associated with up-regulated genes. These repeats were located in 45.8% sites at the upstream ('UP'), 20.8% sites at the downstream ('DOWN'), and 33.3% sites at the gene body ('GB') ( Table 4). Among these repeats, more than 50% CT-and AG-repeats were located at GB. Most repeat motifs were likely to be associated with DNA elements recognized by transcriptional factors or chromatin remodeling factors (S2 Table). Among the 51 core targets of CpBV-H4, more than 60% possessed GT-or AC-repeats.
Physical mapping of 51 core joining sites of CpBV-H4 on host chromosomes The 51 core joining sites of CpBV-H4 were mapped onto the chromosomes of P. xylostella ( Fig 5). P. xylostella chromosomes have been characterized with 30 linkage groups and uncharacterized scaffolds [18]. Fifteen CpBV-H4 joining sites were located on 11 chromosomes while the remaining 36 CpBV-H4 sites were found on uncharacterized chromosome(s). On this mapping, 302 host DEGs regulated by CpBV-H4 during C. plutellae parasitism were applied (see vertical lines in Fig 5). These target genes were not only distributed on chromosomes containing CpBV-H4 targets, but also distributed on other chromosomes without CpBV-H4 targets.

Influence of relative locality of CpBV-H4 targets on expression of nearby genes
We then tested whether there was any positional effect of CpBV-H4 targets on regulating gene expression of nearby genes. To address this question, 15 CpBV-H4 targets ('vH4-ChIP1ṽ H4-ChIP15') were further divided into three relative localities in each nearby gene on coding DNA sequence or gene body ('GB'), upstream intergenic region ('UP'), and downstream intergenic region ('DOWN'). Subsequently, when relative localities and expressional changes of nearby genes induced by CpBV-H4 expression were compared, individual ChIPs were not evenly distributed on all tested genes. They were concentrated around target gene sites (red color in (Fig 6) except vH4-ChIP2 and vH4-ChIP3. This might be due to the fact that target genes were determined by ChIP frequency in a specific spot, not by total ChIP frequency in each locality. Host genes around CpBV-H4 targets exhibited expressional changes (either upor down-regulated). Relative distribution of CpBV-H4 target sites was further analyzed to determine any positional effect on regulating the expression level of nearby genes (Fig 7). In this analysis, 75 neighboring genes including 15 CpBV-H4-joining sites were assessed. In these CpBV-H4 joining sites, ChIP targets were relatively evenly distributed on three positions ( Fig 7A). The less number of major ChIPs by about two folds of change at GB might be due to difference in domain size because the average GB domain length (% 1.5 kb) was shorter by almost two folds than UP (% 2.5 kb) and DOWN (% 3.0 kb) domain lengths. Each gene was classified by the presence of the main ChIP peak among the three relative localities (S3 Table  and Fig 7B). Although different genes had different locations of main ChIP peaks, the three ChIP positions did not have significant effect on the direction of gene regulation (X 2 = 0.3995; df = 2; P = 0.8189). transcripts in their FPKM levels between NP5 and P7 or between vH4 and vH4T. Red spots indicate transcripts exhibiting more than two-fold change ('FC') in expression level. 'UP' and 'DOWN' represent up-regulation and down-regulation of gene expression, respectively, in the two DEG groups.

Discussion
Parasitism by endoparasitoid wasps using PDV alters host gene expression to allow internal environment to be favorable for parasite development [19]. Two different endoparasitoid wasps, Diadegma semiclausum and C. plutellae possessing IV and BV, respectively, can parasitize P. xylostella and significantly alter host gene expression [20,21]. Several viral factors have been proposed to be able to regulate host gene expression in CpBV [10]. CpBV-IkB is a viral IkB possessing ankyrin repeat. However, it lacks regulatory domain and suppresses antimicrobial peptide gene expression in response to immune challenge [22]. Two homologous proteins, CpBV15α and CpBV15β, can interrupt eukaryotic translation initiation factors and act as translational inhibitory factors against mRNAs of P. xylostella [16,23]. Recently, a cys-motif protein of CpBV has been reported to be able to suppress the translation of host mRNAs [24]. CpBV-H4 is known to be able to regulate host gene expression in transcriptional level by an epigenetic mode [14,16]. Thus, parasitism of C. plutellae can regulate host gene expression at both transcription level and translational level.
This study determined how many host genes of P. xylostella were regulated at transcriptional level by C. plutellae parasitism using RNA-Seq. In addition, this study determined the contribution of CpBV-H4 to host genes regulated by C. plutellae parasitism. DEG analysis of transcriptomes between parasitized and nonparasitized larvae at late stage determined host genes regulated by parasitism, in which 877 genes were up-regulated and 981 genes were down-regulated. These regulated host genes were predicted to have various functional categories including metabolism, development, immunity, signaling, and gene expression regulator. Venom proteins, ovarian proteins, teratocyte, and CpBV are known to be parasitic factors of C. plutellae [25,26]. Venom and ovarian proteins produced from female wasp are delivered to parasitized host along with eggs to suppress host immunity or help PDV enter target cells [27,28]. It has been suggested that these two factors play crucial roles during early parasitic stage [29]. However, the current study analyzed host gene expression at late parasitic stage. Thus, the altered host gene expression might be induced mainly by teratocyte and CpBV. It has been reported that P. xylostella larvae injected with teratocytes exhibit significant difference in expression pattern compared to untreated control larvae [26]. P. xylostella larvae injected with CpBV also can suppress host gene expression [30]. Especially, an SSH analysis has shown that CpBV-H4 alone can inhibit the expression levels of at least 115 host genes at transcription level [14]. Thus, the 1,858 host genes regulated by C. plutellae parasitism found in this study might have been influenced by both teratocyte and CpBV. Furthermore, this current study was focused on the single influence of CpBV-H4 on host gene regulation and how much it contributed to parasitism with respect to host gene regulation. Using the in vivo transient expression technique used to determine physiological functions of PDV genes in a previous study [31], CpBV-H4 was transiently expressed in nonparasitized larvae in this study. A truncated CpBV-H4 prepared by removing 38 amino acid residues at the N-terminal tail used in a previous study [11] was also expressed in the same aged larvae as a control. DEG analysis of these two transient expression groups resulted in 1,190 host genes specifically regulated in their expressions by CpBV-H4. Finally, the comparison of 1,858 genes regulated by parasitism and 1,190 genes regulated by CpBV-H4 showed that 302 host genes were overlapped in the two groups. Thus, 16.8% of host genes regulated by C. plutellae parasitism at late parasitic stage nonparasitized status. (C) GO analysis of total host genes regulated by parasitism (left panel) and specific host genes regulated by CpBV-H4 (right panel). Host genes were identified from P. xylostella genome database (http:// iae.fafu.edu.cn/DBM/index.php) and annotated with GO program (https://en.wikipedia.org/wiki/Gene_ontology). were induced by CpBV-H4 expression while the other 83.2% genes might have been regulated by other CpBV factors and teratocyte. Our current analysis supported the results of previous SSH analysis [14] because most (95%) down-regulated genes (115 genes) determined by SSH analysis were included in the down-regulated genes found in the current RNA-Seq analysis. In contrast, the remaining 888 host genes influenced by CpBV-H4 expression but not included in genes regulated by C. plutellae parasitism might be induced by unnatural effect of CpBV-H4 on host regulation such as excess amount (25 ng CpBV-H4 per larva) of gene dose compared to natural parasitism (< 1 ng per larva [32] or the absence of interacting effect with other parasitic factors. CpBV-H4 was mainly localized in 51 joining sites based on ChIP-Seq analysis. These joining sites were determined by overlapping sites based on the nearest genes between joining sites determined from parasitized larvae and larvae transiently expressing CpBV-H4. In-depth analysis of these joining sites showed that individual ChIP sites were unevenly distributed around target genes. However, this biased distribution did not influence the direction (up or down) of    gene regulation. Thus, target genes can be up-or down-regulated irrespective to biased distribution of CpBV-H4 on upstream, gene body, or downstream locations. A previous study [17] pointed out that CpBV-H4s are localized on promoter regions of inducible target genes which might be highly involved in chromatin remodeling. However, the current ChIP-Seq analysis showed that CpBV-H4 could be integrated into gene body or downstream as well as upstream possessing promoter regions. Physical mapping of joining sites of CpBV-H4 showed that CpBV-H4s were localized mainly into a small number of chromosomes. In contrast, host genes regulated by CpBV-H4 were not close to CpBV-H4 joining sites. Some target genes controlled by CpBV-H4 were localized on chromosomes in which CpBV-H4 was not integrated. This suggests that genes regulated by CpBV-H4 might be able to regulate subsequent target genes by acting as transcriptional regulators. Indeed, target genes regulated by CpBV-H4 included transcription regulators involved in gene expression. For example, the expression levels of Px-KDM and Px-  SWI/SNF were suppressed by CpBV-H4 in both SSH and RNA-Seq analyses. KDM and SWI/ SNF are known to be able to regulate other gene transcription in an epigenetic mode [33,34]. Furthermore, these two genes are required for immunity and development of P. xylostella [14]. Thus, CpBV-H4 can influence host gene expression by directly binding to chromatin around target genes or by indirectly influencing host gene expression by regulating transcriptional regulators. This study found that 302 host genes were regulated by CpBV-H4 with joining sites near 51 target genes. Furthermore, this study explained a significant contribution (16.8%) of CpBV-H4 on host gene regulation induced by C. plutellae parasitism because teratocytes and other 156 CpBV genes might play crucial roles in altering host gene expression. Similar viral histone H4s have been found in other Cotesia-associated BVs [35]. Thus, viral histone H4 might be a main parasitic factor in Cotesia-associated parasitism.

Materials and methods
Insects and parasitization P. xylostella larvae were reared with cabbage leaves at 25 ± 1˚C with photoperiod of 16:8 h (L: D). Adults were fed 10% sucrose. Late first instar larvae were parasitized by C. plutellae at 1:2 (wasp: host) density for 24 h under the rearing condition. Parasitized larvae were fed cabbage leaves. Parasitized larvae lived for 8 days (P1-P8). They died after the emergence of fully matured wasp larvae which shortly formed cocoons for pupal development. In contrast, Nonparasitized larvae at the age corresponding to P larvae at parasitization lived 6 days (NP1-NP6) at 25˚C and pupated. Thus, P1 was an equivalent age to NP1 while P7 was an equivalent age to P7. Wasp cocoons were collected and kept at 25˚C for adult tissue development. After emergence, adult wasps were allowed to mate for 24 h. They were then used for parasitization.   Transient expression of CpBV-H4 in P. xylostella larvae A gene encoding CpBV-H4 containing N-terminal tail (vH4) and truncated CpBV-H4 (vH4T) after deleting N-terminal tail was cloned into a pIB vector [17]. The recombinant plasmid (250 ng/μL) was mixed in an equal volume of Metafectene PRO transfection reagent (Biontex, Planegg, Germany) and incubated for 20 min at room temperature to allow the formation of DNA-lipid complex. A total of 100 nL of this DNA-lipid complex was injected into the hemocoel of NP3 larvae at a rate 10 nL/sec using microsyringe pump controller ( Table. https://doi.org/10.1371/journal.pone.0177066.g006 Viral histone H4  Table. determined using forward primer 5 0 -GGATCCATGGGAAGAGGATTGGGCAA-3 0 and reverse primer 5 0 -GAATTCTCAACCTCCATAACCATA GATC-3 0 . Total RNA of larvae expressing vH4 or vH4T was extracted using Trizol reagent described below for subsequent RNA-Seq analysis.

RNA extraction for RNA-Seq analysis
Total RNA was isolated from four groups of P. xylostella larvae (P7, NP5, vH4-expressing larvae, and vH4T-expressing larvae) using Trizol reagent (Invitrogen, Carlsbad, CA, USA). Extracted RNA was resuspended in 40 μL of diethyl pyrocarbonate-treated water. RNA integrity for subsequent RNA-Seq was analyzed using Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, USA  Table). These mapped reads were then assembled with Cufflinks (version 2.2.1) (http://cole-trapnell-lab.github.io/cufflinks/). The assembled contigs were used to calculate FPKM (fragment per kilobase of transcript per million mapped reads). They were then annotated with DBM database.

Differentially expressed gene (DEG) transcript analysis
For DEG analysis of four treatment groups described in section 2.3, a total of 12,945 transcripts were used by deleting 5,128 transcripts from a total of 18,073 transcripts because at least one of these samples contained FPKM values of 0. DEG used a criterion of at least two-fold change in FPKM value.

Chromatin immunoprecipitation and deep sequencing (ChIP-Seq)
ChIP-Seq was performed using the above-mentioned four treatment groups of P. xylostella. A polyclonal antibody [17] for CpBV-H4 was used for ChIP. Briefly, 50 larvae for each group were homogenized in 1% sodium dodecyl sulphate followed by addition of formaldehyde (final concentration of 1%) to fix DNA with protein and then sonicated for 30 cycles for 1 min with 2 min interval. After sonication, DNA fragments in the range of 200-300 bp in 200 μL were used for ChIP-Seq according to QuickChIP kit manual (Cat. No. NBP2-29902, Novus Biologicals, Littleton, CO, USA). Briefly, 200 μL of sheared DNA was added to 800 μL of ChIP dilution buffer containing 75 μL of salmon sperm DNA (Sigma, St. Louis, MO, USA). CpBV-H4 antibody (1 μg) was added along with Protein A/G agarose (Thermo Scientific, Waltham, MA, USA). The complex of captured chromatin was then dissociated into protein and DNA using Tris-EDTA buffer (10 mM Tris, 1 mM EDTA, pH 8.0). The precipitated DNA was then resuspended in the same Tris-EDTA buffer and subjected to sequencing using Illumina HiSeq2000 platform with paired-end sequencing mode.

Screening of major incorporation sites of CpBV-H4 on host chromatin from ChIP-Seq
Four ChIP-Seq data were used to determine CpBV-H4 joining sites. All ChIP sites were validated by significance value to discriminate specific and nonspecific bindings. To compare P7 and NP5 ChIP-Seq sites, significance at E value of 10 −8 was used. Remaining validated ChIP sites were annotated by their nearest genes. Parasitism-specific ChIP sites were determined by manually deleting overlapping genes between two ChIP sites derived from P7 and NP5 larvae. To compare vH4 and vH4T ChIP-Seq sites, significance at E value of 10 −13 was used due to relatively higher mapping ratios compared to that in the comparison between P7 and NP5 ChIPs. Remaining validated ChIP sites were annotated by their nearest genes. CpBV-H4-specific ChIP sites were determined by manually deleting overlapping genes between vH4 and vH4T ChIP sites. Major incorporation sites of CpBV-H4 were then determined by overlapping sites between parasitism-specific ChIP sites and CpBV-H4-speicific ChIP sites.

In-depth localization analysis of major incorporation sites of CpBV-H4
Among 51 ChIP-Seq targets with respect to CpBV-H4, 15 targets were located on 11 characterized LGs. In-depth analyses of these 15 CpBV-H4 ChIP-Seq targets were done by choosing three ORF nearest to both sides of main ChIP-Seq targets. The total numbers of ChIPs were then counted on gene body ('GB', a region containing open reading frame), 5' region ('upstream') from GB, and 3' region ('downstream') from GB, in which downstream and upstream regions were intergenic regions. The fold changes in FPKM for all 105 ORFs were obtained from RNA-Seq DEG data between P7 vs NP5.
Screening of parasitism-specific and CpBV-H4-specific genes Four treatment group samples were used for the selection of parasitism-specific (P7 vs NP5) and CpBV-H4-specific (vH4 vs vH4T) genes. The total number of genes expressed in both groups was separated based on up-regulation or down-regulation. In each regulation group, genes exhibiting more than two-fold change were chosen. These genes were then compared to the overlapped genes between P7 and NP5 to select parasitism-specific genes or compared to overlapped genes between vH4 and vH4T to select CpBV-H4-specific genes.