Molecular Characterization and Differential Expression of an Olfactory Receptor Gene Family in the White-Backed Planthopper Sogatella furcifera Based on Transcriptome Analysis

The white-backed planthopper, Sogatella furcifera, a notorious rice pest in Asia, employs host plant volatiles as cues for host location. In insects, odor detection is mediated by two types of olfactory receptors: odorant receptors (ORs) and ionotropic receptors (IRs). In this study, we identified 63 SfurORs and 14 SfurIRs in S. furcifera based on sequences obtained from the head transcriptome and bioinformatics analysis. The motif-pattern of 130 hemiptera ORs indicated an apparent differentiation in this order. Phylogenetic trees of the ORs and IRs were constructed using neighbor-joining estimates. Most of the ORs had orthologous genes, but a specific OR clade was identified in S. furcifera, which suggests that these ORs may have specific olfactory functions in this species. Our results provide a basis for further investigations of how S. furcifera coordinates its olfactory receptor genes with its plant hosts, thereby providing a foundation for novel pest management approaches based on these genes.


Introduction
Insects can exploit chemical signals in the environment using their accurate olfactory systems, thereby mediating many important physiological behaviors, such as mate-finding, host location, and sending alarms to conspecifics. The antennae are the major olfactory organs of insects, and they possess various types of sensilla, where peripheral olfactory signal transduction events occur. At the molecular level, three main types of proteins are generally considered to be involved in odorant molecule transduction in the sensillum. First, odorants may diffuse into the sensillar lymph via pores, where odorant-binding proteins (OBPs) recognize and bind them. Second, OBPs act as transporters to transfer odorants across the sensillar lymph to reach than OBPs, and a single OR is sufficient to change insect behavior, whereas a specific OBP is not needed to invoke behavioral change [37]. Thus, in the present study, we sequenced and analyzed the head transcriptome of S. furcifera adults using next generation sequencing, where we identified 63 OR and 14 IR transcripts in this pest insect species. We also conducted transcriptome sequencing and gene ontology (GO) annotation, as well as scanned sequences for motif-patterns and examined phylogenetic relationships.

Results
Transcriptome sequencing and sequence assembly The S. furcifera head transcriptome was sequenced using the Illumina HiSeq™ 2000 platform and assembled with Trinity (v2012-10-05) ( Table 1 and Fig 1). In total, about 163 million reads were obtained. After filtering, 142 million clean reads were generated, which comprised 14.2 gigabases (Gb), with a longest length of 28,290 nt and a median length of 456 nt. These reads were assembled into 89,810 transcripts and 43,712 unigenes, with N50 lengths of 3,014 and 2,217 nt, respectively (Table 1). In addition, the unigenes with a sequence length >1000 nt accounted for 29.63% of the transcriptome assembly (Fig 1). The transcriptome raw reads have been deposited with the NCBI SRA database (accession number: SRR2068690).
GO annotations for all the unigenes were obtained using the Blast2GO pipeline according to the BLASTx search against NR. The GO annotations were used to classify the transcripts into functional groups according to specific GO categories. Among the 43,712 unigenes, 13,265 (30.3%) could be assigned to various GO terms. In the molecular function category, the genes expressed in the head were mostly enriched for binding (e.g., nucleotide, ion, and odorant binding) and catalytic activity (e.g., hydrolase and oxidoreductase). In the biological process category, the most common were cellular and metabolic processes. In the cellular component terms, the most abundant were cell and organelle ( Fig 2B).

Identification OR/IR genes
The unigenes related to candidate olfactory receptors (ORs/IRs) were identified based on keyword searches of the BLASTx annotations. The predicted protein unigene protein sequences were also analyzed using PSI-BLASTp with known aphid olfactory receptors [32,38]. In total, we identified 77 unigenes that belonged to the olfactory receptor family in the head transcriptome of S. furcifera, including 63 ORs and 14 IRs, all of which shared similarity with other insect OR and IR genes. Among these, 27 OR and 3 IR genes encoded putative, complete opening reading frames. Further information for the OR and IR genes including the unigene references, lengths, and best BLASTx his are listed in Tables 2 and 3. To validate the reliability of the transcriptome assembly, we randomly chose 32 full-length ORs for RT-PCR validation. To cover a sequence that was as long as possible, the primers were designed to span the ORF, the primer sequences are listed in S1 Table. As a result, all 32 ORs were successfully amplified by RT-PCR (S1 Fig). The PCR results were confirmed by sequencing. All of the OR and IR sequences in this study are listed in S1 File.

Motif-pattern and phylogenetic trees analysis
Conserved motifs are important elements of functional domains. We used the MEME server to identify conserved motifs in 130 hemiptera ORs. Parameters used in this and all other motif predictions of this study were: minimum width = 6, maximum = 10, maximum number of motif to find = 8. As a result, eight motifs (Most case occur: Motif-1, ALYSCNWTDM; Motif-2, LLTMQMNNAN; Motif-3, PTKIVNLEMF; Motif-4, QLFMYCYIFD, Motif-5, DLKSIIKDHQ; Motif-6, GHYQIIDPET, Motif-7, TYNAYYIFY; Motif-8, CYTVVSVLLN) were found for hemipteran ORs (Fig 3). Most motif amino acid residues locate in intramembrane domain, not in transmembrane domain. Motif 1, 4, 5 were the top three motifs present in these ORs, the ratio were 44.6%, 32.3% and 33.1%, respectively. We also carried out a motifpattern analysis of hemipteran ORs. It was quite different between species with the exception of the ORco sequences, which exhibited the same "4-1" motif-pattern. The "6-7-5-4-1-2-3-8" pattern was the most common motif in aphids with 25 ORs in A. pisum and 10 ORs in A. gossypii exhibiting the pattern. The most prevalent motif pattern in S. furcifera was the "5-1" motif, which was found in 8 SfurORs.

Discussion
In this study, we determined the repertoire of olfactory receptor superfamilies (ORs and IRs) in S. furcifera due to their potential significance as target genes for developing new pest control strategies, as well as for elucidating the molecular mechanisms that underlie insect-host plant interactions. In total, 14.2 Gb of S. furcifera head transcriptome data were sequenced, which is higher than that processed in most other studies [31,32,[39][40][41]. After extensive sequencing and assembly using Trinity RNA-Seq, we identified 63 ORs and 14 IRs in S. furcifera. The number of ORs lies between that of two hemipteran aphids, A. gossypii (45 ORs) [32] and A. pisum (73 ORs) [38,42], with sequenced genomes, and it is similar to the 62 ORs found in D. melanogaster [43] and the 79 ORs in A. gambiae [11,12], but much lower than those in T. castaneum (259 ORs) [14] and A. mellifera (170 ORs) [13]. The number of IRs was similar to the 14 IRs found in A. gossypii [32], 18 in D. melanogaster [44], and 22 in A. gambiae [29], but slightly higher than those in T. castaneum (10 IRs) (these data were obtained from GenBank) and A. mellifera (nine IRs) (these data were obtained from GenBank). These findings suggest that the adaptation of distinct species to their plant hosts has led to the diversification of ORs and IRs during their evolution. We conducted a MEME motif analysis using multiple hemiptera ORs to investigate differences among various species. Unlike insect OBPs [45], hemiptera ORs exhibit more differences, likely because ORs are more specific for odorant substrates than OBPs. In support of this, a single silkmoth pheromone receptor was activated by tis ligand to trigger sexual behaviors without the need of a specific OBP [37]. Furthermore, among various suborders of hemiptera the respective hosts are quite different, for example R. prolixus utilize blood meals and S. furcifera is an oligophagous pest that feeds only on few plants such as rice, maize. Thus we propose that they locate different hosts via volatiles based on their specific ORs. The exception, ORco, is more highly conserved than other ORs, which reflects its functional role in interacting with specific ORs to form the ligand-gated ion channel [17,18]. mong the SfurORs, the SfurORco gene had the highest mRNA abundance, which is similar to AgoORco in A. gossypii [32]. In insects, the ORco gene is a co-receptor that forms a functional heteromer with specific ORs [17,18]. In addition to SfurORco, the SfurOR1 gene had higher expression levels than the other SfurORs, thereby suggesting that it may bind key plant host volatiles in S. furcifera, although further functional research is required to confirm this suggestion. The phylogenetic analysis of hemipteran ORs suggested that the SfurORs have undergone functional differentiation due to their scattered distribution. One specific SfurOR sub-clade, which included SfurOR16, 23, 33, 35, 37, and 55, had no counterparts in other species in this analysis, thereby suggesting that these six ORs may be activated by the specific host plant volatiles of S. furcifera.
To further distinguish putative IRs from iGluRs, the SfurIRs were aligned with IR orthologs from other insect species and some DmeliGluRs for BLASTx and phylogenetic analysis. We demonstrated that there were obvious differences in the distributions of DmeliGluRs and insect IRs. Like the ORco gene, the IR8a and IR25a genes are thought to act as co-receptors because of their co-expression with other IRs [46]. Our expression profiles were consistent with this hypothesis because IR3 (IR8), IR9 (IR40a.1), and IR7 (IR25a) were the top three genes among the14 SfurIRs, This result aslo agrees with the higher expression levels of AgoIR8a and AgoIR25a in A. gossypii [32].
In conclusion, based on analyses of head transcriptomic data, we identified 63 ORs and 14 IRs in the insect species S. furcifera. Our method was successful in identifying chemosensory receptor genes with low expression levels and our results provide a valuable resource for investigating and elucidating the mechanisms of olfaction in S. furcifera. As a crucial first step toward understanding their functions, we also conducted a comprehensive examination of the expression patterns of these olfactory receptor genes, which demonstrated that most of these OR and IR genes were expressed in chemosensory organs. Our findings provide the foundation for future research into the olfactory system of S. furcifera and for further investigations of classic behaviors, such as migration, as well as large numbers of potential target genes for controlling this pest.

Materials and Methods
Insect rearing and tissue collection S. furcifera were collected from rice fields with the permission of the agricultural bureau in Libo county (25°21' N; 107°49' E), Guizhou province, China. The field studies did not involve endangered or protected species and no specific permissions were required for these insects. Collected insects were reared in the laboratory on rice seedlings at 26 ± 1°C, with a 16 h light: 8 h dark cycle. We collected 1000 heads of 1-to 3-day-old long-winged adults (male/female = 1/ 1) for transcriptome sequencing. We dissected various tissues (approximately 300 antennae, 150 mouthparts, 150 heads, 500 legs, and 50 bodies for each replicate) from long-winged adults under a microscope and we collected three replicates for each tissue type. The tissue samples were stored in RNAlater reagent (Qiagen, Valencia, CA, USA) at 4°C until further use.

cDNA library construction and Illumina sequencing
Total RNA was extracted using TRIzol reagent (Invitrogen Carlsbad, CA, USA) according to the manufacturer's protocol. The cDNA library construction and Illumina sequencing of the samples were performed by Novogene Bioinformatics Technology Co. Ltd, Beijing, China. The mRNA was purified from 10 μg of total RNA from S. furcifera heads using NEBNext oligo (dT) 25  ) were used to ligate library DNA at 37°C for 15 minutes. After end repair and ligation of the adaptors, the products were amplified by PCR and purified using a QIAquick PCR Purification Kit to create a cDNA library, which was sequenced using the HisSeq™ 2000 platform.

De novo assembly of short reads and gene annotation
After removing the adaptor sequences, low-quality reads, and reads where N ! 0.1%, the remaining reads were treated as clean reads. De novo transcriptome assembly was performed using the short reads assembly program Trinity (v2012-10-05) [47]. The overlap settings used for the assembly were 30 bp and 80% similarity, and all of the other parameters were set to their default values.
Unigenes >150 bp were aligned by BLASTx with protein databases, including Nr, Swiss-Prot, KEGG, and COG (e-value < 10 −5 ), to identify protein with high sequence similarity and assign putative functional annotations. Next, we used the Blast2GO program [48] to obtain GO annotations of the unigenes and we obtained the GO functional classifications using WEGO software [49].

Expression level analysis for the unigenes
The expression levels (abundances) of the unigenes were calculated with the FPKM method [50] using the formula: FPKM (A) = (10, 00, 000 × C × 1,000)/(N × L), where FPKM (A) is the expression level of gene A, C is the number of reads uniquely aligned to gene A, N is the total number of reads uniquely aligned to all genes, and L is the number of bases in gene A. The FPKM method can eliminate the influence of different gene lengths and sequencing discrepancies when calculating the abundance of expression.

RNA extraction and cDNA synthesis
The approximately 300 S. furcifera headswere dissected and used for RNA extraction. The collected tissues were fast-frozen in liquid nitrogen and kept at -70°C for further use. Total RNA was extracted using a MiniBEST Universal RNA Extraction Kit (TaKaRa, Liaoning, Dalian, China) following the manufacturer's instructions. The cDNA template was synthesized with Oligo(dT)18 primer as anchor primers, using PrimeScript™ I 1st Strand cDNA Synthesis Kit (TaKaRa, Liaoning, Dalian, China) at 42°C for 1 hr, The reaction was terminated by heating at 70°C for 15 min.

PCR validation
Gene specific primers across ORF of selected OR genes were designed using "Primer Premier 5.0" for RT-PCR validation. The sequences of these primers are listed in Table A1. PCR experiments were carried out using a C-1000 thermacycler (Bio-Rad, Waltham, MA, USA), and Touchdown PCR reactions were performed under the following conditions: 94°C for 3 min; 20 cycles at 94°C for 50 sec, 60°C for 30 s, and 72°C for 2 min, with a decrease of the annealing temperature of 0.5°C per cycle. This was followed by 15 cycles at 94°C for 50 sec, 50°C for 30s, and 72°C for 2 min, and final incubation for 10 min at 72°C. The reactions were performed in 25 μl with 100 ng of single-stranded cDNA of S. furcifera heads, 2.0 mM MgCl 2 , 0.5 mM dNTP, 0.4μM for each primer and 1.25 U Taq polymerase or EX-Taq polymerase (TaKaRa, Liaoning, Dalian, China). PCR products were analyzed by electrophoresis on 1.5% w/v agarose gel in TAE buffer (40 mmol/L Tris-acetate, 2 mmol/L Na 2 EDTAÁH 2 O) and the resulting bands visualized with ethidium bromide. DNA purification was performed using the TaKaRa MiniBEST Agarose Gel DNA Extraction Kit Ver.4.0 (TaKaRa, Liaoning, Dalian, China). Purified products were sub-cloned into a T/A plasmid using the pEASY-T3 vector system (Trans-Gen Biotech, Beijing, China) following the manufacturer's instructions. The plasmid DNAs was transformed into competent Trans1-T1 cells, positive clones were checked by PCR, and then sequenced by Sangon Biotech (Shanghai, China).

Motif-pattern analysis
A total of 130 of hemipteran ORs were used for motif discovery and pattern analysis. The MEME (version 4.9.1) online server (http://meme.nbcr.net/meme/), which has been widely used for the discovery of DNA and protein motifs. The parameters used for motif discovery were as follows: minimum width = 6, maximum width = 10, and the maximum number of motifs to find = 8.