Identification of Putative Chemosensory Receptor Genes from the Athetis dissimilis Antennal Transcriptome

Olfaction plays a crucial role in insect population survival and reproduction. Identification of the genes associated with the olfactory system, without the doubt will promote studying the insect chemical communication system. In this study, RNA-seq technology was used to sequence the antennae transcriptome of Athetis dissimilis, an emerging crop pest in China with limited genomic information, with the purpose of identifying the gene set involved in olfactory recognition. Analysis of the transcriptome of female and male antennae generated 13.74 Gb clean reads in total from which 98,001 unigenes were assembled, and 25,930 unigenes were annotated. Total of 60 olfactory receptors (ORs), 18 gustatory receptors (GRs), and 12 ionotropic receptors (IRs) were identified by Blast and sequence similarity analyzes. One obligated olfactory receptor co-receptor (Orco) and four conserved sex pheromone receptors (PRs) were annotated in 60 ORs. Among the putative GRs, five genes (AdisGR1, 6, 7, 8 and 94) clustered in the sugar receptor family, and two genes (AdisGR3 and 93) involved in CO2 detection were identified. Finally, AdisIR8a.1 and AdisIR8a.2 co-receptors were identified in the group of candidate IRs. Furthermore, expression levels of these chemosensory receptor genes in female and male antennae were analyzed by mapping the Illumina reads.


Introduction
Athetis dissimilis (Hampson, 1909) (Lepidoptera: Noctuidae) is found in many countries including Japan, Korea, India, Philippines and Indonesia [1][2][3][4]. In 2012, it was first observed that this species caused damage to summer maize seedling in Shandong province in China, although it had not been documented previously as an agricultural pest [4]. Since then, this pest has been found in Henan, Shanxi and Anhui provinces. Because of the fact that larvae of A. dissimilis live under plant residues, it is difficult to control the spread of the pest with chemical pesticides. Therefore, novel control strategies are urgently needed to mitigate crop damage.
Olfaction plays several vital roles in insect biology, including food selection, mate choice, the location of suitable oviposition sites by females, warning, and defense [5]. Accurate detection of volatile compounds in the surrounding environment is essential for insect survival. Antennae are specialized the main olfactory organs containing a large variety of sensilla. Environmental chemical compounds transported from micro-pore on the sensilla through antennal lymph to olfactory receptor neurons (ORNs) that generate an electrical impulse [6]. Several families of transmembrane proteins at the membrane surface of ORNs appear to detect and recognize odorant molecules [7]. These transmembrane protein families occupied with odorant molecules classified as olfactory receptors (ORs), gustatory receptors (GRs), and ionotropic receptors (IRs) [8][9][10][11][12]. Insect OR proteins contain seven transmembrane domains, but they have an inverted topology compared to those of vertebrates [13,14]. To function, one conventional OR and one obligate olfactory co-receptor (Orco) must form a dimer complex that works as a ligand-gated ion channel [13,[15][16][17]. ORs in moths contain pheromone receptors (PRs) detecting sex pheromone and non-PR ORs. GRs were mainly expressed in the gustatory organs such as the mouthparts [18], in fact, some GRs are also expressed in olfactory structures and presumably have olfactory function [19]. The conservation of GR sequences is much higher than the ORs [20,21]. IRs is another variant subfamily of ionotropic glutamate receptors (iGluRs) [13]. In insects, the IR family includes the conserved "antennal IRs" having an olfactory function, and the species-specific "divergent IRs" having gustatory function [22].
The identification of chemosensory receptor genes in pest insects is especially significant due to their potential as novel targets in insect pest control. With the improvement of highthroughput sequencing methods, more chemosensory receptors have been discovered to date. Transcriptome sequencing or RNA sequencing (RNA-seq) is one common method that helps to obtain a large variety of functional genes. It has been used widely to identify genes involved in chemosensation in insects [23,24].
In order to identify chemosensory receptor genes of A. dissimilis, an organism with no available genomic information, we sequenced and analyzed an antennae transcriptome of adult females and males using Illumina HiSeq2500 sequencing. We report here that the antennal transcriptome of A. dissimilis includes 60 OR, 18 GR and 12 IR genes.

Insect rearing and antennae collection
Athetis dissimilis originally collected in July 2012 from infested maize seedlings in the Experiment Station of Henan University of Science and Technology in Luoyang, Henan province, China. The insects were fed with an artificial diet in the laboratory under conditions of 27 ± 1°C with 70 ± 5% relative humidity and maintaining 16 h: 8 h light/dark cycle. After pupation, pupae sexed according to the position of the genital scar. Male and female pupae were stored in separate cages for the emergence. Adults fed with 10% sugar solution. About 200 pairs of antennae of 3-4 days old male and female moths were excised and immediately stored in liquid nitrogen until use.

RNA purification and sequencing
Total RNA was extracted using the RNAiso Plus kit (TaKaRa) and treated with RNase-free DNase I (TaKaRa) to remove residual DNA following the manufacturer's instructions and then measured for purity, concentration and integrity respectively using NanoDrop 2000c spectrophotometer (NanoDrop Products, Thermo Scientific, USA), Qubit 2.0 (Qubit 1 2.0 fluorometer, Life Technologies, USA) and Agilent 2100 (Quantifluor-ST fluorometer, Promega, USA). The qualified RNA samples were then used for transcriptome sequencing.
Following the TruSeq RNA Sample Preparation Guide v2 (Illumina), mRNA was enriched using oligo (dT) magnetic beads and sheared to create short fragments by adding Fragmentation Buffer. The first strand cDNAs were synthesized using random hexamer primers, which were further transformed into double stranded cDNA by using dNTPs, RNase H and DNA polymerase I. Next to the purification of the double stranded cDNA with AMPure XP beads, the end-repairing, Poly-A tailing and, sequencing adapters linking processes were completed. The size of the fragment was chosen using AMPure XP beads, and cDNA library was constructed by PCR amplification (Veriti 1 96-Well Thermal Cycle, Applied Biosystems, USA). The concentration and insert size of cDNA library were detected using Qubit 2.0 and Agilent 2100, and quantified with q-PCR (CFX-96, Bio-Rad, USA). Finally, 125 bp pair-end reads were generated by sequencing cDNA with Illumina HiSeq2500 based on sequencing-by-synthesis method. Sequencing analysis was performed by the Genomics Services Lab of Beijing Biomarker Technologies Co., Ltd. (Beijing, China). The raw data processing and base calling were performed by the Illumina instrument software.

Unigene generation and annotation
In order to obtain the clean data, the raw reads were initially processed for removing the adapter sequences and low-quality bases. Then, the Q30 and GC-content were used to assess the sequencing quality. Sequenced reads were assembled de novo with Trinity software [25] by setting min_kmer_cov to 2 and all other parameters to default. Unigene sequences were aligned by online BLASTX program on the databases of NR, Swiss-Prot, KOG and KEGG using a cut-off E-value of 10 −5 . Unigenes were then annotated using BLAST with E-value of 10 −5 and HMMER with E-value of 10 −10 . Then, NR BLASTX results were directed into GO annotation using Blast2GO. Genes are described in terms related to molecular function, cellular component or biological process. TransDecoder software was used to predict the coding sequences (CDS) and amino acid sequences of Unigene.

Identification of the target genes and phylogenetic analyzes
Target sequences were identified from the BLAST results obtained by running against the database with E-value of < 10 −5 . The complete coding region was determined using the ORF finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html). The nucleotide sequences of annotated genes were translated into amino acid sequences using ExPASy portal (http://web.expasy.org/ translate/). The transmembrane-domains (TMDs) of annotated genes were then predicted using TMHMM Server v. 2.0 (http://www.cbs.dtu.dk/services/TMHMM-2.0/). Genes of other insect species such as Bombyx mori, Cydia pomonella, and Heliothis virescens were used as references.

OR, GR and IR transcription abundance analysis
Transcription levels of OR, IR and GR genes of A. dissimilis are reported in values of Fragments Per Kilobase of transcript per million mapped reads (FPKM). The FPKM measure considers the effect of sequencing depth and gene length for the read count at the same time, and is currently the most commonly used method for estimating gene expression levels [34]. Thus, the FPKM of each gene was calculated based on the length of the gene and read count mapped to this gene.

Sequence analysis and assembly
We obtained 26,234,196 female and 28,315,769 male clean reads with a total of 13.74 Gb nucleotides from the antennal cDNA libraries. The sample GC content was consistently about 45%, and the average quality value was 30 for more than 87.83% of the cycle ( Table 1). In total 10,821,996 contigs were generated with a k-mer of 25. Then 177,477 transcripts and 98,001 unigenes with N50 length of 1,666 and 1,172, were obtained from assembled using Trinity ( Table 2).

Chemosensory receptors
A total of 73 different sequences that encode candidate OR genes were identified by bioinformatic analysis. Among them, 59 were deposited in the GenBank database under accession numbers in between KR935700 to KR935758, the one Orco gene was deposited under the accession number KR632987. Although 13 other sequences are either shorter than 200 bp or have no common sites found for computing distances, we did not exclude the possibility that they may represent non-conserved portions of genes. Hereby we only analyzed the 60 OR sequences used in our phylogenetic tree construction. The information on the 60 ORs can be found in Table 4, while the sequences of 13 residues OR gene were listed in S2 File. Confirmation was made by phylogenetic analysis for the four candidates AdisPR genes (AdisOR3, 6, 11, and 14), which clusters them into the conserved clade of lepidopteran species PRs. As expected, the AdisOrco sequence showed high homology to the conserved insect co-receptor clustered in the Orco clade. Aside from AdisOR47, all putative AdisORs were assigned to Lepidoptera ORs ortholog clades (Fig 5).
In the current study, 18 candidate GRs from the A. dissimilis antennal transcriptome were identified. Only two GR genes were full-length ORFs while the others were only partial sequences. All these genes were registered to NCBI GenBank (KR674128-KR674145). The information on the GR genes was listed in Table 5. A phylogenetic tree was constructed using 18 candidate ApisGRs, 18 H. assulta GRs, 33 B. mori GRs, and 56 D. melanogaster GRs (Fig 6). AdisGR1, 6, 7, 8 and 94 are the members of the "sugar" receptor subfamily and they were   classified as a clade with H. assulta "sugar" receptors (HassGR6, HassGR7 and HassGR8). In addition, two putative GR receptors (AdisGR3 and 93) were identified as the "CO 2 " receptor genes of the insect that are sharing high sequence identity with H. assulta "CO 2 " receptors (HassGR2 and HassGR3). We also identified 12 candidate IR genes according to their similarities to known insect IRs, in which 4 sequences with full-length ORFs and 8 sequences with incomplete 5 0 or 3 0 terminus. These 12 sequences were deposited in the GenBank under succeeding accession numbers from KR912012 to KR912023. The information on the IRs was listed in Table 6. A. dissimilis IRs were named for their homology to those of H. assulta and S. littoralis. AdisIR8a.1 and 8a.2 were phylogenetically clustered with the highly conserved IR8a sub-family, but no single IR gene of A. dissimilis was located in the IR25a sub-family. Two IRs were clustered into the Sli-tIR1/HassIR1.1 clade, with reliable bootsrap support, named as AdisIR1.1 and 1.2. IR75 was a very large clade that comprises four A. dissimilis IRs (AdisIR75d, 75q.2, 75p and 75p.1). Further, IR21a (containing Adis21a.2 and 21a.3) and IR41a (containing Adis41a) were also highly conserved clades. At least one insect IR orthologous could be assigned to the majority of the putative AdisIRs (Fig 7).
To analysis the transcription abundance of global chemosensory receptor genes in the sequenced libraries of both sexes, we surveyed the differential expression of all chemosensory receptor ORFs identified in the present study. The result is listed in S3 File.

Discussion
Transcriptome sequencing is a feasible and economical way to obtain target genes of interest in a short time; this technology has become popular for filtering chemosensory receptors from  [42]. The genus Athetis is a group of 211 species [43]. Although the majority of the species are not considered as insect pests with major economic effects, a few Athetis species such as A. lepigone, A. dissimilis and A. gluteosa are identified as important crop pest insects in China. Here, we identified 60 candidate OR gene sequences, 18 GRs and 12 IRs from A. dissimilis. This is the first report in the genus Athetis, to our knowledge, that the olfactory receptors of moths are identified by using transcriptome technology with a transcriptome strategy proved to be effective in uncovering large sets of chemoreceptor from three major gene families. ORs, sex-biased expression in the antennae of one sex, are generally considered as PRs that mediate behaviors specific to that sex. Lepidoptera sex pheromones produced by females may attract males for mating opportunities. Several moth sex pheromone receptors have now been functionally characterized, and most are expressed at higher levels in the male antennae [44][45][46]. Based on phylogenetic tree analyzes of the A. dissimilis ORs, four of them clustered in a conserved clade of PRs found in Lepidopteran insects (Fig 5). We, therefore, hypothesize that some or all of them appear to be dedicated to sex pheromone detection. Accordingly, results from the transcription abundance analysis (S3 File) showed that AdisOR3, 6 and 14 had very high expression quantities in the male antennae, while the gene expression level of AdisOR11 was the only one that is almost equal in the female and male antennae. AdisOR11 showed equal expression levels in male and female antennae, which may relate to females detecting their own pheromones.
Insect ORs are frequently co-expressed with a nonconventional OR, recently renamed as olfactory receptor co-receptor (Orco) while they were previously referred to as OR83b in D. melanogaster and OR2 in B. mori [47]. Unlike other insect ORs, with a little sequence homology, Orco is strikingly well conserved across insect species. We identified one AdisOrco sequence with a high degree of similarity to co-receptors from different insect orders clustered in the Orco clade. We found that the AdisOrco gene with biased male expression has the highest expression quantity in all OR genes from the female and male adult antennae (Please see S3 File). This is also in accordance with the expression pattern of all insect Orco genes. The GR family of insect chemoreceptors includes receptors for sugars and bitter compounds, as well as cuticular hydrocarbons and odorants such as CO 2 . Gustatory receptors perceive essential nutrients whose chemical structures remain constant (compared to bittertasting, secondary plant compounds) such as sugars and CO 2 receptors. Thus, sugar and CO 2 receptor genes are relatively highly conserved in most of the insect genomes that have been sequenced to date [10,24,29,48]. We have annotated 18 GR genes from the A. dissimilis antennal transcriptome dataset. The GR family in A. dissimilis includes two putative CO 2 receptors (AdisGR3 and 93) and five sugar receptors (AdisGR1, 6, 7, 8 and 94). AdisGR4, the putative gustatory receptor, share the same clade with BmorGR9, HassGR4 and HassGR9. In comparison with BmorGR9, a newly characterized receptor of fructose [49], we can suggest that AdisGR4 is a sugar receptor (Fig 6). Sugars and sugar alcohols have been identified to affect the host plant selection and egg-laying behavior of codling moth females [50].
The iGluRs mediate excitatory neurotransmission in both vertebrate and invertebrate nervous systems [51]. Ionotropic receptor genes were first discovered in D. melanogaster through genome analyzes [13]; they arose from an iGluR with a change in expression localization from an interneuron to a sensilla neuron [22]. In D. melanogaster antennae, IRs have been reported to detect a variety of molecules [52]. In the A. dissimilis antennal transcriptome, we identified 11 candidate IRs and 1 candidate iGluR. Recent studies have indicated that the coreceptors of IRs, IR8a/25a have a similar expression pattern with the Orco, playing an essential role in tuning IRs sensory cilia targeting and IR-based sensory channels [52]. Although we identified two IR8a genes from A. dissimilis namely AdisIR8a.1 and AdisIR8a.2, IR25a was not found. This may be the result of no biological repeats. We also found the expression level of AdisIR8a.1 was rather high and analogous to the Orco (Please see S3 File). While two IRs named AdisIR1.1 and 1.2 clustered together with their orthologous SlitIR1/HassIR1.1 in a "divergent IR" clade, four IRs (AdisIR75d, 75q.2, 75p, and 75p.1) are localized in a large clade of IR75. But so far, the function of IR75 is unclear. Moreover, IR21a (containing Adis21a.2 and 21a.3), IR76b and IR41a (containing Adis41a) were also highly conserved clades. All AdisIRs that we discovered have orthologs found in Hass/Slit/Dpon.

Conclusions
We first obtained abundant biology information on the transcriptome of A. dissimilis antennae using high-throughput sequencing technology with the aim of identifying of the genes potentially involved in the olfaction process. From the obtained transcriptome data, three important gene families encoding chemosensory receptors were identified, annotated, and further analyzed for their expression profile. Our results provide a foundational knowledge for exploring and understanding the molecule mechanism involved in olfactory recognition process of the insect pest A. dissimilis, and providing alternative novel targets for the pest management with semiochemicals.