Antennal transcriptome analysis of chemosensory genes in the cowpea beetle, Callosobruchus maculatus (F.)

Olfaction, one of the most important sensory systems governing insect behavior, is a possible target for pest management. Therefore, in this study, we analyzed the antennal transcriptome of the cowpea beetle, Callosobruchus maculatus (F.) (Coleoptera: Chrysomelidae: Bruchinae), which is a major pest of stored pulses and legumes. The de novo antennal RNA-seq assembly results identified 17 odorant, 2 gustatory, and 10 ionotropic receptors, 1 sensory neuron membrane protein, and 12 odorant-binding and 7 chemosensory proteins. Moreover, differential gene expression analysis of virgin male and female antennal samples followed by qRT-PCR revealed 1 upregulated and 4 downregulated odorant receptors in males. We also performed homology searches using the coding sequences built from previously proposed amino acid sequences derived from genomic data and identified additional chemosensory-related genes.


Introduction
Insects use olfactory signals in many behavioral contexts, such as locating food, mating, identifying oviposition sites, and escaping predators [1]. To detect olfactory signals, insects have developed a sensory system consisting of olfactory receptor neurons (ORNs) housed in the hair sensilla on the antennae and maxillary palps [2]. There are three chemoreceptor gene families in insects: odorant (OR), gustatory (GR), and ionotropic receptor (IR) families. Insect ORs contain seven transmembrane domains and have a membrane topology with intracellular Ntermini and extracellular C-termini, opposite to that of G-protein coupled receptors mediating chemoperception in many vertebrates [3,4]. GRs have a similar membrane topology to that of ORs [5], and IRs comprise three transmembrane domains with an extracellular N-terminus and a cytoplasmic C-terminus [6,7]. Additionally, other gene families encode proteins that have crucial roles in olfaction, including odorant-binding proteins (OBPs), chemosensory proteins (CSPs), and sensory neuron membrane proteins (SNMPs) [8,9]. Insect OBPs and CSPs chemosensory genes is still limited, possibly because previous studies analyzed the transcriptomes of the abdomen, head, and thorax to understand the digestive and reproductive gene expression profiles of C. maculatus [44,45]. Therefore, in the present study, we focused on de novo RNA-seq to analyze the antennal transcriptome of the species and conducted homology searches to identify chemosensory-related genes for coding sequences (CDSs), which were derived from the genomic data of C. maculatus but still not annotated. Additionally, we carried out phylogenetic analyses using the annotated chemosensory genes from genomically analyzed coleopteran beetle data. Finally, we compared the expressions of differentially expressed genes (DEGs) in the antennae of virgin C. maculatus males and females.

Insect rearing
Laboratory colonies of C. maculatus were used for the study. The insects were reared on Vigna angularis (Willd.) Ohwi and Ohashi in a plastic container at 28˚C in a dark incubator. The adults that emerged from the beans were immediately separated by sex, anesthetized on ice, and had their antennae excised and stored at -80˚C until further use.

RNA extraction, cDNA library construction, and Illumina sequencing
Sixty antennae from virgin male and female C. maculatus were crushed using a BioMasher II (Nippi Inc., Tokyo, Japan). Thereafter, total RNA was extracted using the ReliaPrep RNA Cell Miniprep System (Promega Corporation, Madison, WI, USA) following the manufacturer's protocol. RNA quality was confirmed based on an RNA Integrity Number >8 using an Agilent RNA 6000 Nano Kit in an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Afterward, a cDNA library was prepared using a TruSeq RNA Library Preparation Kit v2 (Illumina, San Diego, CA, USA) using 100 ng of total RNA according to the manufacturer's protocol. The library was sequenced on the Illumina sequencing platform (Illumina HiSeq 2500), and 100 bp paired-end reads were generated. The read data were deposited in the DDBJ Sequence Read Archive (DRA011785

Phylogenetic analyses of the in silico predicted chemosensory system
To narrow down chemosensory-related genes from the unigenes, an insect antennal dataset was customized from the NCBI protein database using the following keywords: "odorant receptor," "gustatory receptor," "sensory neuron membrane protein," "ionotropic receptor," "odorant-binding protein," and "chemosensory protein." The unigenes were then translated into amino acid sequences and annotated using the blastp program against the custom dataset [51]. In addition, a dataset of amino acid sequences obtained from the C. maculatus genome (GCA_900659725.1_ASM90065972v1) from the NCBI Genome database (https://www.ncbi. nlm.nih.gov/genome) was also annotated using the same protocol. The candidate chemosensory proteins (ORs, GRs, IRs, and SNMPs) were predicted using the TMHMM server 2.0 (http://www.cbs.dtu.dk/services/TMHMM/). Among them, the helical domains of ORs and GRs comprising seven helices with topology from inside to outside, IRs comprising three helices, SNMPs comprising two transmembrane helices with intracellular N-and C-termini, and eight cysteine residues were localized on the extracellular domain. The candidate OBPs and CSPs with Sec signal peptides were predicted using the signalP 5.0 server (http://www.cbs.dtu.dk/services/SignalP/). Amino acid sequences for each protein, consisting of both the candidate protein with transmembrane domains or signal peptide and the target protein identified using the BLAST search, were aligned using MAFFT version 7.214 [52] with default parameters. Moreover, maximum-likelihood (ML) phylogenetic analysis for the aligned sequence was performed using IQ-TREE multicore version 1.6.12 with the best-fit model in ModelFinder and 1,000 ultra-fast bootstrap replicates [53]. The ML phylogenetic trees were drawn using CLC Genomics Workbench 20 (Qiagen, Germantown, MD, USA). Amino acid sequences of chemosensory-related proteins from other insects were also obtained to estimate the C. maculatus candidate proteins. Accordingly, OR sequences of T. castaneum, A. glabripennis, and L. decemlineata were obtained from Mitchell et al. [54]. Leptinotarsa decemlineata GR sequences, T. castaneum, and L. decemlineata IR sequences were obtained from Schoville et al. [26]. Furthermore, SNMP sequences for A. planipennis, A. glabripennis, and D. ponderosae were retrieved from Andersson et al. [55], and those for T. castaneum were obtained from Dippel et al. [56]. In addition, OBP and CSP sequences for T. castaneum were obtained from Dippel et al. [57], and OBP and CSP sequences for C. chinensis were retrieved from Zhang et al. [36], and OBP sequences for L. decemlineata were obtained from Schoville et al. [26]. OBP and CSP sequences for A. glabripennis were obtained from Wang et al. [58] and Andersson et al. [55], respectively.

Identification of DEGs
The trimmed and filtered read data were mapped onto the de novo assembled unigenes using CLC Genomics Workbench 20 with the following parameters: mismatch cost = 2, insertion cost = 3, deletion cost = 3, length fraction = 0.8, and similarity fraction = 0.8. After statistical analysis based on a generalized linear model, DEGs with a change more than |2|-fold and false discovery rate (FDR)-adjusted p-value < 0.05 were selected. Gene ontology (GO) annotation and enrichment analysis of the DEGs were conducted using Blast2GO Basic version 6.0.3 [59].

qRT-PCR
The total RNA was quantified using a Nanodrop spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA), and first-strand cDNA was synthesized using a ReverTra Ace qPCR RT kit (Toyobo, Osaka, Japan) according to the manufacturer's protocol. Gene-specific primers (listed in S1 Table) were designed using Primer3.
qRT-PCR reactions were carried out in a StepOnePlus Real-Time PCR System (Applied Biosystems Inc., Foster City, CA, USA) using THUNDERBIRD Next SYBR qPCR Mix (Toyobo) with an amplification step (95˚C for 30 s, followed by 45 cycles of 95˚C for 5 s and 60˚C for 30 s), followed by a dissociation step (a cycle of 95˚C for 15 s, 60˚C for 30 s, and 95˚C for 15 s) for melting curve analysis. The expression levels were calculated using the ΔΔCt method [60]. The mean Ct value for each gene was calculated using three replicates, and the glyceraldehyde-3-phosphate dehydrogenase gene was used to normalize gene expression. The relative expression levels of each C. maculatus OR (CmacOR) by sex were compared using a two-tailed Student's t-test. All statistical analyses were performed in R software.

Sequencing and de novo assembly
The number of raw paired-end reads ranged from 2 × 19,324,335 to 2 × 22,898,011 in each sample, with approximately 98.9-99.0% reads generated as clean sequence reads. A total of 23,840 contigs were identified as unigenes belonging to eukaryotes, arthropods, and insects with BUSCO scores of 94.7, 92.7, and 91.8%, respectively. In addition, 20,024 (84.0%) contigs were functionally annotated using BLAST.

Identification of chemosensory-related genes
Unigene data and datasets of amino acid sequences derived from C. maculatus genomic data were functionally characterized into six chemosensory-related gene sets using the customized insect antennal dataset (S2-S7 Tables).
ORs. Transmembrane topology prediction identified 26 genes encoding putative ORs (17 from the antennal transcriptomes and 9 from the genomic analysis; CmacOR), with lengths ranging from 335 to 479 amino acid residues.
In Coleoptera, phylogenetic analysis revealed that the OR genes were separated into nine major subfamilies [54]; CmacOR gene numbers were allocated based on these subfamilies. The ML phylogenetic tree with genomically identified ORs of three coleopteran species revealed that 20 of 26 CmacORs were clustered around group 2A and followed by group 5A (4) (Fig 1). Within group 2A, a species-specific cluster comprised CmacOR4-11. Only one gene (Cma-cOR2) was housed in group 1, and no other genes were detected in the other groups.
We identified a putative OR-coreceptor (Orco) gene, namely CmacOrco, which was clustered with the Orco family with a high level of conservation (84-89% amino acid sequence identity) across the three beetles (A. glabripennis, L. decemlineata, and T. castaneum). The Cma-cOrco gene also showed the highest expression level according to the TPM value (S2 Table).
GRs. GR-transmembrane topology prediction revealed 2 GRs predicted from the antennal transcriptome, and 7 GRs were annotated upon genomic analysis, namely CmacGR1-7, with intact open reading frames ranging from 330 to 400 amino acid residues. The ML tree constructed using genomically identified L. decemlineata GRs revealed one sugar receptor (CmacGR1) in the cladogram containing LdecGR4-9 [26] (Fig 2). Additionally, four CmacGRs (CmacGR2a-3b) were clustered with LdecGR10NIC, which was identified as a fructose receptor. The remaining four CmacGRs derived from the genomic data were separately clustered in the bitter group [61].
IRs. Transmembrane topology prediction identified 10 IRs from the antennal transcriptome and 21 IRs from the genomic analysis (CmacIRs), ranging from 88 to 927 amino acid residues in length, correspondingly named for their putative T. castaneum and L. decemlineata homologs. The ML tree of CmacIRs revealed 23 widely conserved antennal IRs and 8 divergent IRs (Fig 3). Four conserved-receptor-homologs, namely IR8a, IR25a, IR76b, and IR93a, required as co-receptors with the other IRs [62], were detected in the antennal IRs. We also identified the antennal IR41a and IR75 groups, which functioned as olfactory detectors of acids and amines [63][64][65], and IR40a, which detected humidity [66,67]. Eight homologs were identified among the divergent IRs (one from the antennal transcript and seven from the genomic annotations).
SNMPs. Based on transmembrane topology prediction, we identified one SNMP containing 522 amino acid residues, and the ML tree among coleopteran beetle SNMPs revealed that the gene contained an SNMP2 clade, namely CmacSNMP2 (Fig 4). Although coleopteran SNMPs were categorized into four groups [68], no other SNMPs were detected in this study.
OBPs and CSPs. Using signal peptide prediction and motif analysis, 12 OBP genes from the antennal transcriptome and 14 OBP genes from the genomic analysis were identified, with 4 OBPs in common, whose sizes ranged from 118 to 361 amino acid residues. The number of putative OBPs was almost comparable to that in the congeneric species, C. chinensis (21) [36], correspondingly named for their putative C. chinensis homologs. Moreover, no plus-C, dimer, and atypical OBPs were identified in this study, while 15 classical and 11 minus-C OBPs were deduced from the 26 predicted CmacOBPs.
The ML tree revealed 14 orthologous pairs between C. maculatus and C. chinensis; among them, almost all the pairs shared 93-100% identical residues. Although some CchiOBPs lack N-terminus amino acids, only CmacOBP11 and CchiOBP11 shared 70% identical residues (Fig 5). Six OBPs were specific to C. maculatus. Some CchiOBPs were grouped in different clusters from those previously reported; CchiOBP18 was in the classic OBP and CchiOBP17 was in the plus-C clade, possibly due to the different tree construction methods [36].
The antennal transcriptome and genomic analysis revealed 7 and 6 CSP genes (117-317 amino acids long), respectively, with 3 CSPs in common, and the names were designated CmacCSP1-6. The phylogenetic ML tree analysis showed three orthologous pairs between C. maculatus and C. chinensis, sharing 98-99% identical residues, and three CSPs were specific to C. maculatus (Fig 6).

Detection of DEGs and qRT-PCR analyses
Using gene expression analysis of the reads mapped to the de novo assembled unigene sequences, 231 DEGs were detected between virgin male and female antennae, of which 112 upregulated and 119 downregulated genes were observed based on male antennae ( Fig  7A). The GO enrichment analysis indicated that the DEGs were significantly enriched in four biological processes ("nervous system process," "multicellular organismal process," "signaling," and "lipid metabolic process"), five cellular components ("plasma membrane," "membrane," "cell periphery," "microbody," and "peroxisome"), and four molecular functions ("oxidoreductase activity," "transmembrane transporter activity," "transporter activity," and "catalytic activity") ( Fig 7B). On the other hand, although the DEGs were identified in "cutin, suberin, and wax biosynthesis," "tryptophan metabolism," "folate biosynthesis," "histidine metabolism," and "one carbon pool by folate" based on 2-fold enrichment in the Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis, these pathways were not significantly enriched. Combining the DEGs with topology prediction, seven CmacORs revealed different expression patterns between sexes; CmacOR13a was upregulated while six CmacORs (CmacOR3, CmacOR6a, CmacOR6b, CmacOR8b, CmacOR11, and CmacOR15) were downregulated in male antennae (Fig 7C, S2 Table). To validate the expression profiles of CmacORs between virgin male and female antennae, we performed qRT-PCR for the five CmacORs, except for CmacOR6b and CmacOR8b, which have been suggested to be isoforms, and CmacOrco, which formed heteromeric complexes with all ORs. The analysis revealed that CmacOR13a was significantly upregulated in male antennae, whereas CmacOR3, CmacOR6a, CmacOR11, and CmacOR15 were significantly upregulated in female antennae (Fig 8). However, CmacOrco expression did not differ between the antennae of males and females.

Discussion
Recent RNA-seq (transcriptome) analyses, especially of non-model organisms, have revealed continuous progress in high-throughput sequencing technology [69][70][71]. Therefore, in the present study, we used next-generation sequencing to analyze the antennal transcriptome of a non-model bean beetle, C. maculatus, and accessorily annotated predicted genes from its genomic data.

PLOS ONE
Olfactory-related chemosensory genes have been suggested as potential targets for pest management. Since there are limited coleopteran databases and limited annotation information on C. maculatus, we searched for unigenes and CDSs built from the genomic and amino acid sequences against the customized NCBI protein database. Based on the highest BLAST scores, we identified 99 candidate chemosensory genes, including 26 ORs, 9 GRs, 24 IRs, 1 SNMP, 22 OBPs, and 10 CSPs, which were verified using the transmembrane hidden Markov model, signal peptide predictions, and motif analyses. Of these genes, 17 ORs, 2 GRs, 10 IRs, 1 SNMP, 12 OBPs, and 7 CSPs were newly identified using de novo antennal RNA-seq.

Antennal transcriptome analysis of Callosobruchus maculatus
In the insect olfactory system, OR forms heteromeric complexes with Orco. Although lowlevel amino-acid identities are shared between ORs, Orco sequences are highly conserved across distinct insect lineages [72]. In the phylogenetic analysis, CmacOrco revealed high similarity with three coleopteran Orco genes. The bulk of CmacORs was grouped in group 2A, followed by group 5A; 13 CmacORs formed a unique cluster in group 2A, indicating speciesspecific OR expansion (Fig 1). In Coleoptera, some species lost many OR subfamilies during evolution, and the ORs revealed a systematic lineage-specific expansion pattern following gain and loss [54]. Cucujiformia beetle genomic analysis revealed the expansion of subgroups 2A and 5A [54,73], consistent with the present results.
The gustatory receptors of coleopteran beetles follow a common insect pattern; they include a CO 2 receptor; putative sugar receptors, including those specific to fructose detection; and the remaining GRs are classified as putative bitter receptors assumed to be homologous with those in Drosophila melanogaster [73][74][75][76]. In the present analysis, no CO 2 receptor was identified, and only sugar, fructose, and bitter receptors were suggested. Since TPM values of CmacGR1 and CmacGR3a derived from the antennal transcriptome were low (S3 Table), the GR expression was suggested to be low in the antennae.
Insect IRs are derived from ionotropic glutamate receptors and are divided loosely into antennal and divergent IRs; the antennal IRs show olfactory detection of acids and amines. Conversely, divergent IRs are associated with the gustatory function [6,77]. In antennal transcriptome analysis, only one divergent CmacIR101 with low TPM values was detected (S4  Table); however, nine antennal CmacIRs with adequate TPM values were detected, suggesting that divergent IR expression could be low in the antennae. The IR75 beetle clade includes DmelIR75a-d and DmelIR64a homologs, which tend to radiate, ranging from 4 to 11 genes [55]. Furthermore, three CmacIRs within the IR75 clade were identified.
We identified only one SNMP gene cluster within the SNMP2 clade. Members of the SNMP1 subgroup are usually expressed in pheromone-sensitive olfactory neurons and are required for pheromonal activity in Drosophila flies and moths [78][79][80][81][82]. However, coleopteran beetle SNMPs remain poorly understood regarding whether the SNMPs could mediate pheromone reception, which requires further investigation.
Binding proteins (OBPs and CSPs) are involved in the first step of chemoreception in the lymph. On comparing the OBPs and CSPs between the congeneric beetles C. maculatus and C. chinensis, it was observed that three CmacOBPs and six CchiOBPs did not generate pairs ( Fig  5). The sex attractant pheromone structures differed in both species; C. maculatus uses five short-chain fatty acids, whereas C. chinensis uses homosesquiterpene aldehydes [83], indicating saltational evolution of the structures [84]. Therefore, such unique OBPs may be involved in detecting the structurally different sex attractant pheromones.
Within Coleoptera, CSPs are highly conserved and expressed in many parts of beetles [55]. In the case of the Chinese white pine beetle, Dendroctonus armandi Tsai and Li, the DarmCSP2 involves not only carrying of pheromones but also various host plant volatiles [85]. Thus, CmacCSPs identified from the antenna-transcriptome might carry some semiochemicals.
Since adult C. maculatus do not feed, the most important chemosensory signals are for mating and host recognition for oviposition. qRT-PCR analyses confirmed that the expression pattern of CmacOrco between the antennae of males and females was not significant, and the same pattern was observed in other coleopteran species [30,33,35]. In honey bees, the malebiased expression OR is associated with detecting female-produced sex pheromones [86]. CmacOR13a is highly expressed in the male antennae, suggesting it might be related to sex pheromone reception. In contrast, CmacOR3, CmacOR6a, CmacOR11, and CmacOR15 were significantly more expressed in female antennae; these CmacORs might be related to identifying host legumes for oviposition.
In the present study, antenna-transcriptome-based chemosensory gene analyses suggested the presence of several chemosensory genes. Further functional analyses of chemosensory genes could facilitate the development of sustainable pest control strategies for C. maculatus.
Supporting information S1