Transcriptomic analysis of the honey bee (Apis mellifera) queen spermathecae reveals genes that may be involved in sperm storage after mating

Honey bee (Apis mellifera) queens have a remarkable organ, the spermatheca, which successfully stores sperm for years after a virgin queen mates. This study uniquely characterized and quantified the transcriptomes of the spermathecae from mated and virgin honey bee queens via RNA sequencing to identify differences in mRNA levels based on a queen’s mating status. The transcriptome of drone semen was analyzed for comparison. Samples from three individual bees were independently analyzed for mated queen spermathecae and virgin queen spermathecae, and three pools of semen from ten drones each were collected from three separate colonies. In total, the expression of 11,233 genes was identified in mated queen spermathecae, 10,521 in virgin queen spermathecae, and 10,407 in drone semen. Using a cutoff log2 fold-change value of 2.0, we identified 212 differentially expressed genes between mated and virgin spermathecal queen tissues: 129 (1.4% of total) were up-regulated and 83 (0.9% of total) were down-regulated in mated queen spermathecae. Three genes in mated queen spermathecae, three genes in virgin queen spermathecae and four genes in drone semen that were more highly expressed in those tissues from the RNA sequencing data were further validated by real time quantitative PCR. Among others, expression of Kielin/chordin-like and Trehalase mRNAs was highest in the spermathecae of mated queens compared to virgin queen spermathecae and drone semen. Expression of the mRNA encoding Alpha glucosidase 2 was higher in the spermathecae of virgin queens. Finally, expression of Facilitated trehalose transporter 1 mRNA was greatest in drone semen. This is the first characterization of gene expression in the spermathecae of honey bee queens revealing the alterations in mRNA levels within them after mating. Future studies will extend to other reproductive tissues with the purpose of relating levels of specific mRNAs to the functional competence of honey bee queens and the colonies they head.


Introduction
In insects, success during sexual reproduction is achieved by the implementation of speciesand sex-specific strategies that involve direct contact between the male's sperm and seminal fluid and the female's eggs and the epithelial cells along the female's reproductive tract. These interactions can affect mating behavior, gamete production, fertilization efficiency, sperm competition and sperm storage [1,2]. In particular, sperm storage, the maintenance of sperm inside a female's reproductive tract for a sustained period of time, is a key feature of reproductive success in many insects [3,4]. For example, the myriad of sperm storage organs that have evolved in fruit fly species in the genus Drosophila [5] have been optimized such that one male's seminal fluid and accessory gland proteins increase the survival of all other males' ejaculates collected by a female during mating [6,7]. In another dipteran species, the mosquito Aedes aegypti, females mate with only one male but store their mate's sperm for a long period of time [8,9]. A recent RNA sequencing study of the spermatheca of virgin and mated A. aegypti females found eight spermatheca-specific transcripts that are responsible for the nourishment, maintenance and protection of male sperm [4].
Long-term sperm storage by reproductive females is also common in eusocial insects including ants, wasps and some bees, whereby queens typically mate early in life and store the sperm received during mating for years [10][11][12]. The sperm is stored in the spermatheca, allowing mating and fertilization to occur asynchronously [3]. Several factors in the female's spermathecal fluids and the male's seminal secretions have proven to be crucial for long-term storage and mating success in eusocial insects. In the ant Atta colombica, secretions from the male accessory glands greatly increase sperm viability upon ejaculation inside the female [13]. In Acromyrmex echinatior ants, sperm have a "self-non-self" recognition system in which sperm motility is lower in the presence of a male's own seminal fluids, but it is higher when mixed with the seminal fluids of other males' sperm, suggesting that enhanced sperm motility is costly and thus is only implemented as needed during competition between males [14].
In honey bees (Apis mellifera), a queen's genitalia and spermatheca are located in the lower section of the reproductive tract, where sperm storage, maintenance, release and fertilization occur [1,15]. Virgin honey bee queens mate with an average of 12 drones (range = 1-28) in quick succession during one or multiple mating flights [16,17]. After copulation, less than 10% of the semen remains inside the queen [18] and only about 3% of the sperm actively migrates to the spermatheca for long-term storage [12]. Interestingly, live sperm can "drag" dead sperm cells to the spermatheca [19], increasing the overall number of sperm cells that reach the spermatheca upon mating. The spermatheca of a newly mated queen can contain between four and seven million sperm cells [20][21][22], which she uses to fertilize 1,000 or more eggs daily for the production of worker offspring over her two-to three-year lifespan [23].
Despite the important role played by sperm storage in honey bee reproduction, the molecular mechanisms by which sperm remain viable for years inside the queen's spermatheca are not completely understood. Proteomic studies have identified numerous proteins in the queen spermatheca [34] and drone seminal fluid [35] and have shown changes in proteome composition between recently ejaculated sperm and sperm stored in the queen's spermatheca [38]. However, the molecular basis of sperm longevity within the spermatheca cannot be clearly elucidated via protein expression because many of the protein profiles from such studies may be from highly expressed proteins produced by other tissues and would thus require simultaneous comparative experiments between the spermatheca and other organs. A more informative approach, such as that taken in our study, is to elucidate large-scale gene expression upon mating in honey bee queens through next generation, high-throughput RNA sequencing of the spermathecal organ before and after mating.
Some of the behavioral and physiological changes known to occur in honey bee queens after copulation [42][43][44] are likely associated with changes in gene expression based on the queen's mating status [30,41,45]. With this in mind, we searched for candidate genes that may contribute to sustained sperm longevity in queens by performing differential gene expression analyses between the spermathecae of virgin queens and newly mated queens, and compared them to that of drone semen, paying particular attention to gene expression patterns that were unique to each tissue type. We discuss the putative functions of some gene products that are induced in the spermatheca after mating and thus may be involved in sperm storage. Our approach represents, to our knowledge, the first transcriptomic study of the honey bee queen spermatheca that focuses on the genes that may be involved in long-term sperm storage after mating, providing a foundation for the development of genetic markers of reproductive quality in honey bee queens and drones. Results from future studies using the honey bee as a model organism can help assess the reproductive health of other important pollinator species.

Source of bees
We obtained three virgin queens and three recently mated queens that were shipped by a commercial queen producer located in California's Central Valley (Olivarez Honey Bees, Inc., Orlando, CA). The virgin queens were approximately two weeks old and the mated queens were approximately three to four weeks old. All queens were obtained from the same Carniolan stock source colony and thus were sisters to each other. The queens were individually labeled, caged in a box containing worker attendants and food, and shipped to our laboratory facility at Texas A&M University in College Station, TX. The queens were kept at room temperature until they were dissected for collection of the spermatheca. Sexually mature drones (n = 3 biological replicates, using a pool of ten drones per biological replicate), approximately 15 to 18 days post emergence, were also collected for dissection and semen collection from each of three colonies located at the Janice and John G. Thomas Honey Bee Facility of Texas A&M University's RELLIS Campus in Bryan, TX.

Sample and tissue collection
Each queen was immobilized by placing her in a -20˚C freezer for 3 to 5 min. Once immobilized, she was decapitated on a bed of dry ice to keep all tissues suitable for RNA extraction. We then dissected the queen's spermatheca from the abdomen, removing the tracheal net covering the spermatheca with forceps (Fig 1A and 1B). The spermatheca was pressed against the wall of a microcentrifuge tube to release its contents inside the tube after adding 50 μL Pure-ZOL™ RNA isolation reagent (Bio-Rad Laboratories, Inc., Hercules, CA). Each sample was frozen individually at -80˚C until it was used for RNA extraction (see below). The spermathecal samples from both virgin and mated queens were handled in the same manner by the same person to control for variation due to handling. Drones were decapitated in a similar manner as queens and pressure was then applied to their abdomen to evert the endophallus. A micropipette was used to collect the semen laying on top of a bed of mucus at the tip of the bulb ( Fig  1C and 1D), as done previously [41]. We pooled the semen from ten drones from each of the three colonies. Therefore, each biological replicate contained the semen from ten drones from the same colony. We added 50 μL PureZOL™ RNA isolation reagent (Bio-Rad Laboratories, Inc., Hercules, CA) and then froze the three pooled semen samples at -80˚C until they were used for RNA extraction (see below).

RNA extraction and cDNA library synthesis
We purified total RNA from three biological replicates per tissue type: the spermatheca from three virgin queens, the spermatheca from three mated queens, and three pooled drone semen samples (n = 3 drones per replicate), which was considered an acceptable number of replicates for these type of studies in 2017, the year when the analysis was conducted [46]. Each sample was individually placed in 1.5 mL microtubes with 50 μL PureZOL™ RNA isolation reagent (Bio-Rad Laboratories, Inc., Hercules, CA) and the tissues therein were ground with a pellet mixer (VWR 1 catalog number 47747-307, Radnor, PA). RNA from each sample was extracted using the standard protocol for PureZOL™. After adding 1 μL of 20 μg/μL RNAsefree glycogen, RNA was precipitated overnight at -20˚C. RNA pellets were created through 10 min of centrifugation at 12,000 x g in a microcentrifuge kept at 4˚C. The pellets were washed in 75% ethanol, air dried for 10 min and resuspended in 25 μL of nuclease-free water. RNA concentrations were measured in a Qubit12.0 fluorometer with a Qubit1 RNA HS Assay Kit (Life Technologies Corporation, Grand Island, NY).
Fifty ng aliquots of each RNA sample were sent to the Texas A&M University's Institute for Genome Sciences and Society Core (College Station, TX) for quality assessment (done on a BioAnalyzer 2100), cDNA library construction and sequencing. The cDNA libraries used for sequencing were prepared according to the NuGEN Technologies (Tecan Genomics, Inc., Redwood City, CA) protocol using the Ovation SoLo RNA-Seq system. After adapter ligation, ribosomal transcripts were depleted using a custom insert dependent adapter cleavage (InDA-C) probe pool from NuGEN Technologies (Tecan Genomics, Inc., Redwood City, CA) designed for ribosomal RNA sequences in the A. mellifera genome.

RNA sequencing and bioinformatics analyses for differential gene expression
Libraries, in equimolar concentrations, were sequenced on an Illumina NextSeq 500, with a high-output 150 base-pair sequencing run, using the manufacturer's supplied custom sequencing read 1 primer on four sequencing lanes. After sequencing, the first five bases were trimmed as suggested in the Ovation Solo RNA-seq protocol (https://www.nugen.com/sites/default/ files/M01406_v4_User_Guide%3A_Ovation_SoLo_RNA-Seq_System_1287.pdf). A total of 424 million reads were checked to trim any adapter sequences and low-quality bases using Trimmomatic [47]. Read mapping was performed using HISAT version 2.0.5 [48]. The genome Amel_4.5 from NCBI was used for contig assembly (https://www.ncbi.nlm.nih.gov/ assembly/GCF_000002195.4/). HTSeq [49] was used to generate raw read counts per gene using an intersection-nonempty parameter to account for ambiguous read mappings.

Functional clustering analysis of differentially expressed genes
Tables of functional clustering of differentially expressed genes were generated using the Database for Annotation, Visualization and Integrated Discovery (D.A.V.I.D.) platform [50,51] (https://david.ncifcrf.gov/home.jsp). Gene lists from differential expression analysis of the RNA sequencing data were filtered to isolate genes that were different by either � 4.0-fold (up-regulated) or � -4.0-fold (down-regulated). The gene lists were uploaded to D.A.V.I.D. and referenced against the in-house database for A. mellifera on BeeBase. The "Functional Annotation Clustering" under "lowest" classification stringency was used to study the gene list. We report all terms with FDR cutoff values � 0.05.

Confirmation of differential expression of selected gene products by RT-qPCR
From the RNA sequencing data, ten mRNA targets were chosen to be analyzed for tissue specificity using RT-qPCR (S1 Table). The mRNAs chosen for RNA-seq confirmation were selected because they were the most prevalent (based on the sequencing data) in each tissue type. The mRNAs chosen for their high level of expression in the spermathecae of mated queens were GB54516 (Uncharacterized protein), GB53925 (Kielin/chordin-like) and GB43575 (Trehalase). The mRNAs selected for their expression in the spermathecae of virgin queens were GB43248 (α Glucosidase 2), GB44112 (Melittin) and GB54549 (α Glucosidase 1). The drone semen-specific mRNAs chosen were GB48478 (Multiple inositol polyphosphate phosphatase 1), GB54806 (Facilitated trehalose transporter 1), GB45850 (Clavesin2) and GB40598 (Bumetanide-sensitive Na(K)Cl cotransporter). The mRNA chosen for normalizing the data was GB41358 (encoding Elongation factor 1-alpha F2, or EF1aF2) because of its high and invariant expression across samples in the RNA sequencing data, as well as having previously been validated as a reference gene for gene expression studies of honey bee tissues [52].
Primers were designed with the Primer-BLAST software to span exon junctions and to detect all isoforms for each gene product of interest [53]. All primer sequences are listed in S1 Table. The RNA samples were used to synthesize cDNA for qPCR in a 20 μL volume. For each sample, 50 ng of total RNA from the same tissues was used for RNA sequencing. Using the same source material for RNA sequencing, RT-qPCR was done to reduce the amount of environmental variability surrounding the rearing of drones and queens in different sample types. The total RNA from each sample was treated with DNase and reverse transcribed using the iScript TM gDNA Clear cDNA Synthesis kit (Bio-Rad Laboratories, Inc.). Synthesized cDNA was diluted by four-fold and 1 μL of that dilution was used in a 10 μL qPCR reaction. Diluted cDNA acted as template for all qPCRs. All amplifications used Power SYBR TM Green (Thermo Fisher Scientific, Waltham, MA) in triplicate reactions (10 μL) with primers at 80 nM final concentrations. Standard cycling conditions (50˚C for 2 min, 95˚C for 2 min, then 40 cycles of 95˚C for 15 s and 60˚C for 1 min) and melt curve analyses (65˚C to 95˚C in 0.5˚C increments every 5 s) were used on a CFX TM Real-Time system (Bio-Rad Laboratories, Inc.). All qPCR analyses were done using the CFX Manager 3.1 software.
Amplification efficiencies were calculated and used for correction in all normalized fold expression analyses, performed by the CFX TM Manager 3.1 software. M values for the normalizer were < 0.5 for all runs, as calculated by CFX Manager 3.1. Linear values (2^-ΔΔC t ) are reported as means and standard error of the mean (SEM) for each tissue [54]. The RT-qPCR data are reported after setting the lowest value for that gene/tissue at one and reporting up-regulation of that gene in the other two tissues as fold change values.

Statistical analysis
For the RNA sequencing data, the DESeq2 program used Negative Binomial Distribution to estimate means and variances for each gene per sample based on library size and then performs Wald test to extract P and FDR values [55]. If the latter was < 0.05, genes were considered differentially expressed. For the RT-qPCR data, we performed one-way ANOVA tests to determine if there were statistical differences in gene expression values between the spermathecae of virgin queens, the spermathecae of mated queens and drone semen. In cases where we found statistical differences in gene expression among queen types, or between queens and drones, we conducted pairwise Student's t tests to discern differences between tissue types. All tests were performed using the statistical software JMP 1 12.0 (SAS Inc., Cary, NC). We set the level of statistical significance at P < 0.05 for all tests and report all descriptive statistics as means ± SEM.

RNA yield and purity
RNA purified from individual mated queen spermatheca had average yields of 639 ± 264 ng. Similarly, RNA yields from each virgin queen spermatheca averaged 650 ± 315 ng. However, preparations from pooled semen from ten drones from the same colony (per biological replicate) yielded less RNA, with an average of only 92 ± 17 ng. RNA purity was assessed as described in Gonzalez et al. [41].

RNA sequencing of three honey bee tissues
RNA sequencing data were generated from three honey bee tissues: mated queen spermatheca, virgin queen spermatheca, and drone semen. A total of nine cDNA libraries were made, one for each of three biological replicates for each tissue. RNA sequencing of all libraries resulted in approximately 393 million high-quality reads (92.7% of the total 424 million reads). Of these, 39 million reads were further filtered out. The remaining 385 million reads mapped to the A. mellifera genome. The number of total reads, mapped reads and corresponding genes are listed in S2 Table for each of the nine libraries. In total, the expression of 11,233 genes was identified in mated queen spermathecae, 10,521 in virgin spermathecae, and 10,407 in drone semen (not shown in S2 Table).
Because these data are novel, it is important to take note of the genes that were highly expressed in each tissue. The top 20 most prevalent genes by normalized level of expression in each tissue type are listed in Table 1. The complete lists of genes expressed in each tissue are presented in the S3 Table. The entire datasets for each library are available at NCBI (Bioproject # PRJNA542364).
The expression levels of specific genes were compared between the three tissue types. Volcano plots (Fig 2 and S1 Fig) show individual genes plotted in comparisons of paired tissues, with the x-axis being log 2 (Fold Change) and the y-axis being log 10 (P value). The black dots represent individual genes with expression levels not significantly different between the two Table 1. The top 20 most prevalent RNAs discovered in each honey bee tissue (i.e., mated queen spermatheca, virgin queen spermatheca, and drone semen) are listed along with their BeeBase gene identifiers ("Gene_ID").

Gene_ID
Protein description Average value tissues (P > 0.01 and/or log 2 (Fold Change) � 2.0). The red-orange dots indicate genes with different expression levels between the two tissues (P < 0.01 and log 2 (Fold Change) � 2.0). In Fig 2A, red-orange dots on the left side of the volcano plot indicate genes that were down-regulated in mated queen spermatheca compared to virgin queen spermatheca, and red-orange dots on the right indicate genes that were up-regulated in mated queen vs. virgin queen spermathecae. Comparing that to the volcano plot with data from mated queen spermatheca and drone semen, the transcriptomes of the mated and virgin queen spermatheca are more similar than the transcriptomes of the mated queen spermatheca and drone semen ( Fig 2B). The volcano plot comparing the transcriptomes of virgin queen spermathecae and drone semen (S1 Fig) is similar to the one shown in Fig 2B. Comparisons of the RNA sequencing data from the three tissues revealed that the expression levels of 212 genes were significantly different between mated and virgin queen spermatheca (FDR < 0.05). A total of 129 genes (1.4% of total) were up-regulated and 83 genes (0.9% of total) were down-regulated in the mated queen spermathecae (S4 Table). Among these, the 20 genes with the greatest fold changes in expression between those tissues (ten highest and ten lowest) are shown in Table 2. In the genes that were up-regulated in the mated queen spermathecae, two gene products, "uncharacterized LOC102656393" (GB50654) and "uncharacterized protein PF11_0213-like" (GB41096), were also up-regulated in drone semen (S3 Table). This suggests that these uncharacterized protein mRNAs may be delivered to the spermatheca by sperm after mating. Other genes in Table 2 with high fold change values in mated vs. virgin queen spermatheca, such as "proline-rich extension-like protein EPR1" (GB40609) and "NF-kappa-B-inhibitor cactus-like" (GB53301), were also down-regulated in drone semen (S3 Table). The low expression levels of those genes in the virgin queen spermatheca and The "Average value" is the average of the normalized expression values of the three biological replicates for that tissue. Gene descriptions and gene identifiers in bold, italic font were further investigated using real time quantitative PCR (RT-qPCR). https://doi.org/10.1371/journal.pone.0244648.t001

PLOS ONE
drone semen indicate that the expression of the genes is probably induced in the spermatheca after mating.
In comparing the RNA sequencing data from mated queen spermatheca and drone semen, 766 genes were differentially expressed: expression levels of 416 genes were greater and 350 genes were lower in the mated queen spermathecae (S5 Table). Table 3 shows the 20 gene products with the highest and lowest log 2 fold changes. The 1,058 genes with differential expression levels in virgin queen spermatheca and drone semen are shown in S6 Table. Expression levels of 547 genes were greater in the virgin queen spermathecae and those of 511 genes were lower compared to drone semen.

Functional clustering analysis of differentially expressed genes
Gene contrasts that were up-or down-regulated by � 4-fold in the transcriptomes of mated vs. virgin queen spermathecae were compiled in two different lists for D.A.V.I.D. functional annotation clustering analysis. In the list of 129 genes that were up-regulated in mated queen spermathecae, 73 genes were found in the D.A.V.I.D. database. From these 73 genes, a single cluster was identified belonging to the "DUF1676" category (domain of unknown function 1676) with an enrichment score of 1.83 (Table 4; protein descriptions are in S7 Table). In addition, 69 out of 83 genes in the list of genes that were down-regulated in mated compared to virgin queen spermathecae were found in the D.A.V.I.D. database. From these 69 genes, four different clusters were identified with enrichment scores ranging from 5.00 to 3.27 (Table 4; protein descriptions are in S7 Table). Many of these genes were categorized under "signal" or "hydrolase," suggesting that signaling and metabolic pathways are affected in the spermatheca after mating.
Within the 416 genes that were up-regulated in mated queen spermathecae compared to levels in drone semen, 281 genes were found in the D.A.V.I.D. database. However, no functional clusters were identified with FDR < 0.05. Of the 350 genes that were down-regulated in  (Table 5; protein descriptions are in S8 Table). Common categories of clusters were involved in membrane components such as transporters. Genes involved in membrane transport mechanisms were up-regulated in sperm but were less so in the spermatheca after mating.

Real time quantitative PCR (RT-qPCR) confirms RNA sequencing data
Ten transcripts were selected for RT-qPCR analyses from RNA sequencing data (Fig 3), which revealed up-regulation of genes in one of the three tissues examined (spermatheca from mated queens, spermatheca from virgin queens and drone semen). The amplification efficiencies used for correction in all normalized fold-expression analyses ranged from 95.22% to 105.67%, which was within the 90-110% acceptable range [56]. The RT-qPCR analyses confirmed that two mRNAs (BeeBase gene identifiers GB53925 and GB54516, encoding kielin/chordin-like protein and an uncharacterized protein, respectively) were up-regulated in the spermathecae from mated queens when compared to the spermathecae from virgin queens or drone semen (Fig 4A). However, inter-animal variability precluded GB43575 mRNA (encoding Trehalase) from being differentially expressed between mated and virgin queen spermathecae samples. Furthermore, the GB43248 and GB44112 mRNAs (encoding α Glucosidase 2 and melittin, Table 2. The top ten and bottom ten genes that were differentially expressed between mated queen spermatheca and virgin honey bee queen spermatheca, ranked by the log 2 Fold-change (LFC). respectively) were up-regulated in the spermathecae of virgin queens compared to their expression in mated queens (Fig 4B). Levels of the GB54549 mRNA, encoding α Glucosidase 1, were similar between spermathecae from mated and virgin queens. All six gene products with RT-qPCR data reported in Fig 4A and 4B were up-regulated in queen spermathecae (regardless of mating status) compared to drone semen. In contrast, the four mRNAs (GB48478, GB54806, GB45850 and GB40598, encoding Multiple inositol polyphosphate phosphatase 1, Trehalose transporter 1, Clavesin 2, and Na + (K + )Clcotransporter, respectively) were up-regulated in drone semen compared to queen spermathecae (Fig 4C). The RT-qPCR data confirmed some of the differences in mRNA levels first identified in the RNA sequencing data.

Discussion
In this study, RNA sequencing was used to provide a global, high throughput picture of the transcriptome of the spermathecae and semen of honey bees, as well as the relative expression levels of the genes. The purpose was to start to address the molecular mechanism through which honey bee queens store viable sperm for several years after mating. This is the first report of the complete transcriptome of the spermathecae of mated and virgin honey bee queens, in which over 10,000 genes were identified. This data complements a proteomic study of the spermathecal fluid of honey bee queens [34] in which, using one-dimensional acrylamide gel electrophoresis and mass spectroscopy, the authors identified 122 proteins in Table 3. The top ten and bottom ten genes that were differentially expressed between mated queen spermatheca and drone semen, ranked by the Log 2 Fold Change (LFC). spermathecal fluid. Our study showed up-regulation of the vitellogenin (Vg) gene in mated queen spermatheca, which is involved in the production of the egg yolk protein necessary for egg production, as well as queen longevity [57]. This indicates that the presence of Vg as a major protein in that proteome may be translated in the spermathecae, as found the aforementioned study of the honey bee proteome [34]. Additionally, our current study provides the first full description of the transcriptome of semen from honey bee drones, with a similar number of expressed genes discovered as in the queen spermatheca. A previous study used the much less sensitive proteomic approach to identify 40 proteins in the sperm of honey bee drones [35]. The two proteomic studies [34,35] are consistent with our transcriptomic data, indicating that a distinct set of genes are expressed in honey bee spermathecae compared to those expressed in honey bee semen. A group studying queens of the social ant Cremaster osakensis, which also store sperm over several years, performed RNA sequencing of mated and virgin queen spermathecae [2]. They found 75 genes that were up-regulated in mated queen spermathecae compared to virgin queen spermathecae and 20 genes that were down-regulated in mated queen spermathecae. When compared to our discovery of 129 up-regulated and 83 down-regulated genes in the spermatheca of mated Table 4. Cluster analysis of differentially expressed genes in mated queen spermatheca vs. virgin queen spermatheca using the online tool DAVID [50,51].  Table. https://doi.org/10.1371/journal.pone.0244648.t004

PLOS ONE
Transcriptome of the honey bee queen spermatheca Table 5. Cluster analysis of differentially expressed genes in mated queen spermatheca vs. drone semen using the online tool DAVID [50,51].

PLOS ONE
Despite some of these limitations, the transcripts chosen for quantitation by RT-qPCR included several genes that have been identified in other studies of honey bees and ants. For instance, some of these transcripts are involved in carbohydrate metabolism. Trehalose is the primary circulating sugar in insects. It is synthesized mainly by the fat body and its levels in hemolymph are regulated by its production and release by the fat body and its uptake by other body tissues [60]. The importance of the trehalose transporter (Tret1) in A. mellifera is highlighted by the fact that, in larvae, the molar ratio of trehalose to glucose in the hemolymph is 1.5 to 1. Cloning and expression of A. mellifera Tret1 mRNA in oocytes of Xenopus frogs allowed the high affinity of the trehalose transporter to be measured [60]. In larvae of moths in the genus Bombyx, the Tret1 gene is most highly expressed in testis, as measured by RT-qPCR [60], consistent with high levels in semen from honey bee drones (this study). There are much lower levels of Tret1 in other moth organs, including brain, abdomen, midgut, and ovary. In C. osakensis ant queens, levels of Tret1 mRNA were four times greater in spermathecae than in the whole body and they increased 2.2-fold in spermathecae within one week of mating [2]. Furthermore, in situ hybridization localized the Tret1 mRNA to the spermathecal gland and the hilar columnar epithelium of the spermathecal reservoir. In Zootermopsis nevadensis termite pre-soldiers, Tret1 mRNA levels are high in the head before molting [61]. In the current study of honey bees, we also identified greater levels of trehalase mRNA in spermathecae than in drone semen. Trehalase is an enzyme that circulates in the insect hemolymph and cleaves trehalose into two glucose molecules. Trehalase can be up-regulated by hormones (e.g., ecdysone) and pesticide exposure [62]. The observed up-regulation of the trehalose transporters in the mated queen spermatheca likely helps fulfill the organ's energy requirements to maintain the sperm oxygenated and viable in the long term [60].  PCR (RT-qPCR). The genes are identified by their BeeBase gene identifier. The tissues used (n = 3 biological replicates per tissue type) were spermathecae from mated queens ("Mated"), spermathecae from virgin queens ("Virgin") and drone semen ("Drone"). The color index above indicates genes that were expressed at relatively low levels (blue) or at high levels (red) in each row. Heat maps were generated using Morpheus (https://software.broadinstitute.org/morpheus/). Two mRNAs encoding α glucosidases were also selected for RT-qPCR in this study because the RNA sequencing data indicated that they were up-regulated in the spermathecae from virgin queens compared to those of mated queens. Alpha-glucosidases are widely distributed in species ranging from plants to animals [63]. There are three α glucosidases (I, II, and III) in honey bees [64]. The α Glucosidase 1 protein is a predominant glucosidase in the ventriculus of honey bee workers [65], and cleaves both glucosides and maltooligosaccharides [66]. The α Glucosidase 2 protein is predominant in the ventriculus and hemolymph of honey bee workers [65], and it cleaves glucose residues in N-linked oligosaccharides. Moreover, real time semiquantitative PCR data has previously shown that α Glucosidase 1 mRNA is more highly expressed in adult Apis cerana foragers than α Glucosidase 2 mRNA, while the levels of α Glucosidase 2 mRNA are greater in eggs, larvae and pupae [67]. Our RT-qPCR results confirmed that α Glucosidase 2 mRNA levels were greater in spermathecae from virgin honey bee queens compared to those in mated queens. The up-regulation of both α glucosidases in virgin queen spermathecae likely helps the organ break down polysaccharides for energy usage, especially when priming the sperm-storing organ in preparation for the energetic demands of storing and maintaining sperm viability over time.
Several of the other mRNAs analyzed by RT-qPCR encoded signaling molecules. The Multiple inositol polyphosphate phosphatase (Minpp1) enzyme cleaves inositol hexaphosphate (InsP6, also known as phytic acid) and the inositol pyrophosphates InsP7 and InsP8 [68]. These are ubiquitously expressed in eukaryotic cells and are believed to be master regulators of energy metabolism [69]. The amino acids of the Minpp1 enzyme of honey bees is only identical to human amino acid residues at 20 to 30% of the positions [68]. The Bumetanide-sensitive Na(K)Cl cotransporter is a membrane-bound ion channel. The Clavesin 2 protein regulates endocytosis and secretion [70]. The Kielin/chordin-like protein was recently recognized to be important for TGF1 beta signaling in mammals, as it binds Bone morphogenetic protein 7, increases binding to TGF type 1 receptors and interacts with activin A [71].
Previously, we reported up-regulation of three genes encoding antioxidant proteins in the spermathecae of mated queens compared to virgin queens [41]. In that study, RT-qPCR identified two-fold upregulation of the Catalase and Thioredoxin 2 genes along with three-fold upregulation of the Thioredoxin reductase 1 (TXNRD1) gene (P < 0.05). A different group found an increase in TXNRD1 mRNA levels in mated honey bee queen spermatheca compared to virgin queen spermatheca [34]. In this study, our minimal cutoff for the differential expression of genes was four-fold changes (log 2 (Fold Change) � 2.0; FDR < 0.05) in the level of gene expression. Our RNA sequencing data (S4 Table) confirmed the three-fold greater expression of the TXNRD1 gene in the mated spermatheca compared to the virgin queen spermatheca. Similarly, the RNA sequencing data in the current study determined that the expression levels of four other genes encoding antioxidants (Superoxide dismutase 1, Glutathione S-transferase D1, Glyoxalase domain-containing 4-like and Vitellogenin) were not different in the spermatheca of mated and virgin honey bee queens, consistent with the RT-qPCR data in our previous report [41]. In addition, in queens of the social ant C. osakensis, levels of Superoxide dismutase mRNA were not different between the spermatheca of mated and virgin queens [2]. Real time quantitative PCR (RT-qPCR) confirmation revealed the relative expression levels of genes (identified by their BeeBase gene identifier) that were more highly expressed in mated queen spermatheca (blue bars), virgin queen spermatheca (orange bars), or drone semen (grey bars). Of the selected genes, a) two of three genes were most highly expressed in the mated queen spermatheca; b) two of three genes were most highly expressed in the virgin queen spermatheca; and c) four genes were most highly expressed in drone semen. Different letters above each bar represent the tissue for which gene expression was significantly different from that in the other tissues (P < 0.05). Linear values are presented after normalizing to the EF1aF2 mRNA. Protein names are listed above the bars. https://doi.org/10.1371/journal.pone.0244648.g004 In conclusion, this is the first known transcriptomic study that characterizes gene expression in the spermathecae of honey bee queens revealing the alterations in mRNA levels within them after mating. Further analysis of the honey bee queen transcriptomes obtained from this and other studies should help to identify genes that are involved in long-term sperm storage in the queen spermatheca after mating, and could potentially help to elucidate how environmental and biotic stressors can affect honey bee queen mating health.
Supporting information S1 Fig. Volcano plot displaying differentially expressed genes (red-orange dots) between the spermatheca of virgin honey bee queens and drone semen. Each dot represents one gene. The black dots represent genes that were not differentially expressed (P < 0.01 and |log 2 (Fold-change)| � 2). (PPTX) S1 Table. List of genes chosen for confirmation of differential expression by Real Time quantitiative PCR (RT-qPCR). An exception is the normalizer gene GB41358 at the bottom, on the last row. The BeeBase gene identifier, GenBank accession number, gene symbol, protein description, forward and reverse primer sequences, and amplicon size (bp) are provided. (XLSX) S2 Table. RNA sequencing statistics for nine cDNA libraries. Each "Drone" replicate indicates a pooled semen sample from ten drones; each "Mated queen" replicate represents one spermatheca sample from a mated queen; and each "Virgin queen" replicate represents a spermatheca sample from a virgin queen. The "total number of reads mapped" is the number of reads that were mapped to the honey bee genome. (XLSX) S3 Table. List of genes that were discovered during RNA sequencing. The BeeBase gene identifier, protein description, normalized gene values for each sample of the three tissue types analyzed, and the average normalized value for each tissue type, are provided. Each "Drone" replicate indicates a pooled semen sample from ten drones; each "Mated queen" replicate represents one spermatheca sample from a mated queen; and each "Virgin queen" replicate represents a spermatheca sample from a virgin queen. The genes were sorted (most prevalent to least prevalent) based on average values of the mated queen samples. (XLSX) S4 Table. List of all genes that were differentially expressed between mated queen spermatheca and virgin queen spermatheca. The genes are ranked by the log 2 Fold-change (LFC). Genes are identified by their BeeBase gene identifier. SEM is the standard error of the mean of the log 2 Fold-change. FDR is the False Discovery Rate. (XLSX) S5 Table. List of all genes that were differentially expressed between mated honey bee queen spermatheca and drone semen. The genes are ranked by the log 2 Fold-change (LFC). Genes are identified by their BeeBase gene identifier. SEM is the standard error of the mean of the log 2 Fold-change. (XLSX) S6 Table. List of all genes that were differentially expressed between virgin honey bee queen spermatheca and drone semen. The genes are ranked by the log 2 Fold-change (LFC). Genes are identified by their BeeBase gene identifier. SEM is the standard error of the mean of the log 2 Fold-change. (XLSX) S7 Table. Functional annotation clustering of differentially expressed genes in mated queen spermatheca vs. virgin queen spermatheca using the online tool DAVID [50,51]. Recognized gene clusters were identified using the lowest stringency of the software's Functional Annotation Clustering settings. The reported genes are those that were clustered and had an FDR value < 0.05. Genes are reported using their BeeBase gene identifiers and their protein descriptions. Genes in bold had their expression additionally examined by RT-qPCR. See S4 Table for a complete list of all the genes that were differentially expressed between mated queen spermatheca and virgin queen spermatheca. (XLSX) S8 Table. Functional annotation clustering of differentially expressed genes in mated queen spermatheca vs. drone semen using the online tool DAVID [50,51]. Recognized gene clusters were identified using the lowest stringency of the software's Functional Annotation Clustering settings. The reported genes are those that were clustered and had an FDR value < 0.05. Genes are reported using their BeeBase gene identifiers and their protein descriptions. Genes in bold had their expression additionally examined by RT-qPCR. See S5 Table for a complete list of all the genes that were differentially expressed between mated honey bee queen spermatheca and drone semen. (XLSX)