Transcriptional Profiles of Mating-Responsive Genes from Testes and Male Accessory Glands of the Mediterranean Fruit Fly, Ceratitis capitata

Background Insect seminal fluid is a complex mixture of proteins, carbohydrates and lipids, produced in the male reproductive tract. This seminal fluid is transferred together with the spermatozoa during mating and induces post-mating changes in the female. Molecular characterization of seminal fluid proteins in the Mediterranean fruit fly, Ceratitis capitata, is limited, although studies suggest that some of these proteins are biologically active. Methodology/Principal Findings We report on the functional annotation of 5914 high quality expressed sequence tags (ESTs) from the testes and male accessory glands, to identify transcripts encoding putative secreted peptides that might elicit post-mating responses in females. The ESTs were assembled into 3344 contigs, of which over 33% produced no hits against the nr database, and thus may represent novel or rapidly evolving sequences. Extraction of the coding sequences resulted in a total of 3371 putative peptides. The annotated dataset is available as a hyperlinked spreadsheet. Four hundred peptides were identified with putative secretory activity, including odorant binding proteins, protease inhibitor domain-containing peptides, antigen 5 proteins, mucins, and immunity-related sequences. Quantitative RT-PCR-based analyses of a subset of putative secretory protein-encoding transcripts from accessory glands indicated changes in their abundance after one or more copulations when compared to virgin males of the same age. These changes in abundance, particularly evident after the third mating, may be related to the requirement to replenish proteins to be transferred to the female. Conclusions/Significance We have developed the first large-scale dataset for novel studies on functions and processes associated with the reproductive biology of Ceratitis capitata. The identified genes may help study genome evolution, in light of the high adaptive potential of the medfly. In addition, studies of male recovery dynamics in terms of accessory gland gene expression profiles and correlated remating inhibition mechanisms may permit the improvement of pest management approaches.


Introduction
In insects, the seminal fluid that conveys the spermatozoa is a complex mixture of proteins, inorganic solutes, carbohydrates and lipids produced in the male reproductive tract and transferred to the female during mating [1][2][3][4][5][6][7]. In many species these compo-nents are responsible for a complex set of physiological and behavioural changes in the female, including a reduction of receptivity to remating, increased ovulation and egg-laying, and variations in feeding activities [3,7,8]. The molecular and physiological functions of these substances have been most extensively investigated in Drosophila melanogaster [9], and particular attention has been given to seminal fluid proteins (SFPs) produced in the male accessory gland (MAGs), due to their involvement in the modulation of female post-mating responses. These accessory gland proteins (Acps) consist mainly of putative proteolysis regulators (proteases and protease inhibitors), lipid modifiers (lipases), sperm-binding candidates (Cysteine-RIch Secretory Proteins, CRISPs), antioxidants, carbohydrate-binding proteins (lectins), and many other small peptides and prohormones [10,11].
In recent years, with the aid of next generation sequencing technologies and proteomic approaches, comprehensive studies aimed at the identification and analyses of seminal fluid proteins have been initiated in some insect species, such as the beetle Tribolium castaneum [12,13], Heliconius butterflies [14], the honeybee Apis mellifera [15], the ant Leptothorax gredleri [16], the sandfly Lutzomyia longipalpis [17], the mosquitoes Aedes aegypti [6,18] and Anopheles gambiae [19], and other Drosophila species [11,20]. This is not the case for the Mediterranean fruit fly (medfly), Ceratitis capitata. The medfly is a tephritid pest with a worldwide geographical distribution and a history of rapid and devastating outbreaks [21,22]. This species is the most thoroughly studied fruit fly pest at the genetic and molecular levels and has become a model for analysis of insect invasions. A recent gene discovery project provided the first major dataset of over 9,600 putative transcripts expressed in the embryo and the adult head [23]. Nevertheless, these sequences represent only a portion of the medfly transcriptome and, given the absence of a sequenced genome, this gap of knowledge is a barrier to rapid progress in every field of medfly biology.
Although several aspects of medfly reproductive biology have been widely investigated and much is known about its demography, population genetics, ecology and physiology [22,24,25], systematic studies of gene expression are scarce [23,[26][27][28] and characterization of seminal fluid proteins at the molecular level is still patchy [29]. Several studies suggest that the medfly possesses biologically active Acps, since males deprived of testes are still able to reduce female receptivity to further matings [30][31][32]. Moreover, it has been shown that virgin females injected with male accessory gland extracts, like mated females, shift their attention to fruit and increase their oviposition rate [33]. Other studies suggest that the medfly may possess a homologue of the most significant Drosophila Acp, Acp70 [34], also known as 'sex peptide' due to its ability to stimulate both short and long-term post-mating responses [35,36]. In Drosophila, Acp70 is able to bind sperm, which functions as a sex peptide carrier [37].
The medfly male accessory glands comprise two long tubular mesodermally-derived glands, and a set of short ectodermal glands [38]. These two gland types have been shown to produce different secretions, namely lipids, polysaccharides and proteins in the long tubular, and mainly proteinaceous secretions in the short glands. Mating induces a progressive reduction in the amount of proteins in the glands, suggesting their transfer to the female during copulation [38].
Molecular and functional information on the proteins secreted in the medfly male reproductive apparatus are limited. Several genes expressed in the male accessory glands of the medfly have been identified [29], but only modest progress has been made towards an exhaustive screening of SFPs and analyses of their functions. Often, the genes encoding SFPs are difficult to identify due to their rapid evolution. Indeed, in D. melanogaster, the evolutionary rate of many of these genes is so fast that they lack detectable orthologues even in other Drosophila species [2,[39][40][41][42]. These SFPs are involved in the establishment of barriers to fertilization that can lead to speciation since they contribute to sperm activation, gamete interactions and ovulation. Post-copula-tory competition for egg fertilization may lead to fast co-evolution between seminal proteins and proteins encoded within the female reproductive tract [43]. Independent episodes of such rapid coevolution (for example in allopatric populations) could result in reproductive divergence and eventually lead to speciation [44][45][46].
The identification of genes potentially involved in spermatogenesis and/or sperm-egg interactions will constitute an additional important advancement for understanding the reproductive biology of the medfly. Among insects, the testis transcriptome has been studied in detail only in D. melanogaster [47][48][49][50], B. mori [51], and in A. gambiae [52]. In the medfly, information about gene expression in the testes is almost completely lacking, the only exception being the studies on the testis-specific b2 tubulin gene [53] and partial data on the spermatogenesis and fertilization processes [54][55][56].
Here we describe the transcriptomes of the testes and accessory glands from adult male medfly to advance our understanding of the male reproductive system and the transcriptional profiles and putative roles of secreted candidates that may play a role in the regulation of female reproductive processes.

Medfly samples
An old established strain, ISPRA, which has been maintained at the University of Pavia since 1979, was used to create the cDNA library. Standard larval and rearing methods were used [57]. To obtain male adults for the testes and accessory glands (TAG) library, a standard laboratory rearing cage was set up with about 600 less than one day old male and female adults. Males were removed from the cage at intervals of 24 h for 8 days and the testes and MAGs were dissected in sterile PBS-DEPC ( Figure 1). The dissected material was immediately immersed in RNA-later (Ambion) solution on ice and stored at 280uC until required. Thus the males used in the library construction covered a range of ages and consisted of immature (1-2 day old post emergence), virgin and mated individuals. For the cDNA library, total RNA was extracted from the testes and MAGs from each collection using Trizol (Invitrogen), followed by treatment with DNase (DNAfree, Ambion). An equal quantity of total RNA was pooled from each of the daily extractions prior to poly(A) + RNA purification. First-strand cDNA synthesis was primed with an oligo(dT) containing a NotI restriction site. The double-stranded cDNA was ligated to an EcoRI adaptor, digested with NotI, and cloned directionally into a NotIand EcoRI-digested pT7T3-Pac phagemid vector [58]. The cDNA inserts were flanked by a library-specific 39 linker tag sequence (59-NotI-TTGGCGGCGG-39 and 59 linker (59-EcoRI-GGCACGAGG-39). The library was normalized [58]. Randomly selected clones were sequenced from the 59 end using the M13 reverse sequencing primer (59-AGCGGATAACAATTTCACACAGGA-39) with an Applied Biosystems 3730 DNA analyzer. Base-calling and low quality sequence trimming were performed using Phred [59], and vector sequences were trimmed using Cross-match (http://www. sanger.ac.uk/software/). Repeat sequences were masked using RepeatMasker (http://www.repeatmasker.org). The EST sequences have been deposited in GenBank dbEST database (accession numbers: JK832450-JK838363).

Bioinformatics analyses
The sequences were assembled and annotated using the dCAS pipeline [60,61]. The reads were clustered using BLAST [62], assembled into contigs using CAP3 [63] and annotated by searches against other databases. The results of these analyses were then piped into a hyperlinked Excel report, as described in the dCAS software tool [60].
The coding sequences (CDS) that were equal or larger than 40 amino acids (aa) were extracted according to two criteria i) matches to proteins in the NCBI nr database; ii) the largest ORF coincided with the same reading frame as a predicted signal peptide. Functional annotations of the transcripts were performed using the program Classifier (Ribeiro, unpublished) that combines the output of several tools: BLASTX [64] to compare the nucleotide sequences to the NCBI nr protein database, Swissprot, rpsblast [64] to search for conserved protein domains in Pfam [65], SMART [66], KOG [67], Conserved Domains Databases (CDD) [68] and GO databases [69]. Transcripts were also compared to mitochondrial and rRNA nucleotide sequences in NCBI. Transcripts that shared no significant sequence similarity using BLASTX were further analyzed using the more sensitive and rigorous FASTY program that uses the Smith-Waterman alignment algorithm [70] (http://fasta.bioch.virginia.edu/ fasta_www2/fasta_list2.shtml). Segments of six-frame translations of transcripts with a methionine within the first 100 predicted amino acids were submitted to the SignalP server [71] to identify translation products that could be secreted. The SignalP analysis categorized the peptides as follows: SIG, predicted signal peptide by either the Neural Network or Hidden Markov Model programs; CYT, predicted cytoplasmic protein; BL, borderline result, where the probability of the predicted signal peptide was just below the acceptable threshold. In addition, TargetP predictions to nuclear and mitochondria protein destination were considered [72]. Since the signal peptide is often recognized as a membrane helix, TMHMM searches [73] were also performed, and translations with a predicted single helix within residue 30, which could correspond to the signal peptide membrane region, were also considered as target transcripts for further analyses. Oglycosylation sites on the proteins were predicted with the program NetOGlyc [74]. Gene ontology analysis was performed using Blast2GO (searches with an e,10 26 ) [75]. The tissue specificity of Drosophila orthologues of the identified transcripts was determined using the Flyatlas [76] expression database.

Reverse Transcriptase-PCR on adult male individuals
The expression patterns of transcripts putatively encoding secreted proteins were determined by reverse transcription PCRs (RT-PCRs) using specific primers (Table S1) on cDNA derived from different body compartments. Four day-old virgin male and female medflies were separately dissected in PBS-DEPC onto glass slides and RNA was extracted from the following male tissues: 1) heads, 2) thoraces, 3) testes, 4) accessory glands (both long tubular and short claviform) and 5) abdomen without testes and MAGs, and the following female tissues: 6) heads, 7) thoraces and 8) abdomens. Extractions were performed using Trizol (Invitrogen), followed by treatment with DNase (Ambion), on pools of 50 individuals for each of the eight tissues. cDNA was synthesized from 200 ng RNA from each of the eight pools using the Cloned AMW First Strand Synthesis kit (Invitrogen) and one microlitre of the resulting first-stranded cDNA was then used as template for RT-PCR experiments. RT-PCRs were performed with the following thermal conditions: 2 min at 95uC, 25 cycles of 30 s at 95uC, 45 s at 59uC, 45 s at 72uC, and 10 min at 72uC. PCR products were analyzed on a 1% agarose gel.

Real-Time quantitative PCR on virgin and mated males
Expression profiles were derived for candidate transcripts i) putatively encoding secreted proteins and ii) with specific expression, or significant over-expression, in male accessory glands. Expression levels of each transcript were tested in the abdomen of mature males, comparing virgin and mated individuals of same age to detect differences potentially induced by mating. The abdomen was used rather than the accessory glands and testes in these assays to avoid time-intensive dissections thus permitting the assessment of transcript abundance at the precise time intervals.
Virgin medfly individuals from the ISPRA strain were sexed after chilling at emergence and were used in mating assays on the fourth day after eclosion. All flies were maintained at 24uC with a 12:12 h light/dark cycle on yeast-glucose medium before use in the experiments. For each of the three experimental replicates performed, 30 small cages (565613 cm) were used, each containing one male and ten females, provided with food and water. Mating pair formation was observed from 08:00 to 20:00 h, thus covering the complete light period. Mating duration was recorded and only copulations longer than 100 min were considered to avoid false matings, i.e. those with little or no sperm transfer [77]. After a couple had separated, either the male was gently removed from the vial by aspiration or left inside and allowed to remate. Males were assigned to seven mating status groups according to the number of matings they had achieved and a recovery time after mating/remating of 0, 6 and 12 h, i.e.: Figure S1). For example, the 2 M+6 h group consisted of males that had mated twice and had been allowed to recover for six hours after the termination of the last mating. For each mating status group, ten mated males were dissected and their abdomens immediately transferred to microcentrifuge tubes with 200 ml of Trizol and frozen at 280uC. As a control, a cage was set up containing about 100 virgin males of same age as those used for mating assays. Contemporary with each mated group, an equal number of unmated male abdomens from the control cage were collected in the same manner.
Extractions were performed using Trizol, followed by DNase treatment, on 10 pooled individuals for each of the seven mating status groups and for the corresponding seven unmated control groups. cDNA was synthesized from 200 ng RNA from each of the 14 pools, as described for the RT-PCR analyses. Primer pairs were designed to obtain amplification products of 100-250 bp, using Beacon Designer 7 (Premier Biosoft International). The expression level of each transcript was determined for each of the seven time points between mated and virgin males of same age. The virgin male samples were taken as calibrators in order to assess the relative fold-change after mating. Two medfly housekeeping genes were used for relative quantification normalization: CcActin (GenBank acc. n. FG081771.1) and CcRpL13A (GenBank acc. n. FG085984.1) [78,79]. Real-Time qPCR was performed using SsoFast TM EvaGreenH Supermix (Biorad). Cycling parameters were: 3 min at 95uC, 40 cycles of 10 sec at 95uC, 30 sec at 57uC and 30 sec at 68uC, and 10 min at 72uC. A fluorescence reading was made at the end of each extension step. Three replicates were performed and the specificity of the amplification product was determined by melt-curve analysis. PCR efficiencies were above 95% for all primer pairs. Relative quantification was performed using MJ Opticon Monitor TM Analysis Software 3.1.32 (Biorad). Data were analyzed using the Student's t-test. Results were expressed as mean 6 standard error and a P value of ,0.05 was considered statistically significant. In addition, a multivariate linear regression was applied to evaluate the time-dependency of gene transcription and to assess the impact of individual genes. We considered as dependent variable the log 2 of the ratio of transcript abundance in mated and virgin samples. The independent predictors were the different time-points and the genes. As the data consisted of the means of replicated experiments, we exploited a weighted least square fit procedure [80,81], where the vector of weights have been taken as the log 2 of the ratio between the transcript abundance of mated samples and the corresponding standard error.

Generation, assembly and functional annotation of the medfly testes and accessory gland ESTs
A unidirectional, normalized cDNA library was constructed from the testes and accessory glands (TAG) of immature, virgin and mated males ranging from less than one to eight days of age.
A total of 8448 cDNA clones were sequenced directionally from the 59 end. These reads, once trimmed of vector, contaminants and low quality sequences, resulted in a total of 5914 high-quality sequence reads of average length 621 bp, amounting to almost four megabases of sequence. The ESTs were assembled into 3344 contiguous sequences (contigs, each with the prefix TAG), the highest number of ESTs in a single contig being 75. Nine hundred and fifty one contigs contained at least two ESTs, while 2393 contigs were represented by a single read (singleton). The distribution of the ESTs in the contigs is illustrated in Table 1.
Over 66% (2223) of the total number of assembled TAG sequences (3344) produced best hits against the nr database with an expectation (e) of ,10 26 using BLASTX within Blast2GO. The vast majority (97%) of the best hits were arthropod-derived sequences, with Drosophila being the most common (75% of all hits), and only 1% (38) of the best hits were against known C. capitata sequences. Ten of the 13 putative MAG-specific transcripts previously identified in the medfly [29] were detected in the library. The remaining 33.5% (1121) of the assembled sequences produced no hits against the nr database and may putatively represent novel or rapidly evolving TAG sequences. Of the 2223 assembled sequences that gave significant hits, 1588 (71.4%) were categorized into different gene ontology (GO) classes according to biological process and molecular function ( Figure 2, Figure 3).
Within the Biological Processes ontology, metabolic, cellular, and developmental process, and cellular component organization were the most representative terms. In addition, reproduction, biological regulation and response to stimulus terms were also abundant. In the Molecular Function ontology, the most abundant terms were binding and catalytic activities. Among the catalytic activity category, the hydrolase and transferase functions were the most frequent ( Figure S2, Figure S3).

Identification of the most abundant transcripts in the TAG transcriptome
As an initial description of the TAG transcriptome, we considered the 100 most abundant transcripts derived from the highest number of EST reads. This set included contigs that are derived from eight up to 75 reads. Ribosomal genes accounted for 35 of the 100 abundant sequences and the remaining 65 transcripts are reported in Table S2. Eight transcripts shared significant similarity to the medfly odorant-binding-protein-like, male specific serum polypeptides (MSSPs), and two (TAG1563 and TAG1565) to the Drosophila odorant-binding protein 56d. Moreover, the transcripts TAG1863 and TAG846 were similar to cytochrome c oxidase genes from the medfly and Drosophila respectively, while TAG640, TAG875 and TAG1003 shared similarity to the Drosophila cytoskeletal gene alpha-tubulin 84B, the fertilityrelated gene exuperantia, and a protease inhibitor gene of the grey fleshfly Sarcophaga bullata, respectively. By contrast, fifteen of the most abundant transcripts shared no sequence similarity to genes present in the GenBank database.

Identification of putative peptides with secretory activity in the TAG transcriptome
Extraction of the coding sequences of the TAG transcripts and their conceptual translation products resulted in a total of 3371 putative peptides with a mean length of 167 residues. The number of peptide sequences is greater than the number of assembled sequences (3344), as several contigs yielded two very similar peptides that differed only in length. This redundancy in the dataset was maintained, as it was difficult to identify the correct peptide sequence when multiple methionine residues were present as start codons (Spreadsheet S1). The dCAS pipeline permitted the functional classification of the predicted peptides as shown in Table 2. In particular, we identified 2160 peptides with putative housekeeping functions (Table S3) and 400 with putative secretory activity (12% of total number of predicted peptides). These 400 peptides correspond to 304 TAG transcripts and include known gene families, such as odorant binding proteins (obps), protease inhibitor domain-containing peptides, antigen 5 proteins, mucins, and immunity-related sequences. However, 67% of the peptides classified as being associated with secretory function displayed no significant similarity to known proteins (Table 3). After excluding transcripts that were clearly not directly involved in reproduction, such as ribosomal and mitochondrial sequences, 206 transcripts that encoded putative secreted proteins (Spreadsheet S2) remained, and these were considered for expression profile analyses related to their tissue-specificity and mating-responsiveness. Among these were seven of the putative MAG-specific transcripts previously identified in the medfly [29].
Tissue-specific expression of TAG transcripts encoding putative secreted peptides The 206 candidate transcripts with putative secretory signals were assayed for tissue-specific expression patterns via RT-PCR using total RNA from different adult body compartments from males (heads, thoraces, testes, accessory glands and abdomen carcass (i.e. without testes and MAGs)), and from females (heads, thoraces and abdomens). Eighty-three transcripts were detected in both sexes in all the tissues examined, 88 were detected in different tissue patterns in both sexes, whilst 35 were found to have malespecific transcriptional profile (Figure 4, Table S4). Of these transcripts, 26 were specific to the testes and one to the MAGs (TAG40) (Table 4, Figure 4). The remaining eight male-specific transcripts displayed varying tissue distributions, with four expressed in the testes and MAGs (TAG1019, TAG1693, TAG2960, TAG3266), three in the testes, MAGs and abdomen carcass (TAG1692, TAG1695, TAG3261) and one in the testes, head and abdomen carcass (TAG2907). Interestingly, two additional transcripts, TAG1523 and TAG3324, despite being detected also in female tissues, displayed high transcriptional levels in the MAGs.
The proportion of male-specific transcripts (21 of 35) that lacked significant similarity to known protein sequences was significantly greater than the proportion without hits in the non male specific transcripts (66 of 171) (Fisher's exact test, two tailed P = 0.0242).
The fourteen putative proteins encoded by the 35 male-specific transcripts that shared significant similarities to known protein sequences include ten testis-specific transcripts: TAG207 (similar to the Drosophila antigen 5-related 2 and with a sperm coating proteinlike domain), TAG289 (similar to CG14840), TAG857 (similar to the sperm leucylaminopeptidase 4 and a predicted M17 family aminopeptidase), TAG1111 (similar to Anopheles darlingi B6DE45), TAG1117 (similar to CG12377), TAG1555 (similar to GA28049), TAG1602 (similar to CG11286), TAG2860 (similar to CG13043), TAG3024 (similar to CG5217) and TAG3102 (similar to CG12680) ( Table 4). The MAG-specific transcript, TAG40, classified as a pancreatic lipase-like protein, is similar to CG5162. The two MAG and testes-enriched transcripts TAG1019 and TAG2960 are similar to CG10853 and a trypsin-like serine protease, respectively. One transcript detected in the MAGs, testes and abdomen carcass, TAG3261, shared similarity to a haemo- Four of the seven sequences that corresponded to the putative MAG-specific transcripts previously identified [29] were found to be male-specific, of which one (TAG40/DQ406806) was MAGspecific.
Mating-induced transcriptional changes associated with genes encoding putative secreted proteins Transcripts from ten genes with MAG-specific or -enriched expression profiles were analysed by Real-Time quantitative PCR to assess changes in their abundance after one, two or three matings, respectively. These included eight male-specific transcripts and two transcripts (TAG1523, TAG3324) with high abundance in the MAGs, but also present in other male and female tissues. The expression data were normalized relative to   control virgin males of the same age similarly analysed at the 0, 6 and 12 h time points. Transcript TAG1019 showed very low abundance in all the treatments tested (CT.35) and was not further considered. The analysed genes showed very different and complex transcriptional profiles in response to one or more matings, suggesting potentially different functional roles ( Figure 5). The two MAG-preferential transcripts, the putative pancreatic lipase TAG40 and the antigen 5-related TAG1523, showed a significant increase in abundance only after the third mating. Interestingly, the hypothetical secreted protein precursor TAG3324 displayed a significant decrease in abundance 6 h after the first mating, but its transcriptional activity subsequently increased. No significant differences relative to the virgins were detected for all the three time points following the second mating. The third mating, on the contrary, induced strong transcriptional activity. A similar decrease in transcript abundance after the first mating was detected for the hypothetical secreted protein, TAG3266, but its expression increased after the second mating, immediately post copula. However, its transcript levels significantly decreased 6 h after the second mating, as well as after the third mating. The first and second matings did not impact the transcriptional activity of the possible mucin TAG1693, while the third copula resulted in a reduction of transcript abundance. Concerning the putative trypsin-like serine protease TAG2960, transcript abundance was significantly higher in mated males relative to virgins immediately after each mating, with a decrease observed only 12 h after the second mating. The putative haemolymph juvenile hormone binding protein (TAG3261) showed significant reductions in transcript abundance 6 h after the first mating and 12 h after the second. Similarly, the possible mucin TAG1692 displayed decreases in abundance 12 h after both the first and the second copulation. Finally, the other possible mucin TAG1695 showed no significant changes in transcript abundance between virgin and mated males at all the time points analysed.
To evaluate the influence of the number of matings and recovery time on the transcriptional dynamics of the different genes, we applied a multivariate linear regression. This analysis revealed a statistically significant increase in transcriptional activity only immediately after the third mating (regression coefficient = 0.84, standard error = 0.35, P = 0.02). Furthermore, the gene TAG2960 had a statistically significant impact on the global gene transcript levels (regression coefficient = 0.92, SE = 0.41, P = 0.03).

Discussion
Here we describe the first large-scale transcriptome dataset of 3344 sequences from the testes and male accessory glands of the medfly. What is more, it also represents the first transcript resource from these reproductive tissues in any tephritid species. This resource will be of vital importance not only for unravelling the reproductive processes of the medfly, but also for comparative evolutionary analyses of the reproductive molecular machinery between tephritid species and other more distant insect taxa. Within this transcriptome dataset, we have identified a subset of transcripts that, on the basis of their tissue-specificity, may encode medfly seminal fluid proteins. We described the transcriptional profiles of nine of these genes that showed mating-induced changes in abundance, most probably related to replenishment of their protein products after multiple matings.

TAG transcriptome: conserved, novel and fast-evolving genes
The transcriptome dataset is derived from two organs, the testes and the accessory glands, that participate in the maintenance of complementary reproductive functions of the medfly male. On the basis of what is known from other insect taxa, we may suppose that, in the medfly testes, key regulatory genes of spermatogenesis tend to be conserved to guarantee the male-specific processes required for sperm production [82,83]. The accessory gland secretions may act as key factors in male reproductive success and, as such, the Acp-encoding genes are subject to rapid evolution as a result of sexual conflict and competition [84]. Studies on numerous organisms from very distant taxa have shown that among the most rapidly evolving genes are those expressed in the reproductive tissues, particularly in males [85][86][87][88][89][90][91]. In the medfly, out of the 3344 unique transcripts we identified in the TAG library over a third shared no significant similarities to known genes and may be novel and/or fast-evolving sequences. They may have a potential role in sexual selection and speciation, and represent ideal subjects for future evolutionary genetic studies for this species. However, as this is the first extensive resource from the male reproductive tract in any tephritid species, many may be tephritid-specific rather than medfly-specific genes.
Medfly testes and accessory glands express highly conserved genes that are critical for male reproduction across several insect species Both testes and male accessory glands are sites of rapid cell proliferation and secretory activity, as supported by the categorization of transcripts in functional classes related to biological regulation, metabolic, developmental and cellular processes.
Several chemosensory-related transcripts are present in the TAG transcriptome. Ten putative peptides shared highly significant similarity to D. melanogaster Obp99c and also to the male specific serum polypeptides (MSSPs) previously characterized in the medfly [92,93]. The MSSPs belong to the Minus-C subfamily of odorant binding proteins and are thought to be involved in the transport of volatile substances or other hydrophobic molecules [92,93]. Two other transcripts (TAG1563 and TAG1565) showed high similarity to Drosophila Obp56d. Recent studies have reported  the expression of obps in the male accessory glands and testes of Drosophila [76,88,[94][95][96] and in the MAGs of the mosquito A. aegypti [6] and of T. castaneum [13]. The presence of obps in these tissues is not unexpected, as in addition to their function in olfaction, these proteins may also act as carriers for physiologically active ligands, such as hormones, that are transferred from the male to the female during copulation [96,97]. Other highly abundant transcripts fall into categories involved in protein synthesis. The high representation of ribosomal transcripts is further indicative of the intense metabolic activity related to spermatogenesis, sperm membrane activity, and energy production and utilization. These functional categories are all consistent with the features of the sperm cell, which is rich in mitochondria and has a microtubule-based axoneme. Examples are the highly abundant cytochrome c oxidase transcripts, which may be involved in the energy production necessary for sperm locomotion and the myosin light chain 2 orthologue, TAG698, which may cooperate in muscular activity. TAG875 is very similar to the Drosophila gene exuperantia, that encodes a germ-line restricted RNA-binding protein essential for fertility [98] and has been identified as one of the targets of transformer-2 [99]. Another highly abundant transcript, TAG1003, shares similarity with a protease inhibitor and contains a Kunitz domain. The enzymatic reaction between a protease and its inhibitor is characterized by the formation of a pseudo-irreversible inhibitorprotease complex [100]. The protease inhibitor activity of some Drosophila Acps [101] and the homology of others to proteases [2,10] suggest that some Acps play a role in the processing and activation of other Acps [102][103][104]. Drosophila seminal fluid also contains regulators of proteolysis, one being the trypsin inhibitor Acp62F that has been shown to enter the female's sperm storage organs to play a key role in protecting sperm from degradation [105]. During storage in the female organs, proteolysis of sperm surface proteins could impair the sperm's egg-binding ability. Alternatively, regulated proteolysis of the surface of stored sperm could be essential to activate or capacitate them. Indeed, in mice, mutations in seminal fluid protease inhibitors impair fertility, consistent with the hypothesis that protease inhibitors protect sperm [105]. Serpins (SERine Protease Inhibitors) have been shown to have a role in male fertility in many mammalian species [106]. In addition, data from Caenorhabditis elegans demonstrated the essential role of predicted secreted serine protease inhibitors for the regulation of sperm activation [107,108]. This widespread presence of protease inhibitors across distant species supports the hypothesis of an important role of such molecules also in the medfly.
The best hit of TAG3302, glutathione S-transferase, is a predicted intracellular or membrane-bound protein [18]. Predicted intracellular proteins also have been reported in the seminal fluid of other organisms, such as D. melanogaster [109], bed bugs [110], honey bees [15], and humans [111]. In some species, including A. mellifera and A. aegypti, it has been suggested that these proteins may be secreted through non-standard secretion routes [15,18], such as apocrine or holocrine secretion [112,113]. For example, within the reproductive tract of mated A. aegypti females, the ejaculate contains vacuoles which grow after mating, and disappear within 24 h post copula [114]. Subunits of the membrane-bound proton ATPase thought to be part of these vacuoles have been identified as Acps [18]. Moreover, macroapocrine secretion has been reported in the medfly and the olive fly Bactrocera oleae [38,115].
Elongation factors have also been identified in insect seminal fluids [110,116]. For example, elongation factor 1alpha has been shown to be involved in protein synthesis, regulation of apoptosis and interaction with actin and ubiquitin-dependent proteolysis. . Differential transcript levels (Log 2 transformed fold changes) of nine putative secreted peptides in mated medfly males, compared to virgin individuals of same age. Transcript abundances were determined at seven different time-points (0, 6 and 12 h after the first and the second mating respectively, and 0 h after the third copula) compared to virgin males of the same age. Stars indicate significant differences in transcript abundances (*P,0.05, **P,0.01, ***P,0.001, two-tailed t-test on three replicates) in the pairwise comparison between mated and virgin males. doi:10.1371/journal.pone.0046812.g005 The putative secreted TAG peptides belong to a series of functional classes characteristic of known seminal fluid proteins Our expression analyses in different male and female adult tissues revealed that a third of the putative secreted peptides with testis-specific and/or MAG-enriched expression profiles belong to highly conserved Acp classes [117]. The medfly TAG putative secreted peptides belong to the CAP superfamily (Cysteine-Rich Secretory Proteins/Antigen 5/Pathogenesis-Related 1 Proteins), or are predicted to be mucins, proteases or lipases.
The remainder of our putative secreted peptides lack identifiable orthologues and this may reflect the pattern of rapid evolution described for a significant portion of Acps [7].

CAP (CRISP/Antigen5/PR-1) superfamily
Proteins of the CAP family, also known as the sperm-coating glycoprotein (SCP) family, are secreted and several are known to be involved in male fertility and fertilization [118]. A number have been implicated in sperm chemo-attraction and sperm-egg fusion [119]. In Drosophila, the interaction of at least four SFPs, including the CRISP CG17575, is required for the binding of Acp70A to the sperm and to ensure their localisation in the female seminal receptacle [120]. Two medfly transcripts (TAG207 and TAG1523) are predicted members of the CAP superfamily, as they contain SCP domains and share sequence similarity to Ag5related proteins [121]. TAG207 transcription was limited to the testes, whereas TAG1523 was abundant in the MAGs but also present in the heads of both sexes. Ag5 proteins were first identified within the venom of fire ants and wasps [122], and subsequently in the Drosophila midgut [123] and the saliva of bloodfeeding insects [124][125][126].

Mucins
Four medfly putative TAG peptides shared similarities with proteins of the Mucin family, a group of large glycosylated macromolecules capable of forming enormous networks that act as selective barriers [127]. In Drosophila, mucins are expressed not only in the digestive tract, but also in the salivary glands and in the developing embryo, where they may contribute to the shaping of non-chitin-producing organs by providing a luminal scaffold during their development [127]. Additionally, mucins have been shown to participate, together with other proteins and lipids, in the formation of mating plugs, often produced within the female reproductive tract during or shortly after mating [7]. Mating plugs, which are present in a wide variety of organisms from insects to mammals have been proposed to play three main functions: to prevent remating, either as a physical barrier or by releasing chemical cues that prevent female remating; to favour sperm storage or to prevent sperm loss from the female reproductive tract; or to act as a visible signal of female mating status [128][129][130][131]. The medfly does not produce a plug, but mucins may have a sperm protection function, or may have a role in the differentiation and renewal of the epithelium and modulation of cell adhesion, immune response, and cell signalling [132,133].

Proteases
Proteases are present in the seminal fluid of many insect species and have been implicated in various aspects of male reproduction [7]. Seminal proteases could cleave inactive molecules into their active form, as in D. melanogaster Acp26Aa [104], or they could participate in the digestion or breakdown of mating plugs [129,134]. Their presence in the transcriptomes of the reproductive tracts of both sexes is thus expected. Transcript TAG2960, which corresponds to the previously identified medfly MAG sequence DQ406805 [29], appears to encode a trypsin-like serine protease. This suggests that it is involved in proteolysis and, once secreted, it could be active within the male or female reproductive tract. In addition, TAG857 is a predicted aminopeptidase of the M17 family. In Drosophila, this family of leucyl aminopeptidases comprises sperm leucyl aminopeptidases (Sperm-LAPs, S-LAP) [135] that are specifically expressed in the testis and encode proteins incorporated in mature sperm [136][137][138].

Lipases
High levels of lipase activity have previously been detected in male accessory glands of Drosophila [2,10], Phlebotomus papatasi [139], and the medfly [29]. TAG40, which corresponds to the previously identified medfly MAG-specific sequence DQ406806 [29], shows similarity to lipases, that are postulated to provide energy to sperm by hydrolysis of triglycerides [109]. These lipases, once transferred to the female, may contribute nutritional resources, modify the chemistry of her reproductive tract to favour sperm and/or Acp function, or alter the sperm membrane to facilitate fertilization [2]. Carboxylesterase genes identified in the MAGs of A. gambiae and in L. longipalpis are homologues of Drosophila EST-6 [17,19]. EST-6 in Drosophila is transferred to the female in the seminal fluid and influences oviposition behaviour and female receptivity to remating [140].

Mating induces transcriptional changes in genes putatively encoding seminal fluid proteins
Five transcripts (TAG3266, TAG40, TAG1523, TAG2960 and TAG3324) showed significant changes in abundance after mating. Only TAG2960 appears to be a direct mating-responsive gene: it displays three distinct waves of transcriptional activity, immediately after each of the three matings. Due to its over-expression in the MAGs and its high similarity to a serine protease, it is an interesting medfly accessory gland protein candidate. In most of the other transcripts, the significant increase in abundance occurred after the third mating (examples are the MAG-specific TAG40 and TAG1523), suggesting that the mRNA (and perhaps the peptide) were not totally depleted after the first two copulations.
TAG1523, an antigen 5-related sequence, has previously been detected in the medfly MAGs [29]. It belongs to the large CAP family, involved in diverse functions, such as immune response [141] and testis and sperm development [142]. In Drosophila, several members of this family are preferentially expressed in males and some within primary spermatocytes [143]. It has been proposed that they may act either mediating interactions between germ-line and somatic cells within the male or between the sperm and egg [118]. The mating-responsiveness of antigen 5-related orthologue may be indicative of its involvement in gamete interactions also in the medfly.
TAG3324, together with TAG3266 (that corresponds to DQ406813 [29]), deserves further investigation due to the lack of detectable orthologues in other species. These two transcripts show significant changes in abundance at many time points. The oscillations in their abundance suggests switch points possibly related to the need to replenish transcripts encoding putative secreted proteins with an active role in the seminal fluid.
In several insects, juvenile hormone (JH) has been reported to accelerate the maturation of the male accessory glands [144][145][146][147][148][149] and is required for the renewal of secretory products depleted during mating [12,144,145,147,149,150]. In Drosophila, JH not only contributes to the regulation of the initial accumulation and re-synthesis of accessory gland products [151][152][153], but it has also been proposed to promote male courtship [153]. Juvenile hormone is generally produced and released from the corpora allata [154], but in some species, such as A. aegypti, JH is synthesized de novo in the MAGs [155]. In spite of these roles of JH in insects, very little is known about its titre, function, and regulation in the medfly [156]. In another tephritid species, the Caribbean fruit fly, Anastrepha suspensa, mated males accumulate significantly higher levels of JHIII and JHB 3 in the haemolymph than virgin males [157].
Juvenile hormone binding proteins (JHBPs) act as carriers of JH from the corpora allata to its target cells, and serve as a pool of JH in the haemolymph [158]. In addition, JHBPs protect JH from degradation by non-specific hydrolases [159][160][161]. The presence of JHBPs in the MAGs was previously reported in the medfly [29] and in Drosophila [162]. Acps synthesis can be stimulated by the ectopic application of JH [151], and putative JH binding sites have been identified upstream of the transcriptional start of Acp genes, suggesting a transcriptional regulation of some Acps by JH [151,163,164]. In the medfly, the transcript abundance of haemolymph JHBP (TAG3261, which corresponds to DQ406809 [29]) decreased significantly 6 h after the first mating and 12 h after the second mating, while at all other time points the abundance remained unchanged. In light of the complex regulation of JH in the insect reproductive biology, we hypothesize that these two significant drops in transcript abundance may be related to the replenishment of ejaculate components after mating. Studies on mating-induced changes in JH levels in male insects including the medfly are scarce, but in Drosophila it has been shown that JH levels increase after mating to stimulate Acps synthesis and to replenish the ejaculate components [165]. An increase in JH levels up-regulates the expression of juvenile hormone esterase (JHE), which, together with juvenile hormone epoxide hydrolase, hydrolyzes JH in order to regulate its levels [166,167]. Once ejaculate replenishment is complete, JHE expression would reduce JH levels. On these bases, we hypothesize that the highly significant decrease of haemolymph JHBP (TAG3261) abundance several hours after mating may be related to this regulation of JH levels.
The transcriptional activity of accessory gland genes is related to male mating behaviour Some considerations emerge from the post-mating transcription profiles of genes that encode peptides in the male accessory glands in relation to the male mating behaviour. Our data indicate that for the majority of the considered genes there is no general increase in the transcriptional activity after each mating. However, after repeated copulations, and particularly after the third, their transcriptional profiles suggest that the depletion of their products presumably triggers transcription to replenish the proteins to be transferred. Thus, the availability of a reservoir of seminal fluid proteins sufficient for more than a single copulation is mirrored by the capacity of the male medfly to mate several times during the day [30]. The ability of the male to partition this reserve between successive females may be an efficient adaptive strategy to optimise his reproductive success. Moreover, the observation that the accessory gland proteins begin to be transferred within ten minutes after the start of copulation, long before the sperm, suggests that they may also be required to create the optimal physiological conditions in the female storage organs for the sperm [168]. As duration of copulation is not related to the quantity of sperm transferred to the female [77], the extended duration of this behaviour, which can last over five hours [77], is reminiscent of female guarding behaviour [169,170]. This male strategy would prevent the female from remating with another male before the Acps had switched her behaviour towards oviposition.
In conclusion, the very complex transcriptional profiles of several of these genes suggest that they need to be further characterised. Clarification of seminal fluid components and their regulation will have the potential to reveal novel functions and processes associated with the reproductive biology of this pest species. In addition, studies of male recovery dynamics in terms of expression profiles of Acp genes, and the correlated mechanisms of female remating inhibition, may help improve pest management approaches. Figure S1 Schematic representation of the experimental design for Real-Time qPCR assays. Once mating was completed, individual males (once mated) were removed and analysed at designated recovery times for gene expression (0, 6 and 12 h). The remaining males that had mated were immediately allowed to remate (twice mated) and treated as above. Remaining twice mated males were immediately allowed to mate again (thrice mated) and all were sacrificed after the completion of the copula   Spreadsheet S1 Hyperlinked Excel spreadsheet containing annotated assembled ESTs. Putative peptides are named using the prefix 'Cc-male-' rather than the prefix 'TAG' used in the text. (ZIP) Spreadsheet S2 Excel spreadsheet containing 206 putative secreted proteins considered for expression profile analyses related to their tissue-specificity and matingresponsiveness.