A Proteomic Investigation of Soluble Olfactory Proteins in Anopheles gambiae

Odorant-binding proteins (OBPs) and chemosensory proteins (CSPs) are small soluble polypeptides that bind semiochemicals in the lymph of insect chemosensilla. In the genome of Anopheles gambiae, 66 genes encode OBPs and 8 encode CSPs. Here we monitored their expression through classical proteomics (2D gel-MS analysis) and a shotgun approach. The latter method proved much more sensitive and therefore more suitable for tiny biological samples as mosquitoes antennae and eggs. Females express a larger number and higher quantities of OBPs in their antennae than males (24 vs 19). OBP9 is the most abundant in the antennae of both sexes, as well as in larvae, pupae and eggs. Of the 8 CSPs, 4 were detected in antennae, while SAP3 was the only one expressed in larvae. Our proteomic results are in fairly good agreement with data of RNA expression reported in the literature, except for OBP4 and OBP5, that we could not identify in our analysis, nor could we detect in Western Blot experiments. The relatively limited number of soluble olfactory proteins expressed at relatively high levels in mosquitoes makes further studies on the coding of chemical messages at the OBP level more accessible, providing for few specific targets. Identification of such proteins in Anopheles gambiae might facilitate future studies on host finding behavior in this important disease vector.


Introduction
Mosquitoes are vectors of several diseases affecting about one hundred million people worldwide and killing more than a million, mostly in tropical areas [1]. In the absence of protective vaccines, as is the case of malaria and dengue fever, at present transmission of the pathogens to humans is avoided using bed nets, and mosquitoes populations are controlled mainly with insecticidebased strategies. Although this last approaches may be very efficient, they are also unsafe for human health and for the environment. Moreover, insects can rapidly develop resistance to insecticides, thus continuously requiring the design and the use of new generations of chemicals. Therefore, alternative approaches to fight mosquitoes are strongly needed. A promising strategy is to target the chemical communication system of mosquitoes with the aim of developing efficient repellents that might interfere with the olfactory system and disrupt the perception of chemical messages, such as those that allow host localization and choice. In this respect, an interesting approach is suggested by the observation that high levels of carbon dioxide can disorient mosquitoes [2]. However the commercially available synthetic products have recently raised some concern for human health [3], prompting a wide research on alternative mosquito repellents. Such investigation requires a detailed knowledge of the mosquito's chemoreception system at the molecular level in order to understand which chemical messages are important for the insect biology and the behavioural responses they induce.
Two classes of proteins are directly involved in the perception and recognition of chemical stimuli, membrane-bound olfactory (OR) and gustatory (GR) receptors and soluble Odorant-Binding Proteins (OBPs) [4].
In particular, recent research has provided several pieces of evidence on the specific involvement of OBPs in the detection and discrimination of chemical messages in insects [5][6][7][8][9][10]. Therefore, a study on the structure and properties of the different OBPs could represent a strong basis for understanding the olfactory code in a given species and help designing new compounds that may be effective in population control.
Anopheles gambiae is the main malaria vector in sub-Saharan Africa. The genome of the species [11] has provided valuable information for the study of chemoreception proteins. It contains 79 genes encoding olfactory receptors and 76 encoding gustatory receptors [12][13][14]. These genes have been expressed in different systems and their specificities in recognising chemical stimuli have been analysed [15,16].
While there is little doubt that all (or at least most of) the membrane-bound receptors classified as olfactory and gustatory are involved in the perception of external chemical stimuli, with OBPs the picture is much more complex. In fact, this large family of proteins comprises members that may perform different functions, indirectly related or even completely unrelated to olfaction and taste, such as transport of semiochemicals in reproductive organs [17] or binding of biogenic amines [18].
The genome of An. gambiae contains 66 genes encoding proteins that have been classified as OBPs solely on the basis of sequence similarity [14,19]. This number is very close to that of olfactory receptors and at the beginning suggested the idea that a one to one relationship could exist between members of the two families of proteins. However, this view proved to be too simplistic and the actual situation is much more complex. Only 33 of such genes encode so-called ''classic OBPs'', whose signature is a conserved pattern of six cysteines, linked to each other by disulfide bonds in a specific fashion (1-3, 2-5, 4-6) [20,21]. The relative positions and the pairing of the six cysteines are conserved across all Orders of insects, from locusts and aphids to Coleoptera and Diptera. In addition, there are 19 longer OBPs in An. gambiae, containing a larger number of cysteines and therefore called C-plus OBPs. Their sequences still present a ''classic'' core with additional polypeptide segments [22]. A third group of 14 proteins includes outliers and is classified under the name of ''atypical OBPs''. Among these, some are referred to as ''tandem OBPs'', containing two ''classic OBP'' sequences connected by few amino acids. These proteins, that occur in the saliva of mosquitoes, are probably not involved in chemoreception, on the basis of a recent report showing that one member binds biogenic amines and mediates antiinflammatory processes [18].
The picture is still more complex with the other family of soluble proteins of the chemoreception system, the Chemosensory Proteins (CSPs) [4,23,24]. In fact, several members of this group are expressed in non-sensory organs and some are involved in different functions, such as development and differentiation [25][26][27][28][29][30][31]. These polypeptides are shorter than OBPs (100-120 residues) and present only 4 cysteines paired in non-interlocked fashion [32]. In An. gambiae only 8 genes encoding such proteins have been identified, and reported alternatively as CSPs or SAPs (Sensory Appendage Proteins) [14,33,34].
Because of such complex picture, it is important to identify which OBPs and CSPs are expressed in antennae and other sensory organs, such as mouth parts and tarsi, being these proteins more likely involved in the perception of semiochemicals.
Using microarrays, Biessmann and coworkers [14], found that the most abundantly expressed OBPs in female antennae are in the order: 5, 48, 1, 17, 9, 47, 3, 7, 4 and 20. All of them are classic OBPs with the exception of C-plus OBP47 and OBP48. Most of these proteins are expressed at higher levels in female antennae than in males', while OBPs 5 and 9 are more abundant in males. In the same study, several genes are reported to be down-regulated in the female antennae after a blood meal, with the exception of OBP9, whose level of mRNA greatly increased after a blood meal. Among the CSPs, only the RNAs encoding the three SAPs were detected in the antennae of An. gambiae [14]. A more recent report, aimed at characterising trascriptome profile in chemosensory tissues, partially confirmed the data discussed above [35]. In larvae and pupae, several OBPs were detected using microarray and PCR analysis, the most abundant being #9, 1, 17, 48, 3, 4, 5 [14].
Here, we adopted a proteomic approach to identify OBPs and CSPs that are expressed in the antennae of An. gambiae males and females, as well as in pre-adult stages. The results show that only about one third of the genes encoding OBPs and half of those encoding CSPs are expressed at the protein level in antennae with a strong sexual dimorphism, while in pre-adult stages OBP9 is the by far the most abundant.

Ethics statement
This study was approved by the Ethical Committee of the University of Pisa, N. 12498. The rabbits were bled under anaesthetic from the heart.

Reagents
All enzymes, unless otherwise stated, were from New England Biolabs. Oligonucleotides were custom synthesized at Eurofins MWG GmbH, Ebersberg, Germany. All other chemicals, unless otherwise stated, were purchased from Sigma-Aldrich and were of reagent grade.

Preparation of extracts
Anopheles gambiae were reared at the Department of Public Health of the University ''La Sapienza'', Roma, Italy, from a colony named GA-CAM-ST originated from the progeny of females collected in Cameroon and belonging to the molecular form M (standard with regard to the chromosomal inversions, [44]). All adult specimens were 2 days old and were fed only with 0.5% sugar solution. Females and males were segregated in different cages soon after emergence to keep them virgin. Specimens were killed by freezing at 220uC and then transferred at 280uC.
For 2D gel separation, the antennae of 1,110 male individuals were used. For shotgun proteomic experiments we used in total the antennae from 600 individuals of each sex to perform three sets of analysis, each in triplicate. Antennae were crushed in a mortar under liquid nitrogen and extracted with 0.1% trifluoroacetic acid. The extracts were centrifuged at 19,0006g for 40 min at 4uC and the supernatants were concentrated to 50 mL by centrifugal evaporation.
100 fourth instar larvae, or 100 pupae, or 100 eggs of An. gambiae were homogenised in 500 mL of 0,1% aqueous TFA by grinding in a mortar followed by sonication, and centrifuged at 19,0006 g for 40 min at 4uC.
Gels were stained using Brilliant Blue G-Colloidal Concentrate (Sigma). The excised spots were subjected to tryptic digestion and nano HPLC-ESI Orbitrap analyses. The acquired MS and MS/ MS data were searched with Proteome Discoverer 1.2 (Thermo Fisher) using SEQUEST as the search algorithm against a database created by merging the sequences of the peptides predicted from An. gambiae genome [11] (Anopheles_gambiae.A-gamP3. 48.pep.all.fa.gz, and Anopheles_gambiae.AgamP3.48.pep.abinitio.fa.gz downloaded at http://www.ensembl.org/info/ data/download.html) with the entries related to Anopheles in UniProtKB. Searches were performed allowing up to three missed cleavage sites, 10 ppm of tolerance for the monoisotopic precursor ion and 0.5 mass unit for monoisotopic fragment ions and carbamidomethylation of cysteine and oxidation of methionine as variable modifications. False discovery rate was set at 1%.

Shotgun experiments
Antennae. Samples for shotgun experiments were resuspended in 200 mL of urea containing buffer (8 M Urea, 100 mM TrisHCl, pH 8.5). Based on Bradford colorimetric assay, the samples of female and male antennal extracts contained 80 and 200 mg of total protein, respectively. Reduction of disulfide bridges and alkylation was performed by treating samples with 2 mM DTT (30 minute at 25uC), followed by 11 mM iodoacetamide (20 minutes at room temperature in the dark). LysC digestion was then performed by incubating the samples with LysC (Wako) in a ratio 1:40 (w/w) under gentle shaking at 30uC. The digestion products were diluted 3 times with 50 mM ammonium bicarbonate and incubated with 10 mL of immobilized trypsin (Applied Biosystems) for 4 hours under rotation at 30uC.
Fifteen 15 mg of each resulting peptide mixture were then desalted on Stage Tip [46] and the eluates dried and reconstituted to 50 mL in 0.5% acetic acid. Fractions containing 7 mg of protein were injected.
The extract was analysed on three sets of analyses, each performed in triplicates on a LC-MS/MS system (Eksigent nano Liquid Chromagrapher coupled to a Linear Trap Quadrupole -Orbitrap Velos (Thermo)), on a C18 (75 mm i.d.615 cm, 1.8 mm, 100 Å ) column at 250 nL/min using a 155 or 255 minutes gradient ranging from 5% to 60% of solvent B (solvent A = 5% acetonitrile, 0.1% formic acid; solvent B 80% acetonitrile, 0.1% formic acid). The nanospray source was operated with a spray voltage of 2.1 kV and ion transfer tube temperature of 275uC. Data were acquired in data dependent mode, with one survey MS scan in the Orbitrap mass analyzer (resolution 60,000 at m/z 400) followed by up to 20 MS/MS in the ion trap on the most intense ions (intensity threshold = 750 counts). Once selected for fragmentation, ions were excluded from further selection for 30 seconds, in order to increase new sequencing events. Raw data were analyzed using the MaxQuant proteomics pipeline (v1.2.2.5) and the ANDROMEDA search engine [47] against the database described above. Carbamidomethylation of cysteines was chosen as fixed modification, oxidation of methionine and acetylation of Nterminus were chosen as variable modifications. The search engine peptide assignments were filtered at a False Discovery Rate ,1% and the feature ''match between runs'' was not enabled; other parameters were left as default.
For each set of analysis, relative abundance of proteins was estimated using the ''Intensity'' values as produced by MaxQuant software [47], normalised on the total intensity signal. The results of the three sets were averaged.

PFAM enrichment analysis
Each identified protein was assigned to its Protein family (Pfam) [48] and Pfam were analysed for differential expression between male and female antennae. Proteins were considered to be expressed in only one sex if identification was based on more than 2 peptides and no peptides were identified in the other sex. Proteins were considered overexpressed if the ratio of intensity values between the two sexes was greater than three. The Pfam enrichment analysis was performed using custom R scripts (available on demand). For each individual Pfam id associated to proteins overexpressed or found in only sex, a Fisher exact test was performed over the total set of proteins. Results were filtered with an alpha = 0.05.
Eggs. Eggs extract was freeze-dried, redissolved in 40 mL of 10 mM DTT in100 mM ammonium bicarbonate and incubated at 56uC for 45 min. Then, 40 mL of 55 mM iodoacetoamide were added and the mixture was incubated at room temperature for 30 min in the dark. Digestion was performed by addition of 2 mL of 0.1 mg/mL trypsin and incubation overnight at 37uC. Digestion was blocked by 10% TFA to pH 2.5. Aliquots of 25 mL of the resulting peptide mixture were then desalted on three Stage Tips (Rappsilber et al., 2007); eluates were pooled, dried and then reconstituted to 15 mL in 0.5% acetic acid. Peptide solution was analysed in triplicates (1 mL) on a Ultimate 3000 HPLC (Dionex, San Donato Milanese, Milano, Italy) coupled with an Linear Trap Quadrupole Orbitrap mass spectrometer (Thermo Fisher, Bremen, Germany) using a C18 (75 mm i.d.615 cm, 1.8 mm, 100 Å ) column at a 250 nL/min flow, using a 144 min gradient ranging from 5% to 90% of solvent B (solvent A = 5% acetonitrile, 0.1% formic acid; solvent B 80% acetonitrile, 0.1% formic acid). The nanospray source was operated with a spray voltage of 2.0 kV and ion transfer tube temperature of 275uC. Data were acquired in data dependent mode, with one survey MS scan in the Orbitrap mass analyzer (resolution 15,000 at m/z 400) followed by up to 3 MS/MS in the ion trap on the most intense ions. The acquired MS and MS/MS data were searched with Proteome Discoverer 1.2 (Thermo Fisher) using SEQUEST as the search algorithm, as described above.

RNA extraction and cDNA synthesis
Total RNA was extracted with the TRIH Reagent (Sigma), following the manufacturer's protocol. cDNA was prepared from total RNA by reverse transcription, using 200 units of Super-ScriptTM III Reverse Transcriptase (Invitrogen) and 0.5 mg of an oligo-dT primer in a 50 mL total volume. The mixture also contained 0.5 mM of each dNTP (GE-Healthcare), 75 mM KCl, 3 mM MgCl 2 , 10 mM DTT and 0.1 mg/ml Bovine serum albumin in 50 mM Tris-HCl, pH 8.3. The reaction mixture was incubated at 50uC for 60 min and the product was directly used for PCR amplification or stored at 220uC.

Polymerase chain reaction
Aliquots of 1 mL of crude cDNA were amplified in a Bio-Rad Gene CyclerTM thermocycler, using 2.5 units of Thermus aquaticus DNA polymerase (GE-Healthcare), 1 mM of each dNTP (GE-Healthcare), 1 mM of each PCR primer, 50 mM KCl, 2.5 mM MgCl 2 and 0.1 mg/ml Bovine serum albumin in 10 mM Tris-HCl, pH 8.3, containing 0.1% v/v Triton X-100. At the 59 end, we used a specific primer corresponding to the sequence encoding the first six amino acids of the mature protein. The primer also contained an Nde I restriction site for ligation into the expression vector and providing at the same time the ATG codon for an additional methionine in position 1. At the 39 end a specific primer was used, encoding the last six amino acids, followed by a stop codon and an Eco RI restriction site for ligation into the expression vector. Therefore, we used the following primers for the OBP5 (enzyme restriction sites are underlined): fwAgamOBP5 Nde: 59-AACATATGGCGATGACGC-GAAAACAA-39 rvAgamOBP5 Eco: 59-GTGAATTCTTATTAGGGAAAGA-GAAACAC- 39 After a first denaturation step at 95uC for 5 min, we performed 35 amplification cycles (1 min at 95uC, 30 sec at 50uC, 1 min at 72uC) followed by a final step of 7 min at 72uC. An amplification product of about 400 bp, in agreement with the expected size was obtained.

Cloning and sequencing
The crude PCR product was ligated into a pGEM (Promega) vector without further purification, using a 1:5 (plasmid: insert) molar ratio and incubating the mixture overnight at room temperature. After transformation of E. coli XL-1 Blue competent cells with the ligation product, positive colonies were selected by PCR using the plasmid's primers SP6 and T7 and grown in LB/ ampicillin medium. DNA was extracted using the Plasmid MiniPrep Kit (Euroclone) and custom sequenced at Eurofins MWG (Ebersberg, Germany).

Cloning in expression vectors
pGEM plasmid containing the sequence of OBP5 (Acc. No. Q8T6R6) was digested with Nde I and Eco RI restriction enzymes for two hours at 37uC and the digestion product was separated on agarose gel. The obtained fragment was purified from gel using QIAEX II Extraction kit (Qiagen) and ligated into the expression vector pET-5b (Novagen, Darmstadt, Germany), previously linearized with the same enzymes. The resulting plasmid was sequenced and shown to encode the mature protein. Purification of the protein was accomplished by combinations of chromatographic steps on anion-exchange resins DE-52 (Whatman) and QFF (GE-Healthcare), along with standard protocols previously adopted for other Odorant-Binding Proteins [49,50].

Preparation of antisera
Antisera against OBP4 (Acc. no. Q6T6R7) and OBP5 were obtained by injecting rabbits subcutaneously and intramuscularly with 300 mg of recombinant protein, followed by two additional injections of 150 mg after 15 and 30 days. The protein was emulsified with an equal volume of Freund's complete adjuvant for the first injection and incomplete adjuvant for further injections.
The animals were bled 10 days after the last injection and the sera were used without further purification. The rabbits were individually housed in large cages, at constant temperature, and all operations were performed according to ethical guidelines to minimize pain and discomfort to animals.

Proteomic analysis of antennae
Our first attempt to identify OBPs and CSPs in the antennae of mosquitoes followed a classical approach. The 2D-gel, prepared with the antennae of 1,100 males (Figure 1), produced 79 protein spots in the region of MW lower than 40 kDa, that were excised and analysed. This choice included also proteins longer than classic OBPs, such as C-plus OBPs, salivary OBPs and so-called ''tandem OBPs''. However, the only OBP identified in this experiment was OBP9. In addition, two proteins of the CSP family, named SAP1 and SAP3, were detected. These results reasonably exclude the presence of other OBPs and CSPs, at least above the Coomassie staining detection limit. However, we felt  that this method could not be sensitive enough to detect all the proteins present in our sample and probably not applicable to the smaller antennae of female mosquitoes. Therefore, we decided to apply a shot-gun approach, that does not require a 2D-gel, but analyses a tryptic digest of a crude protein extract by nano-HPLC and MS/MS. Such technique recently proved to be fast and efficient, requiring at the same time very small biological samples, as in the case of the antennae of Drosophila [9,52]. Applying this method to antennal samples from 600 males and 600 virgin females of An. gambiae, we identified 2958 proteins (2605 in females and 2634 in males). A complete list of such proteins is reported in file S1, grouped according to their Pfam descriptors. Pfam PF01395, described as ''PBP/GOBP'', includes all OBPs In several cases, entries with the same or similar names refer to very similar sequences, likely originated from different strains of mosquitoes. In the leader protein column we report the name of the sequence with the highest coverage, as reported in the SwissProt database. OBP9 was also identified as the only olfactory protein in eggs, on the basis of two peptides with a coverage of 25.9%. Unique peptides are those characteristic of each sequence. (classic, C-plus-and atypical), while Pfam PF03392, described as ''Insect Pheromone-binding'' includes CSPs and SAPs. Most of the identified proteins and corresponding Pfam were common between the two sexes and not differently expressed. Table 1 reports Pfam overexpressed or identified in only one sex. Within the ''PBP/GOBP'' Pfam, 3 proteins are female specific and 12 are more abundantly expressed than in males. On the other hand, the expression of three CSPs, belonging to the ''Insect pheromone-binding'' Pfam, was male biased. Table 2 reports the data relative to the individual 24 OBPs and 4 CSPs identified in the in antennae of both sexes, together with their entry codes and names in Uniprot database. The table also includes three proteins previously reported by Justice and coworkers [53] in the antennae of the same species: the putative antennal carrier protein ANP-1, and two polypeptides named TOL-1 and TOL-2 (TakeOut-Like proteins) considered to be potential carriers for hydrophobic ligands and possibly involved in feeding behaviour. None of these three proteins shows significant sequence similarity with OBPs or CSPs.
In a few cases, because of high identity of sequences, more than one OBP was identified on the basis of the same set of peptides. As an example, OBP1 and OBP17 share the same amino acid sequence, but the latter presents a longer C-terminus (155 vs 144 aa). On the other hand, we could distinguish two proteins   Q8T6R5 and Q8I8S7 sharing 97% of their amino acid sequence by the presence of one peptide unique to each of them ( Figures S1  and S2).
The identified OBPs can be assigned to three different groups: 13 classical OBPs, 5 C-plus OBPs and 6 salivary OBPs. These latter OBPs, previously reported as belonging to the D7 proteins, are in fact abundant in mosquitoes saliva [54]. Since they are longer than classic OBPs (about 300 amino acids against 120-130) and are characterised by two extra cysteines in addition to the six of the conserved motif, they can also be assigned to the sub-class of C-plus OBPs. A special note deserves SNAP_ANO-PHELES00000005748, that is much longer than other D7 proteins. In fact, it contains two typical D7 domains (each with 8 cysteines) connected by a short amino acid bridge. The sequence of the first domain is nearly identical to that of the D7 protein Q7Q484. A function of these salivary proteins in chemical communication has not been investigated, although their presence in the antennae might suggest a role in odorant detection. Of the 4 identified CSPs, 3 have been previously reported in the literature as SAPs [14].
Perhaps the major drawback of the shot-gun method is the difficulty in evaluating the abundance of each protein using labelfree approaches [55]. For the evaluation of our results, we have used the protein ''Intensity'' values as produced by ''MaxQuant'' software, based on the areas of the peptide peaks in the LC/MS analysis. Three samples of male and three of female antennae were analysed (each one in triplicate) and the areas of peptides were averaged for each protein over the three replicates. These values were then normalised dividing each of them by the total intensity value relative to all the proteins identified. Finally we plotted the averages of the three analyses for males and females, together with their standard errors (Figure 2) According to the data of Figure 2, the most represented OBPs (including classical and C-plus sequences) in the antennae of females are, in the order, #9, 1/17 and 12, followed by OBPs #48, 47, 7, 3 and 20, that are expressed at lower levels and few others only detectable in traces. D7r2 is the best represented among the salivary proteins. Among the CSPs, we could only detect relatively low levels of the three SAPs and the CSP3. In males, according to the same criterion, the picture is quite different, with OBP9 as the only protein of this family present at high levels, together with SAP1 and SAP3, also strongly represented. This result is in good agreement with 2D-gel data on male antennae, where we could only detect the three above mentioned proteins.
Overall, we can observe that female antennae generally express a larger number and higher quantities of OBPs than males, while the situation is reversed for CSPs.
Our results are in fairly good, but not complete, agreement with a microarray-based RNA analysis [14], which ranked female antennal OBPs in the following decreasing order of abundance: #5, 48, 1, 17, 9, 47, 3, 7, 4 and 20. All these genes, with the exception of OBP4 and OBP5, encode proteins that in our analysis were classified as ''abundant'' or ''well represented'', although not in the same order.
These data are partially confirmed by a more recent a transcriptome analysis [35], that however failed to detect OBP9, a protein found in the present work as the most abundant OBP in all tissues and developmental stages.
The absence of OBP4 and OBP5 in our analysis posed a major problem, also because OBP4 transcript had been reported in our previous work [43], using mosquitoes of the same age and physiological state as those of the present research. In order to clarify this point, we decided to perform Western blot experiments.

Western blot experiments
Therefore, we expressed OBP5 in bacteria, adopting the classic procedure utilised for the expression of other OBPs. As most of these proteins, OBP5 was present as inclusion bodies and was solubilised and purified using our standard protocol successfully adopted for many proteins of this class [49,50] (Figure 3).
Polyclonal antibodies were raised against the newly produced OBP5 and the previously described OBP4 [43] and used in Western blot experiments on crude extracts of female and male antennae. Figure 4 reports the results of the immunodetection. As controls for the antisera, we included samples of the purified proteins, while an internal control for the extract was provided by OBP9 that had been detected as the most intense spot in the 2Dgel of male antennae ( Figure 1) and previously reported in the antennae of both sexes [56]. The expression of OBP9 and the production of a polyclonal antiserum is part of a currently ongoing research (Qiao et al., unpublished). While we could clearly stain OBP9 in the extract, we were not able to get evidence for the presence of OBP4 or OBP5 (Figure 4). We then repeated the Western blot experiments using polyclonal antisera against OBP47 and SAP3, two proteins expressed at lower levels than OBP9, that could provide alternative positive controls. As we failed to stain either of these proteins, both detected in our proteomic study, we concluded that our Western blot method is not sensitive enough for proteins expressed at lower levels, and consequently we cannot exclude the presence of OBP4 and OBP5 in the antennal extracts.
On the other hand, there could be alternative reasons for the absence of OBP4 and OBP5 in our shot-gun experiments, including the possibility that the synthesis of these proteins could be triggered by some physiological events, such as mating or ingesting a blood meal.

Proteomic analysis on pre-adult stages and eggs
We also decided to investigate the presence of OBPs and CSPs in pre-adult stages and in eggs. Given the relatively large samples available for larvae and pupae, we have chosen to adopt for this Figure 5. Two-dimensional gel electrophoretic separation of extracts from 100 fourth instar larvae and 100 pupae of An. gambiae. The gel was stained with colloidal Coomassie Brilliant Blue and all the spots migrating with apparent molecular weight lower than 24 kDa were excised and analysed by mass spectrometry. Both in larvae and pupae OBP9 was by far the most abundant protein (coverage by aminoacid sequence up to 61.87%), found in several spots (red circles). In larvae we could also detect OBP21 (Entry code in Uniprot Q8I8S3; coverage by aminoacid sequence 9.16%) and SAP3 (coverage by aminoacid sequence up to 18.25%), present in spots where also OBP9 was identified. Molecular weight markers are, from the top: Phosphorylase b, from rabbit muscle (97 kDa), Bovine serum albumin (66 kDa), Ovalbumin (45 kDa), Carbonic anhydrase (29 kDa), Trypsin inhibitor (20 kDa), a-Lactalbumin (14 kDa). doi:10.1371/journal.pone.0075162.g005 study a 2D-gel electrophoresis coupled to mass spectrometry analysis.
Crude extracts from 100 larvae at 4 th instar or 100 pupae of An. gambiae were separated on 2D-gels ( Figure 5) and the spots analysed as described in the Materials and Methods section. The mass spectrometry analysis performed on the digests of all the spots migrating with apparent molecular masses lower than 24 kDa has revealed the presence of OBP9 as the sole protein of this class, that however appears in several abundant spots (red circles). This phenomenon, that needs to be further investigated, might indicate the occurrence of different forms of OBP9, possibly the products of post-translational modifications. The widespread expression of OBP9 in An. gambiae also includes a report of this protein in the hemolymph of adults [57]. In the gel of larvae we could also detect OBP21, a protein absent in the antennae of adults, and SAP3 in spots where also OBP9 was identified.
A sample of 100 eggs was utilised for a shot-gun analysis, as reported in the Materials and Methods section. The only olfactory protein identified was OBP9, whose presence was based on two peptides found in all three replicates, with a coverage of 25.9%.

Conclusions
The main results of our work can be so summarised: 1. There is a strong sexual dimorphism in the number of OBPs expressed in the antennae. While only a few OBPs can be found in males with only OBP9 expressed at a high level, females are endowed with at least 8 members abundantly expressed, and 14 more that are still clearly detectable. Different expression of OBPs between sexes had been previously reported in Drosophila melanogaster [58]. 2. Two of the most expressed OBPs (#47 and #48) belong to the C-plus OBPs. In particular, these two proteins contain 4 and 7 cysteines, respectively, in addition to the six of the conserved motif and a more complex structure, recently elucidated for OBP47 [22]. It is not yet clear whether these unusual proteins might be involved in chemodetection like classic OBPs, or else be endowed with alternative functions and modes of action. 3. In pre-adult stages and in eggs the exceptional abundance of OBP9 and the absence of other proteins of the same family suggest that this protein might be involved in functions other than chemoreception. This fact is particularly true for eggs, that are not endowed with chemoreception. 4. The repertoire of OBPs present at detectable levels (13 classical OBPs, 5 C-plus OBPs, 6 salivary OBPs) is much lower than the number of genes encoding such proteins in An. gambiae, thus providing a reduced number of molecular targets for further biochemical research and actions aimed at mosquito population control. File S1 Complete list of proteins identified in Anopheles gambiae antennae through the shotgun approach using ANDROMEDA [47] as search engine. Column A, protein identity as reported in An. gambiae genome and in UniProKB; Column B, identity of leader protein within the protein group; Column C and F, protein descriptio in the genome and in UniprotKB; Column D, protein family; Column E, protein family description; Column G, molecular weight of leader protein; Colum H, protein posterior error probability.