Dual proteomics of Drosophila melanogaster hemolymph infected with the heritable endosymbiont Spiroplasma poulsonii

Insects are frequently infected with heritable bacterial endosymbionts. Endosymbionts have a dramatic impact on their host physiology and evolution. Their tissue distribution is variable with some species being housed intracellularly, some extracellularly and some having a mixed lifestyle. The impact of extracellular endosymbionts on the biofluids they colonize (e.g. insect hemolymph) is however difficult to appreciate because biofluid composition can depend on the contribution of numerous tissues. Here we investigate Drosophila hemolymph proteome changes in response to the infection with the endosymbiont Spiroplasma poulsonii. S. poulsonii inhabits the fly hemolymph and gets vertically transmitted over generations by hijacking the oogenesis in females. Using dual proteomics on infected hemolymph, we uncovered a weak, chronic activation of the Toll immune pathway by S. poulsonii that was previously undetected by transcriptomics-based approaches. Using Drosophila genetics, we also identified candidate proteins putatively involved in controlling S. poulsonii growth. Last, we also provide a deep proteome of S. poulsonii, which, in combination with previously published transcriptomics data, improves our understanding of the post-transcriptional regulations operating in this bacterium.


Introduction
Insects frequently harbor vertically transmitted bacterial symbionts living within their tissues, called endosymbionts [1]. Endosymbionts have a major impact on the host physiology and evolution as they provide ecological benefits such as the ability for the host to thrive on unbalanced diets [2], protection against viruses or parasites [3][4][5][6] or tolerance to heat [7]. A peculiar group of insect endosymbionts also directly affects their host reproduction. This group is taxonomically diverse and includes bacteria from the Wolbachia, Spiroplasma, Arsenophonus, Cardinium and Rickettsia genera [8,9]. Four reproduction-manipulative mechanisms have been unraveled so far, namely cytoplasmic incompatibility, male-killing, parthenogenesis and male feminization [10], all of them leading to an evolutionary drive that favors infected individuals over non-infected ones and promotes the endosymbiont spread into populations. Their ease of spread coupled with the virus-protecting ability of some species [3,4] make them promising tools to control insect pest populations or to render them refractory to human arboviruses [11]. Reproductive manipulators are however fastidious bacteria that are difficult to manipulate, hence slowing down research on the functional aspects of their interaction with their hosts [12]. Their real impact on host physiology thus remains largely elusive. Spiroplasma are long helical bacteria belonging to the Mollicutes class, which are devoid of cell wall [13]. They infect a wide range of arthropods and plants where they act as pathogens, commensals or vertically transmitted endosymbionts depending on the species [13]. S. poulsonii (hereafter Spiroplasma) is a vertically transmitted endosymbiont infecting the fruit fly Drosophila [14,15]. It lives free in the host hemolymph where it feeds on circulating lipids [16] and achieves vertical transmission by infecting oocytes [17]. Most strains cause male-killing, that is the death of the male progeny during early embryogenesis through the action of a toxin named Spaid [18]. Spiroplasma infection also confers protection to the fly or its larvae against major natural enemies such as parasitoid wasps [19,20] and nematodes [21,22].
Spiroplasma has long been considered as untractable, but recent technical advances such as the development of in vitro culture [23] and transformation [24] make it an emergent model for studying the functional aspects of insect-endosymbiont interactions. Some major steps have been made in the understanding of the male killing [25][26][27][28] and protection phenotypes [20,22,29] or on the way the bacterial titer was kept under control in the adult hemolymph [16,30]. Such studies, however, relied on single-gene studies and did not capture the full picture of the impact of Spiroplasma on its host physiology. We tackled this question using dualproteomics on fly hemolymph infected by Spiroplasma. This non-targeted approach allowed us to get an extensive overview of the end-effect of Spiroplasma infection on the fly hemolymph protein composition and to identify previously unsuspected groups of proteins that where over-or under-represented in infected hemolymph. Using Drosophila genetics to knock-down the corresponding genes, we identified new putative regulators of the bacterial titer. This work also provides the most comprehensive Spiroplasma proteome to date. By comparing this proteome to the existing transcriptomics data, we draw conclusions about Spiroplasma post-transcriptional regulations.

Drosophila hemolymph protein profiling
We investigated the effect of Spiroplasma infection on Drosophila hemolymph protein content using Liquid Chromatography-tandem Mass Spectrometry (LC-MS/MS) (S1 Fig). To this end, we extracted total hemolymph from uninfected and infected 10 days old females. At this age, Spiroplasma is already present at high titers in the hemolymph but does not cause major deleterious phenotypes to the fly [31,32]. Extraction was achieved by puncturing the thorax and drawing out with a microinjector. This procedure induces some tissue damage but recovers very few hemocytes (circulating immune cells).
Peptides were mapped to both the Drosophila and Spiroplasma predicted proteomes, hence allowing having an overview of i) differentially represented Drosophila proteins in infected versus uninfected hemolymph and ii) an in-depth Spiroplasma proteome in the infected hemolymph samples. These two datasets will be analyzed in separate sections.
The mapping on Drosophila proteome allowed the identification of 909 quantified protein groups (a protein group contains proteins that cannot be distinguished based on the identified peptides). The complete list of quantified Drosophila proteins and their fold-change upon Spiroplasma infection is available in S1 Table. The hemolymph extraction process involves tissue damage (e.g. the subcuticular epithelium, muscles and fat body) that leads to the release of intracellular proteins in the sample. Hence we first sorted protein groups depending on whether the main protein is predicted to be extracellular or not. Proteins were defined as extracellular if they bore a predicted signal peptide or were annotated with a Gene Ontology (GO) term "extracellular", or intracellular if no signal peptide nor any "extracellular" GO term was predicted. Based on this localization criteria the Drosophila dataset could be split in 555 intracellular protein groups representing 61% of the total proteome and 354 extracellular proteins representing the remaining 39% of the total proteome (Fig 1A).
We then compared the relative amounts of protein groups between infected and uninfected hemolymph. Of the intracellular protein groups, 35 were differentially represented in the Spiroplasma infected samples, including 8 overrepresented and 27 depleted groups compared to the uninfected control ( Fig 1B). A functional GO analysis on the differentially represented protein groups revealed that a significant part of these protein groups were related to the proteasome (subunits α2, 3, 4 and 7 and subunits β3 and 6) and the Toll immune pathway (Fig 1C). The Toll pathway GO enrichment comprises only proteasome subunits, probably because of their involvement in the degradation of Toll pathway intermediates (e.g. Cactus [33]). While differentially abundant intracellular protein groups were mostly underrepresented in Spiroplasma infected hemolymph, analysis of the extracellular protein subset revealed an opposite trend. Among the 71 extracellular protein groups having a significantly different abundance, 4 were depleted and 67 were overrepresented in the infected samples compared to the uninfected ones ( Fig 1D). Surprisingly, the functional GO annotation highlighted an overrepresentation of serine-endopeptidase molecular function. Serine proteases are involved in the regulation of the melanization and the Toll pathways and indeed the associated GO terms are enriched in the infected samples ( Fig 1E).

Immune-related protein enrichment is a consequence of a mild transcriptional activation
Since Spiroplasma is devoid of cell wall, it lacks microbe-associated molecular patterns such as peptidoglycan and is not expected to interact with pattern-recognition receptors regulating the fly immune system. This idea was supported by previous work showing that Spiroplasma do not trigger a strong level of Toll and Imd pathway, the two main immune pathways in Drosophila [31,34]. Previous work also showed that flies defective for the inducible humoral immune response have normal Spiroplasma titer, suggesting that immune pathways are not required to control Spiroplasma growth [31]. Our observation that several immune-related proteins are more abundant in the hemolymph of infected flies (Fig 2A and S1 Table) led us to investigate further on this point. We first tested if the changes in immune-related protein abundance were a consequence of altered gene expression. We measured the corresponding mRNA levels of several differentially represented proteins (Fig 2B). Immune-related genes were slightly upregulated in infected flies, although their induction levels did not compare with those observed after systemic infection by a pathogenic bacteria [35,36]. These results confirm that Spiroplasma does not trigger a strong immune response [31,34], but rather provokes a mild and chronic production of immunity proteins.

A majority of enriched hemolymphatic proteins are not involved in Spiroplasma growth control
A systemic infection or a genetic induction of the humoral immune response promotes Spiroplasma growth in flies [31]. We therefore wondered if the presence of small amounts of

PLOS ONE
Drosophila/Spiroplasma proteomics antimicrobial peptides (AMPs) could be beneficial for Spiroplasma growth. To test this hypothesis, we measured Spiroplasma titer in flies over-expressing different AMP genes [37]. We found that AMP gene overexpression did not increase Spiroplasma titer ( Fig 3A). Similarly, flies lacking ten AMP coding genes [38] had a Spiroplasma titer comparable to that of control flies (Fig 3B). These results suggest that AMPs released during the humoral immune response do not affect Spiroplasma growth.
We then tested if other immune-related proteins found more abundant in infected hemolymph could alter Spiroplasma growth. We first compared the Spiroplasma titers of

PLOS ONE
control Oregon R flies with that of hayan, attD and tep2 mutants. We observed no difference in bacterial titer, indicating that these genes are not required neither detrimental to Spiroplasma growth (Fig 3B). We then used an in vivo RNAi approach to silence immune genes coding for the most enriched proteins in Spiroplasma-infected flies using the ubiquitously expressed Act5C-GAL4 driver. Silencing these genes did not provoke an increase in Spiroplasma titer, further reinforcing the idea that immune proteins do not prevent Spiroplasma growth (Fig 3C).
Among the most differentially represented proteins in Spiroplasma infected hemolymph, we also detected several proteins of unknown function. To test their role in Spiroplasma growth control, we used the same RNAi-mediated knockdown approach and quantified Spiroplasma titer in these flies. Most of the RNAi lines tested showed no change in Spiroplasma titer as compared to control lines (S2 Fig). Two RNAi lines targeting CG18067 and CG14762, respectively, showed a slight yet not significant reduction in Spiroplasma titer. These results suggest that all the tested genes do not facilitate Spiroplasma growth nor control its titer. Instead, these proteins are likely to be induced as a consequence of Spiroplasma infection and may participate in the host adaptation to the bacteria. A mutant line was available for only one of the most enriched uncharacterized proteins (CG15293). We found that this mutation leads to a significant increase in bacteria titer (S2 Fig). This gene codes for a 37 kDa secreted protein with no known function. Our results raised the hypothesis that CG15293 participates in the control of Spiroplasma titer in vivo.

Transcriptome-proteome correlation in Spiroplasma poulsonii
The proteomics analysis of infected hemolymph samples also allowed the detection and quantification of Spiroplasma proteins. With a total of 503 proteins, this is the most comprehensive Spiroplasma proteome to date. The complete list of Spiroplasma quantified proteins is available in S2 Table. We took advantage of a previously published transcriptomics dataset of Spiroplasma [23] to compare the expression level of Spiroplasma genes to their corresponding protein abundance by building a linear model of the proteomics signal as a function of the transcript level (Fig 4). The two measures are poorly correlated (Kendall's τ = 0.40). Analyzing the normalized residuals of the model also allowed us to identify proteins that are particularly discrepant with the model, that is with absolute standardized residuals greater than 2. This approach uncovered 27 proteins, of which 11 have a significantly higher abundance and 16 a lower abundance in the proteome than what was predicted from the transcriptomics data (S3 Table). Over-represented proteins

PLOS ONE
with a reliable annotation include the membrane lectin Spiralin B, the cytoskeletal protein Fibrilin, the glucose transporter Crr, the potassium channel KtrA, a ferritin-like protein Ftn, the chromosome partitioning protein ParA and the DNA polymerase subunit DnaN. Under-represented proteins include the transporters SteT and YdcV, the serine/threonine phosphatase Stp, the helicase PcrA, the ATP synthase subunit AtpH, the GMP reductase GuaC and the tRNA-(guanine-N(7)-)-methyltransferase TrmB. Other proteins have no predicted function.

Discussion
This study provides an extensive characterization of the proteome of Spiroplasma-infected Drosophila hemolymph. This dual-proteomics approach provides a deep proteome of Spiroplasma in its natural environment and pinpoints host proteins which abundance is altered by the presence of the bacteria. The power of Drosophila genetics allowed us to test the involvement of these candidate genes in regulating endosymbiosis. A targeted genetic screen revealed that most of the differentially abundant proteins upon Spiroplasma infection do not participate in the control of symbiont growth but may rather be involved in host adaptation to Spiroplasma.
Spiroplasma is devoid of cell wall, hence devoid of peptidoglycan, which is the main immune elicitor for the insect immune system [39]. This led to the assumption that Spiroplasma was undetectable by the canonical innate immune pathways. Furthermore, the elicitation of the fly immune system (by an infection or using genetics) has no impact on Spiroplasma titer, suggesting that the bacteria are not only invisible by also resistant to the fly immune effectors [31]. Flies that over-express AMPs and the ΔAMP10 mutant line have normal bacterial titer that confirms the host immune system does not affect Spiroplasma growth. However, numerous immunerelated proteins (mostly associated to the Toll pathway) were enriched in Spiroplasma infected hemolymph, including receptors or putative receptors (GNBP1, GNBP-like3), signal transduction intermediates (Spätzle-Processing Enzyme and Hayan) and effectors or putative effectors (Attacin, Bomanin Bc3 and CG33470). The Spiroplasma titer was not altered in several fly lines carrying loss-of function alleles of these genes, indicating that the immune elicitation is a consequence of the infection but not a control mechanism. It is worth noting that the expression levels of the genes were extremely low as compared to a fully-fledged immune response against pathogenic bacteria [35]. Such low induction of the immune response has been reported in flies experiencing stress, such as cold or heat stress [40,41]. Therefore, the mild induction of immune proteins in infected hemolymph may be an unspecific stress response associated to the presence of bacteria. Another attractive hypothesis would be that proteases released by Spiroplasma could trigger the soluble sensor Persephone and activate the Toll pathway in a peptidoglycan-independent fashion [42,43].
Another uncovering of this study is the depletion of proteasome components in the hemolymph upon Spiroplasma infection. As the ubiquitin-proteasome system is a major degradation pathway for intracellular proteins [44], the components that we detected in the hemolymph are presumably released from cells that broke upon hemolymph collection (epithelial, fat or muscular cells, but also possibly hemocytes). However, functionally active 20S proteasome units have been discovered circulating in extracellular fluids in mammals, including serum [45] hence we cannot exclude the existence of circulating proteasome units in Drosophila hemolymph. Host ubiquitin-proteasome systems have long been suspected to be a key element in symbiotic homeostasis. It is upregulated in cells harboring intracellular symbionts in weevils, presumably to increase protein turnover and provide free amino-acids to the bacteria [46]. In vitro work on cell cultures infected by the facultative endosymbiont Wolbachia also revealed that it induces the host proteasome presumably also to support its own growth [47][48][49]. Remarkably, one proteasome subunit was detected as enriched upon Spiroplasma infection in the aphid Aphis citricidus when the insect was feeding on a suboptimum but not on an optimum host plant, suggesting an interaction between symbiosis, nutrition and the proteasome-ubiquitin degradation system [50]. In the case of Drosophila-Spiroplasma interaction, the depletion of proteasome subunits upon infection could thus be a titer regulation mechanism: by decreasing its proteolysis, the host would limit the release of amino acids in the extracellular space that would be usable by Spiroplasma.
Our approach also produced an in-depth Spiroplasma proteome on total proteins regardless of their cell localization. This gives a quantitatively unbiased overview of each protein abundance which was not achieved by previous data based on detergent extractions [30]. Such quantitative approach allowed us to make a comparison between the transcript and protein levels on about one fourth of the total number of predicted coding sequences [23]. Although the correlation between transcriptomics and proteomics data greatly varies among models, tissues and experimental set-ups [51], our data indicate a rather low correlation in the case of Spiroplasma. The correlations between transcript and protein levels depends on the interaction between transcript stability and protein stability [52]. It also depends on the intrinsic chemical properties of each transcript that affect the ribosome binding or the translation speed, for example the Shine-Dalgarno sequence in prokaryotes [53] or more importantly the codon adaptation index of the coding sequence [54]. An explanation for the overall lack of correlation between transcripts and proteins regardless of the model could be that selection operates at the protein level, hence changes in mRNA levels would be offset by post-transcriptional mechanisms to maintain stable protein levels over evolutionary times [55]. This hypothesis entails that genes having a low mRNA level compared to their protein levels are likely to be under strong selective pressure to maintain high protein abundancy through post-transcriptional regulations. Intriguingly, this is the case for S. poulsonii Spiralin B, a protein homologous to the S. citri Spiralin A, a membrane lectin suspected to be essential for insect to plant transmission [56,57]. Spiralin B has been identified as a putative virulence factor in S. poulsonii as it is upregulated when the bacteria are in the fly hemolymph compared to in vitro culture [23] and could be involved in oocyte infection for vertical transmission. Similarly, the Crr glucose transporter has an unexpectedly high protein/mRNA ratio, possibly in connection with its role in bacteria survival in the hemolymph. Spiroplasma has a pseudogenized transporter for trehalose, the most abundant sugar in the hemolymph [30]. This could have been selected over host-symbiont coevolution to prevent Spiroplasma overgrowth, hence increasing the stability of the interaction by limiting the metabolic cost of harboring the bacteria. Maintaining high Crr levels could thus be an offset response to maintain bacteria proliferation despite trehalose inaccessibility. Last, the ferritin-like protein Ftn has also a bias towards high protein abundancy, suggesting that iron could be a key metabolite in Spiroplasma-Drosophila symbiotic homeostasis.
All tissues bathed by hemolymph contribute to its composition. As a consequence, studying the impact of symbionts on this biofluid is unapproachable by transcriptomic methods only. Dual proteomics proved to be an efficient approach to overcome this hurdle and gain novel insights into the Drosophila-Spiroplasma symbiosis. This method is also promising for the study of other symbioses, particularly those where symbionts inhabit biofluids rather than cells.

Spiroplasma and Drosophila stocks
Flies were kept at 25˚C on cornmeal medium (35.28 g of cornmeal, 35.28 g of inactivated yeast, 3.72 g of agar, 36 ml of fruits juice, 2.9 ml of propionic acid and 15.9 ml of Moldex for 600 ml of medium). The complete list of fly stocks is available in S4 Table. Spiroplasma poulsonii strain Uganda-1 [58] was used for all infections. Fly stocks were infected as previously described [31]. Briefly, 9 nL of Spiroplasma-infected hemolymph was injected in their thorax. The progeny of these flies was collected after 5-7 days using male killing as a proxy to assess the infection (100% female progeny). Flies from at least the 3 rd generation post-injection were used experimentally.

Hemolymph extraction procedure
Embryos were collected from conventionally reared flies and dechorionated using a bleachbased previously published method [59] to ensure that they were devoid of horizontally transmitted pathogens. One generation was then left to develop in conventional rearing conditions to allow for gut microbiota recontamination. Hemolymph was extracted from 10 days-old flies from the second generation after bleach treatment using a Nanoject II (Drummond) as previously described [16]. 1 μL of hemolymph was collected for each sample and frozen at -80˚C until further use. Hemolymph was then diluted 10 times with PBS containing Protease Inhibitor Cocktail 1X (Roche). 1 μl was used for protein quantification with the Pierce BCA Protein Assay Kit (Thermofisher). The remaining 9 μl were mixed with SDS (0.2% final), DTT (2.5mM final) and PMSF (10μM final). Aliquots of 15 μg were used for proteomics analysis.

Proteomics sample preparation and LC-MS/MS
Sample preparation and data analysis was carried out at the Proteomics Core Facility of EPFL. Each sample was digested by Filter Aided Sample Preparation (FASP) [60] with minor modifications. Dithiothreitol (DTT) was replaced by Tris (2-carboxyethyl)phosphine (TCEP) as reducing agent and iodoacetamide by chloracetamide as alkylating agent. A proteolytic digestion was performed using Endoproteinase Lys-C and Trypsin. Peptides were desalted on C18 StageTips [61] and dried down by vacuum centrifugation. For LC-MS/MS analysis, peptides were resuspended and separated by reversed-phase chromatography on a Dionex Ultimate 3000 RSLC nanoUPLC system in-line connected with an Orbitrap Q Exactive HF Mass Spectrometer (Thermo Fischer Scientific). Database search was performed using MaxQuant 1.5.1.2 [62] against a concatenated database consisting of the Ensemble Drosophila melanogaster protein database (BDGP5.25) and a custom Spiroplama poulsonii proteome predicted from the reference genome (Genbank accession number JTLV00000000.2). Carbamidomethylation was set as fixed modification, whereas oxidation (M) and acetylation (Protein N-term) were considered as variable modifications. Label-free quantification was performed by MaxQuant using the standard settings. Hits were considered significant when the fold-change between infected and uninfected hemolymph was >2 or <-2 and the FDR<0.05.
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD024340.

Spiroplasma quantification
Spiroplasma quantification was performed by qPCR as previously described [32]. Briefly, the DNA of pools of 5 whole flies was extracted and the copy number of a single-copy bacterial gene (dnaA or dnaK) was quantified and normalized by that of the host gene rsp17. Primers sequences are available in S4 Table. Each experiment has been repeated 2 or 3 independent times with at least 3 biological replicates each. Data were analyzed by one-way ANOVA.

RT-qPCR
Gene expression measurements were performed by RT-qPCR as previously described [63,64]. Briefly, 10 whole flies were crushed and their RNA extracted with the Trizol method. Reverse transcription was carried out using a PrimeScript RT kit (Takara) and a mix of random hexamers and oligo-dTs. qPCR was performed on a QuantStudio 3 (Applied Biosystems) with PowerUp SYBR Green Master Mix using primer sequences available in S4 Table. The expression of the target gene was normalized by that of the housekeeping gene rp49 (rpL32) using the 2-ΔΔCT method [65].
Each experiment has been repeated 2 or 3 independent times with at least 3 biological replicates each. Data were analyzed by Mann-Whitney tests.