Transcriptional Response of Musca domestica Larvae to Bacterial Infection

The house fly Musca domestica, a cosmopolitan dipteran insect, is a significant vector for human and animal bacterial pathogens, but little is known about its immune response to these pathogens. To address this issue, we inoculated the larvae with a mixture of Escherichia coli and Staphylococcus aureus and profiled the transcriptome 6, 24, and 48 h thereafter. Many genes known to controlling innate immunity in insects were induced following infection, including genes encoding pattern recognition proteins (PGRPs), various components of the Toll and IMD signaling pathways and of the proPO-activating and redox systems, and multiple antimicrobial peptides. Interestingly, we also uncovered a large set of novel immune response genes including two broad-spectrum antimicrobial peptides (muscin and domesticin), which might have evolved to adapt to house-fly's unique ecological environments. Finally, genes mediating oxidative phosphorylation were repressed at 48 h post-infection, suggesting disruption of energy homeostasis and mitochondrial function at the late stages of infection. Collectively, our data reveal dynamic changes in gene expression following bacterial infection in the house fly, paving the way for future in-depth analysis of M. domestica's immune system.


Introduction
Although lacking acquired immune systems, insects have efficient and potent innate immune systems to discriminate and combat foreign invaders successfully [1,2].It is generally acknowledged that the insect immune system involves cellular and humoral immune reactions against microbial infections that maintain close networks with each other and occur first in the epidermis, gut and tracheal respiratory organs and then in the hemocoel [2].One characteristic of insect immunity is rapid activation of immune genes upon microbial infection, which produces effectors such as antimicrobial peptides.Insects have evolved sensitive mechanisms for recognition of pathogens including bacteria, fungi, parasites and viruses, which subsequently trigger cellular immune [3,4] and humoral immune reactions [5,6] via signal transduction pathways.Four signal transduction pathways, Toll, IMD, JNK and JAK/STAT, are regarded as the main pathways regulating the immune response of insects [7].Genes encoding effectors are activated through signaling cascades and a set of these molecules are produced in specific tissues and secreted into the hemolymph [1].
House flies Musca domestica are endemic, and are the carriers of more than 100 harmful pathogens that can have severe consequences for human and animal health.Unfortunately, controlling the human diseases transmitted by house flies has not been successful due to the lack of knowledge of the basic molecular mechanism of this species [8].Adaptation to distinct ecological environments might result in the evolution of specific immunity of house flies.Therefore, comparing the innate immune systems of Musca with those of the species that face different ecological pressures and pathogens such as Drosophila and Anopheles can be very informative, and thus offer clues on how house flies can flourish in close contact with many pathogens [8].
Recently, next generation sequencing technologies, such as the 454 Life Sciences (Roche) pyrosequencing platform, the Illumina Genome Analyzer, and the Applied Biosystems Solid platform provide rapid and high-throughput methods of identifying differentially expressed genes and their expression profiles [9,10].Identification and characterization of the host genetic factors released in response to pathogens is essential for understanding of innate immunity of M. domestica.However, information on the host genes involved in antibacterial defense is still limited.In this study, we performed transcriptome analysis and digital gene expression profile analysis of M. domestica challenged with Escherichia coli and Staphylococcus aureus, using high-throughput sequencing methods (Illumina Solexa Sequencing).The aims of this study were to uncover some information about the house fly immune response and discover new genes involved in bacterial infection in order to better understand the bacteria-host interaction.At the same time, the high-throughput sequencing in this study will identify a large number of transcripts that are comparable to the available transcripts in other species, and provide strong support for the genomic analysis of M. domestica.

Fly maintenance and bacterial challenge experiments
The laboratory colony of M. domestica used in this study was a gift from Miss Fengqin He, Institute of Zoology, Chinese Academy of Sciences.Musca larvae were raised on artificial diet consisting of bran and water until pupariation.After eclosion, adult flies were fed with water, sugar, and milk powder.Specimens at all life stages were kept in a temperature-controlled room at 2561uC, 7065% relative humidity, and a photoperiod cycle of LD12:12.Septic injury was produced by pricking the abdomen of the 2 nd instar larvae with a needle previously dipped into a concentrated mixed bacteria suspension of E. coli and S. aureus [11].The bacterial challenged larvae were maintained at 25uC on fresh medium for 6 h, 24 h, and 48 h before RNA extraction.

RNA isolation
Total RNA was isolated from the following developmental stages: eggs, 1 st instar larvae, 2 nd instar larvae, 3 rd instar larvae, pupae and newly-emerged adults (within 3 days of eclosion) in a 1:1 female:male ratio.RNA was extracted using the RNAiso Plus Kit (TaKaRa) following the manufacturer's instructions.

Construction of the cDNA library and Illumina sequencing for transcriptome analysis
Briefly, 20 mg total RNA (a mixture of RNA from eggs, 1 st instar larvae, 2 nd instar larvae, 3 rd instar larvae, pupae, adults, and bacterial challenged 2 nd instar larvae at equal ratios) was used to construct a cDNA library.Poly (A) mRNA was purified from total RNA using oligo (dT) magnetic beads.Fragmentation buffer was added for resizing mRNA to short fragments.Taking these short fragments as templates, random hexamer-primer was used to synthesize the first-strand cDNA.The second-strand cDNA was synthesized using buffer, dNTPs, RNaseH and DNA polymerase I, respectively.Short fragments were purified with the QiaQuick PCR extraction kit and resolved with EB buffer for end reparation and adding poly (A).After that, the short fragments were connected with sequencing adapters.And, after the agarose gel electrophoresis, the suitable fragments were selected for the PCR amplification as templates.Finally, the library could be sequenced using Illumina HiSeq TM 2000.

Bioinformatics analysis
Transcriptome de novo assembly was carried out with short reads assembling program Trinity [12].Trinity first combines reads with a certain length of overlap to form longer fragments, which are called contigs.The reads are then mapped back to contigs.The paired-end reads enable the software to detect contigs from the same transcript as well as the distances between these contigs.Next, Trinity connects the contigs, and gets the sequences that cannot be further extended on either end.Such sequences are defined as unigenes.The unigenes were lengthened by the increased sequence depth using normalized library with the 454 platform [13].Finally, BLASTx alignment (E-value,10 25 ) between unigenes and protein databases like nr, Swiss-Prot, KEGG and COG (Clusters of Orthologous Groups of proteins) were performed, and the best aligning results were used to decide sequence direction of unigenes.When different databases gave conflicting results, we prioritized them in the following order: nr, then Swiss-Prot, KEGG, and COG.When unigene did not align with any of the entries in these databases, ESTScan [14] was used to predict the unigene's coding regions and to determine its sequence direction.
Candidates of immunity-related genes from the fruit fly were used to query the M. domestica transcriptome.The nucleotide CDS and protein amino acid sequence for each of the identified Drosophila melanogaster immune genes were downloaded from the flybase (http://flybase.org/).The M. domestica transcriptome was searched for homologues to these sequences using CLC Main Workbench 5.5 with the tBLASTn program with a cutoff E-value of 10 25 .Tentative matches were checked by searching the nr NCBI database using BLASTn for gene prediction errors.
We collected the unigenes that are homologous to the important known innate-immunity-relevant genes, and aligned them to the reference gene sequences of D. melanogaster for counting the gene variant numbers.At first, we included the reference gene variants of a certain D. melanogaster gene together with the homologous unigenes from our data to do the alignment, and then we assigned the unigenes to their most homologous reference gene variants respectively.As for those none-full-length unigenes, sometimes we couldn't tell which reference gene variant they belong to because of their lacking of the specific sequence region, so that we assigned them to the first reference gene variant showed in the alignment result.Finally we counted the least variant number of the unigenes under each reference gene variant, and added up the numbers as the least variant gene number of the gene in the house fly.When we counted the least variant number of the unignenes, in order to avoid counting the none-overlapping fragments from the same variant, we only took into account the unigenes that overlap with each other and cover the highest phylogenetic diversity site in the alignment.

Gene expression library preparation and sequencing
Total RNA was extracted separately from immune-challenged larvae at 6, 24 and 48 h post-infection following the manufacturer's instruction, as described above.RNA extracted from nonchallenged larvae at each matching time point was taken as a control.Ten larvae were collected for RNA extraction from each group.Next, a gene expression library was prepared using an Illumina gene expression sample prep kit.Briefly, mRNA was enriched using the oligo (dT) magnetic beads from the total RNA, and cDNAs were synthesized as described above for each sample.The library products were then ready for sequencing analysis via Illumina HiSeq TM 2000 using paired-end technology in a single run.Six libraries from control and challenged groups were constructed.

Analysis and annotation of gene expression tags
The 50 bp length reads were gained through base calling and were filtered to get the clean reads by removing the dirty reads with adaptors, reads in which unknown bases are more than 10%, and low quality reads (the percentage of the low quality bases of quality value #5 is more than 50% in a read).Then, clean reads were mapped to the above transcriptome reference database using SOAPaligner/soap2 [15], allowing no more than 2 mismatches.Gene expression levels were evaluated with RPKM values [16], and the value was substituted by 0.001 if the gene has no expression.NOIseq method that can work in absence of replication was used for gene expression analysis [17].If there was more than one transcript for a given gene, the longest transcript was used to calculate its expression level and coverage.To identify differentially expressed genes between two samples, the false discovery rate (FDR) method was used to determine the threshold of P-value in multiple tests [18].We use ''FDR#0.001and the absolute value of log 2 Ratio $1'' as the threshold to judge the significance of gene expression difference.In the gene expression profiling analysis, Gene Ontology (GO) enrichment analysis of functional significance was applied using the hypergeometric test to map all differentially expressed genes to terms in the GO database, looking for significantly enriched GO terms in differentially expressed genes, and comparing them to the transcriptome database.For the pathway enrichment analysis, we mapped all differentially expressed genes to terms in the KEGG database and looked for significantly enriched metabolic pathways or signal transduction pathways in differentially expressed genes comparing with the whole transcriptome database.Then, all genes expression was subjected to KEGG enrichment analysis compared to the transcriptome background using a hypergeometric test.Annotated KEGG pathways with a Q value#0.05 were considered as significantly enriched terms in differentially expressed genes.

qPCR validation
The accuracy of the genes expression approach has been validated by qPCR analysis.Total RNA was extracted from bacterial challenged and non-challenged larvae as described for gene expression library preparation, and then was reversetranscribed according to the protocol provided with M-MLV reverse transcriptase (Promega, USA).Ten differentially expressed genes were randomly chosen to verify gene expression sequencing results using four replicates.The b-actin gene was used as a constitutive expression control for normalization.The primers are shown in Table S1.qPCR was performed following the abovementioned methods with modified annealing temperature for each pair of primers [19].The relative quantification (comparative method) was calculated using the DDCt method [20].All samples were normalized to the DCt value of a reference gene to obtain a DDCt value (DCt target 2 DCt reference).The final relative expression was calculated using the following formula: F = 2 2DDCt .The data obtained from qPCR were analyzed for statistical significance using Graph-Pad Prism [21].The significance at P, 0.05 was analyzed using one-way ANOVA.The qPCR results were then compared with gene expression data to detect the expression correlation of each gene.

Peptide synthesis and antibacterial assays
The peptides muscin and domesticin, were chemically synthesized using the solid-phase method on the Applied Biosystems model 430A peptide synthesizer by Shanghai mocell biotech Co., Ltd.(Shanghai, China).Synthesized peptides were purified by high performance liquid chromatograph (HPLC), and verified by mass spectrometry and amino acid composition analysis.Each synthesized peptide was diluted with sterile deionized water to different concentrations, and applied directly to antibacterial assays.Twelve bacterial strains used in the tests were a gift from Shunyi Zhu, Institute of Zoology, Chinese Academy of Sciences (Beijing, China).Minimum inhibitory concentrations (MICs) were determined in duplicate by the liquid growth inhibition assay [22].Briefly, 10 ml of each dilution (sterile deionized water as a control) were incubated in sterile microtiter plates (96 wells) with 100 ml of a suspension of midlogarithmic phase culture of bacteria at a starting optical density of OD 600 = 0.001, or with 80 ml of fungal spores (final concentration 10 4 spores/ml) and 10 ml of water.Poor-broth nutrient medium (1% bactotryptone, 0.5% NaCl w/v, pH 7.5) was used for standard bacterial cultures.Anti-fungal assays were performed in potatoes dextrose broth (Difco) at halfstrength supplemented with 10 mg/ml tetracycline.Bacteria were grown during 24 h under vigorous shaking at 30uC.Fungi were grown at 25uC in the dark without shaking for 48 h in a moist chamber.Microbial growth was controlled by measurement of the optical density at D 600 after a-24 h incubation.Inhibition of filamentous fungi growth was observed at microscopic level after 24 h and measured at 600 nm after 48 h.The MIC values are expressed as the range between the highest concentration of the peptide where bacterial growth was observed and the lowest concentration that caused 100% of inhibition bacterial growth [23].

Sequencing and sequence assembly
To obtain the global gene expression profile of the house fly, RNA samples from M. domestica of six developmental stages (eggs, 1 st instar larvae, 2 nd instar larvae, 3 rd instar larvae, pupae, adults) were prepared, equally-mixed and then sequenced by the Illumina platform in a single run which generated 70,335,268 raw reads, 66,049,270 clean reads and 5,944,434,300 total clean nucleotides (Table 1).The average read size, Q20 percentage (sequencing 1% error rate,) and GC percentage were 90 bp, 96.25%, and 53.69%, respectively.These short reads were assembled into 116,687 contigs with a mean length of 319 bp.Then, after gap filling of contigs using paired-end reads from the transcriptome sequencing data of 454, we obtained 47,086 unigenes.The mean size of these unigenes was 757 bp and the N50 was 1,186 bp.Here N50 was the length of the smallest contig in the set that contained the fewest or largest contigs whose combined length represents at least 50% of the assembly.Of these unigenes, 10,933 (23.2%) were larger than 1,000 bp (Figure S1).Due to the short length of the sequencing read, unigenes obtained from Illumina platform were short in general.In present study, the unigenes were lengthened by the increased sequence depth using our previous normalized library with 454 platform [13].

Unigene functional annotation
Unigene sequences were annotated by searching the Nr of NCBI, Swiss-Prot, KEGG and COG protein database using BLASTx with a cut-off E-value of 10 25 .A total of 27,021 distinct sequences (57.39% of unigenes) matched known genes (Table S2).The remaining 20,065 unigenes (42.61%) could not be done and needed more genetic data to annotate.It is noteworthy that a large part of the unigenes in M. domestica transcriptome database is with unknown functions.
The GO categories recovered from Blast2GO analyses were summarized by the proportion of unique sequences annotated in each GO level 2 category (Figure 2).We categorized 7,063 unigenes (26.1% of total) into 48 function groups.Cell, cellular process, cell part, binding and metabolic process were the five largest groups, containing 3,812, 3,751, 3,389, 2,919, and 2,914 unigenes, respectively.
To identify the biological pathways that are active in M. domestica, functional classification and metabolic pathway assignment were performed by the KEGG annotation system.A total of 16,440 unigenes (60.8%) were classified into 239 KEGG pathways.Metabolic pathways contained 2,274 unigenes (13.8%) was significantly larger than other pathways, such as pathways in cancer (619, 3.77%), focal adhesion (565, 3.44%), and regulation of actin cytoskeleton (545, 3.32%).In addition, some immune related pathways, including phagosome, lysosome, ECM-receptor interaction, Fc gamma R-mediated phagocytosis, leuko-cyte transendothelial migration, complement and coagulation cascades, and many signaling transduction pathways such as MAPK, VEGF, JAK-STAT, PPAR, and Toll-like receptor signaling pathway were predicted in the KEGG database (Table S3).
These annotations provide a valuable resource for investigating specific processes, functions and pathways and allow for the identification of novel genes involved in the pathways of immunity.

Annotation of immune-relevant genes
Based on genome-wide analysis, many immune-related genes have been identified from D. melanogaster (265), Anopheles gambiae (304), Apis mellifera (138), and Bombyx mori (220) [2].To gain deep insight into the molecular biology of immune systems in M. domestica, the immune-relevant genes were discovered by referencing to those identified in other insect species.Approximately 279 genes were found to be homologous to known immune-relevant genes (Table 2), including the most important elements of innate immunity, such as pattern recognition receptors, and other immune-related genes (such as PPO, NOS, caspase, dicer2, argonaute2).It is noteworthy that many unigenes obtained by next generation sequencing should be different fragments or even allelic or splice variants of the same  gene [24].In this study, the number of gene sequences predicted to encode immune genes would be therefore overestimated over the actual number of genes belonging to each of the currently characterized gene families.The putative genes uncovered provide the basis for further understanding of the physiological functions of candidate genes in M. domestica immune responses.

Gene expression library sequencing
Having established the house fly transcriptome database, we next determined how bacterial infection altered the transcriptome.We sequenced RNA from 2 nd instar larvae at 6 h, 24 h, 48 h following inoculation with a mixture of E. coli and S. aureus; RNA from uninfected larvae served as the control (designated as M6, M24, M48 and CK6, CK24, CK48, respectively).After filtering the dirty tags, each sample generated 3,510,021,3,672,652 reads.Among these clean reads, 78.51,86.51%were mapped to unigenes in the transcriptome database (Table 3).The percentage of clean reads ranged from 96.78% to 99.29%, reflecting the high quality of the sequencing.About 4,5% of genes were covered between 90,100% in each library.

Effects of bacterial infection on gene expression
The ''FDR#0.001'' and the absolute value of ''log 2 Ratio$1 or #21'' were used as the threshold to identify and compare differentially expressed genes between larvae non-challenged and challenged at different stages.Numbers of differentially expressed genes for each comparison were shown in Figure 3. 572, 3194, and 3544 genes were significantly affected by bacterial infection at 6, 24 and 48 h post-infection, respectively (Table S4), They include genes involved in insect immunity (ECM-receptor interaction, phagosome, complement and coagulation cascades, PPAR signaling pathway) and protein, carbohydrate and lipid metabolism.Besides, genes reflecting mitochondrial oxidative stress and endoplasmic reticulum (ER) stress were up-regulated.Thus, infection caused extensive changes in the M. domestica physiological status.
Importantly, some of these changes are temporal-specific.There were 12, 33, 43 pathways significantly enriched in the 3 comparisons respectively (Q value#0.05).The results are shown in Table S5.For example, at 6 h after bacterial infection, metabolic pathways were preferentially altered, with the genes affected including those controlling fatty acid, galactose and amino acid metabolism, ribosome biogenesis, phagocytosis, neuroactive ligand-receptor interaction, collecting duct acid secretion, terpenoid-quinone biosynthesis and adipocytokine signaling.In contrast, at later stages, particularly at 48 h post-bacterial challenge, genes controlling oxidative phosphorylation were strongly downregulated, suggesting profound mitochondrial dysfunction at these stages.

Differentially expressed genes involved in immune response
Among the genes affected by infection, 509 were presumably involved in innate immune responses (Table S6), including pattern recognition, Toll and IMD signaling, direct antimicrobial defense, proPO-activating cascade and redox, as described below.
PGRP, first discovered in haemolymph of B. mori, binds bacterial peptidoglycan and triggers the prophenoloxidase cascade, culminating in the activation proPO and spa ¨tzle [26,27].PGRPs are considered to be the largest and most versatile family of pattern recognition molecules for microbial products in insects [28].Indeed, we found that there are multiple putative PGRP unigenes in the house fly, each induced at all three time points post-infection, except that Unigene29467 was repressed at 6 h post-challenge.Of note, recent studies in Drosophila indicates that  amidase PGRPs negatively regulates the IMD pathway by degrading PGN [29,30], suggesting some late-expressed PGRPs in house fly may act to dampen immune response.
In contrast, bGRP was hardly responsive to infection, with only one gene (Unigene30021) mildly (1.54x) up-regulated at only one time point (6 h post-challenge).bGRPs were first identified in B. mori and crayfish Pacifastacus leniusculus, for their ability to bind  Total reads b-glucans, and role in b-glucan-induced activation of phenoloxidase [31].These proteins are found only in invertebrates and contain a protein domain that is similar to several bacterial glucanases.Some of the proteins have broader, some narrower defense specificities, and all are associated with the regulation of immune signaling pathways [32].C-type lectins (CTLs) are a large family of PRRs found in almost all metazoans and mainly exert their functions depending on a common structural motif, the carbohydrate recognition domain [33].The members of CTL family are abundant in shrimp and have various functions in innate immunity, including phagocytosis, melanization, respiratory burst, agglutination, antibacterial and anti-viral responses [34].A total of 21 CTL unigenes were identified in the M. domestica transcriptome data, and most of them were massively up-regulated in larvae during various stages of infection.Thus, CLT may be important for antibacterial defense.
Scavenger receptors (SRs) recognize different PAMPs, including LPS, lipoteichoic acid (LTA) and yeast zymosan/b-glucan, and act as phagocytic receptors mediating non-opsonic phagocytosis of pathogens [35,36].We identified 73 unigenes 7 of them upregulated at 48 h (the last stage post-infection) but none at earlier stages, raising the possibility that SRs might function to clean damaged biomolecules and even cell debris at late stage of infection.
Thioester-containing proteins (TEPs) are structurally related to the complement C3/alpha (2)-macroglobulin family [37,38].One of the best characterized TEPs in invertebrate is the mosquito A. gambiae TEP1, where upon bacterial infection, TEP1 was cleaved to release the C-terminal fragment that promotes phagocytosis of bacteria [39].We found 40 putative unigenes, 6 of them upregulated at 6 h and 48 h post-challenge, consistent with their roles in innate immunity.
Antimicrobial peptides.A hallmark of the insect defense is the induction and secretion of antimicrobial peptides that accumulate in the hemolymph where they oppose invading microorganisms [40].Twenty unigenes encoding 4 known antimicrobial peptides were found to be up-regulated drastically in the infected M. domestica larvae at all three time points postbacterial infection, reinforcing their crucial roles in innate immunity (Table S6), including cecropin, attacin, defensin and diptericin.We also found two novel dramatically up-regulated unigenes (Unigene29893 and Unigene13085), which we named muscin (GenBank: KF514663) and domesticin (GenBank: KF514664) (Figure S2 and Figure S3).The deduced peptides, containing putative signal peptides and rich in positively charged and hydrophobic amino acids, were synthesized and their antimicrobial activities tested using liquid growth inhibition assay.The two peptides were both active against all the tested bacteria (Table 4).Muscin and domesticin were most effective against Serratia marcescens and Micrococcus luteus, respectively (MIC 1.25-2.5 and 0.1-0.2 mM), and weaker activities were observed against Salmonella typhimurium (MIC 50-100 and 25-50 mM for muscin and domesticin respectively).However, the peptides lacked any detectable activity against fungi Neurospora crassa even at 100 mM.No homolog of muscin was found through searching the GenBank database,while domesticin shows some similarity to a few peptides of Drosophila, including IM18 peptide, an immuneinduced molecule with unknown function (Figure S4) [41].On the other hand, we failed to identify the homologus genes of gloverin, moricin, lebocin, antimicrobial peptides in B. mori.We predicted that the rapid evolution of antimicrobial peptide genes in insects is likely to be the result of host-pathogen co-evolution, indicating a specific role of antimicrobial peptides against a restricted subset of pathogens.Comparative analysis of gene repertoires of the antimicrobial peptides from different insects suggests that evolution of antimicrobial peptides followed independent scenarios as a result of specific adaptation to distinct ecological environments.
Antimicrobial peptide genes were dominant in up-regulated transcripts at all three time points post-bacterial infection.These results indicated that various antimicrobial peptides could respond rapidly to invaded pathogens and keep high-level expression during all these detected stages.The up-regulation of antimicrobial peptides expression after infection is a common defense strategy by hosts to rapidly destroy invaders.Differential expression patterns among different antimicrobial peptides may represent a combined strategy to form a defense network against diverse microbial pathogens.
Lysozymes, present in many organisms, cleave the b-1, 4glycosidic linkage between N-acetylmuramic acid and N-acetylglucosamine found in certain bacterial cell walls [42].Their functions have been widely described in species that ingest or harbour bacteria throughout their life cycles, such as M. domestica and D. melanogaster, and also characterized as digestive enzymes [42][43][44][45].Other reports have suggested a major role of lysozymes in immune responses to pathogens [42,46].We found that two of the four lysozyme unigenes up-regulated and two other downregulated post-infection, suggesting Musca lysozymes may function as defense molecules as well as digestive enzymes.proPO-activating system.An immediate immune response in insects is the cell-mediated melanization reaction observed at the site of cuticular injury or on the surface of parasites invading the hemocoel.Melanization requires the activation of proPO (prophenoloxidase), an enzyme that catalyzes the oxidation of mono-and diphenols to orthoquinones, which polymerize nonenzy-matically to melanin [40].The proPO-activating system mainly includes many genes such as serine proteinases and their inhibitors (serpins), proPO-activating enzyme (PPA), proPO and its active form, phenoloxidase (PO) [47,48].After stimulated by injury or PAMPs, a serine proteinase cascade is first activated [49], which leads to the cleavage of the pro-form of the prophenoloxidase-activating enzyme (pro-PPA) into active PPA.Then PPA further activates the proPO into the active enzyme, PO, through proteolysis of its propeptide [47].In the present study, many differentially expressed genes were annotated to be tentative members of the proPO-activating system (Table S6).These genes were mainly a kind of serine proteinases, and their inhibitors.The present gene expression data revealed that most members in the serine proteinase cascade and proPO system were responsive to bacterial infection in M. domestica.The expression level of proPO and PPA transcripts were increased significantly at 24 h or 48 h post-bacterial challenge.The expressions of most serine proteases were increased in bacterial-challenged larvae, indicating a positive response of the serine proteinase cascade in the immune defense.However, the profiles of serpins were also up-regulated unexpectedly during this process, which seemed incompatible with their roles in regulation of the proPO-activating system.A similar consequence was shown in shrimp Fenneropenaeus chinensis [50].It might be taken as a negative feedback mechanism to avoid damage of host tissues and cells by excess reactive components generated by PO.
Redox system.Reactive oxygen species (ROS) is referred to as a class of radical or non-radical oxygen-containing molecules that have high oxidative reactivity with lipids, proteins, and nucleic acids [51].Generation of ROS in infected organisms not only helps to kill pathogens but also acts on the host cells themselves, including altering the intracellular redox balance and functioning as signaling molecules involved with the regulation of immunomodulatory genes.ROS production is regarded as an immediate acute-phase oxidative defense in response to pathogen assault or cellular stress such as phagocytosis and melanotic encapsulation [52].Our analysis showed that the transcript levels of some cytochrome P450 and xanthine oxidase genes were strongly induced in the process of bacterial infection, especially at 48 h.Moreover, some unigenes encoding oxidoreductases, such as dehydrogenase/reductase SDR family member 11-like, 15-hydroxyprostaglandin dehydrogenaseor-related dehydrogenase, FAD-NAD binding oxidoreductase, peroxidasin-like, NADH dehydrogenase subunit 5 etc., were up-regulated significantly in this process.However, few unigenes encoding antioxidant enzymes were also seen up-regulations in this study.Previous study has revealed that M. domestica SOD genes could be induced mainly at 72 h post E. coli or S. aureus challenge [19].We suppose that comprehensive up-regulated expressions of antioxidant enzyme genes would appear in the later stage.Furthermore, increasing sequencing depth in the future might increase both the number of immune related genes, and provide more data on their differential expression pre-and post-infection.There must be a fine redox balance maintained by innate immune system, which is critical for insect to survive from the war between pathogen and host.The underlying molecular mechanisms, however, still remain elusive.Our further research will focus on mitochondrion, an organelle which is considered as the central platform for innate immunity [53].
In addition to those known immune-related genes, we are surprised to find that hexamerins are up-regulated strongly at 24 and 48 h post-challenge (Table S4).In general, hexamerin is thought to act as storage protein which is used as a source of amino acids and energy for protein synthesis during metamorphosis.But some reports indicated that insect hexamerin may be involved in innate immunity [54][55][56].As humoral procoagulant, hexamerin can bind to the surfaces of bacteria invaded, and its subunits are confirmed as the major constituent of the aggregates in Drosophila hemolymph [54].So we speculate that hexamerins might be involved in the innate immune response of the house fly.

qPCR
To validate the gene expression result, qPCR analysis was performed using gene-specific primers for ten changed unigenes selected at random.Results are shown in Table 5 and are compared with the gene expression data.Although the qPCR data supports the trends of our gene expression results, the levels of change tend to be different between methods for each gene.We attribute differences in the level of change to the sensitivity of biases occurred between qPCR and gene expression.In general, the results of gene expression profiling are reliable.

Conclusions
In this study, bacteria-challenged M. domestica transcriptome profiles were investigated and the substantial amount of transcripts was recovered, which provided a strong support for the future genomic research on M. domestica, especially on in-depth genome annotation in insects.Universally identified immune-candidate genes, infection markers, and putative signaling pathways were found in M. domestica and especially a considerable amount of immune-relevant genes and pathways in the house fly showed significant similarity to Drosophila, Anopheles, Apis, and Bombyx, suggesting that mechanisms underlying the innate immunity in insects might be conserved in invertebrates.After the bacterial challenge, pattern recognition proteins, especially the PGRPs, significantly increased.Antimicrobial peptides, including ceropin, attacin, diptericin, defensin, were also found increased.Moreover, component genes in Toll and IMD signaling pathways, as well as the genes involved in proPO-activating system and redox system, were strongly induced in the process of bacterial infection.
In addition, a large set of novel immune response genes that have never been linked previously to immune responses in other insects indicated that the immune system of insect might be much more complex than previously reported, and house fly-specific immune events might have happened during evolution as a result of specific adaptation to distinct ecological environments.Particularly, we realized that regulations of whole-body energy and metabolic homeostasis were disrupted, and there was a strong suppression of oxidative phosphorylation during antibacterial defense reaction.We firmly believe that unclear repair and rebalance mechanisms at the later stage of infection should be crucial for insect to survive from the pathogen-host battle.
Two novel antimicrobial peptides, muscin and domesticin, were found in challenged house flies, and they both showed broad spectrum of bactericidal activities.Domesticin is homologous to some peptides of Drosophila, including an immune-induced molecule IM18 peptide, while we failed to identify any known homologous peptide of muscin.This kind of newly found antimicrobial peptides could be a part of the explanation to how the house fly was able to flourish in the septic environments.
Although RNA-seq technology reduces the need of technical replication within our experiments and the fact that we used the NOISeq method can empirically model the noise in count data, the absence of biological replication within our experiments constrains the possibility of more detailed analyses.We feel safe to draw the draft conclusion above out of the multiple points data, including pre-challenge and post-challenge 6 h, 24 h, and 48 h.However, studies with more intensive time intervals and biological replications are needed to provide deep insight into the immunogenetics of the house fly, which also may contribute to a better understanding of the evolutionary history of innate immunity from insects to vertebrates.Table S1 Primers used for qPCR analysis.

Supporting Information
(DOC) Table S2 Top hits obtained by BLASTx for the unigenes. (XLS) Table S3 KO annotation of unigenes. (XLS) Table S4 The data of all the differentially expressed genes. (XLS) Table S5 KEGG enrichment analysis of all the differentially expressed genes.

Figure 3 .
Figure 3. Differentially expressed genes of each library compared with CK.Unigenes changed at transcriptional level following bacterial challenge were identified by filtering of the onefold up-and down-regulated ones with FDR#0.001.doi:10.1371/journal.pone.0104867.g003

Figure 5 *
Figure S1 Length distribution of unigenes.(DOC) Table 5. Comparisons of relative gene expression fold between gene expression data and qPCR results.Name

Table 1 .
Summary of the M. domestica transcriptome.

Table 2 .
Gene counts belonging to immunity-related gene families.

Table 3 .
Statistics of gene expression sequencing.

Table 4 .
Antibacterial activities of muscin and domesticin.MICs are expressed as the interval a-b, where a is the highest concentration tested at which microorganisms are growing and b the lowest concentration that causes 100% growth inhibition. doi:10.1371/journal.pone.0104867.t004