The Evolution and Appearance of C3 Duplications in Fish Originate an Exclusive Teleost c3 Gene Form with Anti-Inflammatory Activity

The complement system acts as a first line of defense and promotes organism homeostasis by modulating the fates of diverse physiological processes. Multiple copies of component genes have been previously identified in fish, suggesting a key role for this system in aquatic organisms. Herein, we confirm the presence of three different previously reported complement c3 genes (c3.1, c3.2, c3.3) and identify five additional c3 genes (c3.4, c3.5, c3.6, c3.7, c3.8) in the zebrafish genome. Additionally, we evaluate the mRNA expression levels of the different c3 genes during ontogeny and in different tissues under steady-state and inflammatory conditions. Furthermore, while reconciling the phylogenetic tree with the fish species tree, we uncovered an event of c3 duplication common to all teleost fishes that gave rise to an exclusive c3 paralog (c3.7 and c3.8). These paralogs showed a distinct ability to regulate neutrophil migration in response to injury compared with the other c3 genes and may play a role in maintaining the balance between inflammatory and homeostatic processes in zebrafish.


Introduction
The zebrafish (Danio rerio) has been increasingly recognized in biomedical research as a valuable model with which to study vertebrate development, hematopoiesis and immunity [1]. As freeliving organisms from early embryonic life stages, fish are highly dependent on their innate immune system for survival [2]. Among the diverse group of cells and proteins that comprise innate immunity, the complement system is considered an essential firstline defense mechanism not only in fish but also in other vertebrates and invertebrates [3].
The complement system is recognized as an intricate set of plasma and cell-surface proteins that interact with each other in an organized cascade, leading to system activation and the release of biologically active proteins. The complement system modulates the fates of diverse physiological processes, from inflammation and pathogen opsonization and clearance to hematopoiesis, tissue regeneration and lipid metabolism [4]. In vertebrates, complement can be activated by three distinct pathways: the classic, alternative and lectin, all of which converge at the C3 level with consequent cleavage of the C3 and C5 proteins and the generation of anaphylatoxins and other biologically active fragments.
C3 is an approximately 185-kDa protein that comprises 13 different domains organized into two chains (alpha and beta) that are connected by a disulfide bond [5]. C3 is one of the most abundant proteins in the plasma (approximately 1.6 mg/ml in humans) and, through its diverse domains, can interact with a wide variety of plasma and cellular proteins [6].
A body of evidence from evolutionary genetics studies has indicated the presence of the C3 gene in organisms that existed before the divergence between Cnidaria and Bilateralia [7]. Since then, the C3 gene has maintained an evolutionary equilibrium and is highly conserved among species, likely due to its importance in immunity and homeostasis mechanisms [4]. Positive evolutionary pressure on C3 seems to be particularly pronounced in fish, in which C3 gene duplications have been characterized in a variety of species, such as trout [8], common carp [9], medaka fish [10] and zebrafish [11]. Interestingly, previous studies have indicated that the multiple C3 genes present in a single species are associated with the recognition of different fish pathogens, thus enlarging the spectrum of pathogen-associated molecular patterns (PAMPs) that the complement system can recognize and respond to [8].
Herein, we confirm the presence of three different c3 genes and identify five additional c3 genes in the zebrafish genome. Furthermore, we propose that an early duplication event occurred at the base of the teleost fish clade. Most importantly, we show the differential abilities of C3 paralogs to regulate cytokine production and neutrophil migration upon injury, indicating a dual role for complement in the inflammation/regeneration processes in zebrafish.

Phylogenetic analysis
An exhaustive BLAST search [12] was performed against the Danio rerio full genome (version Zv9) with the available human C3 and zebrafish c3 sequences that were retrieved from the public NCBI nucleotide database (http://www.ncbi.nlm.nih.gov/ nucleotide). Similarities and identities between the corresponding protein sequences were calculated with MatGAT 2.02 [13]. Structural characterizations were investigated with the NCBI online Conserved Domain Database (CDD) [14].
Additional fish C3 and C5 protein sequences were retrieved from published genomes with the Ensembl Genome Browser, version 69 [15] ( Table 1). The sequence alignment was performed with the MAFFT online server according to the E-INS-i strategy [16]. Ambiguously aligned columns were pruned with Gblocks 0.91b [17]. The best-fit model of amino acid replacement was selected according to the Akaike Information Criterion (AIC) [18] with ProtTest 3.2 [19]. The c3-c5 gene family tree was estimated with jPrime 0.2.0 [20], in which 4 independent MCMC runs, each consisting of 1,000,000 iterations, were sampled once every 200 iterations. After discarding the first 500 samples for each run as burn-in, the final gene tree was obtained as a weighted consensus majority-rule tree from the 4 runs with MrBayes 3.2.1 [21,22].
To identify gene duplications and loss events during the evolution of the c3-c5 gene family, a reconciliation [23,24] of the c3-c5 tree with fish phylogeny was performed. Although the evolutionary relationships among fish species are not well resolved [25], in this study, a species tree coherent with the accepted taxonomic relationships among teleost species was selected ( Figure  S1). The divergence times among fish species were retrieved from the TimeTree database [26]. The most parsimonious reconciliation of the estimated gene tree and the species tree was performed with Notung 2.6 [27,28] and represented with FigTree v1.3.1 (http://tree.bio.ed.ac.uk/software/figtree/) and PrimeTV [29]. Synteny was investigated with Genomicus v69.01 [30].

Animals
Fish care and challenge experiments were conducted according the CSIC National Committee on Bioethics guidelines under approval number ID 01_09032012.
The wild-type and Tg(mpx:GFP) [31] zebrafish used in this study were obtained from our experimental facility, where they were cultured according to established protocols [32,33]. Tg(mpx:GFP) fish were kindly provided by S. Renshaw (University of Sheffield).

Expression of c3 genes
To evaluate the mRNA expression levels of the c3 paralogs, total RNA was extracted from several organs collected from naïve wild type adult zebrafish, including the spleen, kidney, liver, intestines, gills, heart, muscle, tail and brain. Samples from 12 fish were pooled to yield 3 biological replicates of 4 individuals per pool.
To determine the expression levels of the different genes during zebrafish ontogeny, wild-type zebrafish larvae were sampled at the following different times post-fertilization (pf): 3 hpf, 6 hpf, 1 dpf, 2 dpf, and 3 dpf and at 3-day intervals from 5 to 29 dpf. Due to differences in animal size, 10-15 animals were necessary to yield biological replicates from 3 hpf to 14 dpf, whereas only 6-8 individuals from 17 to 29 dpf were used for biological replicates.
Total RNA was isolated from 3 biological replicates per sampling point.
Furthermore, the post-stimulation expression patterns were analyzed. Adult zebrafish (n = 36) were injected intraperitoneally with 10 mL of 1 mg/mL lipopolysaccharide (LPS) to mimic a bacterial infection. Additional fish (n = 36) were injected with PBS and used as controls. At 3, 6 and 24 h post-stimulation, selected organs (spleen, kidney and liver) were sampled and pooled from 12 fish to yield 3 biological replicates of 4 fish per sampling point per organ.

Quantitative PCR gene expression analysis
Total RNA isolation was performed for both adults and larvae with the Maxwell 16 LEV simplyRNA Purification kit (Promega, UK). Next, 500 ng of total RNA were used to obtain cDNA with the SuperScript II first-strand synthesis kit and random primers (300 ng/mL; Life Technologies).
Primer efficiency was calculated from the slope of the cycle threshold (Ct) regression line versus the relative cDNA concentrations in serial 5-fold dilutions. A melting curve analysis was also performed to verify that no primer dimers were amplified.
Each reaction was performed in 25 mL of reaction mix that comprised 1 mL of 2-fold diluted cDNA template, 0.5 mL of each primer (both at a final concentration of 10 mM; sequences shown in Table 2), 12.5 mL of Brilliant II SyBR Green qPCR Master Mix (Agilent Technologies) and 10.5 mL of pure water. Technical triplicates were performed for each reaction. The cycling conditions were as follows: 95uC for 10 min, 40 cycles of 95uC for 15 s and 60uC for 1 min, and a final dissociation stage of 95uC for 20 s, 60uC for 20 s and 95uC for 20 s. For normalization purposes, the elongation factor 1-alpha (ef1a) gene expression levels were analyzed in each sample as a housekeeping gene control according to the Pfaffl method [35]. Additionally, il1b mRNA expression was analyzed in the 3 h samples from LPS-stimulated fish and PBS-injected controls to confirm the inflammatory state after stimulation.

Gene knockdown studies
For gene inhibition studies, four different morpholinos (Gene Tools) were designed: two translation blocking morpholinos (MO-ATG-c3u and MO-ATG-c3.7/8) and two splice-site blocking morpholinos (MO-c3.1s and MO-c3.7/8s). ATG morpholinos were designed according to the first 25 bases of the sequence. Because of their high sequence identity, a common ATG morpholino was designed to block the genes c3.1, c3.2/3 and c3.6 (MO-ATG-c3u: CAGAGAGAAACAGCAGCTTCAC-CAT), while a second ATG morpholino (MO-ATG-c3.7/8: CCCATAACAGCAGCTGAAGAAGCAT) was designed to inhibit c3.7 and c3.8. Splice-site blocking morpholinos were designed according to the Ensembl gene exon data (MO-c3.1s: CCAGCTT-CTCACCCAGTGTTGCCGT; MO-c3.7/8s: TTCCGACT-TACCGAGCTGATCTCT). A standard control oligo (Gene Tools) was used as a non-specific control. Intra-yolk microinjections of morpholino solution, 1 nL, were administered to fresh single-cell stage WT embryos. The splice-site blocking morpholinos efficacy was confirmed by electrophoresis gel of PCR products amplified with primers showed in Table 2. Both the c3.1 inhibiting morpholinos produced an increase in developmental errors that led to higher lethality when injected at concentrations greater than 0.5 mM in a dose-dependent fashion, and thus, 0.4 mM was used in the study; a 1 mM solution of MO-ATG-c3.7/8 and 0.5 mM of MO-c3.7/8s were used. For the c3.1 phenotype rescue, the c3.1 ORF was produced by PCR and capped mRNA was synthetized using mMessage mMachine Kit (Ambion) following the manufacturer protocol.

Neutrophil migration studies
For neutrophil migration studies, morpholino microinjections were performed in Tg(mpx:GFP) zebrafish embryos. At 3 dpf, after larval hatching, the tails were cut with a razor. 24 h Neutrophil migration to the regenerative tissues was observed under an AZ100 microscope (Nikon) and photographed with a DS-Fi1 digital camera (Nikon) and the relative fluorescence intensities in the tails resulting from the GFP expressed under the myeloid-specific peroxidase promoter present in the neutrophils was measured with ImageJ software [36].
Confocal images of 3 h post fin transection live larvae were captured using a TSC SPE confocal microscope (Leica). The images were processed using the LAS-AF (Leica) and ImageJ software.

Aeromonas hydrophila infection
To evaluate the effect of the c3.7/8 inhibition on the inflammation, 3 dpf larvae microinjected with the MO-c3.7/8s morpholino were infected with a concentration 3?10 7 A. hydrophila by bath. At 3 h post infection, total RNA from 3 biological replicates of 10 individuals was extracted and analyzed by qPCR.

In situ hybridization
Sense and antisense RNA-probes were designed according to the previously used qPCR primers. The probes were produced with the PCR amplification method under standard PCR conditions (35 cycles, 60uC annealing temperature). The sense probe incorporated the necessary promoter sequence for labeling purposes (SP6 promoter, ACGATTTAGGTGACACTATA-GAA), while the antisense probe incorporated the T7 promoter (AGTTAATACGACTCACTATAGGGATT). RNA-probes were prepared with the DIG-RNA Labeling Kit (SP6/T7) (Roche) according to the manufacturer's instructions. Whole-mount in situ hybridization (ISH) was performed on 3 dpf zebrafish embryos essentially as reported by Thisse and Thisse [37]. Stained embryos were cleared in 100% glycerol, observed under an AZ100 microscope (Nikon) and photographed with a DS-Fi1 digital camera (Nikon).

Results
Sequence, genomic organization and phylogeny of zebrafish c3 Previous screening of the zebrafish genomic PAC library identified three loci for complement c3, namely, c3.1, c3.2 and c3.3 [11]. We sought to extend this analysis by performing a genome-wide blast search. Our analysis not only confirmed the presence of genes coding for the c3.1, c3.2 and c3.3 variants in the zebrafish chromosome 1 but further identified three putative c3 genes in chromosome 1 and two in chromosome 22, which were named c3.4, c3.5, c3.6, c3.7 and c3.8 ( Table 1).
The predicted proteins (C3.1 to C3.6) coded by the c3 genes located in chromosome 1 showed an overall identity above 55% and a similarity above 70%. In contrast, they only shared 35% identity and 57% similarity with the putative C3 proteins located on chromosome 22 (Table 3). Furthermore, the similarity between human C3 and zebrafish C3.1 to C3.6 was approximately 43%, whereas a slightly lower similarity was observed between the human C3 and zebrafish C3.7 and C3.8 sequences (Table 3). Due to their high similarity, we analyzed c3.2 and c3.3 as a group (c3.2/ 3). The same strategy was adopted for c3.7 and c3.8 (c3.7/8). Additionally, similar conserved domains, such as MG1, A2M, C345C and anaphylatoxin (ANATO), were observed in both human C3 and zebrafish C3, according to the CDD. The MG1 and ANATO protein domains were not initially predicted in the C3.8 protein sequence, but partial cDNA amplification and posterior sequencing confirmed the presence of the latter domain (data not shown).
The unreconciled c3 gene tree was completely resolved, although not all clades were recovered with high confidence ( Figure S2). Note, however, that the separation of c3.7/8 from the other c3 sequences was highly supported. Interestingly, the reconciled gene tree revealed a highly dynamic gene family with as many as 21 duplications and 10 inferred losses ( Figure 1A). In particular, we identified a putative early gene duplication event at the base of the teleostei clade that separated the zebrafish c3.7 and c3.8 and their orthologs from the rest of the fish c3 genes. Indeed, the c3.7/8 genes seem to be specific to the teleost lineage because a search for orthologs of these genes in non-teleost fish (such as Latimeria chalumnae or Petromyzon marinus) and other vertebrate genomes returned no results.

Differential expression of the zebrafish c3 genes
Next, we evaluated the mRNA expression levels of each c3 gene in the different zebrafish tissues. All genes were constitutively expressed in the different tested tissues, except for c3.7/8, which was not detected in the heart and muscle (Figure 2). Although the expression profile varied in an organ-dependent manner, overall, c3.1 was the most expressed c3 gene, followed by c3.6. While c3.1, c3.2/3, c3.6 and c3.7/8 followed a similar pattern with predom- inant expression in the spleen and liver, c3.4 and c3.5 were mainly expressed in the kidney and intestine. The c3.7/8 expression was the lowest, accounting for approximately 1% of the total c3 expression ( Figure 2). During ontogeny, all of the different genes were expressed through larval development with the exception of c3.2/3, which was first detected at 48 h of development ( Figure 3A). As in the adult tissues, c3.1 and c3.6 were the most highly expressed sequences in the larval stages ( Figure 3A). Notably, c3.4 and c3.5 expression seemed to be predominant in the adult stage, independent of the tissue. In contrast c3.1, c3.2/3, c3.6 and c3.7/8 expression was more prominent during the larval stages in the kidney, intestine and muscle tissues ( Figure 3B), possibly indicating a role for these genes in the kidney, intestine and muscle during development. Furthermore, a zebrafish whole-mount in situ hybridization confirmed the primary hepatic expression of c3.1 during development ( Figure 3C).
To investigate whether the induction of c3 expression was dependent on the c3 gene, zebrafish were stimulated with LPS, and the c3 expression levels in the spleen, kidney and liver were subsequently evaluated over a period of 24 h post-stimulation. While an overall increase in the splenic c3 expression levels was observed upon LPS stimulation (Figure 4), no significant alterations in c3 expression were found in the kidney or liver (data not shown). The c3.1, c3.2/3 and c3.6 expression levels peaked at 6 h and demonstrated a significant 3-to 4-fold increase relative to the PBS control. In contrast, splenic c3.4, c3.5 and c3.7/ 8 responded early to the treatment with an expression peak at 3 h post-stimulation ( Figure 4). In addition to the increased c3 expression, il1b mRNA levels were also increased (22-fold in the spleen after 3 h of LPS stimulation), thus confirming the induction of a pro-inflammatory state by LPS (data not shown).
Zebrafish c3 genes possess differential inflammatory roles Because a pro-inflammatory phenotype in zebrafish was associated with the increased expression of all genes (Figure 4), we investigated how the different c3 affected the migratory abilities of neutrophils in response to tail amputation. To this end, tails of GFP-transgenic [Tg (mpx:GFP)] zebrafish were amputated in individuals in which the c3.1-2/3-6 or c3.7/8 genes had been inhibited with the MO-ATG-c3u and MO-ATG-c3.7/8 morpholinos, respectively. 24 h after the tail injuries, migrating neutrophil estimations were determined by fluorescence microscopy ( Figure 5A, B). Interestingly, while inhibition of c3.1-2/3-6 resulted in a 2-fold decrease in neutrophil migration to the mutilated zone, the inhibition of c3.7/8 had an opposite effect with a 2-fold increase in the number of migrating neutrophils in the damaged tissue, suggesting that the different c3 genes have opposite roles during the inflammation process in zebrafish.
These results for c3.1 and c3.7/8 inhibition were confirmed using splice-site blocking morpholinos ( Figure 6). Similar to MO-ATG-C3u, MO-c3.1s affected the development in a concentration-dependent manner. However in this case, stronger effects were observed since the minimal concentration that successfully blocked the c3.1 expression affected the larvae phenotype. In consequence, it was not possible to determine its effects on neutrophil migration. The co-injection of the morpholino with c3.1 capped mRNA successfully rescued the aberrant phenotypes ( Figure 6A, B).
The increased inflammatory state observed with the inhibition of c3.7/8 ATG morpholino was also successfully confirmed with the splice-site blocking morpholino MO-c3.7/8s. Neutrophil migration studies after tail clipping was higher in the MO-c3.7/   Figure 6C). Unfortunately, in this case the synthesis of capped RNA was not successful, probably due to the length of the gene. Moreover, il1b expression was found to be significantly higher in MO-c3.7/8s microinjected larvae than on WT 3 h after its stimulation with A. hydrophila bath infection ( Figure 6D).

Discussion
Herein, we characterized eight genes that code for the C3 protein in zebrafish (c3.1 to c3.8). The first three genes, c3.1, c3.2 and c3.3, also known as the c3a, c3b and c3c genes, have been previously identified [11]. However, c3.4 to c3.8 sequences were present in zebrafish online databases like NCBI and Ensembl but lacked of proper characterization. Notably, multiple forms of complement components, such as C3, C5, C7, factor B and properdin, have been identified in lower vertebrates [8,[38][39][40], raising the hypothesis that this remarkable diversity has allowed these animals to expand their innate capacities for immune recognition and response [8].
The phylogenetic analysis of the different c3 and c5 genes suggests a high genomic dynamism in teleostei as multiple copies of the c3 gene were observed in all analyzed fish genomes ( Figure 1A). In contrast, the evolution of the c5 gene family was much more static. Here, the absence of the c5 gene in lampreys correlates well with the hypothesis that this gene appeared in the jawed fish lineage [41]. Thus, P. marinus c3, instead of the c3/c5 duplication event, was used to root the phylogenetic tree.
Our analysis also indicates that the c3.1 to c3.6 genes in zebrafish and the majority of the c3 genes in medaka, stickleback and cod resulted from intraspecific duplications of a unique ancestral c3 gene. Additionally, a particular c3 gene duplication across the teleostei species tree. The position and direction of the genes ccdx130, gtf3a, pspn, wdr83 and wdr83os were highly conserved. However, this gene region was inverted in the current platyfish scaffold assembly. doi:10.1371/journal.pone.0099673.g001 that was conserved across the analyzed genomes appeared in the teleost lineage, resulting in the c3.7 and c3.8 genes and their orthologs. This particular paralog c3 gene seems to be specific to teleosts, as indicated by the fact that we could not locate orthologs of this gene in non-teleost fish (coelacanth and lamprey) or other vertebrate genomes.
The c3.1 to c3.6 genes are found in tandem in the zebrafish chromosome 1. This particular gene order could indicate that these genes are the products of specific, segmental duplications and not the consequence of whole genome duplications and posterior rearrangements, a hypothesis that agrees with the inferred gene phylogeny. The conserved synteny between the zebrafish chromosome 1 c3 genes and human C3 indicates that those genes are likely orthologs and are therefore expected to retain equivalent functions [42]. In contrast, c3.7 and c3.8 are found in chromosome 22 and, although we did not specifically study their origins, could have possibly emerged during teleostspecific genome duplication. Regardless, we can safely state that the c3.7/8 orthologs demonstrate a different evolutionary pattern than the other c3 duplications.
It is necessary to indicate that we worked with draft genomes, for which the assemblies and annotations are incomplete. This is important to remember when drawing conclusions from the analyses. For example, the disappearance of the c5 gene in Atlantic cod is likely an artifact of an unfinished genome assembly rather than a real gene loss. Additionally, c3.5 appears as two distinct gene products in the current zebrafish genome and c3.8 was not correctly predicted, thus positioning the anaphylatoxin domain inside a non-existent intron.
Gene expression analysis revealed two different expression patterns in the studied tissues, which included the spleen, liver, kidney, intestine, gills, heart, brain, tail and muscle. c3.1, c3.2/3, c3.6 and c3.7/8 were primarily expressed in the spleen and liver. In contrast, c3.4 and c3.5 showed higher expression in the kidney and the intestine. This expression pattern contradicts that observed in mammals, in which complement factors are mainly secreted in the liver. However, high extrahepatic c3.2 and c3.3 expression was also observed in fish in response to Poly I:C [43] and viruses [44], suggesting the local production of innate immune proteins in response to infection. As expected, the c3 genes were highly and early expressed during ontogeny, when adaptive immunity is not yet developed. A similar pattern has been reported in other fish species, such as the India major carp, Atlantic cod, spotted wolffish and Atlantic salmon [45][46][47][48]. This early expression was primarily located in the liver as determined by in-situ hybridization of c3.1 on 5 dpf zebrafish larvae, agreeing with the data deposited in ZFIN database [49]. c3.2/3 did not follow this earlier expression pattern and was not detected until 2 dpf, supporting previous findings of low zebrafish c3.2/3 expression before hatching [50].
In addition to constitutive expression, LPS-induced expression also revealed the differential regulation of c3 in the different organs. While the splenic expression levels of c3.1, c3.2/3 and c3.6 reached the maximum increase of 3-4-fold at 6 h post-stimulation,    [51]. The paralog c3.7/ 8 only showed early incremental expression after LPS treatment. In agreement with the differential expression, we also detected differential abilities of the c3 paralogs to modulate the developmental and inflammatory process.
On the one hand, a strong inhibition of c3.1 using morpholinos resulted in an increased percentage of aberrant phenotypes, such as an increased presence of edemas and difficulties in yolk sac resorption, which derived in non-hatched individuals. Co-injection of c3.1 mRNA successfully rescued the phenotype in a dosedependent style. The c3.1 anaphylatoxin fragment (C3.1a) has already been shown to control mutual cell attraction during chemotaxis [52] and been correlated with the tissue regeneration process in zebrafish [53] as well as other species [54,55]. Furthermore, a mild, partial inhibition of c3.1, c3.2/3 and c3.6 resulted in diminished neutrophil migration to the injury site.
On the other hand, inhibition of c3.7/8 did not affect the zebrafish development, but significantly altered the magnitude of the response after inflammatory stimuli. Fish microinjected with c3.7/8 morpholinos showed a great fold-change increase of proinflammatory il1b cytokine expression 3 h after A. hydrophila infection as well as massive neutrophil migration to the regenerative tissue after tail clipping at both 3 and 24 h. This suggests that c3.7/8 plays an important role in complement regulation and inflammation modulation. In summary, our results show that c3.7/ 8 is a paralog c3 gene found exclusively in teleost fish; this paralog has the same structure as classical C3 but might regulate inflammatory responses to maintain an optimal equilibrium between reactions against external stimuli and protection against cell damage. Figure S1 The species tree used in this study was coherent with the current taxonomic information.