Characterization of a Gene Coding for the Complement System Component FB from Loxosceles laeta Spider Venom Glands

The human complement system is composed of more than 30 proteins and many of these have conserved domains that allow tracing the phylogenetic evolution. The complement system seems to be initiated with the appearance of C3 and factor B (FB), the only components found in some protostomes and cnidarians, suggesting that the alternative pathway is the most ancient. Here, we present the characterization of an arachnid homologue of the human complement component FB from the spider Loxosceles laeta. This homologue, named Lox-FB, was identified from a total RNA L. laeta spider venom gland library and was amplified using RACE-PCR techniques and specific primers. Analysis of the deduced amino acid sequence and the domain structure showed significant similarity to the vertebrate and invertebrate FB/C2 family proteins. Lox-FB has a classical domain organization composed of a control complement protein domain (CCP), a von Willebrand Factor domain (vWFA), and a serine protease domain (SP). The amino acids involved in Mg2+ metal ion dependent adhesion site (MIDAS) found in the vWFA domain in the vertebrate C2/FB proteins are well conserved; however, the classic catalytic triad present in the serine protease domain is not conserved in Lox-FB. Similarity and phylogenetic analyses indicated that Lox-FB shares a major identity (43%) and has a close evolutionary relationship with the third isoform of FB-like protein (FB-3) from the jumping spider Hasarius adansoni belonging to the Family Salcitidae.


Introduction
During evolution, two systems of immunity have arisen: innate and adaptive. The innate immune system is the oldest and found in all multicellular organisms, while the adaptive immune system, which emerged about 450 million years ago, is present only in vertebrates, except for the Agnatha [1,2]. The complement system, in mammals, plays an important role in both, innate and adaptive immune system and is composed of more than 30 serum and cellsurface components that participate in the recognition and clearance of invading pathogens. The activation of the complement system can occur by three pathways: classical, lectin and orthologues of C2 and FB genes, and probably have arisen before divergence of Cnidaria/Bilateralia [2].
When we elucidated the transcriptome of the Loxosceles laeta spider venom gland, in addition to finding the expression profile of the Sphingomyelinases D, the major proteins responsible for the envenomation, other EST sequences with similarity to C3 and FB-like genes, from invertebrate organisms, were identified [13]. These findings suggest that the central components of the complement system could be expressed in the venom gland of the Loxosceles spiders. Thus, the present work aimed to clone and characterize the FB complement component from Loxosceles laeta venom gland and phylogenetically analyze its deduced amino acid sequence.

Material and Methods
Loxosceles spiders and isolation of RNA Loxosceles laeta spiders were collected in Campo Alegre, Santa Catarina, Brazil and kept at Immunochemistry Laboratory of Butantan Institute, São Paulo, Brazil. Eighty L. laeta female spiders were subjected to food restriction to stimulate the production of mRNA in the venom glands. After 5 days, the venom glands were collected and frozen at -80°C until use. For total RNA extraction, the Trizol reagent was used following the manufacturer's instructions (Gibco-BRL Life Technologies, MD, USA). The authorization to access the L. laeta (permission no. 01/ 2009) was provided by the Brazilian Institute of Environment and Renewable Natural Resources (IBAMA), an enforcement agency of the Brazilian Ministry of the Environment.

RT-PCR and Rapid amplification of cDNA ends (RACE)
Based on the EST sequence LLAE0889S, which we previously identified in the transcriptome of L. laeta to be similar to the complement factor B [13], specific sense and antisense primers were designed to amplify the complete gene sequence (FB sense -5' CGAAGCAGCTCAAG GACCAC 3' and FB antisense -5' CCTTCCATCCATGCGACCAC 3'). The SMARTer RACE cDNA Amplification kit (Clontech, CA, USA), was used to amplify the Loxosceles FB (Lox-FB) RNA. The PCR reactions were performed using the following conditions: 2 cycles of 94°C for 30 sec, 72°C for 3 min; 5 cycles of 94°C for 30 sec, 70°C for 30 sec and 72°C for 3 min and 27 cycles of 94°C for 30 sec, 68°C for 30 sec and 72°C for 3 min.

Cloning and sequencing of Lox-FB
Resulting products from the RACE reactions were separated in 1% agarose gel and the positive PCR products were purified using PureLink™ PCR Purification kit (Invitrogen, CA, USA) and directly cloned into pGEM-T-Easy Vector (Promega, WI, USA), at 16°C overnight, and transformed into E. coli competent Cells XL1Blue. Positively transformed cells were grown overnight at 37°C in LB (Luria Bertani) broth supplemented with 100 μg/mL ampicillin. Plasmids were isolated by Boiling Plasmid Mini Prep method [14], digested with restriction endonuclease enzyme EcoRI (New England Biolabs, MA, USA) and purified using phenol and chloroform [15].

Sequence analysis and molecular modeling
All sequences were analyzed both at the nucleotide and amino acid levels using the Basic Local Alignment Search Tool (BLAST) from the National Center for Biotechnology Information (NCBI: http://www.ncbi.mlm.nih.gov/blast/BLAST.cgi). Translation and protein analyses were performed using ExPaSy tools (www.expasy.org). The deduced amino acid sequence of Lox-FB was aligned with the corresponding sequences of various animals using MEGA 6 software. On the basis of the human FB structure (PDB ID code: 2ok5.1), Swiss-model website (http:// swissmodel.expasy.org/) was used to create a comparative homology model of Lox-FB [16].

Phylogenetic analysis
All FB-like sequences used for phylogenetic analysis were downloaded from the GenBank database. Multiple sequence alignments were performed with full length open reading frame sequences using MUSCLE (Multiple Sequence Comparison by Log-Expectation) and the phylogenetic tree was constructed based on this alignment using the Maximum Likelihood (ML) algorithm available in MEGA 6 software [17]. Statistical confidence of the evolutionary analysis was assessed by bootstraps of 1000 replicates.

Results
Characterization of L. laeta FB The 5' and 3' RACE fragments yielded a complete reading frame (ORF) of Loxosceles laeta factor B-like composed of 1953 base pairs (bp) that encodes for a protein of 651 amino acids ( Fig  1). The NCBI's conserved domain database (CDD) program identified that Lox-FB has a classical domain organization, composed of two CCPs, a vWFA domain and a SP domain (Fig 2). The leader peptide signal is composed of 25 amino acids producing a mature protein of 626 amino acids. Eighteen cysteines were found, eight of them present in the CCP domains, one cysteine in the vWFA domain and nine cysteines in the SP domain (Fig 1). The deduced molecular weight of Lox-FB was predicted as 72.38 kDa and the isoeletric point as 5.73, without considering the eight putative N-glycosylation sites.

Multiple Alignment of Lox-FB with other FB/C2-like proteins
Although the C2/FB proteins described in the literature, until now, have the same architecture of domains composition, they are different in some aspects like the quantity of CCP domains. For example, the FB-like found in some species belonging to Echinodermata phylum [11], horseshoe crab [10] and one of two isoforms present in a sea anemone [8] contains five CCP domains instead of three found in vertebrates, in which they were the first to be characterized (Table 1). Lox-FB has only two CCP domains, as FB found in bivalves [18], FB-2 centipede, FB-2 sea spider and FB-3 jumping spider [12]. Because of these differences, alignments at this position tended to be out of register in C2/FB sequences that contain more than two CCP domains. Each CCP module from Lox-FB has approximately 60 amino acids of length and there are some highly conserved residues as proline (P), glycine (G), tryptophan (W) and four cysteines (forming two disulfide-bridges; I-III and II-IV) (Fig 3).
Analyzing the other two domains, many conserved sites were detected, such as the five amino acids involved in the binding to C3b dependent on Mg 2+ ions present in the vWFA domain; seven cysteines and those regions close to the active site also appeared in the same position in the SP domain (Fig 4). However, none of the three amino acids residues important in the serine protease activity (catalytic triad) were conserved (Fig 4). Apparently, the classic catalytic triad of serine peptidases belonging to trypsin-like family composed of histidine (H), aspartic acid (D) and serine (S) residues [19,20] was replaced by other amino acids: serine (S), asparagine (N) and proline (P). Interestingly, Lox-FB, FB-like isoform 3 from the spider Hasarius adansoni and FB-like molecule of bivalve Ruditapes decussatus are similar, since they have only two CCP domains, two extra cysteines (highlighted in grey) and did not preserve the catalytic triad. Considering only the spider species, both of them share the same amino acids at the first and second position of the protease active site, but not the last one. Because of these  differences, it could be that these proteins have lost their proteolytic activity or they have a different mechanism of activation. Despite the difference in the amino acids considered to be of importance to the enzymatic function of human FB, Lox-FB had conserved amino acids residues placed surrounding the triad, particularly Thr 54 , Ala 56 and Ser 214 (chymotrypsin numbering) that play an important role in stabilization of catalytic triad [20].

3D Structure of Lox-FB
The resulting predictive structure of Lox-FB, obtained after computational modeling using the available crystal structure of human factor B (2ok5.1), revealed 23.02% of identity and, despite differences in the number of CCP domains, a remarkable structural similarity between the CCP2-CCP3 domains of human FB and CCP1-CCP2 from Lox-FB was observed (Fig 5). Furthermore, the vWFA domain fits perfectly with human FB, since the six major α-helices surrounding a central twisted β-sheet are present and conserved at the same positions with minor differences on the conformation of the loop folds. As well as linear alignment, the amino acids that represent the metal ion-dependent binding site (MIDAS) are conserved and occupy the same positions on the 3D structure ( Fig 6A). With respect to the serine protease domain (SP), the overlap was not as perfect as that observed for the vWFA domain; nevertheless, the regions that constitute the secondary structures of the β sheet are overlapping (Fig 6B). The three amino acids that comprise the catalytic triad of human factor B (H, D, S) are aligned with the same three amino acids (S, N, P) from Lox-FB observed in the alignment of the primary sequences ( Fig 6B).

Phylogenetic analysis
Lox-FB showed similarity with complement proteins sharing 25% and 26% identity with human FB and C2, respectively (Table 2). Values obtained with invertebrates FB/C2 proteins exhibited a similarity ranging from 22% to 43% of identity. To investigate the evolutionary history of FB and C2 complement proteins and to determine how Loxosceles spiders fit into this picture, 56 FB and C2 sequences were used and subjected to phylogenetic analysis. An unrooted phylogenetic tree was constructed and resulted in two differentiated groups, vertebrate and invertebrate proteins (Fig 7). Considering the vertebrate proteins, there is a clear separation between C2 and FB proteins, except for some fishes FB/C2 sequences. The lamprey FB (jawless vertebrate) is positioned outside of the jawed vertebrate FB and C2 components suggesting that it represents the ancestral group of higher vertebrates FB and C2 proteins. Considering the branch represented by invertebrates, there are three main groups: the first one is represented by ascidians FB-like sequences (group A) which are possibly the closest ancestor of FB/C2 vertebrate sequences. The second was called group B, which comprises Lox-FB and FBlike sequences from cnidarians, amphioxus and bivalve; Lox-FB is located at the same branch of FB proteins (isoform 2) as the centipede and sea spider, and at the same sub-branch of the

Discussion
The studies of the evolution of the complement system have progressed in the last years and due to technological advances, including the genomic and transcriptomic methodologies, resulting in the increased possibility of discovering genes related to the complement system in invertebrates. At the present moment, there is more information about the existence of C3 genes than of FB genes, however, it is known that the FB gene is missing from genome sequences of cnidarians hydra, the nematode C. elegans and several species of insects. These findings suggest that the origin of the FB genes probably has occurred before the divergence between Cnidaria and Bilateralia, more than one billion years ago, and the absence of these complement genes in cnidarian and in some protostome lineages can be explained due to a secondary loss [21,2]. The present work confirms this hypothesis, since a FB-like gene was also found expressed in Loxosceles spider venom gland. This finding represents the first identification of a factor B homologue from a Loxosceles spider (Arachnida, Sicaridae) and similar to the vertebrate FB/C2 proteins, Lox-FB is a mosaic protein composed of CCP, vWFA and serine protease domains. Many studies have indicated the existence of a less complex complement system, named the "archeo-complement system" involving C3-like proteins associated to factor B-like proteins as it was described in lamprey [22], sea urchin [11], horseshoe crab [10], ascidians [23], amphioxus [24], sea anemones [8], bivalve [18] and sea cucumber [25]. Factor B is a component of the alternative pathway and since this gene was found in organisms belong to primitive lineage (cnidarians and protostomes), it is possible that these organisms are endowed with alternative pathway activity.
All invertebrate C2/FB proteins, characterized so far, preserve the same classic architecture domain found in related vertebrate proteins. Despite of some particularities in N-terminal region such as extra CCP domains, observed in sea urchin and horseshoe crab, or the additional domains, as low density lipoprotein receptor like domain (LDL_A), found in ascidians or epidermal growth factor-like domain (EGF_CA) in amphioxus, most of all these C2/FB proteins have conserved the regions that play important role for activation of this protein and because of that they are considered orthologues of the mammalian FB and C2.
Highly conserved regions were identified in Lox-FB such as CCP consensus sequence, Mg 2+ binding sites in the vWFA domain and conserved positions near to the catalytic center within the SP domain. However, considering the domain architecture, Lox-FB has only two CCP domains, one less than observed in vertebrates, but is similar to FB-3 present in the jumping spider Hasarius adansoni, the centipede Scolopendra subspinipes japonica FB-2, the sea spider Ammothea sp FB-2 and the bivalve Ruditapes decussatus FB [12,18]. The CCP module is a domain commonly present in many mammalian complement proteins and is responsible for the interaction of complement proteins, with each other and their respective regulators and receptors, but also for the binding of e.g. factor H to human cells. Pathogens often mimic or capture CCP-like molecules to avoid detection and destruction by the complement system and can also use CCP-containing molecules to gain entry into the cell (e.g EBV binding to CR2) [26]. Not much of the roles of CCPs in pathogen evasion/protection in invertebrates is known, but recently, some studies demonstrated a role for CCPs in preventing lethal flaviviral infection of mosquitoes that are responsible for transmission of e.g. Dengue fever and Yellow fever. The mosquitoes are not affected by these viruses themselves, while they can transmit the disease to humans. The mosquito Aedes Aegypti was found to contain a neural factor AaHig that contained 5 CCP domains and that functions as viral recognition factor interacting with surface proteins of dengue virus (DENV) or Japanese encephalitis virus (JEV), thereby preventing the flaviviral entry into the mosquitos neural cells [27]. Another study showed that a scavenger receptor binds to DENV via CCP modules and indirectly helps to control flavivirus infection by inducing antimicrobial peptides [28]. Thus, even if the main role of CCP domains in invertebrates FB is to interact with C3b fragment and cause pathogen opsonization, the possibility cannot be excluded that they may also function as a pathogen recognition factors in the spider. However, a major difference between the mosquitos and the spiders is that the mosquitos are vectors in the transmission of viruses and other pathogens, and thereby it is essential that they are protected against the pathogen themselves. Such a role for spiders has not been described yet and thus no evolutionary pressure may have been present to evolve such a role for the CCP-containing spider molecules.
Factor B belongs to family S1 of clan PA of peptidases, since it has a serine protease domain that bears the chymotrypsin fold and almost all representatives of this class utilize the canonical catalytic triad represented by Asp 102 , His 57 and Ser 195 (chymotrypsin numbering) [20]. Despite of many similarities in the secondary structures between human FB and Lox-FB, the classical catalytic triad was not found, however, the adjoining amino acids were conserved, but whether Lox-FB has proteolytic activity remains to be investigated. It is possible that the amino acids present at the putative catalytic site may form an active site or this protein corresponds to a Lox-FB inactive isoform. We only found one FB-like molecule in the Loxosceles venom gland, but we cannot exclude that the Loxosceles spiders have other FB isoforms that possess a conserved triad catalytic in their hemolymph.
Thus far, in invertebrates, there is no clear information about specific proteolytic activity of an alternative pathway (AP) convertase C3bBb-like and how it is assembled. A serine protease activity in horseshoe crab plasma is triggered by PAMP molecules, such as LTA and LPS in Mg 2+ and Ca 2+ -dependent manner, however, it is not clear if CrC2/Bf participates directly in CrC3 activation or if it has to be activated by other serine protease similar to FD vertebrate complement [10]. Le Saux and collaborators demonstrated a new role played by CrC2/Bf that is able to binding to the three members of PRRs: galactose-binding protein (GBP), Carcinolectin-5 (CL5) and C-reactive protein (CRP), promoting their assembly on pathogens and, consequently, activating the complement system [29]. These findings also suggest that CrC2/Bf could function as a MASP counterpart, participating in putative lectin pathway.
Although the horseshoe crab complement system is being studied with great depth considering proteolytic activities, most data derived from invertebrates FB/C2-like is represented by characteristics based on their putative structures. In other words, there is no experimental evidence that those isoforms that did not retain the classic catalytic triad actually had lost their proteolytic activity.
Furthermore, it is possible that in the Loxosceles complement system there is another mechanism of activation, independently of factor B, as observed in the horseshoe crab Tachypleus tridentatus. It was demonstrated that these organisms are endowed by a serine protease named Factor C, originally characterized as an LPS-sensitive initiator of hemolymph coagulation stored within the hemocytes, and that this factor could act as a C3 convertase on the surface of invading Gram-negative bacteria in the initial phase of complement activation [30]. Perhaps, there is also a component similar to the factor C in the hemolymph or the venom gland from Loxosceles, however, the activity of Lox-FB should be evaluated to understand if it has physiological roles in the Loxosceles complement system activation. Recently, Tagawa et al. (2012) [31] characterized two isoforms of factor B from Tachypleus tridentatus (TtC2/Bf-1 e TtC2-Bf2) and both of them were indispensable for TtC3b deposition on Gram-positive bacteria and fungi. Even though, they have not characterized a factor D-like serine protease in horseshoe crab, and because of this they suggested that other components, such as plasma lectins, which could be important for recruitment of the C3bBb-like on the surfaces of Gram positive bacteria and fungi. Then, it seems that the mechanism of activation of horseshoe crab complement system is different when compared to mammals and maybe the Loxosceles spider has the same pattern of activation.
According to phylogenetic analysis, there is a divergence between the proteins present in vertebrates and in invertebrates. At the branch represented by components factor B and C2 from vertebrates, the lamprey FB appears as sister group of vertebrates, indicating that the gene duplication events happened before the origin of jawed vertebrates. This configuration of phylogenetic tree is in agreement with the absence of classical pathway in jawless vertebrates, since they do not have immunoglobulins genes [9]. Almost all sequences from invertebrates have more than one isoform and some of them are grouped at the same clade (group C) as observed for the two isoforms (1 and 2) of the limulus Tachypleus tridentatus (horseshoe crab 2 C2/FB), sea cucumber Apostichopus japonicus and jumping spider Hasarius adansoni. However, in some species that expressed different FB-like isoforms, their isoforms did not locate to the same group; for instance, some isoforms of FB-like sequences from the centipede, sea spider and the jumping spider were grouped together in group B in which Lox-FB is also found, while other isoforms of the same species were grouped in group C.
The phylogenetic history of a protein does not always follow the evolution of the species because of different selective pressures. Along with the type of pathogen that the organism is infected with, other factors as adoption of different habitats, life histories and complexity will influence immune system design and mode of action and evolution [32]. Considering the whole organism, there are many types of selective pressures that influence the survival as, for instance, climate changes, predation, food availability and infections. Factor B is a protein related to the immune response and the selective pressures worked on it are mainly represented by recurrent infections, to which the organisms were exposed. Therefore, it is possible that the shell clam Ruditapes decussatus, amphioxus B. belcheri and the spider Loxosceles laeta have been infected by similar pathogens that express the same molecular patterns, which explains the distribution of these species at the same group on phylogenetic tree. Further studies will be necessary to understand the nature of the protein Lox-FB and to investigate how it interacts with other complement proteins possibly present in Loxosceles hemolymph. Knowing these aspects could contribute to better understand the defense mechanisms of Loxosceles spiders in the context of immunologic responses.