Structural Basis for the Immunomodulatory Function of Cysteine Protease Inhibitor from Human Roundworm Ascaris lumbricoides

Immunosuppression associated with infections of nematode parasites has been documented. Cysteine protease inhibitor (CPI) released by the nematode parasites is identified as one of the major modulators of host immune response. In this report, we demonstrated that the recombinant CPI protein of Ascaris lumbricoides (Al-CPI) strongly inhibited the activities of cathepsin L, C, S, and showed weaker effect to cathepsin B. Crystal structure of Al-CPI was determined to 2.1 Å resolution. Two segments of Al-CPI, loop 1 and loop 2, were proposed as the key structure motifs responsible for Al-CPI binding with proteases and its inhibitory activity. Mutations at loop 1 and loop 2 abrogated the protease inhibition activity to various extents. These results provide the molecular insight into the interaction between the nematode parasite and its host and will facilitate the development of anthelmintic agents or design of anti-autoimmune disease drugs.


Introduction
Nematode parasite infections that are highly prevalent in many parts of the world cause significant health problems [1]. Infections of these parasites are often characterized by a chronic and asymptomatic course due to the immunosuppression induced by the parasites in the hosts [2]. The nematode-induced immunosuppression is also manifested as protection from autoimmune and allergic diseases [3,4]. The human gastrointestinal nematode Ascaris lumbricoides infects as many as 1.5 billion people globally and causes malnutrition, retarded growth, reduced physical fitness and reduced work capacity in infected individuals. Similar to other nematode parasite, A. lumbricoides infection has been shown to modulate hosts' immune responses to infections with unrelated pathogens, reduce incidence of asthma, and lessen skin-test reactivity to allergens and house dust mites [3][4][5][6]. A recent study showed that A. lumbricoides pseudocoelomic fluid modulates dendritic cell phenotype and its function [7]. Therefore, the parasitic nematode extracts were proposed as potential therapeutic agents for treatment of autoimmune disorders and allergic diseases [8,9].
Cysteine protease inhibitors (CPI; cystatins) from nematodes have been found to play major roles in modulating host immunity [10][11][12]. Cystatins are a group of cysteine protease inhibitors that bind reversibly to cysteine proteases and regulate their proteolytic activities. The predominant target proteases for cystatin are C1 (papain-like cysteine peptidase) and C13 family (legumain). Parasite CPIs can modulate the protease-dependent functions of host immune cells. A number of cathepsins (a mammalian version of the C1 family cysteine proteases) have been identified as important proteases in mediating immune responses, such as proteolytic degradation of the invariant chain that regulates MHC-II molecule intracellular trafficking, antigen processing and cleavage of intracellular domain of Toll-like receptor (TLR)-9 [13][14][15]. Inhibition of these cysteine proteases may suppress the activation of dendritic cells and interfere with the formation of the MHC-II-antigen peptide complex, resulting in impaired ability of the antigen-presenting cell to activate CD4 + T cells and immune responses.
The members of cystatin superfamily are categorized into three types based on their amino acid sequences and the position of the disulfide bond(s). Type 1 cystatin (stefins) contains an 11 kD single domain without signal peptide and disulfide bond. Type 2 cystatin, with a molecular weight of ,13 kD, is a single domain protein secreted into the extracellular region and has two conserved intramolecular disulfide bonds. Type 3 cystatin (kininogen) comprises three type 2-like domains [16][17][18][19]. Since the first elucidation of the structure of type 2 chicken egg white (CEW) cystatin by Wolfram Bode [20], many other cystatins structures have been determined by X-ray crystallography or NMR, including human cystatin A-D, F [21][22][23][24][25][26], and cystatins from protozoan parasite and soft tick [27,28]. Structurally, a typical cystatin fold contains a fivestranded anti-parallel b-sheet wrapped around an a-helix.
Although CPIs from many nematode species have been studied extensively for their roles in the induction of immunosuppression, the structural features of this group of immunomodulatory proteins remain largely unknown. In this study, we found that the recombinant Al-CPI strongly inhibited the proteolytic activity of cathepsins. We then analyzed the structure of A. lumbricoides CPI (Al-CPI). We identified the critical segments of Al-CPI molecule that could be involved in the interaction between Al-CPI and its target proteases. Mutagenesis study further confirmed that these segments were responsible for the inhibition of cysteine protease activities.

Ethics statement
Adult A. lumbricoides were collected from patients in a village in Yunnan province, China, after written informed consent. The study protocols were approved by the Institutional Human Study Ethics Committee of Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences.

Molecular cloning, expression and purification of Al-CPI
To clone the Al-CPI cDNA, total RNA was isolated from adult worms and double-stranded cDNA was obtained by RT-PCR using random primers and a reverse transcription system (Promega, Madison, WI). A fragment of the gene encoding Al-CPI was amplified by PCR from the cDNA with the primers 59-CCGGAATTCGAAAACCTGTATTTTCAGGGCCAAGTAG-GAGTTCCTGGTGGTTTC-39and 59-ACGCGTCGACTTA-TGCAGATTTGCATTCTTTGATG-39. The sense and antisense primers were designed based on sequences conserved in cystatins previously described for Nippostrongylus brasiliensis, Onchocerca volvulus, Brugia malayi, Haemonchus contortus and Caenorhabditis elegans in GenBank and Heligmosomoides polygyrus [28]. Primers for 39 and 59 RACE were synthesized and full-length Al-CPI cDNA was obtained by standard RACE protocols. The full length Al-CPI gene (GenBank accession no. HQ404231) was constructed into pET32a vector and then transformed into Origami (Novagen, The values in parentheses refer to statistics in the highest bin. R-factor = g h |Fo(h)-Fc(h)|/g h Fo(h), where Fo and Fc are the observed and calculated structure-factor amplitudes, respectively. c) R free was calculated with 5% of the data excluded from the refinement. d) Root-mean square-deviation from ideal values. Madison, WI). Cells were grown in Luria-Bertani (LB) medium containing 100 mg/ml kanamycin and ampicillin to A 600 = 0.6-0.8 at 37uC for ,3 h. The cells were then divided into two parts and cultured at 20uC for 20 h, after being induced with 1 mM isopropyl b-D-1-thiogalactopyranoside (IPTG). Cells were harvested with 6000 rpm centrifugation for 15 min and frozen at 280uC. Al-CPI protein was purified using nickel affinity column with washing buffer A (200 mM NaCl, 20 mM Tris pH 7.5) and then elution buffer B (200 mM NaCl, 250 mM Imidazole, 20 mM Tris pH 7.5), followed by TEV protease digestion in dialysis buffer (100 mM NaCl, 20 mM Tris, pH 7.5). Al-CPI was further purified by sizeexclusion chromatography using HiLoad TM 16/60 Superdex TM 75 column (GE Healthcare, Uppsala, Sweden), with a peak elution volume of 83 ml corresponding to the monomeric form of Al-CPI. The expression, purification and crystallization of the recombinant Al-CPI protein were described in detail in our previous report [29]. PCR-based mutagenesis strategy was used to generate Al-CPI mutants and the specific residues predicted to be critical for binding to proteases were replaced. Al-CPI mutants were constructed in pET32a vector and transformed into Origami. The mutant proteins were expressed and purified similarly as the wild type Al-CPI.

Measurement of cathepsin inhibition activity of Al-CPI
Inhibitory activity of the recombinant Al-CPI and the mutant proteins was determined by protease-activity assays using specific fluorogenic substrates [30]. Cathepsin B and C were purchased from Sigma-Aldrich (St Louis, MO) and cathepsin L and S were obtained from Calbiochem (La Jolla, CA) and Enzo (New York, NY), respectively. The fluorogenic substrates for cathepsin B (Z-Arg-Arg-AMC) and cathepsin C (Gly-Phe b-naphthylamide) were obtained from Sigma-Aldrich and Santa Cruz Biotechnology (Santa Cruz, CA), respectively. Substrates for cathepsin L (Z-Phe-Arg-AMC) and cathepsin S (Z-Val-Val-Arg-AMC) were from Calbiochem and Enzo Life Sciences (Plymouth, PA), respectively. To measure the inhibition activity of Al-CPI, the cathepsins were incubated with the substrates in the absence or presence of serially diluted Al-CPI in appropriate buffer for 15 min. The reaction was stopped with stopping buffer. The amount of product was measured fluorometrically with excitation at 360 nm and emission at 460 nm. The inhibitory activity of Al-CPI was expressed as a percentage of the total activity detected in reactions without Al-CPI. The half maximal inhibitory concentration (IC 50 ) of Al-CPI and its mutants based on initial reaction velocities were determined by nonlinear regression analysis.

Crystallization and X-ray data collection
The monomeric form of Al-CPI eluted from the size-exclusion column was used for crystallization screening. Protein concentration was determined using Bio-Rad Protein assay kit (Bio-Rad, Hercules, CA). Crystals were obtained from 0.2 M sodium acetate trihydrate, 0.1 M sodium cacodylate trihydrate pH 6.5, and 30% w/v polyethylene glycol 8,000 (from Crystal Screen HT). Crystals were grown by the sitting-drop vapor diffusion method and microseeding. For data collection, crystals were soaked in the reservoir liquid added with 20% glycerol as the cryo-protectant. X-ray diffraction data were collected to the resolution of 2.1 Å with an in-house Oxford Diffraction Gemini R Ultra system (Oxford, England) and the beam line 17U of the Shanghai Synchrotron Radiation Facility. The diffraction images were indexed and integrated by Mosflm [31] and scaled using SCALA from CCP4 [32]. The crystals belonged to the space group P1. Unit cell parameters are shown in Table 1.

Structure determination and refinement
Al-CPI structure was determined by molecular replacement using the program MOLREP [33] with chicken egg white cystatin (PDB code: 1CEW) as the search model [20]. The structure was refined by Refmac5 from CCP4 package and rebuilt with Coot [32,34,35]. R/Rfree values of the final models are 0.202/0.256. The detailed refinement statistics are shown in Table 1. All structural figures were prepared by PyMOL [36]. Atomic coordinates and structure factors of Al-CPI have been deposited in the Protein Data Bank (PDB) with the accession code 4IT7.

Molecular docking analysis
Models for the mutants of Al-CPI were generated using the Design Protein tool in Discover Studio 3.11 (Accelrys Inc., San Diego, CA). Five models of Al-CPI mutants were built and optimized. Docking of wild type and mutated Al-CPI to various cathepsins was done using ZDOCK from Discovery Studio 3.11. Al-CPIs (wild type and mutant forms) and cathepsins were treated as ligands and receptors, respectively. PDB codes for the cathepsins are: 2IPP (cathepsin B), 3PDF (cathepsin C), 3HWN (cathepsin L), 2FQ9 (cathepsin S). In each of the docking calculations, two thousand poses were generated and the poses with a ZDOCK score higher than 12 were refined in RDOCK. The best refined models were chosen for further analysis.

Statistical analyses
Statistical analyses were performed with GraphPad Prism 5 software (GraphPad Software Inc., La Jolla, CA). Significance of the differences between groups was analyzed using the Student's t test. Individual data and mean 6 S.D. of the group are presented. A p value,0.05 was considered significant.  The Ca of conserved amino acid residues were chosen for all the distance measurements. doi:10.1371/journal.pone.0096069.t002

Molecular cloning and biological activity of recombinant Al-CPI
The cDNA library of A. lumbricoides was screened by RT-PCR using the primers for consensus sequences of cystatins reported in other nematode parasites, and a fragment of the CPI gene was obtained. The full length CPI gene from A. lumbricoides was obtained by RACE technique. The complete cDNA of Al-CPI contains an open reading frame of 399 bp coding for 132 amino acid residues. The biological activity of the recombinant Al-CPI protein was determined by testing its ability to inhibit the proteolytic activity of cathepsin B, C, L and S. The recombinant Al-CPI exhibited various levels of inhibitory activity to the four cathepsins in a dose-dependent manner (Fig. 1). Al-CPI showed strong inhibition to cathepsin L, while intermediate inhibition to cathepsin C, S and weak inhibition to cathepsin B were observed (Fig. 1).

Structural feature of Al-CPI
To further understand the molecular mechanism of the interaction between Al-CPI and its target proteases, crystal structure of Al-CPI protein was obtained. The monomeric form of Al-CPI crystallized in the space group P1 (Table 1). There are four copies of Al-CPI monomer in the asymmetric unit. The Al-CPI monomer structure shows a conventional type-2 cystatin fold. It has a five-stranded anti-parallel b-sheet that wraps around the central a-helix. From the N-terminus to the C-terminus, Al-CPI contains: N-terminal fragment (N), short b-strand 1 (b1, residue 9-11), a-helix (17-32), b2 (31-50), loop1 (L1, 51-54), b3 (55-64), appending structure (AS, 65-86), b4 (87-96), loop2 (L2, 97-101), and b5 (102-112). Al-CPI also has two conserved intra-molecular disulfide bridges between C68 and C78 and between C89 and C109 ( Fig. 2A). In the final model, the N-terminal five residues were invisible from the electron density map and were not modelled. Similar to crystal structures of cystatin and cathepsin complex reported previously by others [37], the N-terminal fragment (G6-G7), loop 1 (V51-T54) and loop 2 (P97-F101) of Al-CPI form a wedge segment that is likely to insert into the activity pocket of papain-like cysteine proteases in such a way that Al-CPI can inhibit the protease activity.
Only five unique types-2 cystatin structures were found in PDB to this date. They are from different species: CEW cystatin from chicken (Gullus gallus), cystatin C, D and F from human (Homo sapiens), and salivary cystatin from soft tick (Ornithodoros moubata). Among these five structures, CEW cystatin has the highest sequence identity (34%) with Al-CPI and cystatin C shows the highest structure similarity with Al-CPI with a Z-score of 16.0 from Dali server [38]. Most of these cystatin structures, including a V57N mutated form of cystatin C, are monomer. One exception is cystatin F that was glycosylated and formed a dimer in the structure. To compare the structures of these similar cystatins, multiple sequence alignment was performed with Multalin. The distances between the a-helix and other parts of cystatins were then measured using the Ca of conserved amino acid residues (marked by red arrow in Fig. 2D). The distance between the ahelix and the b-sheet was much shorter in Al-CPI and tick salivary cystatin, compared with other cystatins. The distance between the a-helix and the active site segment (N, L1 and L2) was longer in Al-CPI and salivary cystatin than the distance in other cystatins ( Table 2). As tick salivary cystatin is very similar to Al-CPI in this local region, for clarity we only superimposed the structures of Al-CPI with CEW cystatin and cystatin C. As shown in Fig. 2B and C,the a-helix core packs much tighter against the b-sheet in Al-CPI than in CEW cystatin and cystatin C; the active site segment A detailed analysis of the residues involved in the intramolecular packing interface reveals that Al-CPI has some unique sequence features not observed in other cystatins. Al-CPI has an isoleucine (I29) in the middle of the a-helix, while the amino acid at that position in other cystatins is a tyrosine (Fig. 2B). Directly across that position, there is a valine (V91) in Al-CPI while in other cystatins there is instead a phenylalanine. The bulky aromatic residues will push the a-helix away from the b-sheet. A third position is I106 for Al-CPI, while it is a serine in other cystatins. A hydrophobic residue (isoleucine) will help the a-helix pack closer to the b-sheet (to become a hydrophobic core) than a hydrophilic residue such as serine can do. These sequence differences also exist in other cystatins (data not shown). Interestingly, in the interface between the active site segment (N, L1 and L2) and the a-helix, Al-CPI contains mostly polar residues while other cystatins contain mostly hydrophobic residues (Fig. 2C). Therefore, compared to the

Interaction between Al-CPI and cathepsins
The docking analysis revealed that the interaction between Al-CPI and cathepsin L mainly involved a hydrophobic groove in cathepsin L (Fig. 3A and B). H163 and C25, being the key residues responsible for the cathepsin activity, could form hydrogen bonds with the main chains of G6, V51 and V52 in Al-CPI ( Fig. 3C and  D). Additionally, P97 and W98 from Al-CPI would pack against L114, F145, W189 and W193 in cathepsin L (Fig. 3E). These interactions revealed by the docking analysis suggest that three regions of Al-CPI (G6 of N-terminal fragment, V51 and V52 of loop 1, and P97 and W98 of loop 2) may be important for the Al-CPI binding with the cathepsins to exert its inhibitory effect.
The enzymatic experiment results showed the strongest inhibitory potency of Al-CPI to cathepsin L but the weakest to cathepsin B (Fig. 1). To further understand the molecular basis of this difference, we performed docking calculations for four cathepsins. Al-CPI exhibits the highest binding affinity with cathepsin L and the lowest affinity with cathepsin B among the four cathepsins tested ( Fig. 4A and B; Table 3). The docking calculations support and provide a possible explanation to the experimental inhibition data. Sequence alignment shows that the active site is well conserved across different cathepsins (Fig. 4C, shown in red box). However, cathepsin B has a unique insert sequence (105-124) (Fig. 4C, shown in blue box). Docking analysis results indicate that the insert segment of cathepsin B would crash with L2 of Al-CPI, resulting in reduced binding affinity and inhibition activity (Fig. 4B).

Protease inhibition activity of Al-CPI variants
The structural analysis results presented above suggest that the G6 of N-terminal fragment, loop 1 and loop 2 of Al-CPI are critical regions for Al-CPI to bind to the proteases. To verify the importance of these regions in Al-CPI function, Al-CPI mutants were generated and tested for their protease inhibition activities. Dramatic changes in the protease inhibition were observed in Q50E+V52G (mutant 3) and P97G+W98G (mutant 4) double mutations and combined mutations (mutant 5). These mutants exhibited significantly reduced inhibition activities to cathepthins (increased IC 50 values) (Fig. 5). Mutations at the critical binding sites also resulted in significant changes in the calculated binding affinity between Al-CPI variants and cathepsin L ( Table 3). Compared with the wild-type Al-CPI, V52G mutant showed slightly reduced binding affinity to cathepsin L as well as slightly reduced enzymatic inhibition activity to cathepsin C and S. However, mutations of Q50E+V52G (loop 1) resulted in greater reduction in the affinity and protease inhibition activity than the V52G single mutation. From docking calculation, the G6E mutant  showed reduced binding affinity to cathepsin L to a greater extent than that shown by the V52G mutant (Table 3). Yet, in enzymatic assay, the inhibitory activity was reduced to a smaller extent in the G6E mutant than in the V52G mutant (Fig. 5). This discrepancy is likely due to some additional effects induced by the changes around the V52 region that were not factored in the docking calculations.

Discussion
Nematode parasites are known to modulate or suppress the immune responses of host and, consequently, protect against the development of autoimmune and allergic diseases [2,39,40]. Although CPIs of nematode parasites have been studied extensively for their immunomodulation function, the knowledge of their detailed structural features is still lacking. Elucidation of the structure and function relationship of parasite CPIs would greatly facilitate the discovery of a new group of drugs for treatment of allergic diseases. To this end, we investigated the structural basis of the immunomodulatory function of CPI from A. lumbricoides. We identified the N-terminal segment, the loop 1 and the loop 2 as the key regions for Al-CPI binding with cathepsins. These observations are consistent with those previously reported for other cystatins [20,21,37]. However, our results further showed that only the mutations at loop 1 and loop 2 significantly reduced the inhibition activity of Al-CPI. Docking analysis between Al-CPI and four cathepsins demonstrated various binding affinities that were consistent with the inhibition activities detected in the enzymatic analysis. Al-CPI has less inhibitory activity against cathepsin B, compared with C, L and S. Structural and docking analysis suggested that the specificity is mainly due to the insertion of a short segment in cathepsin B, causing the steric hindrance for Al-CPI binding. These results revealed the details of the potential molecular interaction between Al-CPI and the proteases, identified the regions critical for Al-CPI inhibition functions, and provided explanation to the differential inhibition activities of Al-CPI against the four cathepsins studied.
Compared with other type-2 cystatins, we found that the parasite cystatins (both Al-CPI and tick salivary cystatin) had two unique structure features: a tighter hydrophobic core and a more open active site segment. This could be related to their functions during evolution. In a parasite's life cycle, its cystatins not only have to act on its own proteases, but also on proteases in its host. The relatively tighter core may prevent the cystatins from being degraded easily in the hosts. The more open active site segment could render the cystatins more flexible and accessible for binding to the target cysteine protease. This hypothesis should be verified by further experiments.
In conclusion, our results demonstrated that the cysteine protease inhibitor from human gastrointestinal nematode, A. lumbricoides, has distinctive effect on different cathepsins. Structural analysis of the recombinant Al-CPI protein identified the key segments involved in the enzymatic function of this parasitederived molecule. These observations may provide important insight into the molecular mechanism of immunosuppression associated with helminth infections and might be useful for the development of anti-allergic immunomodulatory drugs.