A Comprehensive Analysis of Plasmodium Circumsporozoite Protein Binding to Hepatocytes

Circumsporozoite protein (CSP) is the dominant protein on the surface of Plasmodium sporozoites and plays a critical role in the invasion by sporozoites of hepatocytes. Contacts between CSP and heparin sulfate proteoglycans (HSPGs) lead to the attachment of sporozoites to hepatocytes and trigger signaling events in the parasite that promote invasion of hepatocytes. The precise sequence elements in CSP that bind HSPGs have not been identified. We performed a systematic in vitro analysis to dissect the association between Plasmodium falciparum CSP (PfCSP) and hepatocytes. We demonstrate that interactions between PfCSP and heparin or a cultured hepatoma cell line, HepG2, are mediated primarily by a lysine-rich site in the amino terminus of PfCSP. Importantly, the carboxyl terminus of PfCSP facilitates heparin-binding by the amino-terminus but does not interact directly with heparin. These findings provide insights into how CSP recognizes hepatocytes and useful information for further functional studies of CSP.


Introduction
Species of the genus Plasmodium are the causative agents of malaria. Malaria is transmitted through the bite of an infected Anopheles mosquito, which injects parasite forms, known as sporozoites, into human hosts. Sporozoites develop in the mosquito midgut inside a structure called an oocyst. After release from the oocyst, sporozoites invade the mosquito salivary glands [1][2][3]. When an infected mosquito takes a blood meal, sporozoites are deposited into the host's dermis. The transmitted sporozoites enter the host's bloodstream, which carries them to the liver. Sporozoites invade hepatocytes and undergo replication within a vacuole to form liver stages. After their release into the bloodstream, the parasites initiate the symptomatic erythrocytic cycle of malaria. Blocking sporozoite infection of the liver decreases both the incidence and severity of the disease, as shown by the use of the recently approved anti-malaria vaccine [4]. This vaccine targets the most abundant protein on the sporozoite surface, the circumsporozoite protein (CSP).
CSP is present in all Plasmodium species. Though variation exists in the amino acid sequence across species, the overall domain structure is well conserved. CSP is embedded in the plasma membrane via a GPI anchor at its C-terminus [5,6], exposing the protein to the extracellular space. The protein consists of an N-terminal domain (NTD), a region of tandem repeats, and a C-terminal domain that is homologous to the thrombospondin type-1 repeat (TSR) superfamily ( Fig 1A). The NTD of CSP lacks predicted secondary structures [7], likely adopting a flexible configuration. Three well-conserved motifs have been identified in CS proteins from different Plasmodium species: region I consisting of the peptide KLKQP immediately upstream of the central repeat domain, region III immediately downstream of the repeats, and region II plus in the TSR domain. Recent structural studies have revealed that region III and the TSR fold into a single "αTSR" domain (also referred to as the C-terminal domain, CTD) [8].
CSP plays multiple roles in the parasite life cycle, as it is required for the formation of sporozoites in the mosquito midgut [9,10], the release of sporozoites from the oocyst [11], invasion of salivary glands [1], attachment of sporozoites to hepatocytes in the liver [12], and sporozoite invasion of hepatocytes [13,14]. The prevailing model of CSP structure-function suggests that the NTD binds to and masks the C-terminus while sporozoites are in the mosquito salivary glands. In the liver, primary contacts between the N-terminus of CSP and heparin sulfate proteoglycans (HSPGs) on the hepatocyte arrest sporozoites on the hepatocyte surface. Subsequent cleavage of CSP in region I, unmasks the TSR domain. The TSR domain has cell adhesive properties [15], and is thought to mediate the secondary attachment between sporozoites and hepatocytes, thus leading to the invasion of hepatocytes by sporozoites [14]. Sequence elements of CSP that mediate primary and secondary interactions to HSPGs are still unclear.
Full-length CSP is difficult to purify [16,17] and has not been tested systematically for association with heparin or hepatocytes. Binding assays using peptides or fragments of recombinant CSP suggested that positively charged regions, such as region I and region II plus, are potential binding sites for negatively charged HSPGs [18][19][20]. However, the physiological relevance of these data is questionable due to probable improper folding or lack of necessary intramolecular context in CSP peptides or fragments. Attempts have been made to study the CSP structure-function relationship in vivo using Plasmodium berghei parasites [11,14]. However, most of these studies did not examine the attachment of sporozoites to hepatocytes and focused on either sporozoite exit from the dermis or sporozoite invasion of hepatocytes. Mutant sporozoites expressing CS protein lacking the entire NTD stick promiscuously to various tissues in the mosquito and dermis [14], suggesting that the TSR domain alone does not facilitate specific interactions between CSP and HSPGs on hepatocytes. These results are consistent with in vitro data demonstrating that the purified TSR domain from Plasmodium falciparum CSP (PfCSP) does not bind heparin [8]. Furthermore, sporozoites lacking region I alone are not affected in hepatocyte attachment [14]. Taken together, these studies suggest that, during the primary attachment between sporozoites and hepatocytes, the NTD of CSP (excluding region I) contains sequence elements that allow specific interactions between CSP and HSPGs. CSP sequences enriched in basic residues have previously been scanned for heparin binding activity. A region I-containing peptide was found to interact strongly with heparin [20], implicating region I in mediating the contacts between CSP and HSPGs. The sequence upstream of region I contains several basic residues (Fig 1B). The presence of these basic residues raises the possibility that amino acids outside of region I contribute to the binding between CS and HSPGs.
Here, we conduct a systematic analysis of the interaction between CSP and heparin or hepatocytes using GFP-tagged recombinant CS proteins. We found that the NTD of CSP, Mutations that affect heparin binding are colored in red and mutation that does not is colored in cyan. Basic residues in other CSPs that are near region I are colored in orange. The region I is highlighted by a yellow box. Peptides that have been tested for heparin binding are underlined. C,D, Heparin binding of GFP-PfCSP (C) and GFP alone (D).~150 μg of purified protein was applied to the heparin column, and samples were particularly a lysine cluster upstream of region I, is critical for direct interactions between CSP and HSPGs. Other regions of CSP, including region I and the αTSR domain, may coordinate to present this heparin-binding site indirectly. Our results imply that the lysine cluster is crucial for the specific interaction between CSP and hepatocytes.

Interactions between CSP and heparin
To test and visualize the binding of CSP to hepatocytes, we engineered and purified a series of constructs in which the N-terminal signal peptide of PfCSP was replaced by a GFP tag to assist in protein folding ( Fig 1A). To further optimize bacterial expression, recombinant PfCSP (residues 19-127 and 256-374) lacked the C-terminal GPI anchor signal and 32 out of 43 repeats. Because CSP is known to recognize heparin, we initially determined the retention of the protein on a heparin column while washing with a salt gradient. As expected, most of the GFP-PfCSP bound to the column (comparing input to flow-through, Fig 1C) and could only be eluted at high salt concentrations (fractions 17-19, Fig 1C). Such binding is unlikely to be due to the GFP tag, as GFP alone mainly flowed through heparin under the same conditions and the retained portion exhibited very weak interactions with the column (Fig 1D). These results suggest that purified recombinant GFP-CSP is capable of binding heparin in vitro.
To determine the regions of CSP that could be involved in heparin binding, we examined the NTD for features that are conserved between different Plasmodium species in addition to region I. Notably, there is a conserved lysine pattern in CS proteins from different Plasmodium species infectious to mammals (Fig 1B), suggesting that these lysines play a conserved functional role. A peptide-based binding assay previously suggested that K85 of PfCSP plays a role in heparin binding [21]. To investigate the role of these lysine residues in heparin binding, we tested the binding of GFP-PfCSP carrying a substitution at K85. GFP-PfCSP K85A eluted from the heparin column at lower salt concentrations compared to GFP-PfCSP ( Fig 1E). Substitutions of additional lysines, specifically K88, K90, and K92 individually or in combination, also decreased CSP binding to heparin ( Fig 1F, 1G and 1H). The effects of these lysines are specific, as substitution of K67 (K67A) did not significantly decrease heparin binding ( Fig 1I). Taken together, the results indicate that the lysine cluster in proximity to region I is most critical for heparin binding.
To investigate whether additional sequence elements of CSP were involved in the interaction with heparin, we truncated either the N-terminus (GFP-PfCSPΔN: deletion of residues 19-92) or C-terminus of PfCSP (GFP-PfCSPΔC: deletion of residues 310-374). GFP-PfCSPΔN was not trapped in the heparin column (Fig 1J), whereas GFP-PfCSPΔC bound to the column but was washed off by relatively low concentrations of salt (fractions 13-16, Fig 1K). These results imply that both termini contribute to interactions with heparin and confirm that the Nterminus is relatively more important.
Several basic regions of CSP have been proposed to engage the negatively charged heparin. These regions include region I, which consists of a conserved "KLKQP" motif ( Fig 1B), and region II plus, which consists of a "RKRK"-like motif in the TSR region. Deleting the region I 'KLKQP' motif (residues 93-97) from GFP-PfCSP reduced slightly the interaction with heparin, as shown by more protein being eluted in earlier fractions from the column (Fig 1L). The analyzed by SDS-PAGE, followed by Coomassie staining. Domain structure of the protein used is shown on the left. I, input; FT, flow-through. E-N, as in C, but with GFP-tagged CSP mutants. Arrowhead indicates a contaminant. All data were confirmed by at least three independent experiments using three independently purified batches of proteins. Data shown are from a representative experiment.
doi:10.1371/journal.pone.0161607.g001 same reduction was observed when the two lysines in region I were substituted with alanine ( Fig 1M, K93A/K95A). Thus, mutations in region I have a minor effect on heparin binding, consistent with the normal attachment of sporozoites lacking region I to hepatocytes [14]. Confirming previous results obtained with His-tagged-αTSR [8], GFP-αTSR (residues 310-374, contains region II plus) exhibited marginal binding to the heparin column ( Fig 1N). These results, in combination with the reduced affinity of GFP-PfCSPΔC for heparin, suggest that the αTSR domain indirectly influences CSP-HSPG interactions.
To address the possibility that the lack of binding by GFP-PfCSPΔN is due to poor folding of the CSP produced in the bacterial expression system, we purified GFP, GFP-PfCSP and PfCSP mutant proteins from P. pastoris, a eukaryotic expression system. We found that GFP-PfCSP purified from P. pastoris (PpGFP-PfCSP) binds heparin strongly whereas GFP-PfCSPΔN and GFP-αTSR exhibit decreased binding and PpGFP showed no detectable binding (Fig 2A), similar to results obtained from proteins produced in E. coli (Fig 1B, 1D, 1J and 1N). Though the yield of PpGFP-PfCSP was poor, we recovered sufficient protein for subsequent studies. Gel filtration analysis showed that the GFP-PfCSPΔN and GFP-αTSR from yeast (PpGFP-PfCSPΔN and PpGFP-αTSR, respectively) and E. coli (EcGFP-PfCSPΔN and EcGFP-αTSR) behave similarly (Fig 2B and 2C). PpGFP-PfCSPΔN and EcGFP-PfCSPΔN proteins have a tendency to oligomerize and elute over the same wide range (Fig 2B). Both PpGFP-αTSR and EcGFP-αTSR behaved normally and were eluted in similar fractions (Fig 2C). Collectively, these results suggest that recombinant CS proteins produced in bacteria and yeast are equivalent in terms of heparin binding.

Testing interactions between the CSP termini
The NTD and CTD of CSP have been proposed to interact with each other prior to proteolysis at region I [14]. To test this interaction, we performed co-immunoprecipitation assays using purified proteins. When GFP-PfCSPΔC was immunoprecipitated using anti-GFP antibodies, there was no detectable co-precipitation of HA-αTSR ( Fig 3A). Conversely, when HA-αTSR was precipitated using anti-HA resin, there was only a trace amount of GFP-PfCSPΔC in the precipitate (Fig 3B). However, more GFP alone co-precipitated with HA-αTSR, suggesting that nonspecific interaction between GFP and HA-αTSR account for the trace amount of GFP-PfCSPΔC in the precipitate.
To address the possibilities that incorrect folding of the NTD in PfCSPΔC prevents interaction with HA-αTSR, and that the NTD and CTD interact inter-molecularly, HA-αTSR was incubated with GFP-PfCSP prior to immunoprecipitation with anti-GFP or anti-HA. There was still no detectable co-precipitation (Fig 3C and 3D). These results suggest that the CTD, i.e., the αTSR domain, interacts very little with the NTD of CSP, at least in vitro.

Interactions between CSP and cultured hepatocytes
To determine exactly which domain(s) of CSP interacts with liver cells, we incubated different recombinant CS proteins with HepG2 cells and measured their association by flow cytometry using the GFP tag of the recombinant CS proteins for quantification. Incubation of HepG2 cells with GFP-PfCSP, but not GFP alone, significantly increased the GFP signal on the HepG2 cells (Fig 4A). None of the other mutants exhibited detectable differences in HepG2 cell attachment compared to GFP alone (Fig 4A). These results confirmed that the binding of CSP to heparin reflects its binding to liver cells.
To further analyze the activities of the NTD of CSP in a cellular context, we incubated various GFP-tagged CSPs with HepG2 cells and monitored their interactions using live cell imaging. As expected, GFP-PfCSP efficiently decorated the surface of HepG2 cells (Fig 4B).   Consistent with flow cytometry analysis, GFP alone or GFP-PfCSPΔN did not bind HepG2 cells (Fig 4B). When the newly identified lysine-rich region of CSP was mutated, interaction with HepG2 were disrupted (Fig 4B). These results confirmed that the NTD of PfCSP, in particular a lysine cluster upstream of region I, is critical for CSP interaction with hepatocytes.

Discussion
Our data provide important insights into how CSP contributes to the attachment of Plasmodium sporozoites to liver cells. We propose that specific lysine residues (K85/K88/K90/K92) upstream of the region I motif in CSP bind to HSPGs on the surface of hepatocytes, likely through electrostatic attraction. One of the lysines, K85 in PfCSP, had been suggested to recognize heparin [21], whereas the other three lysines were not previously identified. The C-terminal α-TSR domain helps stabilize this interaction, though it does not interact directly with either heparin [8] or the NTD of CSP. Region I in the N-terminus is even less important for the heparin binding.
The importance of the identified lysine cluster in mediating CSP interactions with the mammalian liver is supported by domain swapping experiments in which CSP in the rodent-infective parasite P. berghei was replaced by CSP from the avian-infective species Plasmodium gallinaceum (PgCSP) [22]. PgCSP lacks the lysine cluster that we hypothesize is important for targeting sporozoites to the liver. Indeed, P. gallinaceum sporozoites infect macrophages instead of hepatocytes [23]. Interestingly, the chimeric P. berghei sporozoites expressing PgCSP were no longer infectious to rodents and failed to invade mosquito salivary glands. Their infectivity to rodents was restored with the re-introduction of the NTD of P. berghei CSP, implying that CSP sequence elements responsible for mediating liver infection reside in the protein's Nterminus. These results are consistent with a model in which chimeric sporozoites are not efficiently arrested in the mouse liver due to the lack of lysine residues in PgCSP.
Region I was suggested to contact HSPGs based on experiments utilizing peptides or protein fragments containing region I and the lysine residues identified in the current study [20]. Our data suggest that binding of these peptides to heparin can be explained by the interactions between the lysine residues (K85/K88/K90/K92) and heparin. Consistent with our observation, mutant parasites lacking region I but containing the lysine cluster exhibited normal attachment to Hepa1-6 cells [14]. These results support the notion that region I plays a minor role in the initial binding of CSP on the sporozoite surface to hepatocytes.
Region II plus is also thought to be involved in hepatocyte attachment because of the enrichment of basic residues [24]. Structural studies of the αTSR domain have revealed that region II plus is buried inside the molecule [8], making it less likely that it makes direct contact with other molecules. Indeed, the purified αTSR domain, with intact region II plus, does not bind heparin. Nevertheless, parasites with mutated region II plus exhibit decreased infectivity [11]. One possibility is that the mutation of region II plus distorts the folding of αTSR, which in turn destabilizes the NTD and indirectly affects hepatocyte binding, similar to the GFP-PfCSPΔC mutant tested here. Alternatively, the αTSR domain may bind to molecules other than HSPGs on the surface of hepatocytes to promote secondary attachment to hepatocytes and invasion. Indeed, mutant sporozoites carrying an exposed αTSR domain because of deletion of the preceding NTD of CSP invade even non-liver tissues, such as the dermis [14].
Even though the αTSR domain does not interact with heparin directly, deletion of αTSR affects binding between heparin and the NTD. These results suggest that the CTD can stabilize the NTD. The αTSR domain has also been shown to be accessible to antibodies only when the NTD is cleaved off at region I, suggesting that the CTD is masked by the NTD prior to cleavage [14,25]. Though a hydrophobic groove was identified in the αTSR domain, suggesting a potential binding pocket [8], the rod-like structure of CSP probed by atomic force microscopy [16] indicated that the terminal association of CSP barely exists or is unstable. Our co-immunoprecipitation assay did not detect an association between purified NTD and CTD. One possibility is that the repeat region, which is largely missing in the recombinant proteins, plays a coordinative role. Alternatively, the interaction is only possible when CSP is properly oriented and assembled on the sporozoite membrane. How the NTD shields the CTD before invasion and how the CTD stabilizes NTD in heparin binding remains to be determined.
We propose that the primary attachment of CSP to HSPGs on the surface of hepatocytes uses the lysine-rich site in the NTD. Following the attachment of sporozoites to hepatocytes, proteolysis at a site in region I exposes the CTD. Finally, the CTD engages the hepatocytes more tightly and promotes invasion by sporozoites [14]. We did not find evidence of an interaction between the CTD and HSPGs, making it likely that the CTD attaches to hepatocytes using molecules other than HSPGs.
Our work also revealed that the attachment of a GFP epitope and reduction in the number of repeats in the central region of CSP greatly facilitates the solubility and purification of CS protein from E. coli. Previously, CSP expressed recombinantly in E. coli needed to be refolded to yield soluble proteins [7]. The biophysical properties and heparin-binding abilities of the recombinant modified CSP produced in E. coli and P. pastoris, are equivalent. Therefore, the binding analyses in this study are likely to be physiologically relevant. Successful production of large amounts of well-folded CS protein will enable structural studies of CSP and the identification of hepatocyte proteins that interact with the αTSR domain of CSP.

Plasmid construction
The codon-optimized wild-type CS gene encoding P. falciparum 3D7 CS (residues 19-127 and 256-374) was fused with a GFP tag at the N-terminus and a His tag at the C-terminus. All point mutations and truncations were generated by overlap PCR using wild-type CS as the template. For E. coli expression, the wild-type CS and all mutants were inserted into the EcoRI/ NotI sites of pET-28a. For Pichia pastoris expression, GFP-CS, GFP-CSΔN, and GFP-αTSR were cloned into pPIC9K using EcoRI/NotI sites. All constructs were confirmed by DNA sequencing.

Protein expression and purification
For E. coli expression, all constructs were transformed into bacterial strain BL21 (DE3) and cultures grown in Luria-Bertani media at 37°C to an OD 600 of 0.8. Protein expression was induced by the addition of 0.35 mM IPTG for 24 h at 16°C. Cells were harvested, resuspended in lysis buffer (10 mM Tris pH 8.0, 300 mM NaCl), and lysed by ultrasonication. The lysate was centrifuged at 30,000 rpm for 1 h and the supernatant collected. The protein was isolated on Ni-NTA agarose (Qiagen) and further purified by gel filtration chromatography (Superdex-200; GE Healthcare).
For P. pastoris expression, SalI linearized plasmids were transformed into P. pastoris strain GS115 by electroporation. The positive recombinants were then cultured in buffered complex glycerol media followed by protein expression for 48-72 h in buffered complex methanol media (Pichia Expression kit, manual 25-0043; Invitrogen). Culture supernatant supplemented with 0.5 M NaCl and 10 mM imidazole was incubated with Ni-NTA agarose for 1 h at 4°C. After washing with 0.3 M NaCl, 10 mM Tris pH 8.0, and 20 mM imidazole, protein was eluted with 0.15 M NaCl, 10 mM Tris pH 8.0, and 0.5 M imidazole and further purified by gel filtration.

Heparin binding assay
Purified proteins were diluted with binding buffer (10 mM Tris pH 7.4) and loaded onto a heparin affinity column (Heparin HP 1 ml; GE Healthcare) equilibrated in the same buffer. After washing the column with 5 bed volumes of binding buffer at 0.5 ml/min, the column was developed with a 0-1 M NaCl linear concentration gradient (20 bed volumes) at 0.5 ml/min. Fractions containing target proteins were detected by SDS-PAGE or Western blot.

Hepatocyte binding assay
The HepG2 hepatoma cell line (ATCC, HB-8065) was maintained at 37°C with 5% CO 2 in DMEM containing 10% fetal bovine serum. Cells were seeded in a 6-well plate and allowed to adhere for at least 24 h. For FACS analysis, the cells were washed two times with PBS and digested with 0.25% trypsin-EDTA. When the cells detached from the plate, medium was added to terminate the reaction. The cells were harvested and washed one time with PBS and then resuspended with 1 ml PBS, followed by the addition of 20 μg purified protein and incubation at 37°C for 1 h. Unbound protein was removed by washing three times. Finally, cells were resuspended in 500 μl PBS and GFP fluorescence measured by flow cytometry. For microscopy analysis, the cells were incubated with 160-200 μg purified protein at 37°C for 1 h, washed with pre-warmed media and stained with Hoechst at 10 μg/ml. Washed cells were then analyzed live in PBS using Zeiss Observer Z1 with a 40× objective lens.

GFP pull-down and HA pull-down
For GFP pull-down, 2 μg GFP or GFP fusion protein was incubated with 20 μl Anti-GFP mAb-Agarose (MBL) in 200 μl binding buffer (0.3 M NaCl, 10 mM Tris pH 8.0, 0.5% Triton X-100) for 2 h at 4°C, followed by washing the agarose three times to remove unbound protein. Agarose was resuspended in 500 μl binding buffer and 20 μg purified HA-αTSR added to incubate for 2 h at 4°C. After washing the agarose three times, it was incubated with an equal volume SDS sample buffer. Finally, the samples were subjected to SDS-PAGE and Western blot. HA pull-down was the same as GFP pull-down. A total 2 μg HA-αTSR was incubated with 20 μl Anti-HA-Agarose (Sigma) for 2 h at 4°C. After washing the agarose, GFP or GFP fusion protein was incubated with the washed agarose for 2 h at 4°C in 500 μl binding buffer. After washing the agarose three times and adding SDS sample buffer, the samples were evaluated by Western blot.