Lack of Evidence from Studies of Soluble Protein Fragments that Knops Blood Group Polymorphisms in Complement Receptor-Type 1 Are Driven by Malaria

Complement receptor-type 1 (CR1, CD35) is the immune-adherence receptor, a complement regulator, and an erythroid receptor for Plasmodium falciparum during merozoite invasion and subsequent rosette formation involving parasitized and non-infected erythrocytes. The non-uniform geographical distribution of Knops blood group CR1 alleles Sl1/2 and McCa/b may result from selective pressures exerted by differential exposure to infectious hazards. Here, four variant short recombinant versions of CR1 were produced and analyzed, focusing on complement control protein modules (CCPs) 15–25 of its ectodomain. These eleven modules encompass a region (CCPs 15–17) key to rosetting, opsonin recognition and complement regulation, as well as the Knops blood group polymorphisms in CCPs 24–25. All four CR1 15–25 variants were monomeric and had similar axial ratios. Modules 21 and 22, despite their double-length inter-modular linker, did not lie side-by-side so as to stabilize a bent-back architecture that would facilitate cooperation between key functional modules and Knops blood group antigens. Indeed, the four CR1 15–25 variants had virtually indistinguishable affinities for immobilized complement fragments C3b (K D = 0.8–1.1 µM) and C4b (K D = 5.0–5.3 µM). They were all equally good co-factors for factor I-catalysed cleavage of C3b and C4b, and they bound equally within a narrow affinity range, to immobilized C1q. No differences between the variants were observed in assays for inhibition of erythrocyte invasion by P. falciparum or for rosette disruption. Neither differences in complement-regulatory functionality, nor interactions with P. falciparum proteins tested here, appear to have driven the non-uniform geographic distribution of these alleles.


Introduction
The Swain-Langley (Sl1/2) and McCoy (McC a/b ) Knops blood group antithetical antigen pairs lie within the erythrocyte-borne membrane glycoprotein, complement-receptor type 1 (CR1) (CD35) [1,2]. Both the encoded amino acid sequence variations occur in the vicinity of the 25 th complement control protein (CCP) module (or short consensus repeat) out of the 30 CCP modules that constitute the ,1800-residue N-terminal ectodomain of the most common size-variant of CR1 [3] (Fig. 1). The strikingly nonuniform geographical distribution of these alleles has been much discussed [4] [5]. Notably, both McC b (coding for E1590, McC a codes for K1590) and Sl2 (coding for G1601, Sl1 codes for R1601) are extremely rare in Caucasoids but common (prevalence of 45% and 80%, respectively) in individuals of African descent [6]. Such population differences probably arose due to selective pressure and it was hypothesized that the Sl2 (and possibly McC b ) alleles afford a survival advantage in the context of a geographically localized environmental hazard such as is posed by the malaria parasite Plasmodium falciparum [4,5,7]. Malaria is probably responsible for ,1,100,000 deaths annually in Africa [8] but is largely absent from Europe and North America.
Erythroid CR1 is a ligand used by P. falciparum merozoites for sialic acid-independent invasion of red blood cells [9,10,11]. Moreover, CR1 on non-parasitized erythrocytes is a ligand for P. falciparum erythrocyte membrane protein 1 (PfEMP1) borne on infected erythrocytes; this interaction mediates formation of ''rosettes'', which consist of clusters of infected and non-infected cells [12]. Rosette formation is associated with life-threatening forms of cerebral malaria [13]. Correlation between CR1 polymorphisms (Sl2/Sl2 vs. Sl1/Sl1) and resistance to severe malaria was reported in a western Kenyan population [14], but not in two studies of populations from The Gambia [15,16].
Selective-pressure hypotheses require that McC and Sl-encoded variants diverge with respect to copy number expressed or to structure-function relationships; but this has not been investigated comprehensively. Complement receptor-type 1 [17] regulates classical and alternative pathways of the complement system. It is also the immune-adherence receptor, mediating transport by erythrocytes of particles opsonised with complement-activation specific fragments, C3b and C4b, to the primary lymphoid system for clearance [18]. Modules 1-3 (site 1), 8-10 (site 2) and 15-17 (a near-identical copy of site 2) are the first three modules of, respectively, the first, second and third (out of four), highly similar long-homologous repeats (A-D) accounting for the N-terminal 28 CCPs of the most common CR1 size variant ( Fig. 1) [19]. These sites interact with opsonic C3b and C4b [20] and, to a lesser degree, the C3b breakdown product, iC3b, and are collectively crucial to the immune-clearance role of CR1. They also accelerate decay of C3 and C5 convertases, crucial multiple-subunit enzymes in the complement cascade. Moreover, site 2 is a cofactor for proteolytic cleavage of C3b and C4b [21,22,23,24] by factor I that interrupts the complement cascade and generates ligands for other complement receptors linking complement to cellular and antibody-mediated immunity [25]. Importantly, site 1 also mediates interactions with Rh4 [26] vital for sialic acidindependent invasion by P. falciparum [11] while site 2 is a candidate for interaction with PfEMP1 and rosette formation [27].
It is intriguing that the Knops blood group antigens in CCPs 24-25 do not occupy known functional sites of the CR1 ectodomain. Residues encoded by Sl1/2 and McC a/b lie within long-homologous repeat D, seven modules away (in the primary structure) from the nearest proven functional site (i.e.  or site 2 within long-homologous repeat C). Nonetheless, sites of sequence variation could be spatially apposed to one or more functional sites depending upon the overall architecture of CR1's long chain of CCPs [28,29,30,31]. In particular, the uniquely long linking sequence of eight amino acid residues (compared to four residues on average) connecting CCPs 21 and 22 (i.e. between long-homologous repeats C and D, see Fig. 1) could represent a molecular hinge, enabling the modules of long-homologous repeat D to fold back onto long-homologous repeat C and hence influence the activity of the second copy of site 2.
Thus the possibility requires investigation that sequence variations in or near to CCP 25 influence C3b/C4b binding, complement-regulatory activity or binding to Rh4 or PfEMP1, despite being remote in the sequence from the primary sites of interaction. It is also possible that long-homologous repeat D possesses its own less well characterized or yet-to-be discovered binding sites for host or parasite proteins. For example, classical complement pathway protein C1q (part of the C1 complex that also includes the serine protease C1r and C1s) reportedly binds long-homologous repeat D of CR1 [32].
We prepared four eleven-CCP constructs, representing the McC and Sl-encoded CR1 variants, along with a set of shorter recombinant constructs encompassing modules at the border between long-homologous repeats C and D (CCP 21-CCP 22). We were thus able to show that Knops blood group variants in long-homologous repeat D neither modulate structure nor influence known functional activities of CR1 that might underlie selective pressure.

Results
A panel of truncations were designed for this study and produced in P. pastoris We made truncations of CR1 (Fig. 2) in the same strain of Pichia pastoris previously employed to produce biologically active fulllength factor H (FH) (a CR1 homologue containing 20 CCPs) [33]. Recombinant CR1 15-25, i.e. CCPs 15-25, was designed as the minimal CR1 fragment spanning the second copy of site 2 (CCPs 15-17) and variant residues 1601 (Sl1/2) in CCP 25 and 1590 (McC a/b ) in the linker between CCPs 24 and 25.
Producing the four variants of this eleven-module fragment -Glu 1590 /Gly 1601 (EG, from McC b /Sl2), Lys 1590 /Arg 1601 (KR, from McC a /Sl1, .98% prevalence in Caucasoids), Lys 1590 /Gly 1601 (KG, from McC a /Sl2) and Glu 1590 /Arg 1601 (ER, from McC b /Sl1) -in preference to an entire (e.g. 30-module) ectodomain of CR1, allowed us to focus on a single functional site. Moreover, unlike in the case of variants of CCPs 1-30, large amounts of material for quantitative measurements and biophysical characterisation could be produced. We also produced smaller fragments, CR1 21, CR1 21-22 and CR1 20-23 ( Fig. 2), to investigate the potential of the double-length linker between CCPs 21 and 22 to promote a bentback module chain, in which CCPs 24/25 lie close to CCPs 15-17. All proteins were secreted in good yields and purified (Fig. 2) using sequential ion-exchange and gel-filtration chromatographic steps.

CCPs 21 and 22 of CR1 do not form a side-by-side interaction
We compared 1 H, 15 N-HSQC NMR spectra of CR1 21, CR1 21-22, and CR1 20-23 (Fig. 3). In these spectra each amide group produces a cross peak the position of which is determined by its 15 N (y-axis) and 1 H (x-axis) chemical shifts. The spectra were of high quality, featuring predominantly uniform line-widths and excellent chemical shift dispersion, showing that these proteins are properly folded. Nearly all cross peaks in the CR1 21 spectrum overlap with cross peaks in the CR1 21-22 spectrum (Fig. 3A), showing that attachment of CCP 22 to CCP 21 causes only minor perturbations of chemical shifts for amide groups in CCP 21. This implies that these modules share only a small inter-modular interface consistent with an end-to-end (rather than side-by-side) interaction. Likewise, many cross peaks in the partially assigned spectrum of CR1 21-22 match to cross peaks in the spectrum of CR1 20-23 (Figs. 3B,C); of those that do not, none corresponded to residues in the linker between CCPs 21 and 22; in fact most lie in module 22 (data not shown). These observations are consistent with CCPs 21 and 22 retaining their end-to-end disposition within the context of CR1 20-23.
To further investigate spatial arrangements of modules in CR1 20-23, samples were subjected to small-angle X-ray scattering (SAXS) generating high-quality data with no concentration dependency (Figs. 4A,B). The resulting DAMMIF ab initio model, with an excluded volume of 59 nm 3 -consistent with that expected for a monomeric, properly folded, CR1 20-23 (29 Kd) -fitted well to the SAXS profile (discrepancy, x = 1.5) (Fig. 4A). Both the model and the distance distribution, with a D max of 15 nm (Fig. 4C), are fully consistent with a largely elongated arrangement of the four modules (see insert in (Fig. 4A) and cannot be reconciled with a stable bent-back conformation as was formed, for example, by the six CCPs at the centre of FH that had a D max of 10.4 nm [34].
Thus, in CR1 20-23, the long linker between CCPs 21 and 22 (i.e. between long-homologous repeats C and D) does not induce a U-bend. Together with their immediate neighbours, these modules form a highly elongated structure. Extrapolating to a larger construct, CCPs 19 and 24 would be ,15 nm apart, disfavouring apposition of CCPs 15-17 and CCPs 24-25. These results suggest that sequence variations in CCPs 24-25 (of longhomologous repeat D) have only local structural effects. To test this, analytical ultracentrifugation (AUC) was used to compare CR1 15-25 allotypic variants.

Biophysical analysis reveals no differences in gross structures of variants
We subjected solutions of the CR1 15-25 allotypic variants to AUC and compared their sedimentation coefficients and axial ratios ( Table 1). All had Svedberg values in the narrow range of 3.14-3.25, and calculated hydrodynamic radii in the range 5.55-5.80 nm, with no indication of any loss of folded structure or selfassociation over the concentration range (0.1-1 mg.ml 21 ) measured. These results are in good agreement with estimates of particles size distribution from dynamic light scattering (Fig. S1). Their calculated axial ratios were also similar, within experimental error, falling in the range of 5.6-6.6.
Thus, consistent with NMR and SAXS-derived structural models of CCPs 20-23, the Knops blood group allotypic variations in CCPs 24-25 do not induce structural differences in CR1 15-25 detectable by AUC (or dynamic light scattering). These data, taken together with the SAXS and NMR findings for CR1 20-23, are best explained by an open architecture for CR1 15-25, in which contacts between non-adjacent modules are improbable. On the other hand, the sequences of modules (CCPs 15-19 and CCPs 24-25) flanking the elongated section (CCPs 20-23) cannot extend linearly and rigidly in opposite directions (since that would have resulted in a structure with a much higher axial ratio). It therefore remains conceivable that these flanking sequences could cooperate in binding to a large ligand such as C3b or C4b, even though in previous work no binding between long-homologous repeat D (alone) and C3b/C4b was detected [20]. This possibility, whereby long-homologous repeat D contributes to binding C3b/C4b only after the higher affinity site in long-homologous repeat C had been occupied, was explored through binding assays.
All four CR1 15-25 variants bind equally well to opsonins C3b and C4b We investigated whether variant residues in CCPs 24-25 modulate the C3b/C4b-binding ability of CR1 functional site 2. Experiments based on SPR (see Fig. 5, and Figs. S2, S3) provided no evidence for such a proposition. All four CR1 15-25 variants bind similarly well to immobilized C3b, with K D values in the range of 0.8-1.1 mM, and to C4b, with K D values of 5.0-5.3 mM ( Table 2). These data may be compared to K D values for CR1 15-17 of 1.1 mM (C3b) and 3.2 mM (C4b), measured on the same C3b-and C4b-loaded flow cells of the CM5 sensor chip (Figs. S2, S3). All the truncations bind less tightly to C3b and C4b than sCR1 (measured on the same CM5 chip, K D = 0.53 mM for C3b and K D = 0.91 mM for C4b; (Figs. S2, S3)); presumably this reflects the avidity provided by the multiple sites present in the full-length ectodomain. To provide independent validation of these results that were based on studies of binding to covalently immobilized (amine-coupled) C3b and C4b, an ELISA assay was conducted in which C3b or C4b were adsorbed to polystyrene microtitre plates (Fig. S4). These experiments also failed to distinguish between the binding properties of CR1 15-25 variants. We concluded that the variant residues of CCP 25 have neither direct nor indirect roles in engagement of C3b and C4b by site 2.   The CR1 15-25 variants have identical abilities to regulate complement Functional site 2 of CR1 is the predominant locus of cofactor activity for factor I-catalyzed proteolysis of C3b and C4b (see Figs. 6A,B). Cofactor activity very likely involves contacts between CR1 and both C3b (or C4b) and factor I [35]. We showed (above) that the long-homologous repeat C-copy of site 2 engages with C3b and C4b independently of long-homologous repeat D. Nonetheless it remained feasible that Knops blood group variations in CCPs 24/25 influence recruitment, or activation [35], of factor I to CR1:C3b or CR1:C4b complexes. We conducted fluid-phase cofactor assays in which cleavage products of CR1/factor I, acting on C3b or C4b. were detected in a qualitative manner using SDS-PAGE, ensuring reactions had not proceeded to completion. In all CR1 15-25 lanes and the sCR1 lane, the pattern and strengths of bands corresponding to cleavage products (following a 60-minute incubation) appeared to be identical in the case of C3b (Fig. 6C) and, likewise, in the case of C4b (Fig. 6D). Thus all the recombinant variants had co-factor activity for the cleavages made by factor I needed to generate C3dg or C4d. No differences could be detected between the variants in the amounts or nature of product generated.

The variants have equivalent affinities for C1q
Interaction of CR1 with C1q was reported previously, along with suggestions that long-homologous repeat D (CCPs 22-28) contributes to this binding event cases [32,36,37]. We measured affinities of the four CR1 15-25 variants for C1q that had been immobilized, via amine coupling, on the CM5 sensor chip surface (under the same conditions and in the same series of experiments used to obtain K D values for C3b and C4b) (Fig. 7A, Table 2). Measured values lay in a narrow range of 5-6 mM, comparable with a value of 8 mM for sCR1 (Fig. S5). CR1 1-3 (site 1) bound weakly (K D .30 mM (not shown) but in this assay CR1 15-17 (site 2) bound better than sCR1 (K D ,1 mM) while the nine-CCP construct CR1 17-25 (i.e. missing CCPs 15 and 16) did not bind at all (Fig. S5); these data suggesting that C1q binds to site 2 require further verification. In any case, the results of SPR strongly indicate that the McC and Sl-encoded variations had small or negligible effects on binding of C1q. Very similar results were obtained using an ELISA (see Fig. 7B) in which soluble C1q binding to CR1 truncations adsorbed on a micro-titer plate was measured.

PfRh4 invasion pathway
Erythroid CR1 is the receptor required for the major sialic acidindependent route for invasion by multiple strains of P. falciparum [9,10,11,26]. The key P. falciparum protein in this invasion pathway, PfRh4, was previously shown to bind to CR1 CCPs 1-3 (site 1), but not to the highly similar CCPs 15-17 (site 2) nor to the Caucasoid variant (Lys 1590 /Arg 1601 ) of CR1 15-25 that includes CCPs 22-24; these modules are similar in sequence to both CCPs 1-3 and CCPs 15-17 [26]. We examined the previously unexplored possibility that PfRh4 engages with other, non-Caucasoid, variants of CR1 15-25. According to SPR, none of the CR1 15-25 variants (like CR1 15-17, CR1 24-25(KR) and other negative controls) exhibited more than residual affinity for an immobilized recombinant version of PfRh4 (PfRh4.9) corresponding to its erythrocyte-binding N-terminal domain [38], unlike the positive controls, CR1 1-3 and sCR1 (Fig. 8A). To investigate whether the non-Caucasoid CR1 15-25 variants nevertheless interfered with the PfRh4-CR1 invasion pathway, their effects on parasite growth were monitored. As was observed previously, invasion of both untreated and neuraminidase-treated erythrocytes by 3D7 strain was inhibited by addition of sCR1, whereas invasion of untreated erythrocytes by the parasite strain W2mefDRh4, which lacks a functional CR1-PfRh4 pathway, was unaffected by the presence of sCR1 (Fig. 8B) [26]. These results served as a control for experiments with the CR1 15-25 variants [9,11] (see Fig. 8B). In agreement with the SPR study, none of the CR1 15-25 variants (unlike the positive control, sCR1) inhibited invasion of nm-treated erythrocytes by P. falciparum. Thus the Knops blood group antigens on CCPs 24/25 had no influence on the susceptibility of erythrocytes to invasion by P. falciparum, consistent with a location within the CR1 molecule that is remote both in the sequence and in space from the key PfRh4-binding site in CCPs 1-3 (site 1) [26].
All variants are equally implicated in P. falciparum rosetting Previous work implicated site 2 (CCPs 8-10 or CCPs 15-17) in the rosette-mediating interaction between erythroid CR1 and PfEMP1 expressed on the surface of infected erythrocytes [27]. We were unable (data not shown) to detect binding between any of the CR1 truncations (including sCR1) and immobilized recombinant versions of the N-terminal duffy binding-like domain from PfEMP1 [39] (in the same set of SPR experiments used to measure the affinity of sCR1 for Rh4.9). Nonetheless, we were able to elaborate on previous findings by confirming that not only was rosetting inhibited by sCR1 and by CR1 15-25, but it was also ameliorated by CR1 15-17 (as shown previously) and even by CR1 17 alone (Fig. 9A). This appeared to be a module-specific effect since rosetting was not inhibited by CR1 21, CR1 21-22 or CR1 20-23 (Fig. 9B), nor by full-length factor H or several factor H fragments (Fig. 9C). Importantly, we could not detect any difference between the four variants of CR1 15-25 in terms of their ability to disrupt rosetting (Fig. 9B).

Discussion
Complement receptor-type 1, a rare example of a protein whose ectodomain is composed entirely from modules of the same type, is a product of gene duplication and exon shuffling [40]. This evolutionary heritage is particularly apparent amongst CR1 sizevariants (190-280 kDa), which possess different numbers of homologous blocks of seven CCPs arranged in long-homologous repeats [19]. In the 30-CCP ectodomain of the most common (gene frequency of ,0.85) size-variant, four tandem N-terminal long-homologous repeats (A-D), are followed by two membraneproximal CCPs. Each CCP contains about 60 amino acid residues organized in a b-strand-rich compact unit stabilized by pairs of disulfides between invariant cysteines (Cys I-Cys III, Cys II-Cys IV) [41,42]. We investigated structural and functional consequences of Knops blood group antigenic variations of CR1 likely to be under geographical region-specific selective pressure resulting in non-uniform distribution of allotypes among global populations. These involve loss or gain of charge through substitution of residues predicted to be surface-exposed (see Fig. 1) and thus the E1590/G1601 (McC b /Sl2, common in sub-Saharan African) and K1590/R1601 (McC a /Sl1, predominant in Caucasoids) [2] variants will differ significantly in the electrostatic surfaces presented by their CCPs 24-25 regions (in longhomologous repeat D, Fig. 1) [43].
Linkers of just four amino acid residues join long-homologous repeats A and B, and long-homologous repeats B and C, but an eight-residue linker connects the last residue of long-homologous repeat C (i.e. consensus Cys IV of CCP 21) to the first of longhomologous repeat D (i.e. consensus Cys I of CCP 22). We looked for, but found no evidence of, structural or functional interactions between long-homologous repeats D and C despite the potential for this double-length linker between long-homologous repeats to allow or promote a close-to-180-degree bend in the molecule. Indeed, our results suggest the modules flanking this linker adopt an extended arrangement consistent with a stalk-like or spacer role for long-homologous repeat D, helping to project the functional sites-bearing CCPs 1-21 away from the membrane and clear of the glycocalyx of erythrocytes, where over 85% of human CR1 resides [2]. Thus, we infer that single or double-residue substitutions in long-homologous repeat D of surface-exposed residues are unlikely to have structural or electrostatic consequences for long-homologous repeats A-C.
Our SPR and ELISA-derived binding data are fully consistent with a large body of previous work suggesting N-terminal modules of long-homologous repeats A-C cooperate in interacting with the activated products of C3 and C4 (C3b and C4b) and their complexes, while long-homologous repeat D (the location of variant residues 1590 and 1601) does not engage directly with these ligands [17,20,21,23,24]. Long-homologous repeats A, B and C each carry an N-terminal set of three modules that presumably bind to surfacetethered C3b/C4b in a manner similar to one another and to factor H-modules 1-3 (i.e. the long axes of modules aligned with the long axis of C3b and with the N-terminal of the three modules distal to the surface) [44]. Thus it is easy to envisage long-homologous repeats A-C, borne on the long-homologous repeat D stalk, adopting an architecture that has a quasi-three-fold axis of symmetry. Our results indicated that CCPs 15-25 encompass binding sites for C1q, which is multi-subunit activator of the classical pathway. Our studies suggested CCPs 15-17 make a major contribution to C1q binding whereas previous studies ascribed more importance to long-homologous repeat-D [32,36,37]; in our studies CCPs 26-28 of long-homologous repeat D were absent so we did not make a direct comparison between these two possibilities.
Consistently with our structural findings, we demonstrated that Sl and McC-encoded sequence variations within long-homologous repeat-D appear to have no effect on a range of CR1 properties resident in long-homologous repeats A-C including: C3b-binding and C4b-binding, vital for the long-recognized key biological role of human CR1 as the immune adherence receptor [45]; cofactor activity for factor I, needed for protection of erythrocyte surfaces from Figure 5. CR1 15-25 variants bind equally well to C3b and C4b. Representative SPR-derived binding curves for the CR1 15-25 variants (as indicated in parentheses) flowed over C3b (upper four panels) or C4b (lower four panels) that were amine-coupled to CM5 chips (see also Figs. S1 and S2). In each case, a set of sensorgrams recorded for a range of CR1 15-25 concentrations (0.25 mM, 0.5 mM, 1 mM, 2.5 mM, 5 mM, 10 mM), are shown above the plot of response versus [CR1 [15][16][17][18][19][20][21][22][23][24][25]  C3b proliferation and its potentially cytolytic consequences [46]; interactions with P. falciparum-derived erythrocyte-borne PfEMP1, required for rosetting [12,27]; and engagement by the P. falciparum adhesin PfRh4 for P. falciparum invasion of erythrocytes [11]. Our results additionally demonstrated that binding to CR1 of C1q occurs entirely independently of the Sl and McC-encoded variations. For rosette-disruption assays we chose parasite clone R29, on the grounds that it is the best-characterized CR1-sensitive P. falciparum rosetting strain [12,27]. In previous work, no differences were observed between strains in response to various agents such as the CR1 mAb J3B11 and recombinant CR1 15-17 [27]; on this basis it was decided not to initiate extensive testing of other strains for rosette disruption. Although our experiments were not conducted on variant versions of the full-length ectodomain, this is unlikely to have eroded their relevance in respect of the assays performed in this study. Interactions between long-homologous repeat D and long-homologous repeats A or B, but not C, would require an unfeasibly contorted architecture for the CR1 molecule. It also seems unlikely that CCPs 26-30 (absent in our truncation) are a requirement for a putative interaction between longhomologous repeats D and C. On the other hand, our focus on soluble truncations as opposed to membrane-bound variants does leave open a number of untested possibilities. In particular, although we observed no differences in self-associative properties of CR1 15-25, it is possible that a stalk-like long-homologous repeat D is important for mediating the clustering of CR1 molecules [47]. The relevance of CR1 clustering to merozoite invasion or rosetting has not been investigated but it has been reported to be important for cell deformability, motility of cells in microvasculature and hence the immune clearance role of CR1. It is also possible that Sl and McC-encoded sequence variations affect a putative interaction, of unknown physiological significance, with mannan binding lectin [37]. Moreover, CR1 is displayed on other cell-types than erythrocytes and may have other functions than those tested in this study. Finally, we used C4b from pooled plasma and therefore we did not take account of polymorphic variation in C4 that also exhibits a non-uniform distribuion across geographical regions of origin.
African-derived populations are characterized by a slightly higher (0.12 versus 0.18) incidence of a larger size-variant of CR1 [1], and high copy numbers of CR1 on erythrocytes [48] as well as increased frequency of the Sl2 and McC b alleles [3]. The current study makes a correlation between the last of these sources of polymorphic variation and malaria seem less likely. Nor does the data support hypotheses based on differential abilities amongst these Knops blood group antigenic variants to protect erythrocytes against cytolytic complement attack. Our C3b-and C4b-binding data suggest that all the variants will be equally good at clearing immune complexes, although our studies of soluble fragments did not take into account the possible roles in this respect of CR1 clustering. Overall, a likely scenario is that the source of selective pressure on these variants in long-homologous repeat D arises from less-well understood interactions between CR1 and pathogenic organisms. For example, elevated frequencies of McC b and Sl2 alleles have been correlated to resistance to Mycobacterium tuberculosis infections [49].

Protein production
Recombinant versions of truncated CR1 (the Caucasian, i.e. Lys1590, Arg1601, variant) were produced in a Pichia pastoris in the initial C4b sample, degradation products at 25 kDa and 67 kDa are present as contaminants; the reaction was deliberately stopped prior to completion to allow a comparison to be made between the variants. See Figure S3 for ELISA results. doi:10.1371/journal.pone.0034820.g006 expression host using methods described previously. In brief, DNA coding for CCPs 21 (E1317-R1392), CCPs 21-22 (E1317-S1456), CCPs 20-23 (S1257-I1516) or CCPs 15-25 (T940-S1647) (numbering refers to unprocessed initial gene product) was PCRamplified from CR1 cDNA then ligated into the TOPO plasmid vector (Invitrogen) that was used to transform Top10 E. coli cells (Invitrogen). Amplified and subsequently extracted (using the QIAprep Miniprep Kit, Qiagen) DNA was digested with PstI and XbaI and the insert was ligated into the pPICZ aB vector (Invitrogen) downstream of the DNA for the Saccharomyces cerevisiae a-mating factor secretion signal. Plasmids were amplified in E. coli cells, linearised and used to transform P. pastoris KM71H cells for production of protein that is directed into the secretory pathway. Purification from media after cell removal was achieved by cationexchange chromatography followed by a second ion-exchange chromatography step and gel-filtration chromatography. Proteins exhibiting N-linked glycosylation were treated after the first purification step with the endoglycosidase Endo H f (New England Biolabs). Purification was monitored by sodium dodecyl sulfatepolyacrylamide gel electrophoresis (SDS-PAGE) performed under both reducing and non-reducing conditions. All of the recombinant truncations of CR1 carried the expression artifact EAEAAG at their N-termini. For isotopic enrichment, cells were grown in minimal media supplemented with 15 N-ammonium sulfate and (for double labeling) 13 C-glucose (during growth phase) and 13   to introduce the nucleotide substitutions needed to encode the Knops blood group antigenic variations (i.e. single substitutions K1590E and R1601G and double substitution K1590E/R1601G).

NMR spectroscopy
All NMR data were collected in 5-mm NMR tubes on an Avance800 NMR spectrometer (Bruker) fitted with a cryoprobe, processed using TOPSPIN (Bruker), then analyzed by utilizing the ANALYSIS module of the common computing protocol for NMR (CCPNMR) software package [50]. Two-dimensional 15 N-1 H HSQC spectra and three-dimensional triple-resonance spectra (CACBNH and CACB(CO)NH) [51] were acquired at 37uC on a 300 mM sample of the 13

Analytical Ultracentrifugation
Sedimentation velocity analysis (at 1860.5uC) was performed on 0.4-ml protein samples (0.2-0.4 mg.ml 21 in PBS, pH 7.5), on a XL-A analytical ultracentrifuge (Beckman). A series of radial scans of optical density (279 nm) across the centrifuge cell was collected (Proteome Lab software, Beckman-Coulter). An initial scan was performed immediately upon attaining the set rotor speed (45,000 rpm); 80 subsequent scans were recorded at two-minute intervals. Each resulting data set was analyzed (SEDFIT [52]) yielding values of the s-distribution parameter c(s) as a function of sedimentation coefficient (s). For non-linear fitting of optical density values versus radial distance in SEDFIT, the resolution was set to 150, and a confidence factor set to 0.68. An average value for the frictional ratio (F) was determined over a series of fits and used as a default for all fittings; baseline, meniscus, and cell base radial positions were floated. The final profile, stored as a continuous distribution, was analyzed within the general software set for data analysis, ''pro Fit'' (QuantumSoft, Zurich). The maximal value of the c(s) function was used to define the sedimentation coefficient of the target molecule. A value of the partial specific volume (0.72160.001 ml.g 21 for all four allotypic variants) was computed (using SEDNTERP [53]), which was also employed to compute density and viscosity properties of the buffer solution, enabling correction of raw s values to standard conditions of solvent viscosity and density (s 20,w ). The diffusion coefficient (an inverse function of F) of each species was estimated in the conventional manner from s, M and partial specific volume within a locally written utility for hydrodynamic calculations, Biomols 2. From each of the F values determined it was straightforward to compute (using SEDNTERP), on the basis of an assumed 'typical' (for protein) value of solvation (1.4, vol/vol), an estimate for the overall 'shape' of the solute particle, modeled as a prolate ellipsoid of revolution.

Small-angle X-ray scattering (SAXS)
Synchrotron radiation X-ray scattering data were collected on the 633 beam line of the EMBL (DESY, Hamburg) [54,55], using a MAR345 image plate detector (MarResearch, Norderstedt, Germany) and 120-s exposure times. Solutions of CR1 20-23 (1.9 and 3.3 mg.ml 21 ) were measured (10uC) in 50 mM potassium phosphate (KP) buffer, pH 7.4. The sample-to-detector distance (2.7 m) covered a range of momentum transfer 0.1,s,6.0 nm 21 . No radiation damage was observed during a second 120-s exposure. Data from the detector were normalized to the transmitted beam intensity, averaged and the scattering of buffer solutions subtracted. Difference curves were scaled for solute concentration. Data manipulations were performed using PRI-MUS [56]. Forward scattering I(0) and radius of gyration (R g ) were determined from Guinier analysis [57], assuming that at very small angles (s,1.3/R g ), intensity, I(s), = I(0)exp(-(sR g ) 2 /3). These parameters were also estimated from the full scattering curves using the indirect Fourier transform method implemented in GNOM [58], along with the distance-distribution function p(r) and maximum particle dimensions D max . Molecular weight was estimated by comparing extrapolated forward scattering with that of a reference solution of bovine serum albumin. Due to uncertainty in MWt estimation resulting from uncertainty in measured protein concentrations, an excluded volume of solutes was determined from the ab initio modeling program DAMMIF [59]. For globular proteins, this hydrated particle volume in nm 3 is approximately 1.5 to 2 times the MWt in kDa. Low-resolution shape envelopes for CR1 constructs were determined by ab initio bead-modeling in DAMMIF [59]. The results of multiple DAMMIF reconstructions were aligned using SUPCOMB [60] to determine the most representative model from each of the ab initio methods. Averaged DAMMIF models were also determined using DAMAVER [56] and then adjusted, so that they agree with the experimentally determined excluded volume, using DAM-FILT [61].

Surface plasmon resonance (SPR)
Duplicate SPR experiments were carried out (25uC) on a Biacore T100 (GE Healthcare). C3b, C4b and C1q (Complement Technology, Texas) were immobilized using standard amine coupling, and dissociation constants measured as described [62]. Briefly, 1620 response units (RUs), 1518 RUs and 2125 RUs of C3b, C4b and C1q, respectively, were immobilized on three of four flow cells of a Biacore Series S carboxymethylated dextran (CM5) sensor chip. Experiments were performed by flowing over the chip 30 ml.minute 21 analyte solution containing a range of concentrations of CR1 fragments that had been dialyzed into 10 mM Hepes-buffered 150 mM saline with 3 mM EDTA and 0.05% v/v polysorbate 20 (HBS-EP; pH 7.4). Regeneration was achieved by flowing 1.0 M NaCl over chip surfaces. Dissociation constants (K D s) were calculated (using Biacore T100 evaluation software v. 2.0) by fitting steady-state binding levels derived from background-subtracted traces to a one-to-one stoichiometry steady-state model. For blank subtractions, a reference surface was prepared by performing two consecutive dummy aminecoupling reactions in the absence of any proteins.
Parasite culture and growth inhibition assays P. falciparum asexual stages were maintained in human O+ erythrocytes in RPMI-HEPES medium with 50 mg.ml 21 hypoxanthine, 25 mM NaHCO 3 , 20 mg.ml 21 gentamicin, and 0.5% w/v Albumax II (Gibco; Invitrogen) in 1% O 2 , 4% CO, and 95% N 2 at 37uC and synchronized by standard methods. 3D7 is a cloned line derived from NF54 obtained from David Walliker, Edinburgh University. W2mef is a cloned line derived from Indochina III/CDC strain. W2mefDRh4 was derived from W2mef by genetic disruption of the PfRh4 gene as described [63] Growth inhibition assays (GIA) were performed as described, with the following modifications [64]. Neuraminidase (66.7 mU/ ml)-treated or normal erythrocytes in culture medium were inoculated with late-trophozoite stage parasites to give a parasitemia of 1% and hematocrit of 1% in a volume of 50 ml. CR1 fragments or PBS were added at the beginning of the assay. Growth assays were performed in 96-well round-bottom microtiter plates (Becton Dickinson) over one cycle of parasite growth. After 48 hours, the parasitemia was determined by flow cytometry of ethidium bromide-stained trophozoite-stage parasites using a FACSCalibur (BD Biosciences) and a plate reader. For each well 40,000 cells or more were counted. Growth was expressed as mean parasitemia obtained from triplicate readings. At least two to four independent assays were performed, each in triplicate. Growth (% of control) refers to the % parasitemia in the presence of CR1 constructs relative to the % parasitemia with the addition of PBS (arbitrarily set to be 100%).

Rosetting assays
P. falciparum parasites (clone IT/R29) were cultured and selected for rosetting as described previously [39,65]. The red cells in which the parasites are grown for the rosetting experiments were all group O and different donors were used for different replicate experiments. The red cells were used for parasite culture up to 10 days after drawing. Our unpublished data show that donor red cells form rosettes well up to 10 days, but rosetting starts to reduce after that time. For rosette-disruption assays, parasite cultures were pre-stained with 25 mg.ml 21 ethidium bromide then washed once and re-suspended at 2% hematocrit in bicarbonate-free RPMI (Sigma) containing 25 mM Hepes and 10% v/v normal human serum. Aliquots (25-50 ml) of pre-stained culture suspension in micro-centrifuge tubes were mixed with soluble recombinant CR1 proteins in KP buffer to give a final concentration of 20 mM. Mixtures were incubated at 37uC for 30 minutes. During this incubation period, cells were re-suspended at ten-minute intervals by agitation. Subsequently a drop (,10 ml) of the pre-stained culture suspension was placed on a slide on a slide underneath a 22622-mm cover slip and viewed with a fluorescence microscope. Infected erythrocytes (that fluoresce orange due to parasite DNA) could be discriminated from non-infected cells. A total of 200 infected red blood cells were counted and scored as either being in a rosette, or not, where a rosette is defined as ''an infected cell with two or more uninfected red cells sticking to it''. Only mature (that is, pigmented trophozoite or schizont) infected cells were counted. This allowed calculation of ''rosette frequency'' as the number of infected cells in rosettes expressed as a percentage of the total number of infected cells counted. The rosette frequency in the presence of recombinant CR1 fragments was compared to that of negative controls with no added protein (an equivalent volume of PBS or RPMI binding medium or KP buffer was added instead of protein). Positive controls showed the rosette disruption obtained by 0.1 mg.ml 21 of a monoclonal antibody to CR1 site 2 (J3B11) or 20 mM CR1 15-17 [27]. Rosette frequencies were compared by ANOVA using Graphpad Prism software (La Jolla, USA).

ELISAs
For these experiments, C3b and C4b were purchased from Complement Technology (Tyler, Texas). To C3b-coated or C4bcoated micro-titer plates, solutions of 1 mg.ml 21 sCR1 or CR1 15-25 variants (in 25 mM NaCl) were added. The primary antibody used for detection was rabbit anti-CR1, and the secondary antibody was a goat anti-rabbit IgG conjugated with horseradish peroxidase. Results of three separate experiments were averaged. For comparisons of C1q binding, sCR1 or CR1 15-25 variants were coated on micro-titer plates. Solutions of C1q (140 ng.ml 21 in 75 mM NaCl) (results of three experiments averaged) were added. The primary antibody was rabbit anti-C1q and secondary antibodiy anti-rabbit IgG conjugated with HRP. Figure S1 Particle size distribution according to dynamic light scattering. Shown are overlays of dynamic light scattering-derived particle size profiles for the indicated recombinant CR1 fragments (see key). Data were collected using a Zetasizer Nano S system (Malvern Instruments Ltd., UK) on samples of ,3.5 mg.ml 21 protein in phosphate-buffered saline at 25uC. (TIFF) Figure S2 Binding of CR1 constructs to C3b by SPR. Sensorgrams (left) (for a concentration series, see Methods in main text) and response (response units (RU)) versus concentration (M) plots (right) for, from top to bottom, CR1 15-17 (site 2), CR1 1-3 (site 1), CR1 15-25 (ER), CR1 15-25 (KG) and sCR1. The fitted (see Methods in main text) K D values (shown in mM 6 error, where the error applies to last significant figure shown, e.g. for CR1 15-17, K D = 1.0660.01 mM) are displayed in the plots but also summarized in Table 2, main text. (TIFF) Figure S3 Binding of CR1 constructs to C4b by SPR. As for Figure S2 except C4b replaced C3b. (TIFF) Figure S4 Binding of CR1 constructs to C3b and C4b by ELISA. There are no significant differences between the CR1 15-25 variants in terms of their ability to bind to C3b and C4b according to an ELISA (see Methods in main text). (TIFF) Figure S5 Binding of CR1 constructs to C1q by SPR. As for Figure S2 except C1q replaced C3b. (TIFF)