Structural Analysis of the C-Terminal Region (Modules 18–20) of Complement Regulator Factor H (FH)

Factor H (FH) is a soluble regulator of the human complement system affording protection to host tissues. It selectively inhibits amplification of C3b, the activation-specific fragment of the abundant complement component C3, in fluid phase and on self-surfaces and accelerates the decay of the alternative pathway C3 convertase, C3bBb. We have determined the crystal structure of the three carboxyl-terminal complement control protein (CCP) modules of FH (FH18–20) that bind to C3b, and which additionally recognize polyanionic markers specific to self-surfaces. These CCPs harbour nearly 30 disease-linked missense mutations. We have also deployed small-angle X-ray scattering (SAXS) to investigate FH18–20 flexibility in solution using FH18–20 and FH19–20 constructs. In the crystal lattice FH18–20 adopts a “J”-shape: A ∼122-degree tilt between the structurally highly similar modules 18 and 19 precedes an extended, linear arrangement of modules 19 and 20 as observed in previously determined structures of these two modules alone. However, under solution conditions FH18–20 adopts multiple conformations mediated by flexibility between CCPs 18 and 19. We also pinpoint the locations of disease-associated missense mutations on the module 18 surface and discuss our data in the context of the C3b:FH interaction.


Introduction
Complement factor H (FH), a 155-kDa soluble glycoprotein abundant in human plasma, is an important regulator of the complement system, the chief molecular component of innate immunity. FH is a member of the regulators of complement activation (RCA) family (reviewed in [1]) that are characterized by possession of repeating compact domains, known as complement control protein modules (CCPs), sushi domains or short consensus repeats [2,3]. In FH twenty CCPs are arranged in a tandem manner [4].
Human FH binds to, and regulates levels of, the first activationspecific cleavage product of complement component C3, C3b. Without regulation, C3b self-promulgates via formation of the complex, C3bBb, which proteolyses C3, producing more C3b. C3b can attach indiscriminately to surfaces via a thioester-containing domain (TED), and is an opsonin. It also triggers a proteolytic cascade terminating in self-assembly of cytolytic pores. FH is a cofactor for proteolysis of C3b to iC3b that no longer participates in the complement cascade but remains opsonic and is a ligand for receptors on B-cells and phagocytes. FH also competes with factor B for binding to C3b, and it accelerates the irreversible decay of C3bBb [5][6][7]. Furthermore, FH recognizes polyanionic markers, such as glycosaminoglycans (GAGs) that are common on selfsurfaces but rare on pathogen surfaces [8][9][10]. This dual C3b and polyanion recognition allows FH to regulate complement activation effectively on self-surfaces but not on foreign ones [10].
In this study we describe the crystal structure of the three carboxyl-terminal CCP modules of human FH (CCPs 18-20) at a resolution of 1.8 Å together with analysis of this region in solution by small angle X-ray scattering (SAXS). This new structural information extends current knowledge based on the CCPs 19-20 structures and provides a more robust structural context for discussion of disease-linked mutations.

FH18-20 Crystallization and Data Collection
Crystals of FH18-20 were grown at 17uC by vapor diffusion from hanging drops. Drops contained 1 ml of protein solution (16.8 mg/ml) in PBS with an equal volume of well solution (0.1 M sodium malonate, pH 4.0, 12% w/v polyethylene glycol 3350). Crystals grew within forty-eight hours. Crystals were flash frozen in liquid nitrogen after successive soakings in cryoprotectant solutions containing 10% and 25% v/v glycerol. Intensity data were collected (Q scans were 1u over 180u) to a resolution of 1.8 Å (the edge of the detector) on beamline I03 at the Diamond Light Source (Oxfordshire, UK). Data were indexed with Mosflm [38], and subsequently merged and scaled with SCALA [39].

FH18-20 Structure Determination
A previously elucidated structure of FH19-20 (PDB ID: 3OXU/chain F [18]) was used as a search model for molecular replacement using the program PHASER [40]. The resulting model underwent ten cycles of restrained refinement using the program REFMAC [41]. The remaining CCP module (FH18) was built using the PHENIX Autobuild Wizard [42] and the program COOT [43]. This model was subjected to further cycles of restrained refinement and, when appropriate, ligands and water molecules were added to the model using COOT. Disordered regions were carefully modeled into F o 2F c electron density and changes in R/R free (%) values were used to assess final model quality.
The final structure was composed of one FH18-20 molecule comprising 185 residues (Gly1045-Lys1230), 17 of which exhibit alternate conformations, 170 water molecules, four glycerol molecules and a phosphate ion. No clear electron density was observed for the first or last residues in the recombinant FH18-20 sequence (Ala1044 or Arg1231), while the Thr1184-K1188 region within CCP 20 was disordered; this region was modeled using the NMR-derived FH19-20 structure (PDB ID: 2BZM [37]). The R/R free values converged for twenty cycles of REFMAC at 18.2% and 22.6%, respectively. Data-reduction and refinement statistics are summarized in Table 1

Validation and Deposition of FH18-20 Structure
The geometry of the model was assessed using MolProbity [44]. Atomic coordinates and the experimental structure factors for our 1.8 Å structure of FH18-20 have been deposited in the Protein Data Bank with the accession code 3SW0 (PDB ID: 3SW0). Small-Angle X-ray Scattering Synchrotron radiation X-ray scattering data were collected on the X33 beam line of the EMBL (DESY, Hamburg, Germany) using a Pilatus one-megapixel array detector (Dectris, Switzerland) and eight frames of 15-second exposure times. Solutions of FH18-20 and FH19-20 were measured at 20uC in PBS, pH 7.4, 1 mM DL-Dithiothreitol (DTT) at protein concentrations of 1.9, 3.8 and 7.6 mg/ml (for FH18-20) and 2.4, 4.6 and 7.9 mg/ml (for FH19-20). The sample-to-detector distance was 2.7 m, covering a range of momentum transfer 0.07 nm 21 ,s,6.0 nm 21 (s = 4psinh/l, where 2h is the scattering angle, and l = 0.15 nm is the X-ray wavelength). Addition of reducing agents, such as DTT, serve as free radical scavengers and can significantly reduce radiation damage to biological samples [45]. Comparison of successive 15second frames revealed no detectable radiation damage in the presence of DTT. However, in the absence of DTT, significant radiation damage occurred following the second frame. Usage of 1 mM DTT in our hands has been shown to be insufficient to reduce disulphide bonds within CCP-containing constructs (data not shown). Data from the detector were normalized to the transmitted beam intensity, averaged and the scattering of buffer solutions subtracted. Difference curves were scaled for solute concentration. All data manipulations were performed using the PRIMUS software package 16 [46].
Fitting of the FH18-20 and FH19-20 (PDB ID: 3OXU [18]) crystal structures to the SAXS data was conducted using the program CRYSOL [47]. CRYSOL calculates the partial scattering amplitudes of proteins from their atomic coordinates, taking into account the hydration layer and excluded solvent volume. Low resolution shape envelopes were determined from the solution scattering data using the program DAMMIF [48] and the most typical model from multiple reconstructions (10) identified using DAMAVER [49]. Resulting bead models were converted to meshed envelopes and visualized using PYMOL (Version 1.3, Schrödinger, LLC). Superposition of available bead models on three-dimensional structures of FH18-20 or FH19-20, as appropriate, were carried out using the program SUPCOMB13 [50]. Rigid body modeling using the program CORAL (Complexes with Random Loops) was also conducted using the FH18-20 crystal structure, constraining either FH modules 18 and 19, or 19 and 20 as fixed, and refining the relative position and orientation of modules 20 or 18, respectively, against the SAXS data [51].
Analysis of inter-domain flexibility in FH18-20 employed the ensemble optimization method (EOM) [52]. This uses a genetic algorithm to select, from a pool of randomly generated models, an ensemble of possible conformations whose combined theoretical scattering profiles best fit the experimental data. The CCP modules of FH18-20 were treated as rigid bodies and the linkers between them represented as flexible chains of dummy residues. For the pool, 10,000 models were generated from the input structures. A final ensemble of 20 conformations was selected by genetic algorithm after 50 cycles.
The discrepancies (x) between models/ensembles and the experimental data from CRYSOL, DAMMIF, CORAL and EOM are summarised in Table S1. This discrepancy is defined as: where N is the number of experimental points, I exp (s j ) and I calc (s j ) are the experimental and calculated scattering intensities, c is a scaling factor and s(s j ) is the experimental error at the momentum transfer s j .

Crystal Structure of FH18-20
FH18-20 crystals diffracted to a maximum resolution of 1.8 Å (see Table 1). FH18-20 data were indexed in the space-group P22 1 2 1 , with one monomer in the asymmetric unit. The three CCP modules of FH18-20 form a 'J'-shaped structure ( Figure 1A). CCPs 19 and 20 adopt an extended rod-like conformation in which CCP 20 is approximately aligned with CCP 19, consistent with previous structural studies carried out using wild-type or mutant forms of FH19-20 [17,18,22,37,53,54]; CCPs 19-20 of our FH18-20 structure (PDB ID: 3SW0) may be superimposed (residues 1109-1228, back-bone atoms) with a root-mean-square (rms) of 1.21 Å on a crystal structure of wild-type FH19-20 (PDB ID: 3OXU/chain F [18]). Both FH18 and FH19 are typical CCP modules with very similar structures (rms, alpha-carbon atoms, ,1 Å ) in line with their high sequence similarity, while CCP 20 exhibits great structural divergence (rms, alpha-carbon atoms .2 Å to all but one other CCP module) (Table S2) [55]. While the long axes of CCPs 19 and 20 are approximately aligned, with only a ,32u tilt relative to one another, the long axis of CCP 18 is strongly tilted, by ,122u with respect to the long axis of CCP 19, and by ,151u with respect to CCP 20 (Table S3). This distinctive kink in the FH18-20 structure is facilitated by an extensive network of hydrogen bonds (Figure 1). Atomic distances consistent with hydrogen bond formation are observed between Gln1101 (CCP 18)-Gln1156 (CCP 19); Lys1103 (linker)-Asn1154 (CCP 19); and Lys1103 (linker)-Gln1156 (CCP 19). In addition, a single water molecule (B-factor = 25 Å 2 ) forms three hydrogen bonds, one with the amide of Asp1104 (linker) and two with the backbone carbonyl oxygen atoms of Lys1108 (linker) and Gly1155 (CCP 19). Further hydrogen bonding involving residues: Asp1104 (linker)-Ser1105 (linker); Asp1104 (linker)-Gly1107 (linker); Thr1106 (linker)-Lys1108 (linker); and Arg1153 (CCP 19)-Gln1156 (CCP 19) ( Figure 1B,C) also contribute to the kink between CCPs 18 and 19. This network of hydrogen bonds may also stabilize the observed CCP 18-CCP 19 inter-modular orientation under solution conditions; however, we cannot rule out the possibility that the distinctive kinked structure is induced by crystal contacts, or alternatively, by the low-pH conditions employed to crystallize the molecule. Furthermore, the surface area buried between CCPs 18 and 19 is only ,400 Å , compared to almost ,700 Å for CCPs 19 and 20 (Table S4).

SAXS Analysis of FH18-20 and FH19-20
To investigate the conformation adopted by these three Cterminal CCP modules in solution, SAXS data were acquired on samples of both FH18-20 and FH19-20 (Table 2). For the double module, FH19-20, the overall parameters suggest that the sample is predominantly monomeric in solution ( Table 2). The MW as estimated from the forward scattering intensity of the merged data extrapolated to infinite dilution, I(0) is ,15 kDa, and along with estimates derived from the hydrated particle volumes and ab initio bead modelling, is consistent with monomeric FH19-20 in solution (Figure 2A). Fits of the crystal structure of this FH19-20 construct (PDB ID: 3OXU [18]) to the SAXS data using the program CRYSOL are shown in Figure 2. The structure of FH19-20 provides an excellent fit (x = 1.4) to the merged scattering data extrapolated to infinite dilution ( Figure 2B), supporting the extended structures previously solved by both X-ray crystallography and NMR for this fragment [17,18,22,37,53,54]. The presence of  [17,18,21]. Residues contributing to the inter-domain packing between CCPs 18 and 19 are shown. B, Close-up of the kink that occurs between modules 18 and 19. The orientation of the FH18-20 molecule is the same as shown in 'A'. Electron density (2Fo2Fc map shown in grey, and contoured at 1.5s) for residues contributing to the inter-modular packing is shown. Dashed lines represent hydrogen-bonds between amino acid residues or between amino acid residues and water molecules. C, as for 'B' except the molecule is rotated about the y-axis by 180u. doi:10.1371/journal.pone.0032187.g001 R g Guinier and R g GNOM are the experimentally determined radius of gyration as calculated by Guinier analysis [62] and by indirect Fourier transform using the program GNOM, respectively [63]; D max is the maximum particle dimension; I(0) is the forward scattering intensity; MW (SAXS) is the molecular weight determined by SAXS; Vol SAXS is the hydrated particle volume of solutes determined from the SAXS patterns; and Vol DAM is the excluded volume of solutes determined using the ab initio modeling program DAMMIF [48]. Data merged and extrapolated to infinite dilution are referred to in the table as ''mer''. doi:10.1371/journal.pone.0032187.t002 potential flexibility in the three-residue linker between CCPs 19 and 20 was investigated using an ensemble-optimization analysis conducted using the program EOM. The R g distribution from this analysis is characteristic of an extended structure with a small degree of conformational flexibility relative to the pool of random conformers ( Figure 2C).
For FH18-20, the MW estimate of ,22 kDa from the merged data extrapolated to infinite dilution agreed with that expected for a monomeric form of this three-module protein. Volume estimates were also consistent with a monomeric state, with volumes of 30 and 32 nm 3 measured for the Vol SAXS and Vol DAM values, respectively (corresponding to estimated MW's of 1965 kDa and 1665 kDa). Interestingly, though, the experimentally derived scattering curves did not fit well to data back-calculated from the 'J'-shaped crystal structure (discrepancy x = 2.4) ( Figure 3); nor did ab initio low-resolution shape envelopes generated using the program DAMMIF demonstrate the same acute angle between CCPs 18 and 19 ( Figure 3A) [48]. Furthermore, SAXS-based rigid body models of FH18-20 generated using the program CORAL, in which the position and orientation of CCP 18 was refined against the SAXS data, resulted in an average solution conformation which was more extended than the crystal structure ( Figure 3A) and which better fit the scattering data (x = 1.4) ( Figure  S1). By contrast, refinement of the position and orientation of CCP 20 yielded no improvement in the fit to the SAXS data compared to that of the crystal structure (x = 2.1) ( Figure S1). Overall, these SAXS results suggest FH18-20 has a more extended conformation in solution than that observed in the crystal lattice. The most straightforward explanation is that under the conditions used to collect the SAXS experiments, FH18-20 is more flexible than might be inferred from its crystal structure.
To investigate potential flexibility of the linker regions within CCPs 18-19 and CCPs 19-20, an analysis was carried out with EOM [52]. When all inter-module linkers were defined as flexible the genetic algorithm selected an ensemble of conformations providing a superior fit (x = 1.2) to the SAXS data, compared to that of the crystal structure ( Figure 3B). The R g distribution of the selected ensemble from this analysis is shifted toward extended structures while the width of the distribution is smaller than that of the pool (Figure 3C), suggesting partial flexibility. To investigate the location of potential flexibility and reduce the number of degrees of freedom of the EOM analysis, two additional runs were conducted. In these, the linker between CCPs 18 and 19 or between 19 and 20 was fixed as in the crystal structure, allowing either CCP 20 or CCP 18, respectively, (via the remaining nonfixed linker) to sample conformational space. The SAXS data were fit well by an ensemble in which the 19-20 linker was fixed (x = 1.3) ( Figure 3D), but were fit poorly by the ensemble in which the 18-19 linker was fixed (x = 2.4). These data, which are entirely consistent with the experiments performed on FH19-20, suggest that the six residue 18-19 linker, but not the three residue 19-20 linker, is significantly flexible. The R g distribution of the selected ensemble in which the 19-20 linker was fixed coincides well with that of the respective pool, being both broad (and thus considerably flexible) and skewed toward extended structures ( Figure 3E).
In summary, these data are consistent with a solution of, on average, more extended FH18-20 conformations (compared to the crystal structure) with a rigid 19-20 linker and a highly flexible 18-19 linker. It is possible therefore, that the kinked crystalderived FH18-20 structure in which CCP 18 folds back towards CCP 19 reflects a snapshot of one of several conformations available to these three modules (Figure 4).

Discussion
We have extended the structural information available for the key soluble human complement regulator, FH. We elucidated the  [18]). Shape envelopes were determined using the ab initio bead-modelling program DAMMIF [48] and superposition of the FH19-20 envelope on the corresponding crystal structure was carried out utilizing the program, SUPCOMB13 [50]. B, Fit of the X-ray crystal structure of FH19-20 (solid black line) to the SAXS data extrapolated to infinite dilution (black open circles). The fit of the selected ensemble of conformations from EOM is also shown (solid red line). C, The R g distribution from the ensemble analysis using EOM (pool in grey, selected ensemble in red). doi:10.1371/journal.pone.0032187.g002 crystal structure of FH18-20, and have complemented this with the acquisition of solution SAXS data for FH19-20 and FH18-20. These C-terminal CCPs encompass the key self versus non-self discriminating region of this protein [10]. While several highresolution structures of the C3b/GAG-contacting CCPs 19 and 20 were already available, CCP 18 is also of interest since it too is a site of disease-associated mutations ( Figure 5) [31,35,36]. Moreover, the inter-modular angles between CCPs 18 and 19 are important because they determine the path of the carboxylterminal region of the FH molecule as it exits the C3b:FH complex before looping back to re-engage with the same C3b molecule via its amino-terminal CCPs [18].
FH18-20, in the crystal lattice, adopts a distinctive 'J'-shaped conformation. The short (three-residue) linker between CCPs 19 and 20 (i.e. between the last cysteine of CCP 19 and the first of CCP 20) along with numerous inter-modular contacts imposes an approximately linear rod-like structure on this (GAG/C3bbinding) part of the molecule; the longer (six-residue) 18-19 linker permits a sharp kink to form in the molecule, also stabilized by inter-modular and module-to-linker interactions, but with a smaller inter-modular interface than observed between modules 19 and 20. Interestingly, a previously published SAXS-based model of FH 15-19 also contained a kink between CCPs 18 and 19 [18]. On the other hand our SAXS-derived analysis of FH18-20 and FH19-20 in solution revealed the presence of conformational mobility at the 18-19 inter-modular junction, but little flexibility between CCPs 19 and 20. These data are consistent with the low buried surface area which is observed between CCPs 18 and 19 (Table S3). Varying levels of inter-modular flexibility have been noted previously in FH and other RCAs [13,[56][57][58].
Taking these data together, it appears that modules 19 and 20 are rigidly aligned such that their rod-like conformation is independent of the addition of CCP 18 or the presence of ligands. Modules 18 and 19, on the other hand, can adopt a bent-back conformation, supported by numerous non-covalent interactions, that would bring residues on the surface of module 18 in close proximity to C3b within a FH:C3b complex rather than projecting module 18 (and the preceding modules of FH) clear of the complex ( Figure 4); but this junction is much less rigid than that between modules 19 and 20. Occupation of a solitary N-glycan site on Asn1095 within CCP 18 ( Figure 5) would not preclude this conformational flexibility due to its remoteness from the CCP 18-CCP 19 linker.
The carboxyl-terminal region of FH is a hotspot for diseaseassociated mutations which have been linked to increased risk for the development of aHUS, early onset drusen (basal laminal drusen) and age-related macular degeneration (AMD) [31][32][33][34][35][36]. To date, at least twenty-eight such missense mutations have been documented in FH18-20, four of which occur in CCP 18: N1050Y [31]; V1060A [36]; Q1076E [32,34]; and R1078S [31]. All of these substitutions are located on the surface of CCP 18 ( Figure 5A) and none is likely to result in significant structural perturbations; two of them (Q1076E and R1078S), however, alter the electrostatic potential of CCP 18 and are, additionally, in close proximity to an electronegative patch on this module ( Figure 5B).
While direct binding to C3b occurs mainly through CCP 19 and the CCP 19-20 inter-modular junction, residues exposed on CCP 18 could nonetheless play a role in the encounter between the C terminus of FH and C3b and therefore modulate the ability of FH to control C3b amplification on host surfaces. In previous work reversal-of-charge mutations in CCPs 19 and 20 were found to influence affinity of FH19-20 for C3d/C3b even when they did not lie directly in the interface between these two molecules as visualized in the crystal structure of the FH19-20:C3d complex [17,18,59,60]. Such observations were attributed to electrostatic steering. It has also been suggested that some disease-associated mutations in FH19-20 modulate self-association [54]. The wildtype FH18-20 protein is monomeric under the conditions used for SAXS. Mutagenesis combined with binding and biophysical studies would be needed to explore the hypothesis that residues in CCP 18 exert electrostatic steering effects, or that mutations in CCP 18 can influence self-association. Alternatively, interactions with other less well established FH ligands might be directly affected by mutations in CCP 18; the inflammatory biomarker, C-reactive protein, for example, has been reported to bind to a FH16-20 construct [61]. Figure S1 Fit of FH18-20 rigid body models refined against the SAXS data using the program CORAL.    [22]. For each CCP, inclusive module boundaries were one residue before Cys-I and the third residue after Cys-IV. In cases where structures have been solved by both NMR and X-ray diffraction, the higher resolution X-ray structure was used for comparison. Where both liganded and unliganded structures were available, the highest resolution unliganded X-ray or NMR structure was used. A few residues were missing in the crystal structure of C1r CCP 2, and hence in this case, the structure with the most determined residues was employed for both modules. Colour key used in table: Blue: 0-1.99 Å ; Green: 2.00-2.99 Å ; Red: 3.00-3.99 Å ; Brown: Alignment lengths ,40 amino acids. Abbreviations used in Table: C4BPa = C4b-binding protein a-chain; CR = complement receptor; CRRY = rat Complement receptor 1-related protein Y; DAF = decay-accelerating factor; FB = factor B; FH = factor H; MASP1/2 = mannan-binding lectin-associated serine proteases 1/ 2; MCP = membrane cofactor protein; VCP = Vaccinia virus complement control protein. Some residues were not present (solved) in the electron density map for the C1r CCP 2 module crystal structure, and this explains the short structural alignment length (shown in brown). (DOC) Table S4 Accessible surface area and interface buried surface area calculation. The web server VADAR version 1.8 [1] was used and the surface area (SA) that was buried was calculated as: (SA Module 1+SA Module 2) -SA Bi-module. All units are in Å 2 . The linker length was defined as the number of residues between the C-terminal Cys of the preceding CCP module and the N-terminal Cys of the following CCP module. For CCP , boundaries were considered from one residue before the first Cys till three residues after the last Cys and for CCP 19 (in CCP 18-CCP 19 and CCP 19-CCP 20), boundaries were considered from three residues before the first Cys till one residue after the last Cys (and one residue before the first Cys till one residue after last Cys in CCP 19-CCP 20, and two residues before first Cys till one residue after last Cys in CCP 20). For bi-modules, boundaries were considered from one residue before the first Cys of Module 1 to one residue after the last Cys of Module 2. (DOC) Figure 5. Location of disease-associated missense mutations within CCP 18. A, Shown are the alpha-carbons (red spheres) of residues for which missense mutations associated with aHUS or basal laminar drusen have been reported. Residue numbers are: 1050 (basal laminar drusen variant N1050Y) [31]; 1060 (aHUS-associated variant V1060A) [36]; 1076 (aHUS-associated variant Q1076E) [32,34]; and 1078 (basal laminar drusen-associated variant R1078S) [31]. Also indicated in magenta is the alpha-carbon of residue 1095, the Asn of the sole Nglycosylation consensus sequence located within FH18-20. B, Electrostatic surface representation of FH18-20. Positively and negatively charged areas are indicated in blue and red, respectively. Also shown as a red mesh is a negative isosurface map contoured at 22 kT/e. This figure was generated using the APBS plug-in for PyMOL [64]. doi:10.1371/journal.pone.0032187.g005