Inferring Epitopes of a Polymorphic Antigen Amidst Broadly Cross-Reactive Antibodies Using Protein Microarrays: A Study of OspC Proteins of Borrelia burgdorferi

Epitope mapping studies aim to identify the binding sites of antibody-antigen interactions to enhance the development of vaccines, diagnostics and immunotherapeutic compounds. However, mapping is a laborious process employing time- and resource-consuming ‘wet bench’ techniques or epitope prediction software that are still in their infancy. For polymorphic antigens, another challenge is characterizing cross-reactivity between epitopes, teasing out distinctions between broadly cross-reactive responses, limited cross-reactions among variants and the truly type-specific responses. A refined understanding of cross-reactive antibody binding could guide the selection of the most informative subsets of variants for diagnostics and multivalent subunit vaccines. We explored the antibody binding reactivity of sera from human patients and Peromyscus leucopus rodents infected with Borrelia burgdorferi to the polymorphic outer surface protein C (OspC), an attractive candidate antigen for vaccine and improved diagnostics for Lyme disease. We constructed a protein microarray displaying 23 natural variants of OspC and quantified the degree of cross-reactive antibody binding between all pairs of variants, using Pearson correlation calculated on the reactivity values using three independent transforms of the raw data: (1) logarithmic, (2) rank, and (3) binary indicators. We observed that the global amino acid sequence identity between OspC pairs was a poor predictor of cross-reactive antibody binding. Then we asked if specific regions of the protein would better explain the observed cross-reactive binding and performed in silico screening of the linear sequence and 3-dimensional structure of OspC. This analysis pointed to residues 179 through 188 the fifth C-terminal helix of the structure as a major determinant of type-specific cross-reactive antibody binding. We developed bioinformatics methods to systematically analyze the relationship between local sequence/structure variation and cross-reactive antibody binding patterns among variants of a polymorphic antigen, and this method can be applied to other polymorphic antigens for which immune response data is available for multiple variants.


Introduction
Exploitation of the specificity of antibodies' recognition of antigenic targets is the core of immunodiagnostic, immunotherapeutic and vaccine technologies. B-cell epitopes, which are recognized by antibodies or B-cells, can be divided into linear or conformational. For linear epitopes of polypeptides, the binding site is typically 10-15 contiguous residues on the antigen's molecule [1], whereas conformational epitopes may be formed by residues that are brought together in 3-dimensional surface of the antigen. Epitopes may be unique or conserved amongst several antigenic targets. Epitope mapping studies aim to identify these binding sites so that antibody-antigen interactions of interest can be isolated to enhance the development of vaccines, diagnostics and immunotherapeutic compounds. However, the mapping of epitopes for antibodies is a time-and resource-consuming technique, employing synthesis of overlapping peptides, controlled proteolysis, or genetic manipulations of the encoding sequence that yield amino acid substitutions, deletions, or polypeptide truncations. Another, potentially more rapid and cost-effective approach is the use of epitope prediction programs that utilize information derived from primary amino acid sequence or its known or predicted secondary and tertiary structures [2][3][4].
A different challenge is cross-reactivity between epitopes, that is, those shared between two or more antigens, which otherwise can be distinguished by their type-specific epitopes. Meeting this challenge means teasing out the distinctions between broadly cross-reactive responses, limited cross-reactions among clusters of variants of the same protein, and the truly type-specific responses.
More refined understanding of cross-reactive antibody binding between polymorphic antigens could guide the process of selecting the most informative subsets of variants for diagnostics and multivalent subunit vaccines. But is it possible to parse out the limited cross-reactivity from the broad cross-reactive responses? One suitable model system to explore these issues is the binding of antibodies to the highly polymorphic protein OspC of the Lyme disease (LD) agent Borrelia burgdorferi. OspC is a surface-exposed lipoprotein that elicits an immunodominant antibody response early in infection [5][6][7][8]. There are at least 25 types of OspC proteins represented in the U.S. as a whole, though the number of ospC genotypes prevalent in any given geographic area range between 10 and 15 [9]. After conserved N-terminal signal peptide is cleaved, amino acid sequence identities for all pairs of known OspC types are between 63% to 90% [9,10]. In experimental animal infections immunization with purified OspC provides protection against challenge [11][12][13][14][15][16] but usually only for the strain expressing the same OspC type [8,12,[14][15][16][17][18].
Despite this evidence of OspC-type specific immunity and for type-specific epitope antibodies, a single OspC type in immunodiagnostic assay preparations has provided for reasonably good sensitivity [19][20][21]. This performance level is attributable to crossreactivity in OspC proteins, especially when they are presented as isolated polypeptides on matrices such as blot membranes or microtiter plates [22,23]. However, the sensitivity of OspC-based assays could plausibly be improved by the inclusion of multiple OspC proteins, ones that more fully represent the diversity of types that at-risk humans are likely to encounter [21,24]. An equally desirable feature for an OspC-based immunodiagnostic assay would instead take advantage of strain-specific epitopes to discern the infecting strain of B. burgdorferi. This inference would be potentially useful for clinical management because B. burgdorferi strains, which are definable by their ospC genotypes [25], differ in their propensities to disseminate in the body, thus contributing to different disease manifestations in patients [26][27][28][29] and experimental models [30,31].
Our approach to this challenge began with development of a protein microarray displaying purified recombinant proteins of several naturally-occurring variants of OspC in North America. Microarrays have been used to probe immune responses to proteomes of several human pathogens [32][33][34][35] including B. burgdorferi [36]. We obtained a panel of sera from LD patients and exposed it to the OspC variants on the array, and the resulting experimental data was used to quantify the degree of cross-reactive antibody binding between all pairs of variants. The goal was to relate these data to the amino acid sequence variation between OspC pairs to identify the region of the protein molecule most likely responsible for the cross-reactivity observed. For this aim, we developed a systematic computational analysis of the relationship between the cross-reactivity data and variation in subsets of either linear sequences or predicted 3-dimensional structures. These data and analyses provide a comprehensive study of cross-reactivity of antibody binding to an immunodominant protein antigen.
Notations and abbreviations used throughout this article are detailed in Text S1.

Characterization of Sera from Patients with LD and from Controls
The 55 patient sera comprised 12 samples from early LD, 25 samples from patients with disseminated and late disease stages, and 18 samples from LD patients with persistent oligoarticular arthritis. All were seropositive by standard criteria for the diagnosis of LD by whole-cell ELISA and then confirmatory immunoblot. Sera from patients were significantly more reactive than sera from controls against each of several antigens. The mean (95% confidence intervals) for array binding in pixels per spot for patient sera and control sera were 11,748 (10,000 to 13,803) and 2,818 (1,995 to 3,981) for the B31 strain whole cell lysate; 1,905 (1,122 to 3,235) and 114 (91 to 147) for Decorin Binding Protein B (DbpB); 5,011 (3,467 to 7,244) and 416 (251 to 691) for the flagellin FlaB; and 2,818 (2,041 to 3,890) and 977 (776 to 1,230) for the VlsE outer membrane protein, respectively. Table S1 lists the results of the binding of antibodies of patients and controls to other B. burgdorferi antigens.
The protein microarray developed for this study displayed 23 different OspC variants including types K, A, B, N, and U, which were the most prevalent in nymphal ticks in the northeastern U.S. in a recent survey [25]. OspC types also included I, H, C and M, which are associated with more invasive infections [26][27][28]. Overall, sera from LD patients had significantly higher antibody binding to OspC proteins than naïve controls. The mean (95% confidence interval) binding intensity to all OspC spots was 1,406 (1,135 to 1,677) and 76 (66 to 87) for patient and control sera, respectively. Figure S1 summarizes the degree of antibody binding to each OspC protein by sera of LD patients or the control group. The raw quantitative output of pixel intensity of antibody binding to OspC proteins on the microarray by the sera sets used in this study is provided in Table S2, along with their respective log 10 , rank and binary transforms.

Cross-reactivity among OspC Variants
Each LD serum sample showed positive antibody binding to more than one OspC type present on the array. The correlation coefficient Pearson's r was used as an indicator of in vitro antibody cross-reactive binding between OspC proteins, and the r values calculated for each possible pairing populated the cross-reactive antibody binding correlation matrices (M D ) shown in Figure 1. Each heat map presents the matrix calculated using each of the three data transforms (log 10 , rank and binary); the respective numerical r values are available in the Table S3.
The 20 most cross-reactive OspC pairs are shown in Table 1, ranked by the average r from the three matrices. For sera from patients with LD, the OspC pair A, I3 had the highest crossreactivity value in all three matrices, followed by the pairs I, M; C3, M; C3, E3; H, I3 and C3, I. The remaining 247 OspC pairs had average r values ,0.80, with a frequency count for the following ranges: 38 pairs with r values between 0.70-0.80, 94 between 0.60-0.70, 73 between 0.50-0.60, 33 between 0.40-0.50, and 9 between 0.30-0.40. The complete list of pairwise OspC cross-reactivity values for the sera sets studied is provided in Table  S4. Randomization of the linkages between antibody binding and individual OspC proteins yielded in r values with mean of near zero, an indication that the correlations found for the observed values are indicative of the range of antibody binding to OspC proteins resulting from the specificity of immune response and not by chance. Histograms of the correlations from the actual data matrices and the randomized matrices are presented in Figure 2.

Global Sequence Identity and Cross-reactivity
OspC proteins have both conserved and variable regions of amino acid sequences amongst types. The multiple sequence alignment (MSA) of the 23 OspC proteins is presented in Figure  S2. The alignments in the MSA were used to calculate the sequence identity for each OspC pair, and the resulting values were used to populate the global amino acid sequence identity matrix (M S-Global ). On average, OspC proteins shared 72 (68-76)% in their amino acid sequences for the processed protein, with identity ranging from 90% (OspC types F and I3) to 63% (OspC E and OspC L). The values for the complete M S-Global matrices are provided in Table S5.
The relationship between the global amino acid sequence identity (M S-Global ) and the cross-reactive antibody binding (M D ) between OspC pairs was calculated, with resulting correlation values for r(M S-Global , M D ) of 0.16, 0.07, and 0.07, for the log 10 , rank and binary transforms, respectively. This result was an indication that the degree of amino acid identity shared between two OspC proteins does not account for most of the observed cross-reactive antibody binding between them.

OspC Terminal Regions and Antibody Cross-reactivity
This analysis was repeated focusing on the N-and C-terminal regions of the OspC molecule, by calculating the correlations between the cross-reactivity matrices and sequence identity matrix using only the terminal regions. Figure    Local Sequence Windows and Cross-reactivity The relationship between regions of more divergent sequence across OspC variants and cross-reactivity between pair members was calculated using local sequence identity matrices (M S-Local ) and the cross-reactivity matrices. The highest r(M S-Local , M D ) values resulted from a window size of 7 residues centered on position 182 in the fifth helix, and were 0.38, 0.39 and 0.30 for the log, rank and binary transforms, respectively. The heat maps summarizing  these results are presented in Figure 4; the respective numerical values are available in Table S6.
The sequence window of 7 residues in the restricted MSA poly corresponded to 10 positions in the full MSA. This region spanned residues 179 through 188 of the OspC A index, including 7 polymorphic and 3 conserved positions (L183, K185, A187), and is located in the center of the fifth and last alpha helix, as highlighted in Figure S2. The distance between the C b atoms of residues 179 and 188 is 14.8 Å , as determined by Chimera UCSF [37].
Until now all calculations considered r values using all-versus-all OspC types. However, when the relationship between local sequence identity and antibody cross-reactivity were calculated for an individual OspC type versus all others, the correlation between cross-reactivity and positions 179 through 188 of the fifth helix is more evident. For instance, for OspC type A, the r values were 0.75, 0.75, and 0.73 for log, rank and binary transforms, and the corresponding values for OspC type D3 were 0.67, 0.65, and 0.60. The central position of the 7-residue window most correlated with antibody cross-reactivity is shown on the solvent-accessible surface model of the 3D structure constructed from the MSA presented in Figure 5. The heat maps in Figure S3 summarize the correlation results for individual residues, highlighting the highest r value for each OspC protein in white boxes; the corresponding source values are provided in Table S7.

Local Structure Clusters and Cross-reactivity
To assess the relationship between cross-reactivity and a subset of residues in close proximity to one another in 3-dimensional space, a sequence identity matrix using only the residue cluster (M S-Local3D ) was generated and correlated with the averaged crossreactivity matrices. The highest r(M S-Local3D , M D-avg3 ) value, 0.38, was found using a sphere with predicted diameter of 8 Å , which encompassed the polymorphic positions 56, 63, 180, 181, 182, 184, 186, and 188 of the OspC A index. Positions 56 and 63 are part of the first helix, while the remaining 6 positions are in the fifth helix. The fifth helix positions are the same as 6 of the 7 positions (the exception being position 179) that were identified by the sequence scanning approach as being most highly correlated with cross-reactivity. All correlation results using sphere sizes 4 to 40Å are available in Table S8.

Reactivity to OspC I3, a Chimera of Types F and A
A naturally occurring chimeric OspC protein provided an opportunity to directly evaluate the importance of the fifth helix for cross-reactivity. OspC I3 comprises helices 1, 2 and 3 of OspC F, and helices 4 and 5 of OspC A [38]. The alignment of the 3 proteins together with the locations of helices 2-5 is shown in Figure 6, panel A. Global amino acid sequence identity between OspC types I3 and F is 90%, while between A and I3 is 80%. Figure 6, panel B shows the pairwise identities according to the 3D structural model, with the sequence matches and mismatches for the OspC pairs indicated by green and red. Only 17 positions differ in the pair F, I3 and all but one occur in the fifth helix. In contrast, the pair A, I3 contains 36 mismatches and all of them are in that portion of OspC proximal to the fifth helix.
For two sets of sera examined, from patients with LD and from P. leucopus rodents experimentally infected with B. burgdorferi, the pair I3 and A had the highest ranking correlations, with averaged r values of 0.91 and 0.95, respectively; while the I3 and F pair was ranked number 118 and 190 out of the 253 possible pairs (Table  S4). In Figure 7, the binding of antibodies to the 3 proteins is compared against each other. For both sets of sera the highest coefficients of determination (R 2 ) were between I3 and A, further evidence of the immunodominance of the fifth helix over global sequence identity.

Discussion
We described here a computational protocol for analysis of the binding of antibodies to a diverse population of variants of an antigen presented in an array format. The set of proteins are homologous but diverse enough to feature both type-specific epitopes and cross-reactive epitopes. Accurately distinguishing cross-reactive epitopes from type-specific epitopes on the basis of amino acid sequence is a challenging problem. Our analytic approach automatically generates testable hypotheses regarding which specific sets of residues of the full-length protein comprise immunodominant linear or conformational epitopes. As a model system for development of the protocol, we used 23 variants of the polymorphic OspC surface-exposed protein of B. burgdorferi and asked whether cross-reactive antibody binding was influenced by the degree of global identity at amino acid level or by specific smaller regions of the protein. To this end, we performed an automated systematic analysis of the relationship between the variation among subsets of positions adjacent in sequence or 3D space and the experimentally observed antibody cross-reactivity produced by a set of sera from individuals with documented LD. We found that cross-reactivity between specific pairs of OspC proteins is determined by sequence identity at positions 179 through 188 of the C-terminal fifth alpha helix, rather than how much global identity is shared between the pair.
A limitation of the study was that the infecting strain (or strains) for patients with LD was not known. If the infecting type was known for each sample, then the quantitative measure of crossreactivity between pairs of OspC variants could be calculated more directly using only samples infected with specific types. Additionally, the possibility that patients could be infected with more than one strain of B. burgdorferi could bias cross-reactivity results; however, multiple strain infections seem to be uncommon in humans [39]. In the context of unknown infecting type we use similarity of antibody binding patterns for OspC pairs over the entire set of samples as a proxy for the ideal quantitative measure. On the other hand, absence of knowledge of the infecting strain is by far the most common circumstance during medical management of LD at present and will likely be for the near-term future, until the means to identify infecting strains become feasible and widely adopted.
Another limitation of the study was the dependence on an assay that measures binding of antibody to purified protein on a matrix and not to an in situ protein at the surface of a living bacterium. Presumably all antibodies directed against an OspC protein are not equal in their effector functions, such as direct neutralization or opsonization. Moreover, as previous studies of strain-specificity of protective immunity have indicated [8,17,18], only a portion of the anti-OspC antibodies are likely to be functionally active in this regard. On the basis of the established utility of a single OspC protein for immunodiagnostic assays, the study's array-based assay might not have been expected to tease out subtle type-specific responses. Nevertheless, we showed this was possible in a previous study using this array and experimentally-infected rodents [40], and the differences in reactivity over a range of diverse OspC proteins observed in this study is evidence that even under the conditions where binding by antibodies of little or no functional consequence occurs, we could still detect type-specific binding. This suggests to us that the array-based assays are informative for questions of vaccine or diagnostic design even with a high background of cross-reactivity.
Thus despite the near ubiquitous reactivity to the conserved Nterminal first helix of the OspC protein [40], we determined that the fifth helix is also an immunodominant epitope, as several epitope mapping studies indicated by other approaches [8,[41][42][43][44][45]. The independent validation of our results by traditional techniques adds merit to our procedure; however, our high-throughput approach is not a substitute for traditional experimental methods of epitope mapping, but it may be a valuable complement to these.
Although our study represents the broadest effort to determine the immunodominant regions responsible for cross-reactive antibody binding between variants of the OspC protein, the bioinformatics approach described here can be applied in the study of polymorphic immunodominant antigens in other human pathogens. For instance, Plasmodium falciparum antigens are promising targets for analysis due to the established links between antigen polymorphism and development of resistance only after exposure to many circulating strains, and the ongoing large-scale effort to investigate immune responses of individuals and populations suffering from malaria [46,47].
The two types of data that are necessary for performing the cross-reactivity analysis for a set of variants are: (1) quantitative measurements of antibody binding to each variant for multiple patient samples and (2) a multiple sequence alignment of the corresponding sequence variants. For structure scanning, a 3D structure model is also required. The methods for performing the systematic terminal region scanning, sequence scanning, and structure scanning are implemented in a suite of Perl scripts. These scripts, as well as sample input and output files from the OspC project, are publicly available at http://download.igb.uci. edu#ospc.

Sera
The 55 sera from adult patients with different stages of LD and the 25 sera from naïve adults were described in detail previously [36]. In brief, 27 patient and 5 control sera were provided by the Centers for Disease Control and Prevention, Fort Collins, CO, and 28 patient and 20 control sera were provided by Allen Steere, Harvard University. Sera from the 23 P. leucopus experimentallyinfected with B. burgdorferi isolates HB19, Sh-2-82, IDS, TBO2, WQR and 27577 and the 7 control sera were described in detail in [40]. Briefly, adult female pathogen-free, closed-colony outbred P. leucopus (LL stock; Peromyscus Genetic Stock Center, University of South Carolina) were inoculated intraperitoneally with fresh CB-17 SCID mice plasma containing host-adapted B. burgdorferi cells. Animals were terminally exsanguinated 5 weeks post-infection. Samples were kept frozen at 280uC until use.

Ethics Statement
Sera from human donors were originally collected for other studies for which informed consent had been obtained; patient identifier information had been removed. Rodent serum samples were obtained as described in [40], and the study was carried out in strict accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. The protocol was approved by the Institutional Animal Care and Use Committee of the University of California Irvine (IACUC protocol 1999-2080).

Preparation, Construction and Probing of the OspC Protein Microarray
Amplification and cloning of ospC alleles. Table S9 provides the sources, geographic origins, accession numbers and references for the ospC alleles cloned. Table S10 lists the name and nucleotide sequence of primers utilized to amplify these genes. ospC ORFs coding for protein sequence without signal peptide were cloned into pXT7 expression vector containing an amino- terminal 10X-Histidine fusion tag, using the in vivo recombination cloning method [35]. For details regarding PCR reactions and cloning methods, please refer to Text S1.
OspC protein expression and purification. BL21(DE3)pLysS E. coli cells transformed with pXT7-ospC plasmids were cultured in Terrific Broth (MP Biomedicals, Solon, OH) supplemented with kanamycin until reaching OD 600 0.4-0.6. Recombinant protein expression was inducted with IPTG (RPI, Mt. Prospect, IL) and further incubation for additional 4 hours. Cells were harvested and the supernadant containing His-tagged OspC fusion protein was incubated with Ni-coupled magnetic beads (MagneHis kit, Promega, Madison, WI) for protein purification. Recombinant protein purity was estimated to be 80-90% by densitometry of Coomassie Blue-stained protein bands on sodium dodecyl sulfatepolyacrylamide (SDS-PAGE) gels, and concentration was determined by BCA Protein Assay kit (Pierce, Rockford, IL). Purified OspC protein samples were aliquoted and stored at 280uC until use. For details on recombinant protein purification, refer to Text S1.
OspC protein array printing. Purified recombinant-OspC proteins were printed on nitrocellulose-coated glass FAST slides (Whatman, Piscataway, NJ) using an Omnigrid 100 apparatus (Digilab, Holliston, MA), in duplicate spots and in approximately 10 pg and 30 pg of protein per spot. Protein storage buffer alone was printed on the array to serve as a background signal control.
OspC protein microarray probing. Serum samples were diluted 1:200 (LD patient sera) or 1:100 (P. leucopus) in Protein Array Blocking (PAB) buffer (Whatman Inc, Sanford, ME) supplemented with 10% (vol/vol) DH5a E. coli lysate (MCLAB, San Francisco, CA). Incubation and washing procedures are described in Text S1. Cy3-conjugated secondary antibody, goat anti-human IgG heavy and light chain or goat anti-Peromyscus leucopus IgG heavy and light chain (KPL, Gaithersburg, MD) were used to detect sera antibody binding to OspC proteins. Probed array slides were scanned in a Perkin Elmer ScanArray Express HT and output RGB TIFF files were quantitated using ProScanArray Express software (Perkin Elmer, Waltham, MA) with spot-specific background correction. The array data is deposited in NCBI's Gene Expression Omnibus [48] and is accessible through GEO Series accession number GSE45996 (http://www.ncbi.nlm.nih.gov/geo/query/acc. cgi?acc = GSE45996). Protein Microarray Data Analysis Primary analysis. Inclusion criterion for cross-reactivity analysis was a minimum reactivity corresponding to a z score of 2 to at least one OspC protein. For analysis of antibody binding to OspC proteins on the microarray the following steps were taken: (i) raw values of antibody binding measured as the mean pixel intensity of spots of printed protein were log 10 -transformed; raw values less than 1.0 were set to 0; (ii) the mean, standard deviation, 95% confidence intervals and z-scores of antibody binding intensity to each OspC type were calculated for the LD patient, P. leucopus and respective control sera groups.
Cross-reactivity analysis. For a given OspC type x the row of all individual raw reactivity values is denoted by D x (e.g., D A contains reactivity to OspC type A). For a given pair of OspC variants the Pearson's correlation r between the corresponding rows of serum reactivity values was used as the quantitative measure of cross-reactive antibody binding. The correlations were calculated using three forms of transformed data: log 10 (reactivity values were log 10 -transformed); rank (for each OspC type, ranks sera from lowest (1) to highest reactivity value (55 for human sera or 23 for P. leucopus sera)); and binary (values above the global median were set to 1 and values below were set to 0). For each of the transforms, all possible pairwise correlations between two OspC types were calculated and saved in the corresponding antibody cross-reactivity matrices: M D-log , M D-rank, M D-binary. Please refer to Abbreviations section in Text S1 for further explanation.

OspC Sequence and Structure Analysis
Structure-based multiple sequence alignment. A draft multiple sequence alignment (MSA) of the OspC proteins was assembled using PSI-BLAST [49] and then manually adjusted to accommodate insertions and deletions. The MSA comprises residues 31 to 206 of OspC A using the indexes of Kumaran et al. [50], denoted as OspC A Index. The modeled consensus sequence consisted of 183 residues; whereas the individual sequences ranged from 175 to 179 residues over the aligned positions, considering gaps. The pairwise alignments from the MSA were used for all global and local amino acid sequence identity calculations and the aligned gaps between OspC pairs were counted as identities.
3D structure model and inter-residue distances. A 3dimensional model of the MSA was constructed using Modeller 9.1 [51] with the structures of OspC A (pdb 1GGQ), OspC E (pdb 1G5Z) and OspC I (pdb 1F1M) [50,52] as templates, and the consensus sequence (ignoring gaps) as the target sequence to be modeled. Distances between C b atoms (C a atoms for glycine) in the model were used to define inter-residue distances for determining spatial clusters of residues. Structure scanning. Structural clusters of residues were defined using only the polymorphic positions in MSA poly . Each position was used as the central residue for defining a cluster of residues in 3-dimensional space where membership in the cluster was defined by proximity of less than 4 Å to the central residue. The correlations between the corresponding sequence identity matrix calculated using the residue cluster (M S-Local3D ) and the cross-reactivity matrices were calculated, using distance thresholds of 4 to 40 Å in 4 Å increments.         Text S1 Detailed information on Materials and Methods.

Supporting Information
(DOC)