Energetic Selection of Topology in Ferredoxins

Models of early protein evolution posit the existence of short peptides that bound metals and ions and served as transporters, membranes or catalysts. The Cys-X-X-Cys-X-X-Cys heptapeptide located within bacterial ferredoxins, enclosing an Fe4S4 metal center, is an attractive candidate for such an early peptide. Ferredoxins are ancient proteins and the simple α+β fold is found alone or as a domain in larger proteins throughout all three kingdoms of life. Previous analyses of the heptapeptide conformation in experimentally determined ferredoxin structures revealed a pervasive right-handed topology, despite the fact that the Fe4S4 cluster is achiral. Conformational enumeration of a model CGGCGGC heptapeptide bound to a cubane iron-sulfur cluster indicates both left-handed and right-handed folds could exist and have comparable stabilities. However, only the natural ferredoxin topology provides a significant network of backbone-to-cluster hydrogen bonds that would stabilize the metal-peptide complex. The optimal peptide configuration (alternating αL,αR) is that of an α-sheet, providing an additional mechanism where oligomerization could stabilize the peptide and facilitate iron-sulfur cluster binding.


Introduction
Metals in proteins play important roles in stabilizing structure, promoting electron transfer and performing catalysis. Wholegenome analyses of phylogenetically diverse microorganisms suggest the earliest proteins incorporated metals and that metal usage over biological history evolved to match the availability of inorganic components in the environment [1,2,3]. The mechanisms by which the ligand environment modulates metal affinity and specificity are of significant interest in the study of metalloprotein evolution, function and design. Geometric requirements of metal coordination are predicted to impose specific constraints on the structure and topology of a bound polypeptide chain. In this study, we computationally model the accessible conformations of a ferredoxin-like peptide bound to an Fe 4 S 4 cubane cluster in order to better understand how a putative early metalloprotein may have evolved.
It has been proposed that a set of core genes encode proteins that carry out key redox reactions essential for promoting life and driving biogeochemical cycles [4]. These proteins would be among the earliest to emerge in the ancient oceans. Identifying members of this set of core genes is an important step in understanding the evolution of microbial metabolism and emergent biogeochemical cycles. A number of features of ferredoxins make them an attractive as key players in the evolution of redox active proteins. Sequence analysis suggests that ferredoxins evolved very early in the origins of biological catalysis of redox reactions [5,6]. All ferredoxins have a simple, conserved fold that binds two Fe 4 S 4 clusters and is composed of fifty to sixty amino acids. Sequence and structural symmetry suggest it may have evolved from a gene duplication event of a thirty amino acid sequence, each capable of binding one iron-sulfur cluster [7,8,9,10]. An early study of the ferredoxin sequence by Eck and Dayhoff in 1961 revealed even shorter repeats of four amino-acids [5], suggesting a prebiotic ''protoferredoxin'' was potentially composed of a primeval subset of the twenty amino acids [11,12]. Midpoint potentials (2700 to 2300 mV) of ferredoxins are lower than most other proteins, consistent with the mildly reducing early oceans [13,14].
It has been speculated that the iron-sulfur cluster utilized in many redox proteins [15] may be an evolutionary relic of prebiotic chemistry catalyzed by mineral surfaces. Mineral surfaces can effectively adsorb and concentrate organic molecules and catalyze various chemical reactions implicated in the origin of nonequilibrium redox reactions. Chiral mineral surfaces can selectively interact with chiral amino acids, and thus have been extensively studied as a potential origin of life on Earth [16]. Ironsulfur mineral surfaces especially have gained much attention in the context of deep-sea iron-sulfur rich hydrothermal vents where the earliest biologically relevant redox reactions are postulated to have occurred [17,18].
Assuming ferredoxin is one of the select core genes that originated from a mineral surface catalyst -what might intermediates in this progression from mineral to protein look like? (Figure 1): (A) Iron-sulfur minerals such as pyrite and mackinawite can spontaneously catalyze carbon fixation to generate essential organic molecules for life [19,20,21,22], (B) The regular mineral concentrates amino acids [23], permitting new chemistry or enhancing existing reactions. (C) Condensation of small polypeptides occurs at the water-mineral interface [24]. These polypeptides could have sequences similar to Dayhoff's proposed tetrapeptides [25] and would be capable of stabilizing specific oxidation states of bound iron-sulfur fragments. (D) Small polypeptides are used as components of ferredoxin-like proteins. This is the transition from prebiotic chemistry to life and could occur within the context of models for such a transition such as an RNA-world where peptides are co-opted by small RNA hairpins [26]. (E) Ferredoxin is retained in all kingdoms and becomes a domain of larger proteins that include many of the core redox genes of life. Although each of these stages is poorly understood and arguably controversial, this conceptual framework allows the design of specific simulations and experiments to explore the feasibility of ferredoxin evolution from a mineral precursor.
The structural properties of a putative proto-ferredoxin peptide in Stage C have implications beyond origins of life models to metalloprotein design. Although several iron-sulfur binding sites have been designed into existing proteins [27,28] and de novo folds [29,30,31], very few have shown any significant stability to cycles of oxidation-reduction, diminishing their utility in catalysis or bioenergy applications [32,33]. By elucidating the geometric and energetic constraints on a polypeptide bound to an iron-sulfur cluster, one can potentially understand the physical rules governing biological redox reactions and the designing novel protein structures.
In the ferredoxin fold, iron-sulfur cluster has a quasi-tetrahedral structure with four coordination sites, which are most commonly occupied by four cysteine thiolates. The iron-sulfur cluster itself is achiral and the protein topology is mainly dependent on how the cysteine groups from a peptide chain are linked with four iron atoms in the cluster [34]. Topologically, two different modes of protein-cluster interactions, right-handed or left-handed, are possible ( Figure 2). These two topological states cannot be superimposed onto each other by bending or stretching the representative molecular graphs [34]. Previous studies analyzing iron-sulfur proteins in the Protein Data Bank (PDB) reported that all redox active proteins had a right-handed fold; although lefthanded configurations existed for redox inactive proteins [35].
Herein, we present the work that elucidates why a right-handed heptapeptide topology may have evolved in the context of metalprotein energetics.

Definition of handedness in protein folds
The achiral iron-sulfur (Fe 4 S 4 ) cluster has a D 2d point group symmetry and is generally bonded to four cysteine thiolate groups [36,37]. Three of the coordination sites are occupied by cysteine thiolates from a conserved heptapeptide sequence motif (CXXCXXC) and the remaining fourth coordination site is occupied by an outlier cysteine, which is most frequently followed by a proline (CP) [38]. This particular binding motif accounts for approximately 25% (36 out of 137) of iron-sulfur binding motifs from 104 crystal structures available from PDB (Table S1). Among the CXXCXXC motifs, about 85% (31 out of 36) have a ferredoxin fold and approximately 15% have globin-like folds and others as defined by Structural Classification of Proteins (SCOP) [39]. Topologically, the CXXCXXC heptapeptide motif can interact with an iron-sulfur cluster in two different ways, righthanded or left-handed ( Figure 2). For the discussion of these topological states, we quantitatively describe the handedness of the folding using a ''topology angle'', h aligning the outlier cysteine on a z-axis of an internal coordinate frame ( Figure 3). Once the outlier cysteine is specified, handedness in this study is defined relative to the N-to C-terminus chain direction, either proceeding clockwise (right-handed: 0u,h,90u) or counterclockwise (lefthanded: 90u,h,180u) around the cluster ( Figure 4). The outlier cysteine residue can be located before or after the CXXCXXC motif (CP…CXXCXXC or CXXCXXC…CP).

Author Summary
The ferredoxin fold is one of the oldest structures capable of catalyzing electron transfer reactions. In nature, only a right-handed topology exists in the ferredoxin fold. To understand how a specific fold-handedness was selected, we analyzed the structural motif using the tools of de novo protein design, searching in an unbiased fashion for backbone geometries that can favorably interact with the tetrahedral iron-sulfur cluster. In silico, we found both left-handed and right-handed folds can be formed, however the right-handed folds provide up to six hydrogen bonds that can stabilize the reduced iron-sulfur cluster, whereas left-handed folds at most form three hydrogen bonds. The difference in electrostatic conformational energy may have influenced selection of topology early in the evolution of iron-sulfur cluster containing proteins. This observation led us to establish a fundamental protein design principle that only right-handed peptide folds can properly interact while maintain redox function. Our results provide guidance in the creation of artificial proteins capable of carrying out redox reactions.

Analysis of experimentally determined protein structures from PDB
Since the initial analysis on protein structure database [35], the number of solved protein structures has increased at an exponential rate. A non-redundant subset (30% sequence similarity filter) of the PDB was searched for structures with an iron-sulfur (Fe 4 S 4 ) cluster coordinated by a CXXCXXC sequence. The topology angle, h, was calculated from the PDB coordinates ( Figure 4). A histogram of the topology angles reveals that only right-handed folds are involved in an iron-sulfur cluster binding ( Figure 5). The CXXCXXC motif always has a topology angle around 75u.

Computational simulation with protCAD & AMBER
Left-handed configurations of CXXCXXC were not observed, leading us to examine whether such configurations were energetically plausible. An ensemble of CGGCGGC polypeptide configurations was generated. Glycine was chosen for non-Cys positions due to its high backbone flexibility, ensuring the primary conformational constraints came from metal-peptide interactions. The protCAD software platform(protein Computer Assisted Design) [29,40] was used to exhaustively enumerate all combinations of backbone and sidechain torsions in 60u intervals for W,y and 120u intervals for the cysteine x 1 rotamer ( Figure 6 and Figure 7). Out of 5.8610 10 (3 3 66 12 ) configurations, 232 exhibited net-favorable van der Waals interactions (less than 0 kcal/mol), Fe cluster ??? S c distances (,3 A) and C b -S c ???Fe cluster angles (120u to 180u) that would permit binding to an iron-sulfur cluster. The protein structures were then minimized in AMBER to reduce strain from distortions caused by discrete conformation sampling [41].
Topology angles of the computationally generated dataset clustered into two distinct populations -right and left-handed folds -suggesting the CGGCGGC heptapeptide could bind to the ironsulfur cluster with either topology (Figure 8). In fact, the simulation identified more left-handed structures (67%) than right-handed structures (32%), indicating left-handed topologies were entropically favorable. Conducting the same simulation on CAACAAC resulted in 54% left-handed and 46% right-handed structures, suggesting that the steric hindrance of amino acid side chains itself is not sufficient to discriminate the handedness of the topological state. A histogram of the energy distributions for left and righthanded topologies show no significant difference (Figure 9), indicating intrinsic stability of the fold alone is unlikely to account for evolution of a unique topology.

Hydrogen bonds in iron-sulfur proteins
The reduced state of the iron-sulfur cluster can be stabilized by hydrogen bonds contributed by nearby backbone amides [42]. The number of hydrogen bonds around the iron-sulfur cluster is also related to the solvent accessibility to the cluster, thereby tuning the midpoint potential [43,44]. A typical ferredoxin fold exhibits six such interactions with backbone amides directing the proton toward the cluster. Hydrogen bond formation is at the expense of unfavorable backbone dihedral angles, particularly the positive W values at X 2 and X 3 positions (Table S2).
For the analysis of the hydrogen bonding environment of computationally generated structures, interactions were counted based on discrete distance and angular cutoffs: a hydrogen-sulfur   distance less than 3.5 Å and N-H???S angles between 120 to 180u [45]. The number of hydrogen bonds between nitrogen and sulfur were counted based on cutoffs: 3.8 Å and 110 to 180u. Righthanded folds could accommodate six hydrogen bonds, but a maximum of three hydrogen bonds were found in structures with left-handed folds ( Figure 10).

Energetics of peptide-cluster interactions
The electrostatic stabilization of a bound cluster by proximal backbone amides was estimated by comparing the total energies of charged versus uncharged clusters in the context of a coordinating peptide. The net contribution of hydrogen bonds can represented several ways: the average of pairwise distances between hydrogen and sulfur atoms ( Figure 11A) and discrete number of hydrogen bonds plotted against the peptide-cluster interaction energies ( Figure 11B). The interaction energy improves as the distances between sulfur atoms to hydrogen atoms are reduced. The result also indicates that the right-handed peptide-cluster interaction can have a stabilization effect up to 280 kcal/mol, whereas a lefthanded fold can only achieve 250 kcal/mol. For comparison, we generated a CGGCGGC peptide using coordinates from experimental ferredoxin structures, including proteins with nonferredoxin fold ( Figure 11A inset, Supplementary data). The right-handed topology in natural ferredoxin and non-Fd proteins presents a network of stabilizing backbone amides that interact strongly with the Fe 4 S 4 cluster. The result shows the best righthanded structure contributes more stabilizing hydrogen bonds than the best left-handed structure. Additionally, the inset to figure 11A reveals tightly clustered experimental results, all which   cluster around the same right-handed configuration and present six hydrogen bonds, suggesting the right-handed heptapeptide topology is a unique entactic state.

Conclusion
A microscopic analysis of the Fe 4 S 4 binding region of ferredoxin provides some insights into the predicted features of an ancient, short proto-ferredoxin. The right-handed topology observed in redox-active iron-sulfur proteins is not dictated by the peptide chain. In fact, left-handed chain topologies are entropically favored and have slightly improved stabilities in the absence of the cluster. Only when considering electrostatic interactions with the cofactor is the natural right-handed topology the optimal solution. Thus short CxxCxxC peptides alone are unlikely to serve as early redox active species without additional external stabilizing interactions. These may have taken the form of longer sequences with super-secondary structure such as those in designed peptide maquettes [38,46]. It is interesting to note that the model conformation with the best peptide-cluster interaction energy and the ferredoxin-like conformations are both an a-sheet, characterized by residues in alternating a L and a R conformations. This motif was first described by Pauling and Corey as the 'pleated sheet' [47]. a-sheets are thought to be intermediates in a number of protein aggregation disorders [48,49]. The conformation is also implicated in early peptides due to their anion binding properties [50]. It is possible that stabilization of a-sheets provides the entactic state required for favorable cluster binding. The identification of a specific iron-sulfur binding topology may point the way to a mechanism by which the first core metalloproteins evolved.

Topology angle
To have a quantitative measure for the fold-handedness, an arbitrary plane was defined with two vectors, which were defined by C a coordinates from three cysteine residues. The topology angle, a quantitative measure of fold-handedness, was then defined as the angle between a normal vector of the arbitrary plane and a vector from the middle cysteine C a to the cluster. By definition, the quantitative measurement of fold-handedness (topology angle) can take any numeric value from 0u to 180u.
protCAD Iron-sulfur cluster coordinates were extracted from the PDB file, 2FDN. We created a hybrid artificial amino acid residue (Clu) by linking an iron-sulfur cluster to a cysteine residue. The artificial   amino acid was added to the amino acid library of protCAD. Initially a peptide ensemble (Cys-Gly-Gly-Cys-Gly-Gly-Cys) was created and subsequently the central Cys was substituted to Clu. For a given ensemble, there are six W (C9-N-C a -C9), six y (N-C a -C9-N). For each cysteine residue, there three x 1 (N-C a -C b -S c ) dihedral angles. For the central iron-sulfur cluster fused cysteine residue, there are additional dihedral angles, which are x 2 (C a -C b -S c -Fe Clu ) and x 3 (C b -S c -Fe Clu -S Clu ). All phi and psi dihedral angles were increased by a step size of 60u and all chi dihedral angles were set at 2180u, 260u, or 60u. The entire protein structural space was searched by the permutations of seventeen dihedral angles. Plausible protein structures were then determined by geometric parameters, such as a distance from S c to Fe Clu with a cutoff (,3.0A). Energy parameters calculated based on a Lennard-Jones equation [45] was also used to detect feasible structures (total energy,0 kcal/mol).

AMBER 11
The structures obtained from the ProtCAD simulations were subjected to energy minimization calculations using Amber 11 [51], with a generalized Born solvent model [52,53]. Protein atoms were described with the parm99SB [54,55,56] force field parameterization. The atomic charges were modified so that an oxidized Fe 4 S 4 +2 cluster bound to 3 Cys had a net charge of 21, yielding the following charges: qFe = 0.6518 e, qS (cluster) = 20.5552 e, qSG (cysteine) = 20.6042 e. The maximum number of minimization cycles was set to 10 5 , and the structures were considered minimized when the root-mean-square of the Cartesian elements of the gradient was less than 10 24 kcal/mol-Å . To compare the degree of electrostatic stabilization of the cluster in the different peptide models, the charge of the S atoms of the Fe 4 S 4 cluster was set to zero, and a single point energy calculation was performed. A number of structures converged to an identical structure after the energy minimization process. The redundant structures were then removed by MMTSB (Multiscale Modeling Tools in Structural Biology) k-clustering algorithm [57].

Supporting Information
Table S1 List of structures collected from Protein Data Bank(PDB). Structures containing a CXXCXXC binding motif with 30% sequence similarity were collected. The most common iron-sulfur cluster binding motif is CXXCXXC with two types of outlier positions: Type A: (CXXCXXC….C) Type B: (C….CXXCXXC). (DOC)

Table S2
Alternating a L ,a R secondary structure, also known as alpha-sheet, characterized by positive phi dihedral angles in C-X 1 -X 2 -C-X 3 -X 4 -C motif (Protein structures from PDB). Alpha-left (a L ) friendly amino acids (e.g. Asp, Asn, His, Lys) (1) are color coded with pale blue and residues that are unlikely to accommodate positive phi dihedral angle are noted with orange. Glycine and cysteine are colored pale green and yellow, respectively. (DOC)