Advertisement

Mechanical Strength of 17 134 Model Proteins and Cysteine Slipknots

  • Mateusz Sikora ,

    Contributed equally to this work with: Mateusz Sikora, Joanna I. Sułkowska, Marek Cieplak

    Affiliation: Institute of Physics, Polish Academy of Sciences, Warsaw, Poland

  • Joanna I. Sułkowska ,

    Contributed equally to this work with: Mateusz Sikora, Joanna I. Sułkowska, Marek Cieplak

    Affiliations: Institute of Physics, Polish Academy of Sciences, Warsaw, Poland, Center for Theoretical Biological Physics, University of California, San Diego, California, USA

  • Marek Cieplak

    Contributed equally to this work with: Mateusz Sikora, Joanna I. Sułkowska, Marek Cieplak

    mc@ifpan.edu.pl

    Affiliation: Institute of Physics, Polish Academy of Sciences, Warsaw, Poland

Mechanical Strength of 17 134 Model Proteins and Cysteine Slipknots

  • Mateusz Sikora, 
  • Joanna I. Sułkowska, 
  • Marek Cieplak
PLOS
x
  • Published: October 30, 2009
  • DOI: 10.1371/journal.pcbi.1000547

Abstract

A new theoretical survey of proteins' resistance to constant speed stretching is performed for a set of 17 134 proteins as described by a structure-based model. The proteins selected have no gaps in their structure determination and consist of no more than 250 amino acids. Our previous studies have dealt with 7510 proteins of no more than 150 amino acids. The proteins are ranked according to the strength of the resistance. Most of the predicted top-strength proteins have not yet been studied experimentally. Architectures and folds which are likely to yield large forces are identified. New types of potent force clamps are discovered. They involve disulphide bridges and, in particular, cysteine slipknots. An effective energy parameter of the model is estimated by comparing the theoretical data on characteristic forces to the corresponding experimental values combined with an extrapolation of the theoretical data to the experimental pulling speeds. These studies provide guidance for future experiments on single molecule manipulation and should lead to selection of proteins for applications. A new class of proteins, involving cystein slipknots, is identified as one that is expected to lead to the strongest force clamps known. This class is characterized through molecular dynamics simulations.

Author Summary

The advances in nanotechnology have allowed for manipulation of single biomolecules and determination of their elastic properties. Titin was among the first proteins studied in this way. Its unravelling by stretching requires a 204 pN force. The resistance to stretching comes mostly from a localized region known as a force clamp. In titin, the force clamp is simple as it is formed by two parallel β-strands that are sheared on pulling. Studies of a set of under a hundred proteins accomplished in the last decade have revealed a variety of the force clamps that lead to forces ranging from under 20 pN to about 500 pN. This set comprises only a tiny fraction of proteins known. Thus one needs guidance as to what proteins should be considered for specific mechanical properties. Such a guidance is provided here through simulations within simplified coarse-grained models on 17 134 proteins that are stretched at constant speed. We correlate their unravelling forces with two structure classification schemes. We identify proteins with large resistance to unravelling and characterize their force clamps. Quite a few top strength proteins owe their sturdiness to a new type of the force clamp: the cystein slipknot in which the force peak is due to dragging of a piece of the backbone through a closed ring formed by two other pieces of the backbone and two connecting disulphide bonds.

Introduction

Atomic force microscopy, optical tweezers, and other tools of nanotechnology have enabled induction and monitoring of large conformational changes in biomolecules. Such studies are performed to assess structure of the biomolecules, their elastic properties, and ability to act as nanomachines in a cell. Stretching studies of proteins [1] are of a particular current interest and they have been performed for under a hundred of systems. Interpretation of some of these experiments has been helped by all-atom simulations, such as reported in refs. [2],[3]. They are limited by of order 100 ns time scales and thus require using unrealistically large constant pulling speeds. However, they often elucidate the nature of the force clamp – the region responsible for the largest force of resistance to pulling, . All of the experimental and all-atom simulational studies address merely a tiny fraction of proteins that are stored in the Protein Data Bank (PDB) [4]. Thus it appears worthwhile to consider a large set of proteins and determine their within an approximate model that allows for fast and yet reasonably accurate calculations. Structure-based models of proteins, as pioneered by Go and his collaborators [5] and used in several implementations [6][13], seem to be suited to this task especially well since they are defined in terms of the native structures away from which stretching is imposed.

There are many ways, all phenomenological, to construct a structure-based model of a protein. 504 of possible variants are enumerated and 62 are studied in details in ref. [14]. The variants differ by the choice of effective potentials, nature of the local backbone stiffness, energy-related parameters, and of the coarse-grained degrees of freedom. The most crucial choice relates to making a decision about which interactions between amino acids count as native contacts. Comparing to the corresponding experimental values in 36 available cases selects several optimal models [14]. Among them, there is one which is very simple and which describes a protein in terms of its atoms, as labeled by the sequential index . This model is denoted by which stands for, respectively, the Lennard-Jones native contact potentials, local backbone stiffness represented by harmonic terms that favor the native values of local chiralities, the contact map in which there are no contacts, and the amplitude of the Lennard-Jones potential, , is uniform. The contact map is determined by assigning the van der Waals spheres to the heavy atoms (enlarged by a factor to account for attraction) and by checking whether spheres belonging to different amino acids overlap in the native state [15],[16]. If they do, a contact is declared as native. Non-native contacts are considered repulsive. Application of this criterion frequently selects the contacts as native. If the contact map includes these contacts the resulting model will be denoted here as . On average, it performs worse than because the contacts usually correspond to the weak van der Waals couplings as can be demonstrated in a sample of proteins by using a software [17] which analyses atomic configurations from the chemical perspective on molecular bonds. Thus the couplings should better be removed from the contact map (in most cases).

The survey to determine in 7510 model proteins with the number of amino acids, , not exceeding 150 and 239 longer proteins (with up to 851) has been accomplished twice. First within the model [18] and soon afterwords within the model [19]. The first survey also comes with many details of the methodology whereas the second just presents the outcomes. The two surveys are compared in more details in refs. [14],[20]. The results differ, particularly when it comes to ranking of the proteins according to the value of , but they mutually provide the error bars on the findings. They both agree, however, on predicting that there are many proteins whose strength should be considerably larger than the frequently studied benchmark – the sarcomere protein titin ( of order 204 pN [21],[22]). Near the top of the list, there is the scaffoldin protein c7A (the PDB code 1aoh) which has been recently measured to have of about 480 pN [23]. Other findings include establishing correlations with the CATH hierarchical classification scheme [24],[25], such as that there are no strong proteins, and identification of several types of the force clamps. The large forces most commonly originate in parallel that are sheared [26]. However, there are also clamps with antiparallel , unstructured strands, and other kinds.

The two surveys have been based on the structure download made on July 26, 2005 when the PDB comprised 29 385 entries. Many of them correspond to nucleic acids, complexes with nucleic acids and with other proteins, carbohydrates, or come with incomplete files and hence the much smaller number of proteins that could be used in the molecular dynamics studies. Here, we present results of still another survey which is based on a download of December 18, 2008 which contains 54 807 structure files and leads to 17 134 acceptable structures with not exceeding 250 (instead of 150). These structures are then analyzed through simulations based on the model. The numerical code has been improved to allow for acceleration of calculations by a factor of 2.

The 190 structures (or 1.1% of all structure considered) with the top values of in units of are shown in Table 1 (the first 81 entries for which ) and Table S1 of the SI (proteins ranked 82 through 190), together with the values of titin (1tit) and ubiquitin (1ubq) to provide a scale. As argued in the Materials and Methods section section, the unit of force, , is now estimated to be of order 110 pN. All of the corresponding proteins are predicted to be much stronger than titin and none but two of them (1aho, 1g1k [23]) have been studied experimentally yet. In addition to the types of force clamps identified before, we have discovered two new mechanisms of sturdiness. One of them involves a cysteine slipknot (CSK) and is found to be operational in all of the 13 top strength proteins. In this motif, a slip-loop is pulled out of a cysteine knot-loop. Another involves dragging of a single fragment of the main chain across a cysteine knot-loop. The two mechanisms are similar in spirit since both involve dragging of the backbone. However, in the CSK case, two fragments of the backbone are participating.

thumbnail
Table 1. The predicted list of the strongest proteins.

doi:10.1371/journal.pcbi.1000547.t001

We make a more systematic identification of the CATH-classified architectures that are linked to mechanical strength and then analyze correlations of the data to the SCOP-based grouping (version 1.73) [27][29]. The previous surveys did not relate to the SCOP scheme.

We identify the CATH-based architectures and SCOP-based folds that are associated with the occurrence of a strong resistance to pulling. A general observation, however, is that each such group of structures may also include examples of proteins that unravel easily. The dynamics of a protein are very sensitive to mechanical details that are largely captured by the contact map and not just by the appearance of a structure. On the other hand, if one were to look for mechanically strong proteins then the architectures and folds identified by us should provide a good starting point. We also study the dependence of on the pulling velocity and characterize the dependence on through distributions of the forces.

The current third survey has been performed within the same model as the second survey [19]. However, we reuse and extend it here because the editors of Biophysical Journal retracted the second survey [30]. All of the values of are deposited at the website www.ifpan.edu.pl/BSDB (Biomolecule Stretching Database) and can by accessed by through the PDB structure code.

Results/Discussion

Distribution of Forces

The distribution of all values of for the full set of proteins is shown in Figure 1. Despite the larger limit on now allowed, the distribution is rather similar to that obtained in ref. [19] for the smaller number of proteins (and with the smaller sizes). The similarity is primarily due to the fact that the size related effects, discussed below, are countered by new types of proteins that are now incorporated into the survey. The distribution is peaked around of which constitutes about 60% of the strength associated with titin. The distribution is non-Gaussian: it has a zero-force peak and a long force tail. The zero-force peak arises in some proteins with the covalent disulphide bonds. In the model, such bonds are represented by strong harmonic bonds. Stretching of such a protein may not result in any force peak before a disulphide bond gets stretched indefinitely and hence is considered to be vanishing then. The tail, on the other hand, corresponds to the strong proteins. The top strongest 1.1% of all proteins are listed in Tables 1 (in the main text) and S1 (in the SI).

thumbnail
Figure 1. Probability distribution of the maximal forces obtained in the set of 17 134 model proteins (solid line).

The shaded histogram corresponds to the 7510 proteins studied in ref. [19]. The insets show similar distributions for the CATH-based classes indicated. The numbers underneath the class symbols give the size of the set of the proteins considered.

doi:10.1371/journal.pcbi.1000547.g001

The insets of Figure 1 show similar distributions for proteins belonging to the particular CATH-based classes. There are four such classes: , , and proteins with no apparent secondary structures. It is seen that none of the 3240 proteins exceeds the peak force obtained for titin within our model. This observation is in agreement with experiments on several proteins that are listed in the Materials and Methods section. All strong proteins are seen to involve the . The peak in the probability distribution for the proteins is observed to be shifted towards the bigger values of compared to the one for the proteins. At the same time, the high force tail of the distribution for the proteins is substantially more populated than the corresponding tail for the proteins.

Figure 2 is similar to Figure 1 in spirit, but now the structures are split into particular ranges of the protein sizes: between 40 and 100 (the dotted line), between 100 and 150 (thin solid line), and between 200 and 250 (the thick solid line). The curve for the range from 150 to 200 is in-between the curves corresponding to neighboring ranges and is not shown in order not to crowd the Figure. The distributions are seen to be shifting to the right when increasing the range of the values of indicating, that the bigger the number of amino acids, the more likely a protein is to have a large value of . This observation holds for all classes of the proteins, as evidenced by the insets in Figure 2.

thumbnail
Figure 2. Similar to Figure 1 but for proteins belonging to specific ranges of the sequential sizes, as indicated by the symbols a, b, and c.

doi:10.1371/journal.pcbi.1000547.g002

In most cases, the major force peak arises at the begining of stretching where the Go-like model should be applicable most adequately. One can characterize the location of during the stretching process by a dimensionless parameter which is defined in terms of the end-to-end distance, as spelled out in the caption of Table 1. This parameter is equal to 0 in the native state and to 1 in the fully extended state. In 25% of the proteins studied in this survey, was less than 0.25 and in 52% – les than 0.5. There are very few proteins with exceeding 0.8.

Table 1 does not include any (non-cysteine-based) knotted proteins. The full list of 17 134 proteins contains 42 such proteins but they come with moderate values of . However, knotted proteins with may turn out to have different properties.

Biological properties of the strongest proteins

A convenient way to learn about the biological properties listed in Tables 1 and S1 is through the Gene Ontology data base [31] which links such properties with the PDB structure codes. The properties are divided into three domains. The first of these is “molecular function” which describes a molecular function of a gene product. The second is “biological processes” and it covers sets of molecular events that have well defined initial and final stages. The third is “cellular component” and it specifies a place where a given gene product is most likely to act.

The results of our findings are summarised in Table 2. It can be seen, that most of the 190 strongest proteins are likely to be found in an extracellular space where conditions are much more reducing than within cells. Larger mechanical stability is advantageous under such conditions. 90 out of the strongest proteins exhibit hydrolase activity. 39 of these 90 are serine-type endopeptidases. These findings seem to be consistent with expectations regarding proteins endowed with high mechanical stability. For instance, proteases, which are well represented in Table 2 should be more stable to prevent self-cleavage.

thumbnail
Table 2. Gene Ontology terms for the top 190 proteins.

doi:10.1371/journal.pcbi.1000547.t002

CATH-based architectures

The classification of proteins within the CATH (Class, Architecture, Topology, Homology) data base is done semi-automatically by applying numerical algorithms to structures that are resolved better than within 4 Å [24],[25]. The four classes of proteins in the CATH system are split into architectures, depending on the overall spatial arrangement of the secondary structures, the numbers of in various motifs, and the like. The next finer step in this hierarchical scheme is into topologies and it involves counting contacts between amino acids which are sequentially separated by more than a treshold. The further divisions into homologous superfamilies and then sequence family levels involve studies of the sequential identity.

We have found that only six architectures contribute to larger than . These are ribbons – 2.10 (41.8% of the proteins listed in Table 1), – 2.40 (8.9%), – 2.60 (16.3%), – 3.10 (5.4%), 3-layer (aba) sandwiches – 3.40 (5.4%), and these with no CATH classification to date (21.8%). The corresponding distributions of forces are shown in the top six panels of Figure 3 and the topologies involved are listed and named in Table 3.

thumbnail
Figure 3. The top six panels show probability distributions of for the architectures that contribute to the pool of proteins with large forces.

The architectures are indicated by their names and the accompanying CATH numerical symbol. The numbers underneath the symbols of the architecture inform about the number of cases contributing to the distribution. The bottom two panels show examples of architectures that are predicted to yield only small values of .

doi:10.1371/journal.pcbi.1000547.g003

thumbnail
Table 3. CATH classes (C), architectures (A), and topologies (T) contributing to the top strength proteins.

doi:10.1371/journal.pcbi.1000547.t003

Examples of architectures that are dominant contributors to a low force behavior are the orthogonal bundle (the right bottom panel of Figure 3), the up-down bundle, and the (the left bottom panel of Figure 3).

SCOP-based classes and folds

The SCOP (Structural Classification of Proteins) data base [27][29] is curated manually and it relies on making comparisons to other structures through a visual inspection. This classification scheme is also hierarchical and the broadest division is into seven classes and three quasi-classes. The classes are labelled through and these are as follows: mainly (), mainly (), which groups proteins in which helices and are interlaced (), with the helices and grouped into clusters that are separated spatially (), multidomain proteins (), membrane and cell-surface proteins (), and small proteins that are dominated by disulphide bridges or the heme metal ligands (). The quasi-classes are labelled through and they comprise coiled-coil proteins (), structures with low resolution (), and peptides and short fragments (). The classes are then partitioned into folds that share spatial arrangement of secondary structures and the nature of their topological interlinking. Folds are then divided into superfamilies (same fold but small sequence identity) and then families (two proteins are said to belong to the same family if their sequence identity is at least 30%). Families are then divided into proteins – a category that groups similar structures that are linked to a similar function. Proteins comprise various protein species.

Each structure assignment comes with an alphanumeric label, as shown in Tables 1, S1, and 4 which reflects the placement in the hierarchy. At the time of our download, there have been 92 972 entries in the SCOP data base that are assigned to 34 495 PDB structures. These entries are divided into 3464 families, 1777 superfamilies and 1086 unique folds. A given structure may have several entry labels but the dominant assignment is listed first. We use the primary assignment in our studies. The same rule is also applied to the CATH-based codes.

thumbnail
Table 4. SCOP classes (C) and folds (F) contributing to the top strength proteins.

doi:10.1371/journal.pcbi.1000547.t004

Figure 4 shows the distributions of forces for the SCOP-based classes of proteins. The results are consistent with the CATH-based classes since the class of CATH basically encompasses the and classes of SCOP. However, there are proteins which are classified only according to one of the two schemes. Thus there are 4431 proteins out of which only the total of 3368 is SCOP-classified as belonging to the and classes. At the same time, the total of the proteins in the and classes we have is 4795.

thumbnail
Figure 4. Distributions of for the SCOP-based classes for which there are more than 60 structures that could be used in molecular dynamics studies.

The cases that are not shown are: class e (27 structures), quasi-class i (5 structures), and quasi-class j (52 structures). The bottom right panel corresponds to structures which have no assigned SCOP-based structure label. The numbers indicate the corresponding numbers of structures studied.

doi:10.1371/journal.pcbi.1000547.g004

It should be noted that the peak in the distribution for is shifted to higher forces by about from the peak for . At the same time, the zero-force peak is virtually absent in . The SCOP-based classification also reveals that its class contributes across the full range of forces and, in particular, it may lead to large values of . It should be noted, as also evidenced by Table 1, that there is a substantial number of strong proteins that has no class assignment.

Figures 5 and 6 refer to the distributions of across specific folds. The first of these presents results for the folds that give rise to the largest forces. The names of such folds are specified in Figure 5. The percentage-wise assessment of the folds contributing to big forces is presented in Table 4. The top contributor is found to be the b.47 fold (SMAD/FHA domain). Figure 6 gives examples of folds that typically yield low forces.

thumbnail
Figure 5. Distributions of for eight folds that may give rise to a large resistance to pulling.

doi:10.1371/journal.pcbi.1000547.g005

thumbnail
Figure 6. Distribution of for eight folds that are likely to yield a small resistance to pulling.

doi:10.1371/journal.pcbi.1000547.g006

It is interesting to note that distributions corresponding to some folds are distinctively bimodal, as in the case of the SMAD/FHA fold (b.47). This particular fold is dominated by SMAD3 MH2 domain (b.47.1.2; 352 structures) which contributes both to the high and low force peaks in the distribution. The remaining domains (b.47.1.1, b47.1.3, and b47.1.4) contribute only to the low force peak. The dynamical bimodality of the b.47.1.2 fold can be ascribed to the fact that the strong subset comes with one extra disulphide bond relative to the weak subset. This extra bond provides substantial additional mechanical stability when stretching is accomplished by the termini. We illustrate sources of this bimodality in the SI (Figure S1) for two proteins from this fold: 1bra which is strong and 1elc which is weak. In ref. [18], we have noted that various sets of proteins with identical CATH codes (e.g., 3.10.10) may give rise to bimodal distributions without any dynamical involvement of the disulphide bonds. The reason for this is that even though the contact maps for the two modes are similar, the weaker subset misses certain longer ranged contacts which pin the structure. Mechanical stability is more sensitive to structural and dynamical details than are not provided by standard structural descriptors.

Force clamps

Shearing motif.

The most common type of the force clamp identified in the literature is illustrated in the top left panel of Figure 7 corresponding to the 14th-ranked protein 1c4p. In this case, the strong resistance to pulling is due to a simultaneous shearing of two which are additionally immobilised by short that adhere to the two strands. Similar motifs appears in 1qqr(15), 1j8s(17), 1j8r(19), 1f3y(20), 2pbt(29), 2fzl(15), 1aoh(19), where the number in brackets indicate ranking as shown in Table 1. It is interesting to note that the responsible for the mechanical clamp in 1j8s and 1j8r display an additional twist. Undoing the twist enhances . (There is a similar mechanism that seems to be operational in the case of a horseshoe conformation found in ankyrin [32],[33]). The force clamps are identified by investigating the effect of removal of various groups of contacts on the value of [12],[18].

thumbnail
Figure 7. Examples of force clamps found in the top strength proteins.

The relevant disulphide bonds are shown in gray shade. The PDB codes of the examples of the proteins that show the particular type of a clamp are indicated. In the case of the CSK, the numbers indicate sequential locations of the amino acids participating in a disulphide bridge in the 13-ranked 1vpf.

doi:10.1371/journal.pcbi.1000547.g007

There are, however, new types of the force clamps that we observe in the proteins listed in Tables 1 and S1. They arise from entanglements resulting from the presence of the disulphide bonds which cannot be ruptured by forces accessible in the atomic force microscopy. We note that about 2/3 of the proteins listed in Tables 1 and S1 contain the disulphide bonds. Many of these bonds do not carry much of dynamical relevance when pulling by the termini. However, in certain situations they are the essence of the force clamp. The disulphide bonds have been already identified as leading to formation of the cystein knot (CK) motifs [34],[35] (such proteins are found in the toxins of spiders and scorpions) and the cyclic CK motifs [36],[37]. Here, we find still another motif – that of the CSK which is similar to that found in slipknotted proteins [38][40] which do not conatin the disulphide bonds. This motif is found in the top 13 proteins. The cysteine loop, knot, and slipknot motifs are shown schematically in the remaining panels of Figure 7. It is convenient to divide these motifs into two categories: shallow (S) and deep (D) (according to the classification used for knotted proteins [41],[42]), depending on whether the motif is spanning most of the sequence or is instead localized in its small fraction.

Shearing connected with a cysteine loop.

In this case, the mechanical clamp arises from shearing between a belonging to a deep cysteine loop and another strand located outside the loop (the left bottom panel of Figure 7). Existence of the disulphide bond before the shearing motif allows to decompose direct tension onto the making the protein resist stretching much more effectively than what would be expected from a simple shearing motif. Additionally, the disulphide bonds prevent an onset of any rotation in the protein conformation which otherwise might form an opportunity for unzipping. This motif appears in 1dzj(40,D) 1vsc(37,D), 1dzk(35,D), 1i04 (81,D), 1hqp(83,D), 1oxm(98,D), 2a2g (175,D), 2boc(179,D), and many other proteins. The middle panel of Figure 8 gives an example of the corresponding force () – displacement () pattern as obtained for 1dzj.

thumbnail
Figure 8. Examples of the force patterns corresponding to proteins with the disulphide bonds.

doi:10.1371/journal.pcbi.1000547.g008

Shearing and dragging out of a cysteine loop.

This motif consists of two parts. The first is formed by a rather small and deep cysteine loop which is located very close to one terminus with the second terminus located across the cysteine loop. The motif arises when almost all of the protein backbone is dragged across the cysteine loop on stretching. A protein structure also contains a few which get sheared before dragging takes place. This motif is seen in 1kdm(23,D), 1q56(24,D), 1qu0(33,D), 1f5f(34,D) and this geometry of pulling we call geometry I. It should be pointed out that, in all such cases, pulling by the N terminus takes place within (or very near) the plane formed by the cysteine loop. A small change in such a geometry, e.g. the one arising from pulling not by the last amino acid but by the penultimate bead, may cause getting out of the cystein loop and result in a very different unfolding pathway with a distinctly different value of . In this other kind of pulling set up, denoted as geometry II, the loop is bypassed and the resistance to pulling is provided only by the shearing mechanism.

Dragging arises from overcoming steric constraints and generates an additional contribution to the strength of the standard shearing mechanical clamp. By using geometry II and also by eliminating the native contacts between the sheared we can estimate the topological contribution of the dragging effect on the value of . For proteins 1kdm, 1q56, 1qu0, 1f5f, it comes out to be around 25%. The force patterns corresponding to these two geometries of pulling are shown in top panel of Figure 9.

thumbnail
Figure 9. Top: Two trajectories arising in protein 1qu0.

Dragging occurs when the backbone is pulled across the cysteine loop. Shearing occurs when the pull across the cystein loop does not take place. Bottom: The force-displacement pattern corresponding to the CSK force clamp in 2h64 (thick line). The thin line shows the corresponding pattern when one removes the attractive contacts that are slipknot related.

doi:10.1371/journal.pcbi.1000547.g009

In the survey, there are other proteins which also have disulphide bonds and belong to the 2.60.120.200 category. These proteins have a cysteine which is either very shallow or deep, but is located in the middle of the protein backbone so that there is no possibility to form a long . In this case, the dragging effects are much smaller. For instance, for 1pz7(D) and 1cpm(S), is close to .

Shearing inside of a cysteine knot.

This motif is created by a loosely packed CK (two or more spliced cysteine loops) with at least two parallel strands that are present within the knot. Pulling protein by termini exerts tension on the entire CK and thus produces an indirect shearing force on the inside the entangled part of the protein. In this case, elimination of the native contacts between the reduces only partially indicating that the mechanical clamp is created also by the CK. A simple CK is also found in 2bzm(42) and many other proteins, e.g. in 2g7i(77,S), 1hfh103,S), 2g4x(136,D), 2g4w(169,D). The patterns for 2bzm and 2g4x are shown in the bottom panel of Figure 8. More complex structures or higher order CKs (with more than two cystein bonds) can be identified in 1afk(85), 1afl(117), or 1aqp(135). Inside this group of proteins there are also examples of proteins – 1qoz(88,S) – in which a cysteine loop is braided to a CK by some native contacts.

Cysteine slipknot force-clamp is observed in the strongest 13 proteins.

The top strength protein is 1bmp (bone morphogenic protein) with the predicted of , which should correspond to about 1100 pN (see Materials and Methods). This strength should be accessible to standard experiments as the atomic force microscopy has been already used to rupture covalent N-C and C-C bonds by forces of 1500 and 4500 pN respectively [43].

In our discussion, we focus on the 13-ranked 1vpf (a vascular endothelial growth factor) with the predicted of . The CSK motif arises from two loops [40]: the knot-loop and the slip-loop, where the slip-loop can be threaded across the knot-loop. One needs at least three disulphide bonds for this motif to arise.

In the case of the 1vpf, the knot-loop is created by the disulphide bonds between amino acids 57 and 102, 61 and 104, and the protein backbone between amino acids 57–61 (GLY,GLY,CYS) and 102–104 (GLU). The slip-loop is created by the protein backbone between sites 61–102 and is stabilized by 12 hydrogen bonds between two parallel . In the CSK motif, the force peak is due to dragging of a slip- loop through the knot-loop making the native hydrogen contacts only marginally responsible for the mechanical resistance. Thus the force peak arises, to a large extent, from overcoming steric constraints, i.e. it is due to repulsion resulting from the excluded volume. The pattern for this novel type of a force clamp is shown in the top panel of Figure 8. Another example of such a pattern for a CSK is shown in the bottom panel of Figure 9 for the 22nd ranked 2h64 (a human transforming growth factor). The leading role of the steric constraints is verified by checking the reduction of the when all the slipknot-related contacts (inside the slip-loop and between the slip-loop and the knot-loop) are converted to be purely repulsive. As a result of this bond removal, the force peak persists, though it gets shifted and becomes smaller. This is summarized in Table S2 in the SI. It is a new and unexpected result.

Another way to establish the role of the CSK motif is to create the disulphide-deficient mutants, as accomplished experimentally [44] for 1vpf. The two mutants, 1mkk (C61A and C104A) and 1mkg (C57A and C102A), have structures similar to 1vpf but contain no knot-loops and thus there is no slipknot. Muller et al. [44] show that the mutants' thermodynamic stability is not reduced but their folding capacity is. Our work shows that the mutants have a reduced resistance to pulling compared to 1vpf: drops from for 1mkk and 1mkg respectively.

We note that the CSK topology is a subgroup inside the CK class (represented mostly by 2.10.90.10) and the CSK force clamp need arise for a particular way of pulling. For instance, proteins 1afk(68), 1afl(100) or 1aqp(118) have up to four disulphide bonds and yet the CSK motif does not play any dynamical role in pulling by the terminal amino acids. In the case of the CSK, we observe a formidable dispersion in the values of . For example, it ranges between for various trajectories in 1vpf, 2h64, and 2c7w respectively. We now examine the CSK geometry in more details.

Cysteine slipknot motif is distinct from the slipknot motif in several ways.

The left-most panel of Figure 10 shows a slipknot with three intersections at sequential locations , , and . This geometry is topologically trivial since when one pulls by the termini, the apparent entanglement may untie and become a simple line. The entanglement would form the trefoil knot if the intersection was removed by redirecting the corresponding segment of the chain (thin line) away from the loop. Such slipknot motifs have been observed in native states of several proteins [38][40]. In contrast, the CSKs are not present in the native state but arise as a result of mechanical manipulation. The middle panel of Figure 10 shows a schematic representation of a native conformation with three cysteine bonds: between and , between and , and between and . The of the bonds are counted as being closer to the N-terminus. The three bonds are in a specific arrangement as shown in the panel. In particular, the bond must cross the loop . This loop consists of two pieces of the backbone ( and ) that are linked to form a closed path by the two remaining cysteine bonds – it is the cysteine knot-loop. The average radius of this loop is denoted by .

thumbnail
Figure 10. Geometry of a slipknot and a cystein slipknot.

The top panel corresponds to a genuine slipknot. The bottom left panel is a schematic representation of the native geometry that yields the cystein slip-knot on stretching. The resulting cystein slipknot motif is shown in the bottom right panel.

doi:10.1371/journal.pcbi.1000547.g010

The arrangement shown in the middle panel has no entanglements that could be considered as knots in the topolgical sense. However, on pulling by the termini, the chain segment adjacent to gets threaded through the knot-loop since is rigidly attached to , as illustrated in the rightmost panel of Figure 10. Pulling by also results in generating another loop – the cysteine slip-loop – since the segment around gets bent strongly to form a cigar like shape with the radius of curvature at the denoted by . This loop extends between and . It should be pointed out that the cysteine knot-loop in the CSK is stiff whereas in a slipknotted protein (such as the thymine kinase) its size is variable (as it can be tightened on the protein backbone [40] in analogy to tightening a knot [45] by pulling).

The dynamics of pulling depends of the relationship between and as the “cigar” may either go through or get stuck. In the former case a related force peak would arise. If the system was a homogeneous polymer, dragging would be successful when was bigger than . The corresponding force would be related to the work against the elasticity that was needed to bend the slip-loop to the appropriate curvature. This work is proportional to the square of the curvature. Thus the total elastic energy involved in bending the segment is of order [46], where is the arc distance. Dividing this energy by the distance of pulling would yield an estimate of the force measured if thermal fluctuations were neglected. The geometrical condition for dragging in proteins is more complicated because of the presence of the side groups and the related non-homogeneities and variability across the hydrophobicity scale. The diameter of the “rope” that the knot loop is made of should not exceed the maximum a linear extension, of amino acids. Thus the effective inner radius of the knot-loop is . Similarly, the size of the outer circle that is tangential to the tightest slip-loop is , where is the thickness of the slip-loop. (Both thicknesses can be considered as being site dependent and including possible hydration layer effects near polar amino acids.) Thus the slip-knot can be driven through the cystein knot-loop provided(1)In our simulations, the successful threading situations correspond to and of around 7 and 3 Å. The amino acids in the knot-loop are mostly Gly, Ala, or Cys with their side groups pointing outside of the loop. One may then estimate to be about 1.5 Å. On the other hand, the linear size of the amino acids in the slip-loop can be determined to be close to 2.5 Å. These estimates indicate that can be very close to so the possibility of slipping through the knot-loop is borderline. In fact, slipping might be forbidden within the framework of the tube-picture of proteins [47],[48] in which the effective thickness of the tube is considered to be 2.7 Å.

The CSK motifs give rise to a force peak in 1vpf, 2h64(22,S), 1rv6(25,S), 1waq(26,S), 1reu(27,S), 1tgj(28), 2h62(30,S), 1tgk(31), 2c7w(38,D), 2gyr(39,S), 1lx5(95,D), and many other proteins. In these cases, the typical value of is about 7 Å. However, specificity may result in somewhat smaller values of which may cause only smaller segments of the slip-loop to be threaded. If the passage is blocked, there will be no isolated force peak as happens in 1tgj and 1vpp.

Types of the force–displacement patterns for proteins with the disulphide bonds.

In the case of proteins with very shallow cystein knot, loop or slipknot motifs, increases very rapidly with and isolated force peak does not arise (). Such cases are represented, e.g., by 1bmp, 1rnr, 1ld5, and 1wzn where the slipknots are either very tight or the cystein loop is very shallow. In the case of a shallow motif, however, a force peak can sometimes be isolated as in the case of the 13th-ranked protein 1vpf (Figure 8) and in several other proteins, like 1xzg and 1dzk. In this case, the value of takes into account tension on the cystein bonds and it is not obvious whether such a strong elastic background should be subtracted from the value of when determining or not. In this survey, we do not subtract the backgrounds. It should be noted that in our previous surveys we missed the CSK-related force peaks because we attributed the rapid force rises at the end of pulling just to stretching of the backbone without realizing existence of structure in some such rises.

For a deep motif, the pattern may have several small force peaks before the final rise of the force, as observed for 2g4s and 1bj7. When the CSK motif is very deep, it usually does not have any influence on the shape of the pattern apart from a much steeper final rising force. Such a situation is seen in the case of, e.g., 1j8r and 1j8s.

Concluding remarks

This surveys identifies a host of proteins that are likely to be sturdy mechanically. Many of them involve disulphide bridges which bring about entanglements that are complicated topologically such as CSKs and CKs. The distinction between the two is that the former can depart from its native conformation and the latter cannot.

Our survey made use of a coarse grained model so it would be interesting to reinvestigate some of the proteins identified here by all-atom simulations, especially in situations when the CSK is involved. The CSK motifs may reveal different mechanical properties when studied in a more realistic model. Of course, a decisive judgment should be provided by experiment.

The very high mechanical resistance of the CSK proteins should help one to understand their biological function. The superfamily of cysteine-knot cytokines (in class small proteins and fold cystein-knot cytokines) includes families of the transforming growth-factor and the polypeptide vascular endothelial growth factors (VEGFs) [49],[50]. The various members of this superfamily, listed in Table 5, have distinct biological functions. For instance, VEGF-B proteins which regulate the blood vessel and limphatic angiogenesis bind only to one receptor of tyrosine kinase VEGFR-1. On the other hand, VEGF-A proteins bind to two receptors VEGFR-1 and VEGFR-2. All of these proteins form a dimer structure. The members of this familly are endowed with remarkably similar monomer structures but differ in their mode of dimerisation and thus in their propensity to bind ligands. Additionally, all dimers posses almost the same a cyclic arrangement of cysteine residues which are involved in both intra- and inter-chain disulphide bonds. These inter-chain disulphide bonds create the knot and slip-loops, where the intra-chain disulphide bonds give rise to a CSK motif when the slip-loop is gets dragged acrros the knot-loop upon pulling.

thumbnail
Table 5. Members of the cysteine-knot cytokines superfamilly.

doi:10.1371/journal.pcbi.1000547.t005

It has been shown experimentally [51] that such cysteine related connectivities bring the key residues involved in receptor recognition into close proximity of each other. They also provide a primary source of stability of the monomers due to the lack of other hydrogen bonds between two beta strands at the dimer interface.

The non trvial topologial connection between the monomers allow for mechanical separation of two monomers by a distance of about half of the size of the slip-loop. Our results suggest, however, that the force needed for the separation may be too high to arise in the cell.

Materials and Methods

The input to the dynamical modeling is provided by a PDB-based structures. The structure files may often contain several chains. In this case, we consider only the first chain that is present in the PDB file. Likewise, the first NMR determined structure is considered. If a protein consists of several domains, we consider only the first of them.

The modeling cannot be accomplished if a structure has regions or strings of residues which are not sufficiently resolved experimentally. Essentially all structure-disjoint proteins have been excluded for our studies. Exceptions were made for the experimentally studied scaffoldin 1aoh and for proteins in which small defects in the established structure (such as missing side groups) were confined within cystein loops and were thus irrelevant dynamically. In these situations, the missing contacts have been added by a distance based criterion [23] in which the treshold was set at 7.5 Å. Among the test used to weed out inadequate structures involved determining distances between the consecutive atoms. A structure was rejected if these distances were found to be outside of the range of 3.6–3.95 Å. The exception was made for prolines, which in its native state can accommodate the cis conformation. In that case, the distance between a proline and its subsequent amino acid usually falls in the range between 2.8 and 3.85 Å. For a small group of proteins which slipped through our structure quality checking procedure, but were found to be easily fixed (e.g. 1f5f, 1fy8, and 2f3c), we used publicly avialable software BBQ [52] to rebuild locations of the missing residues. A limited accuracy of this prediction procedure seems to be adequate for our model due to its the coarse-grained nature.

The modeling of dynamics follows our previous implementations [11],[12],[18] within model except that the contact map is as in ref. [19], i.e. with the contacts excluded. There is also a difference in description of the disulphide bonds. In refs. [14],[19] they were treated as an order-of-magnitude enhancement of the Lennard-Jones contacts in all proteins. In ref. [18] the different treatment of the disulphide bonds was applied to the proteins that were found to be strong mechanically without any enhancements. Here, on the other hand, we consider such bonds as harmonic in all proteins, in analogy to the backbone links between the consecutive . The native contacts are described by the Lennard-Jones potential , where is the distance between the in amino acids and whereas is determined pair-by-pair so that the minimum in the potential is located at the experimentally established native distance. The non-native contacts are repulsive below of 4 Å.

The implicit solvent is described by the Langevin noise and damping terms. The amplitude of the noise is controlled by the temperature, . All simulations were done at , where is the Boltzmann constant. Newton's equations of motion are solved by the fifth order predictor-corrector algorithm. The model is considered in the overdamped limit so that the characteristic time scale, , is of order 1 ns as argued in refs. [6],[53]. Stretching is implemented by attaching an elastic spring to two amino acids. The spring constant used has a value of which is close to the elasticity of experimental cantilevers. One of the springs is anchored and the other spring is moving with a constant speed, . Choices in the value of the spring constant have been found to affect the look of the force-displacements patterns and thus the location of the transition state [54],[55], but not the values of [10],[12],[18].

The dependence on is protein-dependent and it is approximately logarithmic in as evidenced by Figure 11 for several strong proteins. The logarithmic dependence has been demonstrated experimentally, for instance, for polyubiquitin [56],[57]. . The approximate validity of this relationship is demonstrated in Figure 11 for three proteins with big values of . We observe that the larger the value of , the bigger probability that the dependence on is large. When we make a fit to for 1vpf, 1c4p, and 1j8s, we get the parameter to be equal to respectively (the values of are correspondingly). However, some strong proteins may have to be as low as 0.04.

thumbnail
Figure 11. Dependence of and the pulling velocity for the proteins indicated.

corresponds to which is of order . The data for several top strength proteins are shown.

doi:10.1371/journal.pcbi.1000547.g011

When making the survey, we have used of and stretching was accomplished by attaching the springs to the terminal amino acids (there is an astronomical number of other choices of the attachment points).

In order to estimate an effective experimental value of the energy parameter , we have correlated the theoretical values of with those obtained experimentally. The experimental data points used in ref. [14] have been augmented by entries pertaining to 1emb (117–182), 1emb (182–212) [58] (where the numbers in brackets indicate the amino acids that are pulled) and 1aoh, 1g1k, and 1amu [23]. The full list of the experimental entries is provided by Table 6. Unlike the previous plots [14] that cross correlate the experimental and theoretical values of , we now extrapolate the theoretical forces to the values that should be measured at the pulling speeds that are used experimentally. We assume that the unit of speed, , is of order 1 Å/ns and consider 10 speeds to make a fit to the logarithmic relationship. The values of parameters and for the proteins studied experimentally are listed in Table 6.

thumbnail
Table 6. The experimental and theoretical data on stretching of proteins.

doi:10.1371/journal.pcbi.1000547.t006

The main panel of Figure 12 demonstrates the relationship between the extrapolated theoretical and experimental values of . The best slope, indicated by the solid line, corresponds to the slope of 0.0091. The inverse of this slope yields 110 pN as an effective equivalent of the theoretical force unit of . The Pearson correlation coefficient, is 0.832, the rms percent error, , is 1.02, and the Theil coefficient (discussed in ref. [14]) is 0.281. The inset show a similar plot obtained when the extrapolation to the experimental speeds is not done. The resulting unit of the force would be equivalent to 110 pN which differs form the previous estimate of 71 pN (shown by the dotted line in the main panel) because of the inclusion of the newly measured proteins and implementation of the extrapolation procedure. The statistical measures of error here are . These measures are better compared to the case with the extrapolation because the extrapolation procedure itself brings in additional uncertainties. Nevertheless, implementing the procedure seems sounder physically. The spread between these various effective units of the force suggests an error bar of order 30 pN on the currently best value of 110 pN.

thumbnail
Figure 12. Theoretical extrapolated to the pulling speeds used experimentally vs. the corresponding experimental value, .

The solid line indicates the best slope of 1/(110 pN). The dotted line corresponds to the previous result of 1/(71 pN) obtained in ref. [14] where no exptrapolation was made. The inset shows a similar plot in which the extrapolation is not implemented (denoted as in Table 6). The list of the proteins used is provided by Table 6. It comprises almost all cases considered in ref. [14] but it also includes the recent data points obtained for the scaffoldin proteins [23] and the GFP [58]. The numerical symbols used in the Figure match the listing number in Table 6.

doi:10.1371/journal.pcbi.1000547.g012

Supporting Information

Figure S1.

(a) Structure of trypsin 1bra (N = 245). The mechanically crucial disulphide bond between sites 128 and 232 is highlighted in red. (b) Structure of elastase 1elc (N = 255) which belongs to the same fold b.47.1.2 as 1bra. This structure does not contain two disulphide bonds that 1bra does. (c) The force-displacement plot for 1bra. Fmax corresponds to 3.7 ε/Å. The thinner line is obtained when the 128–232 disulphide bond is eliminated −Fmax drops to 2.7 ε/Å. When one more disulphide bond is cut, stretching continues to distances shown in panel (d) without affecting Fmax. (d) The force-displacement plot for 1elc. The corresponding Fmax is 2.0 ε/Å. In the case of 1elc, stretching results in the terminal helix pulling β strands from the inside of the protein and thus causing the inner β-barrel to unfold. If the case of 1bra (with the disulphide bridge), the terminal helix pulls the neighbouring loop. After this event, resistance grows linearly and forms one major force peak. After the peak, the whole structure opens suddenly, rupturing contacts between strands in the β-barrel and in the neighbouring loops.

doi:10.1371/journal.pcbi.1000547.s001

(4.07 MB EPS)

Table S1.

Continuation of Table 1 of the main text.

doi:10.1371/journal.pcbi.1000547.s002

(0.04 MB PDF)

Table S2.

Identification of a mechanical clamp Fmax for selected proteins.

doi:10.1371/journal.pcbi.1000547.s003

(0.02 MB PDF)

Acknowledgments

The idea of making surveys of proteins using Go-like models arose in a very stimulating discussion with J. M. Fernandez in 2005. More recent discussions and suggestions by M. Carrion-Vazquez, particularly about the cysteine knots, are warmly appreciated.

Author Contributions

Conceived and designed the experiments: MS JIS MC. Performed the experiments: MS JIS MC. Analyzed the data: MS JIS MC. Contributed reagents/materials/analysis tools: MS JIS MC. Wrote the paper: MS JIS MC.

References

  1. 1. Carrion-Vazquez M, Cieplak M, Oberhauser AF (2009) Protein mechanics at the single-molecule level. In Encyclopedia of Complexity and Systems Science 7026–7050. Editor-in-chief R. A. Meyers, Springer, New York, ISBN: 978-0-387-75888-6.
  2. 2. Lu H, Schulten K (1999) Steered molecular dynamics simulation of conformational changes of immunoglobulin domain I27 interpret atomic force microscopy observations. J Chem Phys 247: 141–153.
  3. 3. Paci E, Karplus M (1999) Forced Unfolding of Fibronectin Type 3 Modules: An Analysis by Biased Molecular Dynamics Simulations. J Mol Biol 288: 441–459.
  4. 4. Berman H M, Westbrook J, Feng Z, Gilliland G, Bhat TN, et al. (2000) The Protein Data Bank. Nucl Acids Res 28: 235–242.
  5. 5. Abe H, Go N (1981) Noninteracting local-structure model of folding and unfolding transition in globular proteins. II. Application to two-dimensional lattice proteins. Biopolymers 20: 1013–1031.
  6. 6. Veitshans T, Klimov D, Thirumalai D (1997) Protein folding kinetics: time scales, pathways, and energy landscapes in terms of sequence-dependent properties. Folding Des 2: 1–22.
  7. 7. Hoang TX, Cieplak M (2000) Sequencing of folding events in Go-like proteins. J Chem Phys 113: 8319–8328.
  8. 8. Clementi C, Nymeyer H, Onuchic JN (2000) Topological and energetic factors: what determines the structural details of the transition state ensemble and “on-route” intermediates for protein folding? An investigation for small globular proteins. J Mol Biol 298: 937–953.
  9. 9. Karanicolas J, Brooks CL III (2002) The origins of asymmetry in the folding transition states of protein L and protein G. Protein Science 11: 2351–2361.
  10. 10. Cieplak M, Hoang TX, Robbins MO (2002) Folding and stretching in a Go-like model of titin. Proteins: Struct Funct Bio 49: 114–124.
  11. 11. Cieplak M, Hoang TX (2003) Universality classes in folding times of proteins. Biophys J 84: 475–488.
  12. 12. Cieplak M, Hoang TX, Robbins MO (2004) Thermal effects in stretching of Go-like models of titin and secondary structures. Proteins: Struct Funct Bio 56: 285–297.
  13. 13. Tozzini V, Trylska J, Chang C, McCammon JA (2007) Flap opening dynamics in HIV-1 protease explored with a coarse-grained model. J Struct Biol 157: 606–615.
  14. 14. Sułkowska JI, Cieplak M (2008) Selection of optimal variants of Go-like models of proteins through studies of stretching. Biophys J 95: 3174–3191.
  15. 15. Tsai J, Taylor R, Chothia C, Gerstein M (1999) The packing density in proteins: Standard radii and volumes. J Mol Biol 290: 253–266.
  16. 16. Settanni G, Hoang TX, Micheletti C, Maritan A (2002) Folding pathways of prion and doppel. Biophys J 83: 3533–3541.
  17. 17. Sobolev V, Sorokin A, Prilusky J, Abola EE, Edelman M (1999) Automated analysis of interatomic contacts in porteins. Bioinformatics 15: 327–332.
  18. 18. Sułkowska JI, Cieplak M (2007) Mechanical stretching of proteins – a theoretical survey of the Protein Data Bank. J Phys Cond Mat 19: 283201.
  19. 19. Sułkowska JI, Cieplak M (2008) Stretching to understand proteins – A survey of the Protein Data Bank. Biophys J 94: 6–13.
  20. 20. Cieplak M, Sułkowska JI (2009) Tests of the Structure-based models of proteins. Acta Phys Polonica 115: 441–445.
  21. 21. Rief M, Gautel M, Oesterhelt F, Fernandez JM, Gaub HE (1997) Reversible unfolding of individual titin immunoglobulin domains by AFM. Science 276: 1109–1112.
  22. 22. Carrion-Vasquez M, Oberhauser AF, Fowler SB, Marszalek PE, Broedel PE, et al. (1999) Mechanical and chemical unfolding of a single protein: a comparison. Proc Natl Acad Sci U S A 96: 3694–3699.
  23. 23. Valbuena A, Oroz J, Hervas R, Vera AM, Rodriguez D, et al. (2009) On the remarkable robustness of scaffoldins. Proc Natl Acad Sci U S A - in press (to arrive on-line on July 27).
  24. 24. Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, et al. (1997) CATH - A hierarchical classification of protein domain structures. Structure 5: 1093–108.
  25. 25. Pearl F, Todd A, Sillitoe I, Dibley M, Redfern O, et al. (2005) The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis. Nucl Acid Res 33: D247–51.
  26. 26. Brockwell DJ, Paci E, Zinober RC, Beddard G, Olmsted PD, Smith DA, Perham RN, Radford SE (2003) Pulling geometry defines mechanical resistance of β-sheet protein. Nat Struct Biol 10: 731–737.
  27. 27. Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247: 536–40.
  28. 28. Andreeva A, Howorth D, Chandonia J-M, Brenner SE, Hubbard TJP, et al. (2008) Data growth and its impact on the SCOP database: new developments. Nucl Acid Res 36: D419–25.
  29. 29. Lo Conte L, Brenner SE, Hubbard TJP, Chothia C, Murzin AG (2002) SCOP database in 2002:refinements accommodate structural genomics. Nucl Acid Res 30: 264–7.
  30. 30. Sułkowska JI, Cieplak M (2008) Stretching to understand proteins - a survey of the Protein Data Bank (Retraction). Biophys J 95: 5487–5487.
  31. 31. The Gene Ontology Consortium (2000) gene ontology: tool for the unification of biology. Nat Genet 25: 25–29.
  32. 32. Lee G, Abdi K, Jiang Y, Michaely P, Bennett V, et al. (2006) Nanospring behavior of ankyrin repeats. Nature 440: 246–249.
  33. 33. Li LW, Wetzel S, Pluckthun A, Fernandez JM (2006) Stepwise unfolding of ankyrin repeats in a single protein revealed by atomic force microscopy. Biophys J 90: 30–32.
  34. 34. Craik DJ, Dally NL, Waine C (2001) The cysteine knot motif in toxins and implications for drug design. Toxicon 39: 43–60.
  35. 35. Gruber CW, Cemazar M, Anderson MA, Craik DJ (2007) Insecticidal plant cyclotides and related cysteine knot toxins. Toxicon 49: 561–575.
  36. 36. Craik DJ, Daly NL, Bond TJ, Waine C (1999) Plant cyclotides: A unique family of cyclic and knotted proteins that defines the cyclic cystine knot structural motif. J Mol Biol 294: 1327–1336.
  37. 37. Rosengren KJ, Daly NL, Plan MR, Waine C, Craik DJ (2003) Twists, knots, and rings in proteins - Structural definition of the cyclotide framework. J Biol Chem 278: 8606–8616.
  38. 38. Yeates TO, Norcross TS, King NP (2007) Knotted and topologically complex proteins as models for studying folding and stability. Curr Opinion in Chem Biol 11: 595–603.
  39. 39. Taylor WR (2007) Protein knots and fold complexity: Some new twists. Comp Biol and Chem 31: 151.
  40. 40. Sułkowska JI, Sulkowski P, Onuchic JN, Jamming proteins with slipknots and their free energy landscapes. submitted (2009)
  41. 41. Virnau P, Mirny LA, Kardar M (2006) Intricate Knots in Proteins: Function and Evolution. PLOS Comp Biol 2: 1074–1079.
  42. 42. Sułkowska JI, Sulkowski P, Szymczak P, Cieplak M (2008) Stabilizing effect of knots on proteins. Proc Natl Acad Sci U S A 105: 19714–19719.
  43. 43. Grandbois M, Beyer M, Rief M, Clausen-Schaumann H, Gaub H (1999) How Strong Is a Covalent Bond? Science 283: 1727–1730.
  44. 44. Muller YA, Heiring C, Misselwitz R, Welfle K, Welfle H (2002) The cystine knot promotes folding and not thermodynamic stability in vascular endothelial growth factor. J Biol Chem 277: 43410.
  45. 45. Sułkowska JI, Sulkowski P, Szymczak P, Cieplak M (2008) Tightening of knots in the proteins. Phys Rev Lett 100: 058106.
  46. 46. Landau LD, Lifshitz EM (1986) Theory of Elasticity -Course of Theoretical Physics, 3rd edition. Oxford: Butterworth-Heinemann.
  47. 47. Banavar JR, Hoang TX, Maritan A, Seno F, Trovato A (2004) Unified perspective on proteins: A physics approach. Phys Rev E 70: 041905.
  48. 48. Lezon TR, Banavar JR, Maritan A (2006) The origami of life. J Phys: Cond Matter 18: 847–888.
  49. 49. Iyer S, Scotney PD, Nash AD, Acharya KR (2006) Crystal Structure of Human Vascular Endothelial Growth Factor-B: Identification of Amino Acids Important for Receptor Binding. J Mol Biol 359: 76.
  50. 50. Stroud RM, Wells JA (2004) Mechanistic Diversity of Cytokine Receptor Signaling Across Cell Membranes. Sci STKE 231: re7.
  51. 51. Greenwald J, Groppe J, Gray P, Wiater E, Kwiatkowski W, Vale W, Choe S (2003) The BMP7/ActRII Extracellular Domain Complex Provides New Insights into the Cooperative Nature of Receptor Assembly. Mol Cell 11: 605.
  52. 52. Gront D, Kmiecik S, Kolinski A (2007) Backbone building from quadrilaterals. A fast and accurate algorithm for protein backbone reconstruction from alpha carbon coordinates. J Comput Chemistry 28: 1593–1597.
  53. 53. Szymczak P, Cieplak M (2006) Stretching of proteins in a uniform flow. J Chem Phys 125: 164903.
  54. 54. Evans E, Ritchie K (1999) Strength of Weak Bond Connecting Flexible Polymer Chains. Biophys J 76: 2439–2447.
  55. 55. Seifert U (2000) Rupture of multiple parallel molecular binds under dynamic loading. Phys Rev Lett 84: 2750–2753.
  56. 56. Carrion-Vazquez M, Li HB, Lu H, Marszalek PE, Oberhauser AF, et al. (2003) The mechanical stability of ubiquitin is linkage dependent. Nat Struct Biol 10: 738–743.
  57. 57. Chyan CL, Lin FC, Peng H, Yuan JM, Chang CH, et al. (2003) Reversible mechanical unfolding of single ubiquitin molecules. Biophys J 87: 3995–4006.
  58. 58. Dietz H, Berkemeier F, Bertz M, Rief M (2006) Anisotropic deformation response of single protein molecules. Proc Natl Acad Sci U S A 103: 12724–12728.
  59. 59. Watanabe K, Muhle-Goll C, Kellermayer MSZ, Labeit S, Granzier HL (2002) Different molecular mechanics displayed by titin's constitutively and differentially expressed tandem Ig segment. Struct Biol J 137: 248–258.
  60. 60. Watanabe K, Nair P, Labeit D, Kellermayer MSZ, Greaser M, et al. (2002) Molecular mechanics of cardiac titins PEVK and N2B spring elements. J Biol Chem 277: 11549–11558.
  61. 61. Li HB, Fernandez JM (2003) Mechanical design of the first proximal Ig domain of hman cardiac titin revealed by single molecule force spectroscopy. J Mol Biol 334: 75–86.
  62. 62. Yang G, Cecconi C, Baase WA, Vetter IR, Breyer WA, et al. (2000) Solid-state synthesis and mechanical unfolding of polymers of T4 lysozyme. Proc Nat Acad Sci U S A 97: 139–144.
  63. 63. Lenne PF, Raae AJ, Altmann SM, Saraste M, Horber JKH (2000) States and transition during unfolding of a single spectrin repeat. FEBS Lett 476: 124–128.
  64. 64. Carrion-Vazquez M, Oberhauser AF, Fisher TE, Marszalek PE, Li HB, et al. (2000) Mechanical design of proteins studied by single-molecule force. Prog Biophys Mol Biol 74: 63–91.
  65. 65. Best RB, Li B, Steward A, Daggett V, Clarke J (2001) Can non-mechanical proteins withstand force? Stretching barnase by atomic force microscopy and molecular dynamics simulation. Biophys J 81: 2344–2356.
  66. 66. Brockwell DJ, Godfrey S, Beddard S, Paci E, West DK, et al. (2005) Mechanically unfolding small topologically simple protein L. Biophys J 89: 506–519.
  67. 67. Dietz H, Rief M (2004) Exploring the energy landscape of the GFP by single-molecule mechanical experiments. Proc Natl Acad Sci U S A 101: 16192–16197.
  68. 68. Dietz H, Rief M (2006) Protein structure by mechanical triangulation. Proc Natl Acad Sci U S A 103: 1244–1247.
  69. 69. Li L, Han-Li Huang H, Badilla CL, Fernandez JM (2005) Mechanical unfolding intermediates observed by single-molecule spectroscopy in fibronectin type III module. J Mol Biol 345: 817–826.
  70. 70. Oberhauser AF, Badilla-Fernandez C, Carrion-Vazquez M, Fernandez JM (2002) The mechanical hierarchies of fibronectin observed with single molecule AFM. J Mol Biol 319: 433–447.
  71. 71. Oberdorfer Y, Fuchs H, Janshoff A (2000) Conformational analysis of native fibronectin by means of force spectroscopy. Langmuir 16: 9955–9958.
  72. 72. Oberhauser AF, Marszalek PE, Erickson HP, Fernandez JM (1998) The molecular elasticity of the extracellular matrix protein tenascin. Nature 14: 181–185.
  73. 73. Cao Y, Li HB (2007) Polyprotein of GB1 is an ideal artificial elastomeric protein. Nature Mat 6: 109–114.