Fig 1.
The six-type register shifts for βαβ-motifs, and the preferences in naturally occurring protein structures.
(A) A βαβ-motif structure. A βαβ-motif is composed of two β-strands joined by an α-helix and forms a packing structure of the α-helix on the paired strands. The first and second strands are respectively colored in cyan and orange, and the α-helix is colored in green. (B) A naturally occurring protein structure containing βαβ-motifs (the response regulator Spo0F, PDB ID: 1peyA). In the structure, one of the βαβ-motifs is shown in colors. (C) A graphical explanation of the register shifts using a schematic representation of a βαβ-motif consisting of two β-strands (S1 and S2) and an α-helix, with two neighboring β-strands, S1’ and S2’. To make it clear that the direction of S1’ and S2’ can be either direction in the present database analysis, they are depicted by two-headed arrows. The blue and red arrows indicate the most N- and C-terminal residue pairs that form a cross-strand residue pairing, respectively. (D) The zoom-up view of the βαβ-motif, colored in (B), with the neighboring strands (Left), and its register shifts(Right). (E) Observed frequencies for each resister shift, ,
,
,
,
and
, in naturally occurring protein structures. The bars at the register shift of zero, corresponding to the peak for the distributions, are indicated by filled black bars.
Fig 2.
Observed frequency of loop types in βα (A) and αβ (B) units, as classified based on the ABEGO representation and packing geometry.
The horizontal axis is sorted by frequency of occurrence and the 10 most frequently observed loops are shown. The packing geometry, i.e., parallel or anti-parallel, was defined based on the orientation of the Cα-Cβ vector of the strand residue closest to the helix relative to the vector from the first to the second secondary structure element [4]. If the two vectors were parallel, the orientation was denoted as “(P)”, and if anti-parallel, the orientation was denoted as “(A)”. The character “:” denotes the zero-length loop.
Fig 3.
The distribution of register shifts of the two most frequently observed loop types with anti-parallel orientation in αβ-units in the dataset.
(A) Distribution of the register shift of , and (B) that of
. (C) Schematic structure of a βαβ-motif with the most frequent register shift of
and
. (D) Distribution of the register shift of
, and (E) that of
. (F) Schematic structure of a βαβ-motif with the most frequent register shift of
and
. In (C) and (F), the square with a circle inside represents a single amino acid residue with side chain located on the proximal side, and the colored filled square represents a residue with a side chain located on the opposite side.
Fig 4.
Schematic representations of nine target systems for all atom simulation.
Figures (A)–(I) show schematic representations of the system I–IX, respectively. Each system consists of the αβ-unit and the three-stranded β-sheet structure with different register shifts. The arrows represent β-strands and the rectangles represent α helices. In the β-strands, residues located at the C-terminus are indicated by short arrows, and other residues are shown as squares. A square (or a short arrow) with a circle inside represents a single amino acid residue with a side chain that is on the proximal side, and a colored filled square represents a residue with a side chain on the opposite side. The register shifts of the systems I to IX were (−1, −1), (0, −1), (1, −1), (−1, 0), (0, 0), (1, 0), (−1, 1), (0, 1), and (1, 1), respectively. The residues indicated by blue are residues of which ϕ and ψ angles were exhaustively sampled. The black dotted lines indicate hydrogen bonds that were fixed in the exhaustive sampling, while the red dotted lines indicate hydrogen bonds that can be broken.
Table 1.
List of intra-chain hydrogen bonds that must/must not be satisfied for the systems I–IX.
The HB1 bond is defined as the hydrogen bond between the donor of the most N-terminal residue of the S1 strand in the system III, VI, or IX and the acceptor of the first residue of the GB loop. The HB2 and the HB3 bonds are defined as the hydrogen bonds required for the system of and
, respectively. The graphical explanations of HB1, HB2, and HB3 are given in S20 Fig. The symbol “b” means that it must be bonded, “-” means that its hydrogen bond is ignored because a given system does not include such a donor-acceptor pair, and “n” means that it must not be bonded. HB1 is “n” in the systems III, VI, and IX because if HB1 is formed, the N-terminus of S2 will be stretched by one residue; therefore, it will not be a required register shift for these systems.
Fig 5.
The results of exhaustive conformational sampling and identification of structures satisfying the conditions.
(A) Frequency of structures that satisfy the three conditions among exhaustively generated structures for the system I to IX. (B) Distributions of the computed register shifts of and (C) that of
. The values of the vertical axis of (B) and (C) were obtained by marginalizing the results of (A).
Fig 6.
Frequency of structures that satisfied the conditions (a); (b); (c); (a) and (b); (b) and (c); (c) and (a); and all three conditions for exhaustively generated 1,586,681 structures of the systems from I to IX.
Fig 7.
Frequency of structures with inter-atomic collisions between a given secondary structure pair for each of the system (from I to IX).
The collision probability between the α-helix and S2 is not shown, because structures with such inter-atomic collisions were precluded before the implantation of the αβ-unit into the three-stranded β-sheet.
Fig 8.
Disfavored structures of a GB loop with various three-stranded β-sheets identified by the simulation.
The structures are categorized according to their register shifts. (A) Conformation of the system V exhibiting inter-atomic collision between the helix and S2’. This atomic collision was one of the dominant factors for the decrease of structures satisfying the condition (b) not only in the system V, but also in the systems with . (B)-(D) Disfavored conformations of the system VIII. (B) Conformation with inter-atomic collision between S1 and the GB loop and (C) that between the helix and S2’. (D) Unsatisfied polar groups and their frequencies of unsatisfaction. (E) Conformation in which the most frequently unsatisfied hydrogen bond donor did not meet the hydrogen bond satisfaction condition. (F)-(I) Disfavored conformations of the system IX. (F) Frequency of structures that satisfied the various conditions imposed on the structures corresponding to the survivors of the system VIII. (G) Polar group that violate the condition (a) or (c) and their frequencies. (H) Conformation that did not satisfy condition (a). (I) Conformation in which the most frequently unsatisfied hydrogen bond donor did not meet the condition (c). In (A), (B), (C), (E), and (I), protein atoms and virtual water molecules that collide with other atoms are depicted as transparent spheres colored in CPK and purple, respectively.
Fig 9.
The degree of agreement between the blueprints and the NMR structures for the five de novo-designed proteins reported in Ref. [3].
Left to right: the blueprints, the target structures, the blueprints colored according to their consistency with their NMR structures, the NMR structures, and frequency of the hydrogen bonds formation in the NMR structures. On the rightmost column, hydrogen bonds with a forming probability of 50% or less are indicated by gray bars, and the others are indicated by orange bars. Correspondingly, on the third column from the left, hydrogen bonds with a forming probability of 50% or less are indicated by gray lines, the other hydrogen bonds are indicated by orange lines, and residues that were designated as β-strands in the blueprint, but form a β-strands with a probability of 50% or less are colored in gray. (A) Fold-I, (B) Fold-II, (C) Fold-III, (D) Fold-IV, and (E) Fold-V.