Skip to main content
Advertisement

< Back to Article

Figure 1.

A Hierarchy of Protein Complexes of Known Three-Dimensional Structure

The hierarchy has 12 levels, namely, from top to bottom: QS topology, QS family, QS, QS20, QS30…QS100. At the top of the hierarchy, there are 192 QS topologies. One particular QS topology (orange circle) with four subunits is expanded below. It comprises 161 QS families in total, of which two are detailed: the E. coli lyase and the H. sapiens hemoglobin γ4. All complexes in the E. coli lyase QS family are encoded by a single gene and therefore correspond to a single QS. However, the hemoglobin QS Family contains two QSs: one with a single gene, the hemoglobin γ4, and one with two genes, the hemoglobin α2β2 from H. sapiens. The last level in the hierarchy indicates the number of structures found in the complete set (PDB). There are 30 redundant complexes corresponding to the lyase QS, four corresponding to the hemoglobin γ4 QS, and 80 to the hemoglobin α2β2 QS. We also see that there are 9,978 monomers, 6,803 dimers, 814 triangular trimers, etc. Note that there are intermediate levels using sequence identity thresholds (fourth to twelfth level) between the QS level and the complete set, which are not shown in detail here.

More »

Figure 1 Expand

Figure 2.

Representing Protein Complexes as Graphs

(A) Each protein complex is transformed into a graph where nodes represent polypeptide chains and edges represent biological interfaces between the chains.

(B) All complexes are compared with each other using a customized graph-matching procedure. Complexes with the same graph topology are grouped to form the top level of the hierarchy, as shown by the green boxes. If, in addition, the subunit structures are related by their SCOP domain architectures, they are grouped at the second level, shown by the red boxes. Structures were rendered with VMD [51].

More »

Figure 2 Expand

Table 1.

Criteria for Comparison and Classification of Protein Complexes

More »

Table 1 Expand

Figure 3.

Examples of Quaternary Structure Topologies

(A) All QSTs for complexes with up to nine subunits are shown, accounting for more than 96% of the nonredundant set of QSs and more than 98% of all complexes in PDB. Topologies compatible with a symmetrical complex are annotated with an s, and topologies where all subunits have the same number of interfaces (edges) are annotated by a star (*).

(B) Examples of large complexes that are the single representatives of their respective topologies (QSTs). PDB codes are given. 1pf9, E. coli GroEL-GroES-ADP; 1eaf, synthetic construct, pyruvate dehydrogenase; 1shs, Methanococcus jannaschii small heat shock protein; 1b5s, Bacillus stearothermophilus dihydrolipoyl transacetylase; 1j2q, Archaeoglobus fulgidus 20S protesome alpha ring. It is interesting to note that the graph layouts resemble the spatial arrangements of the subunits.

(C) Likely errors in the PDB Biological Units: QSTs of homomers with different numbers of contacts amongst the subunits. The number of erroneous QSs in each topology is provided above each graph.

More »

Figure 3 Expand

Figure 4.

Distribution of Protein Complex Size in the Hierarchy

Histogram of the number of subunits per protein complex. Smaller complexes are more abundant than larger complexes, and complexes with even numbers of subunits tend to be more abundant than complexes with odd numbers of subunits, at both levels of the hierarchy.

More »

Figure 4 Expand

Table 2.

Twelve Largest Quaternary Structures with Two or More Subunits

More »

Table 2 Expand

Figure 5.

Redundancy in the Protein Data Bank at Several Levels of Sequence Similarity

(A) The number of structures at each level of the 3D Complex database, from 192 QSTs to the total number of structures in the PDB (21,037). The tick marks on the line below the graph indicate the consecutive pairs of levels that are plotted in (B–E).

(B) Number of QS30 per QS. Note that QS Families are almost identical to QSs. The first bar in the histogram shows that about 2,500 QS correspond to one QS30; the second bar represents 250 QS that correspond to two QS30.

(C) Number of QS90 per QS30.

(D) Number of QS100 per QS90.

(E) Number of complexes in the complete set per QS100.

All distributions display scale-free behaviour, in the sense that a large proportion of groups are identical at any two consecutive levels, whereas a small number are very redundant. Adding symmetry information does not change this trend, as shown in Table 1.

More »

Figure 5 Expand

Figure 6.

Cyclic and Dihedral Symmetries

(C2) Cyclic symmetry: two subunits are related by a single 2-fold axis, shown by a dashed line. An ellipse at the end of the symmetry axis marks a 2-fold axis. Nearly all homodimers have C2 symmetry. C2 symmetry is termed “2” in the crystallographic Hermann-Mauguin nomenclature, shown in red beneath C2.

(C4) Cyclic symmetry: four subunits are related by one 4-fold axis. A square at the end of the symmetry axis marks a 4-fold axis.

(D2) Dihedral symmetry: four subunits are related by three 2-fold axes. D2 symmetry can be constructed from two C2 dimers. Note the difference between the D2 and C4 symmetries: two symmetry types that both have four subunits.

(D4) Dihedral symmetry: eight subunits are related to each other by one 4-fold axis and two 2-fold axes. Note that D4 symmetry can be constructed by stacking two C4 tetramers as shown, or four C2 dimers (not shown).

More »

Figure 6 Expand

Figure 7.

The Size of Homomeric Complexes in the Protein Data Bank and in SwissProt

The histogram shows the relative abundances of monomers and homo-oligomers of different sizes in the PDB and in SwissProt. Two PDB sets are shown: the complete set and the nonredundant set of QSs. Three SwissProt sets are shown: the complete SwissProt and the Human and E. coli subsets. The trend in all the sets is similar and highlights the importance of the mechanism of self-assembly, which is linked to many functional possibilities discussed in the text. The oligomeric state of proteins in SwissProt was extracted from the subunit annotation field, and annotations inferred by similarity were not considered.

More »

Figure 7 Expand