The Origins of Specificity in Polyketide Synthase Protein Interactions
Figure 2
Docking Domain Compatibility Classes
(A) Docking domains are clustered according to sequence similarity (Text S1, Section 2). Each node represents a particular head (left) or tail (right) domain; two domains are connected by a line if their BLAST e-value is less than a defined cutoff (2.0e-4 for heads, and 1.0e-4 for tails). Head and tail domains each independently assort into three phylogenetic clusters, labeled H1, H2, H3, and T1, T2, T3, respectively. For the moment, cluster coloring is arbitrary.
(B) Examples of PKS multiprotein chains. Each row shows a different PKS pathway, with names as defined in Dataset S1. Proteins are represented as chevrons, with C-terminal head domains (pointed) and N-terminal tail domains (notched) now colored according to their phylogenetic group, as defined in Figure 2A. The pathway termini, as well as domains which could not be clustered, are colored grey. Note that interactions predominantly occur between docking domains of the same color. There are only two exceptions to this rule, one of which is shown in the nanch multiprotein chain. (The other, in the nidda multiprotein chain, involves a domain that lies at the boundary of its parent cluster, indicating that it has probably been misclassified by our clustering algorithm; see Dataset S2.) Actinobacterial pathways tend to use domains pairs of type H1–T1 and H2–T2, while myxobacterial and cyanobacterial pathways use domain pairs of type H3–T3 alone. (Again, the few exceptions to this rule, such as in the epoth multiprotein chain, are likely due to misclassification.)
(C) Phylogenetic clusters coincide with docking domain compatibility classes. Each PKS pathway in our dataset gives us a list of known interactors, as well as a list of known noninteractors. For example, the amphotericin pathway contains five internal head and tail domains; of the 25 possible pairings of these domains, five represent interactors (three H1–T1 and two H2–T2), while the remaining 20 represent noninteractors (six H1–T1, six H1–T2, six H2–T1, and two H2–T2). We tallied such interaction and noninteraction information over the 42 PKS pathways in our dataset, and summarized our results in a single table. Each row corresponds to a head cluster, and each column to a tail cluster. The top-left entry of every cell reports the number head–tail pairs of the given variety known to be interactors; the bottom-right entry reports the number of head–tail pairs known to be noninteractors. For example, we know one interactor and 73 non-interactors of the H1-T2 variety. The correspondence between the head and tail clusters is obvious: we find large numbers of interactors within compatible clusters (on-diagonal, highlighted in orange) and large numbers of noninteractors between them (off-diagonal, highlighted in purple). This defines a one-to-one pairing of head and tail clusters into three compatibility classes, H1–T1 (green), H2–T2 (red), and H3–T3 (blue), and justifies the common coloring used in Figure 2A. This division into compatibility classes is a useful predictive tool for actinobacterial pathways, since they tend to contain docking domains of multiple varieties. For example, since the ampho pathway has three H1–T1 domain pairs and 2 H2–T2 domain pairs, only 12 of the 120 possible ways of pairing them are compatible (2!3! out of 5!). Of the 33 actinobacterial pathways in our dataset, 19 contain both H1–T1 and H2–T2 varieties (Dataset S2). For these mixed pathways, on average less than a third of all possible ways of pairing their domains are compatible.