Figure 1.
The protein interaction network of T. pallidum.
A: High-confidence protein interaction network (TPA HCI 0.5) including 576 proteins and 991 interactions. Nodes are color-coded according to TIGR main roles. Links are color-coded based on their logistic regression score (indicated as spectral scale). Proteins involved in DNA metabolism (Figure 4) are shown as enlarged red circles. Note their distributed topology. See Table S1 for all interactions and scores. B. Comparison of the approximated power-law degree distributions of the T. pallidum networks. Node degrees k and their relative frequency P(k) are plotted on a bilogarithmic scale and fitted by linear regression. “TPA all”, “TPA 50”, and “TPA HCI” are the complete T. pallidum network and sub-networks filtered by “preycount” or logistic regression, respectively. The insert shows the node degree distribution of the high-confidence T. pallidum network (TPA HCI 0.5) on a linear scale.
Table 1.
Topological properties of presented interaction networks.
Figure 2.
Genomic locations linked by protein interactions.
A,B. Certain genomic locations are especially tightly linked via protein interactions when compared to randomized networks. Genomic location links are visualized for the “TPA 50” protein interaction dataset (A) and for bioinformatical associations from the String database (B, “StringDB 700”, protein links with combined score>0.7) [18]. Grey lines indicate all individual protein interactions/associations connecting genes on the circular chromosome of T. pallidum (1.14 Mbp total size). Tightly connected clusters comprising 5 or more neighbouring genes were identified (thick lines) by a computational method, which is based on the comparison to re-wired versions of the network (see methods). The number of linking interactions between two clusters is indicated by the color of their connecting line and the enrichment compared to randomly re-wired networks is indicated by a Z-value (in the outer circle at the positions of the clusters). Due to the incorporation of genomic neighbourhood links by the String database (and for clarity), self-links between genomic locations are removed in the “StringDB 700” representation. C, The region flanking FliS (TP0943) is, for example, connected to the region of TP0046–TP0048, linking motility and sugar metabolism (TP0943–TP0946) to a cluster of uncharacterized proteins around TP0047 which appears to be involved in motility as well [14].
Figure 3.
Interactions between functional classes in pro- and eukaryotes.
Connections between functional classes mediated by protein-interactions in Y2H datasets (TPA HCI = T. pallidum high confidence interactions, CJE HCI = Campylobacter jejuni high confidence interactions), and two comprehensive coAP/MS datasets from E. coli [5] and yeast [3]). For each data set and each class combination, a functional class association index (fCAI) was calculated (see methods), which scores the interaction density between two functional classes in a dataset of given size and class coverage. The matrices show the significance of each enriched functional class link (see color key). Results obtained from genome-wide Y2H (top) or coAP/MS (bottom) experiments are compared between bacteria and yeast (see color key).
Figure 4.
An expanded view on DNA metabolism.
A. The DNA metabolism network for T. pallidum based on the integration of several experimental and bioinformatical data sets (see methods). T. pallidum proteins with a DNA metabolism related function (red nodes) are linked by interactions from several high-confidence protein interaction datasets. The color of the interactions indicates their source (see color key), e.g., all blue interactions were identified in our study (i.e. in T. pallidum) and are part of the high-confidence interaction dataset for T. pallidum (for detailed list see Table S3b). Proteins of other functional classes are included, when their association is supported by at least one additional evidence. Grey lines indicate support of an interaction by bioinformatical predictions (String database score>0.4). Proteins with orange borders have been shown to localize to the nucleoid. Proteins with a hexagonal shape have a tight bioinformatical link to a DNA metabolism protein (String database score>0.8). Proteins that are discussed in the text are shown in larger, blue font. B. Co-immunoprecipiation (coIP) experiments for a number of selected DNA metabolism interactions are shown (thick lines in network). The coIP is conducted with an anti-Myc antibody. For each coIP, the total input and the fractions after coIP are analyzed by Western Blot probing with an anti-HA and an anti-Myc antibody as indicated on the left of each blot. The empty Myc-tag vector “M-” is used as a control for unspecific binding of the HA-tagged protein. HA-tagged proteins are labeled with “H” and their gene name or gene number, e.g. “H4” in the first coIP corresponds to HA-tagged protein TP0004. Accordingly, Myc-tagged proteins are labeled with “M”, e.g. M-gyrB corresponds to Myc-tagged GyrB protein.
Table 2.
Novel functional assignments based on protein network and additional evidence.
Figure 5.
Interacting clusters of orthologous groups (“iCOG”) show phylogenetically conserved interaction patterns.
Each row of the shown profile corresponds to a species and each column corresponds to a pair of interacting protein families (i.e. iCOG), for which an interaction was found in the high-confidence T. pallidum data set. The protein families were defined based on the “cluster of orthologous genes” approach (COG) (see methods). With this, the profile shows for each interaction of the T. pallidum data set whether both interacting proteins, only one interacting protein or none of the interacting proteins are conserved in a given species (given row). For each species from the shown taxonomy (y-axis) and each iCOG, a conservation value is shown in the matrix. This conservation value indicates whether both COGs are conserved/absent in a given species or whether only one or the other COG is conserved (see left upper corner for color key). Overall, three distinct conservation regions are visible in the clustered matrix: #1, #2, and region #3-#6, which we subdivided somewhat arbitrarily into individual clusters #3-#6 with increasing conservation from left to right (note branches on tree above). This figure is also available as zoomable Figure S1 in PDF format in which individual species names and iCOGs can be seen.
Figure 6.
Prediction of interactions for other species based on T. pallidum high-confidence data sets.
Species (y-axis) are ordered according to taxonomy (broad groups are indicated) and the number of predicted interactions for each species based on two confidence score cut-offs (HCI 0.5 with score>0.5 and HCI 0.7 with score>0.7) is shown.