Local Function Conservation in Sequence and Structure Space
Figure 4
(A) Using TM to identify the nearest neighbor of the sample query protein 1ve3 yields protein domain d1vlma. For d1vlma the TM scores were pre-computed, resulting in the neighborhood illustrated here with Kruskal's non-metric multidimensional scaling [44](where similar proteins structures are depicted close). Domain d1vlma has several molecular functions attached, for this illustration we selected GO∶0008757 (S-adenosylmethionine-dependent methyltransferase activity). Protein domains having this function are colored yellow, domains not annotated with this function are colored in grey. (B) TM scores with respect to d1vlma are sorted along the x-axis. Protein domains annotated with molecular function GO∶0008757 are assigned a y coordinate of 1 (drawn in yellow), domains not annotated with this function are assigned a y coordinate of 0 (drawn in grey). Unlabeled domains are from the 200 nearest neighbors of d1vlma. A logistic curve is fit through these points (drawn in orange). The logistic curve can be evaluated for the raw function conservation score for a given TM score.