Quantitative Protein Localization Signatures Reveal an Association between Spatial and Functional Divergences of Proteins
(A) Schematic showing the major components of Protein Localization Analysis and Search Tools (PLAST). (B) Example images of GFP-tagged Saccharomyces cerevisiae strains from the UCSF dataset . The intensity of each image has been scaled to the same range. (C) Multi-dimensional scaling plot based on the dissimilarity scores (dp) among all the P-profilesSVM constructed for the UCSF dataset. ORFs manually assigned to “nucleus”, “cytoplasm”, or “mitochondrion” categories by UCSF are shown in purple, red, or green dots, respectively. (D) Multidimensional scaling plot of 20 representative protein localization patterns (dots) or “exemplars” identified using an affinity-propagation clustering algorithm. The radius of the circle around each dot is proportional to the number of ORFs assigned to the exemplar. Each exemplar is colored and named according to the most enriched UCSF category among its assigned ORFs (Supplementary Fig. S4A). The exemplars of MC2 (Cox8), CP3 (Rbg1), and NC3 (Hda2) are shown in B. (E) Comparison of the performances of P-profiles and quantitative features extracted using two other previous analysis frameworks (“Chen07” and “Huh09”) ,  in classifying ORFs according to UCSF categories. The accuracies shown were estimated using a multi-class SVM classifier and 5-fold cross validation, and averaged over all UCSF categories.