Quantitative Protein Localization Signatures Reveal an Association between Spatial and Functional Divergences of Proteins
(A) An example of how PLAST assigns compartments to an ORF, YDR110W (black curve = estimated probability distribution of the dp scores between the ORF and a catalog of 73 major subcellular compartments; dashed red vertical line = local maxima of the distribution with the highest dp value; red curve = estimated “null” distribution of the dp scores between the ORF and non-specifically localized compartments; blue vertical line = a threshold for compartments with dp significantly less than the null distribution at Bonferroni-adjusted P˜<2.5×10−4.) The estimated mean and standard deviation of the null distribution are used to standardize the dp scores between the ORF and all compartments. (B) A subcellular localization map showing the standardized P-profile dissimilariy scores () between 4066 ORFs (x-axis) and the 73 major subcellular compartments (y-axis) in a budding yeast cell. The compartments (rows) were ordered using a hierarchical clustering algorithm with cosine dissimilarity scores, and labeled with color codes according to their known functions or localizations (“common” compartments = compartments assigned to large numbers of ORFs.) A fully annotated map is shown in Supplementary Fig. S7. (C) Using a Bonferroni-adjusted threshold of P˜<1.0×10−12, we assigned compartments to each and every ORF. Among the 73 compartments, we found 22 compartments whose known components and “non-components” assigned by PLAST share at least one common, significantly-enriched GO biological process (P˜<0.05 with false-discovery-rate adjustment, hypergeometric test). Shown are the percentages of known- and non-components in all the ORFs assigned with these compartments by PLAST. The list of (up to three) common enriched GO biological processes for each compartment is also shown (pol. = polymerase, reg. = regulation, RNP = ribonucleoprotein).