Protein prediction models support widespread post-transcriptional regulation of protein abundance by interacting partners
Fig 6
Directed graphs of protein and transcript interrelationships identify candidate regulatory genes.
A-C. Examples of directed graphs constructed from genome-wide relationships of transcript-predicted proteins, containing members of A. the propionyl-CoA carboxylase complex; B. the cytochrome c oxidase, mitochondrial complex; C. the PI4K2A-WASH complex, the RICH1/AMOT polarity complex, and others. In each subgraph, orange nodes have outflow edges only (i.e., they are contributing transcripts in the prediction models). Blue nodes are nodes that are connected to other nodes via at least one inflow edge (i.e., they represent proteins, and optionally also transcripts if they also have outward edges). Orange edges represent positive coefficients of the transcripts to the target proteins in the elastic net models; gray edges represent negative coefficients. All edges are directed from transcript to protein, and the widths of the edges are scaled by the weight. D. A highly connected subgraph of mitochondrial ribosome subunits containing 73 nodes and 834 edges. E. Persistent community detection and network representation of preferential node connections, showing a hierarchical relationship between the 28S and 39S subcomplex with the assembled 55S mitochondrial ribosome. F. Network representation of hub nodes defined as 15% of nodes ranked by betweenness centrality, which predicts a potential role of LACTB as a critical hub that lies upstream of multiple large and small mitochondrial ribosomal protein subunits. Node colors represent the pie chart diagram of the corresponding GO biological process described in the table. SHAP values of three proteins (MRPL20, MRPL19, MRPS34) are highlighted showing top model contributors.