Fig 1.
Graphical representation of INTEGRATE pipeline.
Three heterogeneous experimental datasets are integrated into the input metabolic model to obtain three comparable datasets regarding transcriptomics, fluxomics and metabolomics. The concordance level between pairs of these intermediate output datasets is the final output of INTEGRATE.
Table 1.
Characterization of the investigated non-tumorigenic and cancer breast cell line in terms of their subtype and origin.
ER stands for estrogen receptor, PR stands for progesterone receptor, whereas HER2 stands for human epidermal growth factor receptor-2. Cell lines may be positive (plus sign) or negative (minus sign) for each of the described subtypes. Breast cancer also includes Luminal A and B subtypes.
Fig 2.
Experimental metabolic measurements at balanced growth phase.
A) Number of cells in time. B) Protein content in time. C) Correlation plots between protein content and number of cells for experimental observations in the 0–48hrs time windows. D) Dot plot representing the normalized mean (Z-score) of the metabolite abundances within each line (visualized by color) and the fraction of samples of each cell line with the abundance over the mean of all cell lines samples (visualized by the size of the dot). The first five metabolites that better distinguish each cell line from the others, according to t-test p-value, are reported. E) t-SNE dimensionality reduction of intracellular metabolomics profiles. F) Extracellular flux ratios, derived from spent medium measurements.
Fig 3.
Evaluation of the effect of the different types of constraints on ENGRO2 feasible solutions.
Effect of constraints on A) nutrients availability (type 1), B) nutrients availability and extracellular fluxes (type 1+2), C) intracellular fluxes based on transcriptomics data (type 3) and all together (type 1+2+3) in segregating the five investigated cell lines. A two-dimensional map of the FFDs of the five cell lines in each setting is shown. For reversible reactions, that net flux is considered. For computational reasons, only 10000 steady-state solutions sampled within the feasible region of each model were plotted. E) Correlation between the experimental and in silico growth yield on glucose is reported for each of the four settings in panels A, B, C and D. The Spearman correlation coefficient and p-value are reported on top of each plot.
Fig 4.
Variation concordance analysis.
A) RPSvsFFD (x-axis) and the RPSvsRAS (y-axis) scores of the 81 metabolic reactions for which quantification of all substrate abundances was available. The points are coloured as a function the RASvsFFD scores. We reported the names of the reactions having at least one of the scores greater than 0.2 (i.e. fair concordance). B) Heatmap showing the RPSvsRAS and the RPSvsFFD concordance scores, for reactions having a level of concordance between RPS and FFD greater than 0.2. C) Q − Q plot between the empirical probability of agreement between two independent datasets and INTEGRATE Cohen’s kappa distribution related to the comparison between RPS and FFD.
Fig 5.
A) Normalized average RPS and median FFD for reactions in Fig 4B. B) Left: same as Fig 3A (constraints 1+2+3) but with dots coloring representing the flux of cytosolic ACONT. Right: distribution of cytosolic ACONT flux values within the five cell lines. Histogram colors represent cell lines labels. C) Same as B for RPI reaction.
Fig 6.
Variation concordance analysis for Recon3D.
A) RPSvsFFD (x-axis) and the RPSvsRAS (y-axis) scores of the metabolic reactions for which quantification of all substrate abundances was available. We reported the names of the reactions having at least one of the scores greater than 0.2 (i.e. fair concordance). B) Heatmap showing the RPSvsRAS and the RPSvsFFD concordance scores, for reactions having a level of concordance between RPS and FFD greater than 0.2. C) Q − Q plot between the empirical probability of agreement between two independent datasets and INTEGRATE Cohen’s kappa distribution related to the comparison between RPS and FFD.