Fig 1.
Radiogenomics pipeline used in the analysis of association between imaging features and gene signatures in patients with breast cancer.
First, mammograms and tumor biopsy samples were acquired before surgery or treatment. A trained radiologist delimited the lesion region of interest to calculate Image features. For tumor biopsy samples, RNA was extracted and gene expression was measured using microarray technology, then the PAM50 molecular subtype and OncotypeDX recurrence score were measured. Univariate association based on correlation was used to show that image features are associated to signatures. Multivariate analysis was used to fit predictive models using cross-validation strategies and a feature selection algorithm. A similar procedure was used for contralateral images to evaluate whether the associations were tumor-specific.
Table 1.
Summary of patients per risk group.
Table 2.
Summary of the 539 features obtained from each region of the mammogram.
Fig 2.
Distribution of test correlations in the cross-validation multivariate feature selection.
White boxplots correspond to tumor ROIs whereas grayed boxplots correspond to contralateral ROIs used as controls. Note that contralateral distributions of test predictions are lower than corresponding tumor test predictions.
Fig 3.
Characteristics of the model obtained for OncotypeDX.
(A) A heat map representation of the features associated to OncotypeDX RS. The figure shows the features selected by LASSO (vertical axis) and their univariate Spearman coefficient and rank along samples (horizontal axis) ordered by the OncotypeDX risk score. The top of the figure includes common clinical indicators. The image data was scaled to z-score to nightlight differences. (B) Comparison of the estimated OncotypeDX recurrence score with that of the score predicted by the image model in (A). Each dot represents a sample. Colors represent subtypes and filled or open circles represent younger or older patients.
Fig 4.
Characteristics of the model obtained for PAM50.
(A) A heat map representation of the features associated to risk from PAM50 ROR. The figure shows the features selected by LASSO (vertical axis) and their univariate Spearman coefficient and rank along samples (horizontal axis) ordered by the PAM50 ROR score. The top of the figure includes common clinical indicators. The image data was scaled to z-score to nightlight differences. Orange dots at the right represent features also present in the OncotypeDX model. (B) Comparison of the estimated PAM50 recurrence score with that of the score predicted by the image model in (A). Each dot represents a sample. Colors represent subtypes and filled or open circles represent younger or older patients.
Fig 5.
Color representation of the distribution of the local fractal dimension.
High values indicate heterogeneous textures whereas low values represent uniform distributions of the tissue signal. Shades of blue, green, and red colors represent high, medium, and low dimension values respectively.