Quantifying intra-tumoral genetic heterogeneity of glioblastoma toward precision medicine using MRI and a data-inclusive machine learning algorithm

Background and objective Glioblastoma (GBM) is one of the most aggressive and lethal human cancers. Intra-tumoral genetic heterogeneity poses a significant challenge for treatment. Biopsy is invasive, which motivates the development of non-invasive, MRI-based machine learning (ML) models to quantify intra-tumoral genetic heterogeneity for each patient. This capability holds great promise for enabling better therapeutic selection to improve patient outcome. Methods We proposed a novel Weakly Supervised Ordinal Support Vector Machine (WSO-SVM) to predict regional genetic alteration status within each GBM tumor using MRI. WSO-SVM was applied to a unique dataset of 318 image-localized biopsies with spatially matched multiparametric MRI from 74 GBM patients. The model was trained to predict the regional genetic alteration of three GBM driver genes (EGFR, PDGFRA and PTEN) based on features extracted from the corresponding region of five MRI contrast images. For comparison, a variety of existing ML algorithms were also applied. Classification accuracy of each gene were compared between the different algorithms. The SHapley Additive exPlanations (SHAP) method was further applied to compute contribution scores of different contrast images. Finally, the trained WSO-SVM was used to generate prediction maps within the tumoral area of each patient to help visualize the intra-tumoral genetic heterogeneity. Results WSO-SVM achieved 0.80 accuracy, 0.79 sensitivity, and 0.81 specificity for classifying EGFR; 0.71 accuracy, 0.70 sensitivity, and 0.72 specificity for classifying PDGFRA; 0.80 accuracy, 0.78 sensitivity, and 0.83 specificity for classifying PTEN; these results significantly outperformed the existing ML algorithms. Using SHAP, we found that the relative contributions of the five contrast images differ between genes, which are consistent with findings in the literature. The prediction maps revealed extensive intra-tumoral region-to-region heterogeneity within each individual tumor in terms of the alteration status of the three genes. Conclusions This study demonstrated the feasibility of using MRI and WSO-SVM to enable non-invasive prediction of intra-tumoral regional genetic alteration for each GBM patient, which can inform future adaptive therapies for individualized oncology.


Introduction
Glioblastoma (GBM) is one of the most aggressive and lethal human cancers, with a median overall survival of only about 15 months despite best available standard therapy [1].Intra-tumoral genetic heterogeneity is a major contributor to poor clinical outcomes [2].Each tumor is comprised of genetically distinct subpopulations with different sensitivities to treatment, and genetic targets from one biopsy location may not accurately reflect those from other parts of the same tumor [3].Moreover, due to the invasive nature of the disease, diffusely invaded GBM cells are always left behind in the brain after resection, and these residual regions may be genetically distinct from the biopsy samples collected during surgery [4,5].
The region-to-region genetic variability within a single tumor provides potential mechanisms for therapeutic escape and makes single targeted therapies less effective [6].
There are substantial challenges for quantifying intra-tumoral genetic heterogeneity of GBM.
Ideally, one would want to take biopsy samples from many different regions of a tumor and perform genetic analysis of each sample.This, however, is infeasible due to the invasive nature of biopsy.Although the central tumor mass can often be surgically removed, the invasive portions of the tumor are often left unresected and unbiopsied given the risk to adjacent neurologic structures.Thus, biopsy alone is insufficient to characterize the full landscape of the intra-tumoral heterogeneity [2] [7].
Neuroimaging techniques, such as MRI, provide data of the entire tumor and even the whole brain in a non-invasive manner.The emerging field of radiogenomics has demonstrated the feasibility of using MRI features to predict genetic characteristics of GBM via machine learning (ML).For example, Kha et al. [8] proposed an eXtreme Gradient Boosting (XGBoost)-based model to predict the 1p/19q codeletion status in a binary classification task for lower-grade gliomas.Lam et al. [9] developed a hybrid machine learning-based radiomics by incorporating a genetic algorithm and XGBoost classifier to classify low-grade glioma molecular subtypes.Akbari et al. [10] used a Support Vector Machine (SVM) to predict Epidermal Growth Factor Receptor (EGFR)-vIII mutation based on multiparametric MRI features extracted from tumor regions.Tykocinski et al. [11] predicted EGFR-vIII mutation based on features extracted from perfusion-weighted MRI using multivariable logistic regression.KickingeredeThe tir et al. [12] utilized stochastic gradient boosting machine, random forest, and logistic regression to predict the copy number variants (CNVs) of several GBM driver genes such as EGFR, Platelet-Derived Growth Factor Receptor Alpha (PDGFRA), and Phosphatase and Tensin Homolog (PTEN) based on multiparametric MRI.Chen et al. [13] developed a convolutional neural network to predict PTEN mutation using multiparametric MRI.
However, these existing studies focus on predicting overall or average genetic status for the entire tumor, so they are suitable for relatively homogeneous tumors where genetic status does not significantly vary region-to-region.Although these studies have demonstrated the predictive utility of MRI, they fall short for identifying intra-tumoral or regional genetic heterogeneity within each tumor.This paper aims to develop an ML model that can predict the genetic status region-by-region within a tumoral area of interest (AOI) of each patient using MRI.The model, denoted as :  → , takes as input a vector  consisting of MRI features extracted from each region within a tumoral AOI and outputs the genetic status of that region, , where  = 1 or 2 represents that the gene is non-altered or altered, respectively.The resulting regional predictions can be used to generate a prediction map that reveals the intra-tumoral heterogeneity across the AOI.
To train the ML model , a binary classification approach can be considered by using a training set consisting of (  ,   ) for  biopsy samples.However, the biopsy sample size is often small, and a more robust approach is to use semi-supervised learning (SSL) [14].SSL trains  by including both the biopsy/labeled samples (  ,   ) and unlabeled tumoral samples, (  ), i.e., samples from the unbiopsied regions of the tumoral AOI.Additionally, it is possible to leverage samples from outside the tumoral AOI (i.e., the normal brain area), (  ).To include these normal brain samples, one option is to treat them as a third class (class 0), in addition to non-altered gene (class 1) and altered gene (class 2) within the tumoral AOI, and train a three-class classifier.The other option, which may be more appropriate, is to train an ordinal classifier [15][16][17] by considering that class 0, 1, and 2 have an intrinsic order of increasing abnormality.WSO-SVM is a novel ordinal classifier based on SVM.Unlike the existing algorithms that only utilize labeled samples from each class (e.g., normal brain samples-class 0, and biopsy samples-class 1 & 2), WSO-SVM introduces a unique optimization formulation to allow the incorporation of unlabeled tumoral samples (class 1 or 2, not 0).This helps identify accurate classification boundaries and improve prediction performance.The development of WSO-SVM is significant as it represents the first method capable of integrating multiple data sources, including biopsy samples, unlabeled tumoral samples, and normal brain samples, to train a robust classifier for predicting regional genetic status using MRI.In our case study, we demonstrate the superior performance of WSO-SVM compared to a variety of ML algorithms.The clinical utility of this work lies in the non-invasive quantification of intra-tumoral genetic heterogeneity using MRI for individual patients.WSO-SVM enables the generation of regional prediction maps for GBM driver genes such as EGFR, PDGFRA, and PTEN across the entire tumoral AOI for each patient.These maps have practical implications in guiding therapy selection and predicting response to targeted therapies, such as EGFR inhibitors [7].Furthermore, the predictive maps reveal the co-existence of genomically distinct tumor subpopulations within individual tumors, which can enhance our understanding and develop new approaches, such as adaptive therapy, to leverage the interplay and competition between different molecular subpopulations for therapeutic benefit [18].

Data collection
This study used data from a cohort of 74 GBM patients with IRB approval from Barrow Neurological Institute (BNI) and Mayo Clinic Arizona (MCA).These patients were prospectively recruited for the study.The recruitment period is from February 29, 2012, until present.All patients provided written informed consent.The data were accessed for research purposes from February 29, 2012, until present.A total of 318 biopsy samples were acquired from these patients (average: 4; range: 1-13).Each patient went through a pre-operative multiparametric MRI exam, from which five contrast images were obtained: T1-weighted contrast-enhanced image (T1+C), T2-weighted image (T2), mean diffusivity (MD), fractional anisotropy (FA), and relative cerebral blood volume (rCBV).

Biopsy sample analysis
Array CGH data was obtained for a subset of biopsy samples [19].Whole exome sequencing (WES) was performed remaining biopsies and paired blood samples.Quality control was performed using the MultiQC toolkit.The aligned paired-end clean reads were processed using Burrows-Wheeler Aligner2 and GATK3 to remove low-quality reads and realign around indels.Somatic SNVs and indels were detected using a combination of six variant calling algorithms: Freebayes5, MuTect26, TNhaplotyper7, TNscope7, TNsnv7, and VarScan28.Somatic copy number and tumor purity were estimated from WES data using PureCN12.GISTIC213 analysis was performed to identify recurrently amplified or deleted genomic regions by integrating the results from individual patients.
We focused on three GBM driver genes: EGFR, PDGFRA, and PTEN.For each gene, we considered the gene is altered (class 2) if it has an abnormal CNV or is mutated, and non-altered (class 1) otherwise.For EGFR and PDGFRA, we followed the literature [19] and considered amplification as abnormal CNV; for PTEN, deletion or loss was considered as abnormal CNV [20].To maximize the sample size in ML training, we included all available samples for each gene.There are 130/171, 53/238, and 206/109 biopsy samples with altered/non-altered EGFR, PDGFRA, and PTEN, respectively.

MRI preprocessing and feature extraction
Detailed MRI protocols and preprocessing approaches can be found in S1 Appendix.The same approaches have been used in our prior publications [2][7] [21], which have shown robust performance.
The MRI features corresponding to each biopsy sample were extracted from a defined "region", i.e., an 8x8 pixel 2 window centered at the sampling location.This specific window size was thoughtfully chosen due to its approximate equivalence to the physical size of biopsy samples, ensuring an alignment between the MRI features and the genetic status derived from the biopsy.Moreover, prior research findings have supported the suitability of this window size for effectively capturing the intra-tumoral heterogeneity of GBM [2][7] [19].
From this window, we extracted 280 features from five aforementioned MRI contrast images, which included statistical features and texture features using two well-established texture analysis algorithms, Gray-Level Co-occurrence Matrix (GLCM) [22] and Gabor Filters (GF) [23].Please find names of these features in S1 Appendix.[19], which is the union of the contrast-enhancing portion (CE) and the non-enhancing portion (NE) of the tumor.The contralateral AOI is located on the opposite side of the brain from the tumor and is considered "normal".To extract MRI features for these samples, the same approach as that used for biopsy samples was adopted.
The selection of unlabeled tumoral samples and normal brain samples was based on multi-fold considerations: (a) Representation of tumoral heterogeneity: Biologically, a GBM tumor includes a contrast-enhancing portion (CE) and a non-enhancing portion (NE).The former harbors proliferative tumor cells, while the latter harbors invading tumor to the surrounding brain tissue [28].To ensure our unlabeled samples capture this biological heterogeneity of each tumor, an equal number of samples were taken from CE and NE.(b) Avoidance of outlier samples: We were careful to avoid selecting samples from areas that could be considered outliers.Notably, we excluded regions like necrosis, where the tissue characteristics significantly differ [28].Additionally, for tumors located near fixed brain structures like the skull or cerebrospinal fluid, precautions were taken to prevent sample overlap with these structures.(c) Model accuracy and efficiency: Since unlabeled tumoral samples and normal brain samples are "auxiliary" samples to biopsies, their size should not be excessively larger even though acquiring these samples is much easier than biopsies.This is to prevent sample imbalance and potential dilution of the predominant influence of biopsy samples on model training.Therefore, we kept an equal number of unlabeled tumoral samples and normal brain samples, with their combined total aligning with that of biopsy samples.This choice also ensures the computational efficiency of model training.
Moreover, as depicted in Fig 2 , when the trained WSO-SVM is applied to a patient, the goal is to generate a regional prediction map of the genetic status within the tumoral AOI.To accomplish this, an 8×8 pixel 2 sliding window with a stride size of one pixel was placed at each pixel within the tumoral AOI, and MRI features were extracted from each window.
It is important to note that WSO-SVM is different from ordinal SVM in its ability to incorporate unlabeled tumoral samples.This is achieved by introducing a constraint in Eq. ( 5) to prevent the classification of these samples as normal brain samples (class 0).The inclusion of unlabeled tumoral samples helps better identify the classification boundary  0 , and also contributes to the estimation of the weight vector , indirectly aiding in a better identification of  1 .

Fig. 4: A graphical illustration of the model formulation of WSO-SVM
It is easier to solve the WSO-SVM optimization in its dual form which is given in Proposition 1.
Once the optimal solutions of  and  in the dual problem are obtained, we can obtain the optimal coefficients in the primal problem,  , and further get ℎ() = − ∑   (1)  (,   (1) ) + ∑   (12)  (,   (12) ) . Also,  0 and  1 can be estimated as:  0 = ℎ() −  for any (, ) belonging to normal brain samples (or biopsy and unlabeled tumoral samples) whose corresponding  (0) (or  (12) ) satisfies 0 ≤  (0) (or  (12) ) ≤  2 ;  1 = ℎ() −  for any (, ) belonging to non-altered biopsy samples (or altered biopsy samples) whose corresponding  (1) (or  (2) ) satisfies 0 ≤  (1) (or  (2) ) ≤  1 .Then, we can obtain the discriminant functions for any new sample Training and cross validation (CV).We used 10-fold CV to mitigate the risk of overfitting.To further reduce potential bias in evaluating model performance due to the specific fold division in CV, we repeated the CV procedure 30 times.We reported the model's average performance and the standard deviation across the 30 repetitions with the latter capturing uncertainty.Specifically, the biopsy samples were divided into 10 folds.In each iteration, WSO-SVM was trained based on 9 folds of the biopsy samples and randomly selected unlabeled tumoral samples and normal brain samples of the same size according to the considerations illustrated in Sec.2.3.
Choice of tuning parameters: There are two key tuning parameters for WSO-SVM according to Proposition 1,  1 and  2 . 1 affects the classification boundary between biopsy samples in class 1 (gene not altered) and class 2 (gene altered). 2 affects classification boundary between class 1 or 2 (comprising tumoral samples, both unlabeled and labeled) and class 0 (normal brain samples).Our experiments found that distinguishing between class 1/2 and class 0 was relatively easy, which also aligned with the intuition that discerning tumoral samples from normal brain samples should inherently be a formidable task.
Therefore, we tuned  2 on a coarser grid within the range of 0.01 to 100 and kept multiple settings that yielded >80% accuracy in differentiating class 1/2 from class 0. At each setting, we tuned  1 on a finer grid between 0.01 and 100, and selected  1 with the highest accuracy to differentiate class 1 and 2.
Generation of a regional predictive map of genetic status for each patient.To personalize the model toward each patient's data, we re-trained WSO-SVM under the previously found optimal tuning parameter setting but using randomly selected unlabeled tumoral samples and normal brain samples from the specific patient.Next, we applied the model to predict the gene status for each sliding window within the tumoral AOI of the patient, based on MRI features extracted from that window.The resulting predictions formed the predictive map for that patient.
Time complexity in training and deployment.As WSO-SVM adopted SVM as its base model, its time complexity in model training is similar to that of SVM [29], which ranges between O( 2 × ) and O( 3 ×  ), where  is the sample size and  is the feature dimension.Currently, we used quadratic programming to solve the WSO-SVM optimization, which can be further expedited by using more advanced optimization algorithms such as sequential Minimal optimization [30] and stochastic gradient descent [31].While SVM-type of models are not the most computationally efficient, the training time complexity is acceptable and the performance gain over more efficient methods has made it an appealing choice for large datasets in various applications.In our application, the model training is done offline, which makes it feasible to train WSO-SVM on large datasets.During deployment, the trained model generates regional genetic characteristics within the tumoral area on a patient-by-patient basis.The time required to produce the prediction map for an individual patient is less than 30 seconds when executed on a standard desktop computer.This level of efficiency aligns well with the clinical use case, ensuring that the model can be deployed in a timely and practical manner.

Model interpretation
It is important to understand the contribution of different MRI features to the prediction made by WSO-SVM.While WSO-SVM can use either a linear or non-linear kernel, we found that a non-linear kernel produced better performance.Also, previous studies have shown that the relationship between MRI features and genetic status is highly non-linear [32].To interpret the non-linear WSO-SVM, we utilized a popular, model-agnostic method called SHapley Additive exPlanations (SHAP) [33].Essentially, SHAP estimates the contribution of a feature, referred to as the SHAP value, by computing the difference in the model's prediction when the feature is present versus absent.The higher the absolute SHAP value of a feature, the greater its impact on the prediction.In our study, we were more interested in the contribution of each MRI contrast image rather than individual features.Thus, we aggregated the feature-wise SHAP values to the contrast level.

Competing methods
We compared the performance of WSO-SVM with existing algorithms in several categories (using the same CV process): • Binary classifiers: SVM, random forest (RF).
• Ordinal classifiers: ordinal SVM, ordinal RF • Multi-task learning (MTL): regularized MTL (regMTL) [38], MTL Gaussian Process (MTL-GP) [39], MTL RF (MTL-RF) [40].These are multi-class classification algorithms by coupling the models of the three GBM driver genes together.whereas WSO-SVM did not have this issue.Among all the competing algorithms, random forest types of methods performed better in most cases.Moreover, the standard deviation of WSO-SVM is among the smallest over all the methods being compared.The magnitude of the standard deviation is also small, indicating that the model performance is quite stable (i.e., less uncertainty).

Results
To assess the statistical significance of the performance gain for WSO-SVM, we performed a onesided Wilcoxon rank-sum test to compare WSO-SVM against the competing algorithm with the overall best accuracy.For EGFR, WSO-SVM significantly outperformed multi-class RF in accuracy, sensitivity, and specificity (p<0.001,p<0.001, p=0.002).For PTEN, WSO-SVM significantly outperformed binary RF in accuracy, sensitivity, and specificity (p<0.001,p<0.001, p<0.001).For PDGFRA, WSO-SVM had significantly higher accuracy and sensitivity than MTL-RF (p=0.04,p<0.001), but its specificity was not significantly higher.WSO-SVM performed significantly better than the overall best competing algorithm in accuracy (p<0.001),sensitivity (p<0.001), and specificity (p<0.001) using a Wilcoxon rank-sum test.

Table 1: Classification performance of EGFR using CV based on biopsy samples
Furthermore, Fig 6 shows the absolute SHAP values of the five MRI contrast images.It is evident that all contrast images contribute to the classification of each gene, but their relative contributions vary between genes.Further discussion will be provided in the next section.

Discussion
Our results demonstrated that WSO-SVM surpasses a variety of existing ML algorithms for predicting the regional status of three GBM driver genes using MRI.To interpret WSO-SVM, the SHAP values in Fig 6 revealed the importance of each contrast in influencing WSO-SVM's prediction for each gene.Specifically, the model's predictions on EGFR were primarily influenced by T2 and rCBV, which aligns with prior research that found significant correlations of EGFR with T2 [19][41] and rCBV [11][19] [42].T1+C demonstrated the highest contribution to PDGFRA prediction.This is consistent with previous studies indicating that PDGFRA subpopulations tend to localize in CE with relatively less infiltration into NE, in comparison to EGFR [43].For PTEN, the model's prediction received the greatest contribution from rCBV.Prior studies have highlighted the correlation between PTEN and rCBV, particularly when co-existing with EGFR alterations [44].
The prediction maps in of the extensive intra-tumoral genetic heterogeneity in each patient.While intra-tumoral genetic heterogeneity in GBM is well-documented in literature, practical methods for quantifying this heterogeneity are lacking.Biopsy samples, which can only be obtained from a few locations of the brain, leave many regions uncharacterized.This study introduces WSO-SVM as a non-invasive approach to predict regional genetic status across the entire tumoral AOI for each patient using MRI.
The clinical utility of the prediction maps for GBM driver genes, EGFR, PDGFRA, and PTEN, is multi-fold.First, these driver genes have been investigated as therapeutic targets for GBM.EGFR is one of the most commonly altered gene drivers in GBM and has been implicated in several pathogenic mechanisms.
Targeted drug therapies, including those directed at EGFR and other receptor tyrosine kinases (RTKs) like PDGFRA, have been developed [11][12].However, the clinical outcomes of current therapies are unsatisfactory for most patients due to the limited information obtained from sparse biopsy samples, which cannot fully capture the genetic landscape of each patient's tumor.With the capability provided by WSO-SVM, there is an opportunity to optimize therapy selection for each patient and provide better prognostic information regarding their response to treatment.This holds great potential for improving patient outcomes and tailoring therapies to individual genetic characteristics.
Moreover, this study goes beyond individual gene predictions and allows for the simultaneous prediction of multiple GBM driver genes.Interactions between tumor subpopulations within GBM tumors are increasingly acknowledged for their impact on biological behavior, therapeutic response, and local phenotypic expression.Although such interactions have been extensively studied in non-CNS tumors, their exploration in GBM remains limited.Existing studies have primarily focused on the heterogeneous expression of receptor tyrosine kinase (RTK) aberrations, such as EGFR and PDGFRA amplifications.For instance.Inda et al. [45] showed that a minority subpopulation expressing EGFR-vIII could potentiate a majority subpopulation expressing wild-type EGFR to enhance growth, survival, and drug resistance.
Szerlip et al. [46] observed cooperation between subpopulations expressing EGFR or PDGFRA amplifications, requiring combined inhibition for pathway attenuation in vitro.Fiorenzo et al. [47] suggests that in vivo and human studies are needed to fully understand subpopulation interactions' impact on tumor growth.These interactions between subpopulations pose significant challenges for current treatment strategies and clinical trials that focus on single drug targets, such as EGFR [48].By providing the capability to predict multiple GBM driver genes simultaneously, our study offers insights into these complex interactions and addresses the need for a more comprehensive understanding of tumor heterogeneity in GBM to develop future, advanced therapy [18,49].
This study has several limitations.First, the biopsy sample size is relatively small.This is due to the highly invasive nature of acquiring these samples from patients' brains.In the literature of integrating MRI and brain biopsy data for machine learning models, the typical sample size falls within the range of 82-244 [2][5] [7][11]- [13].While our study included 318 biopsies, a size comparably larger than these existing studies, it remains relatively modest when compared to domains where sample collection is more accessible.To alleviate this problem, the WSO-SVM model was designed to incorporate unlabeled tumoral samples and normal brain samples.However, further research is imperative to validate the generalizability of WSO-SVM on a more extensive and diverse population.A related issue is that our performance evaluation was based on CV.Using external datasets to further validate our model is highly necessary.
There is currently no publicly available dataset with the same nature of our dataset, due to the invasive nature of biopsy acquisition and the time-consuming process of patient consent, surgical procedures, genetic analysis, and image preprocessing.Nevertheless, our team is currently collecting more data and preparing for subsequent validations of the model.This paper serves as a starting point in addressing a critical issue of non-invasive quantification for intra-tumoral genetic heterogeneity using MRI and a novel machine learning model WSO-SVM.
Second, it is important to acknowledge that while our study establishes correlations between genetic alterations and imaging-phenotypic features, it does not establish causal relationships.Experimental validation of causal relationships, which may involve creating specific genetic alterations in animal models and observing their effects on imaging phenotypes, remains a critical step to confirm and gain a deeper understanding of the underlying cancer mechanisms.
Third, while we have provided some discussions on the potentials of using the method developed in this paper to help therapeutic selection and develop advanced therapy to improve patient outcomes, this paper focused on the research phase of the method development.Clinical validation in real-world setting is necessary to establish the actual utility and benefit of the proposed method.Such validation could encompass clinical trials designed to compare patient outcomes, such as treatment response and survival, between cohorts undergoing standard clinical protocols for therapeutic selection and those benefitting from the additional guidance provided by the regional genetic prediction maps generated by our method.
Last but not least, the WSO-SVM model has several aspects for improvement.For instance, WSO-SVM can incorporate unlabeled tumoral samples and normal brain samples.Currently, these samples were selected based on considerations illustrated in Sec.2.3.This selection method can be refined by integrating more advanced computational strategies that take uncertainty and diversity into account [50] and by considering patient demographic information [51].Also, WSO-SVM relies on texture features extracted from MRI as input, which may be influenced by imaging quality.Uncertainty quantification of WSO-SVM predictions considering input uncertainty is important, and a Bayesian version of the model could address this issue.Also, developing robust predictive models that are insensitive to input uncertainty would have greater clinical utility.

Conclusion
We developed a data-inclusive WSO-SVM model to predict regional genetic alteration status within each GBM tumor using MRI.This study demonstrated the feasibility of using MRI and WSO-SVM to enable non-invasive prediction of regional genetic alteration for each patient, which can inform future adaptive therapies for individualized oncology.

MRI protocols, parametric maps, and image co-registration
The MRI images used in this study were obtained through standard protocols and gone through preprocessing steps for quality control, which were described in detail in our previous publications [1]- [3].
Here we provide an exertion of the detailed approaches from a prior paper [1].
We performed all imaging at 3 T field strength (Sigma HDx; GE-Healthcare Waukesha Milwaukee; Ingenia, Philips Healthcare, Best, Netherlands; Magnetome Skyra; Siemens Healthcare, Erlangen Germany) within Fig 1 illustrates the different modeling options.However, none of these models can include all available data.To address this gap, we propose a new model called Weakly-Supervised Ordinal SVM (WSO-SVM), which is designed to integrate unlabeled tumoral samples and normal brain samples beyond just biopsy (labeled) samples to enhance the model's learning capacity.

Fig. 1
Fig. 1 Different data sources that can be leveraged by WSO-SVM and existing ML algorithms.

Fig 2
Fig 2 shows a pipeline of the proposed method whose components are discussed in subsequent

Fig. 2
Fig. 2 Pipeline of the proposed method.Left: model training; Right: model deployment.
Fig 3 depicts the biological connection between genetic alterations and these imaging-phenotypic features.These features have been widely used in the radiomics literature for GBM to aid in diagnosis, prognosis, and prediction of genetics-related tumor characteristics, such as genetic subtypes and copy number variations [7][19][24-27].

Fig. 3
Fig. 3 Biological connection between genetic alterations and imaging-phenotypic features.

Fig. 6
Fig. 6 Contributions of MRI contrast images to the classification of (a) EGFR, (b) PDGFRA, and (c)

Fig. 7 :
Fig. 7: EGFR & PDGFRA prediction map (left column) and PTEN prediction map (right column) Fig 7 and in S1 Fig for other patients in our dataset provided strong evidence Eq.(10)  into the optimization in Eq. (17), we can have max conditions in Eq. (11)-(12)  give rise to the constraints of

Table 1 -
3 summarize the average CV performance and standard deviation over 30 repeated experiments for each gene.Fig 5 compares WSO against the competing algorithm with the best accuracy in each category.WSO-SVM achieved the highest accuracy, sensitivity, and specificity for EGFR and PTEN.For PDGFRA, WSO-SVM achieved the highest accuracy and sensitivity, while its specificity is second highest after MTL-RF.However, the sensitivity of MTL-RE is very low (only 0.5).Due to the heavy class imbalance for PDGFRA, most existing algorithms struggle to achieve a reasonable sensitivity,

Table 2 : Classification performance of PDGFRA using CV based on biopsy samples
* Best competing algorithm in each category ** Overall best competing algorithm WSO-SVM performed significantly better than the overall best competing algorithm in accuracy (p<0.001),sensitivity(p<0.001), and specificity (p=0.002) using a Wilcoxon rank-sum test.*Best competing algorithm in each category ** Overall best competing algorithm WSO-SVM performed significantly better than the overall best competing algorithm in accuracy (p=0.04) and sensitivity (p<0.001) using a Wilcoxon rank-sum test.

Table 3 : Classification performance of PTEN using CV based on biopsy samples Fig. 5 Classification performance of WSO-SVM in comparison with the best competing algorithm in each category. The overall best competing algorithm is highlighted by **.
* Best competing algorithm in each category ** Overall best competing algorithm 2 ,  = 1, . . .,  12 ′ .