Radiomics in predicting mutation status for thyroid cancer: A preliminary study using radiomics features for predicting BRAFV600E mutations in papillary thyroid carcinoma

Purpose To evaluate whether if ultrasonography (US)-based radiomics enables prediction of the presence of BRAFV600E mutations among patients diagnosed as papillary thyroid carcninoma (PTC). Methods From December 2015 to May 2017, 527 patients who had been treated surgically for PTC were included (training: 387, validation: 140). All patients had BRAFV600E mutation analysis performed on surgical specimen. Feature extraction was performed using preoperative US images of the 527 patients (mean size of PTC: 16.4mm±7.9, range, 10–85 mm). A Radiomics Score was generated by using the least absolute shrinkage and selection operator (LASSO) regression model. Univariable/multivariable logistic regression analysis was performed to evaluate the factors including Radiomics Score in predicting BRAFV600E mutation. Subgroup analysis including conventional PTC <20-mm (n = 389) was performed (training: 280, validation: 109). Results Of the 527 patients diagnosed with PTC, 428 (81.2%) were positive and 99 (18.8%) were negative for BRAFV600E mutation. In both total 527 cancers and 389 conventional PTC<20-mm, Radiomics Score was the single factor showing significant association to the presence of BRAFV600E mutation on multivariable analysis (all P<0.05). C-statistics for the validation set in the total cancers and the conventional PTCs<20-mm were lower than that of the training set: 0.629 (95% CI: 0.516–0.742) to 0.718 (95% CI: 0.650–0.786), and 0.567 (95% CI: 0.434–0.699) to 0.729 (95% CI: 0.632–0.826), respectively. Conclusion Radiomics features extracted from US has limited value as a non-invasive biomarker for predicting the presence of BRAFV600E mutation status of PTC regardless of size.


Introduction
During the past decade, the incidence of thyroid cancer has rapidly increased worldwide, regardless of the demographic groups [1][2][3]. The majority of thyroid cancers that are being newly detected are papillary thyroid cancers (PTC) [4,5], a subtype consisting of more than 80% of all differentiated thyroid carcinomas [5,6]. In general, PTC is known to have excellent patient outcomes, 5-year survival rates approaching 98-99% [7,8], but approximately 10-15% of patients had aggressive tumor behavior, local recurrence/distant metastasis after treatment or mortality [9][10][11][12]. At present, even with the well-known poor prognostic factors such as age over 45 years, male gender, radioactive iodine resistance [9], it is difficult to descriminate to predict which patient has more aggressive forms of PTCs, and effort has been made using various biomarkers in predicting PTC patients with poor outcome.
With the advancement in molecular genetics, various genetic alterations have been revealed and used as an adjunctive diagnostic method or for predicting patient prognosis [8,13,14]. BRAF V600E mutation, the most frequent oncogene in PTC, has been reported to be associated with aggressive clinical features such as large tumor size, extrathyroidal extension and presence of lymph node metastasis [8,[14][15][16], leading to recurrence or mortality. But even with the ability of either mutation in detecting aggressive cancer types, genetic analysis requires specimen tissue for analysis, mostly obtained from invasive surgical procedures. Aside from the information obtained from conventional imaging, radiomics, using data extracted from medical images converted into high-dimensional, mineable, and quantitative imaging features has been applied to revealing tumor physiology. Other studies have linked imaging features to molecular properties of tumors among various organs [17][18][19][20], but to the best of our knowledge, no studies have applied radiomics in predicting molecular status of thyroid cancer that can be used in predicting tumor aggressiveness. Based on this, we evaluated whether if ultrasonography (US)-based radiomics enables prediction of the presence of BRAF V600E mutations among patients diagnosed as PTC.

Materials & methods
This retrospective study has been approved by the institutional review board (IRB) of Severance Hospital, Yonsei University (approval number: 4-2018-0172), with a waiver for patient consent due to the retrospective study design. Signed informed consent was obtained from all patients prior to biopsy or surgical procedures. Images used for data extraction were fully anonymized before data processing according to the instructions of our IRB.

Patients
We included 527 patients who had been treated surgically with cytologically-proven or suspicious thyroid cancer between December 2015 to May 2017 at Severance Hospital, Seoul, Korea. All patients had BRAF V600E mutation analysis performed on surgical specimen. The 387 consecutive patients who had surgery from December 2015 to December 2016 were used as the training cohort: 300 women, 87 men, mean age, 42.1 years±14.0 (range, 15-82 years).
As the frequency of BRAF V600E mutation has been reported to be associated with tumor size and conventional PTC [21], subgroup analysis was performed including thyroid cancers confirmed as conventional PTCs measuring <20-mm (n = 389). Mean age of the 389 patients was 42.9 years±13.1 (range, 15-80 years). Mean size of the conventional PTCs was 14.9 mm ±4.6 (range, 10-19 mm). Clinicopathologic data regarding tumor size, lymph node metastasis were obtained from review of medical records. Imaging features of the thyroid masses used for analysis were obtained from an institutional database.

US image selection and feature extraction
One radiologist (J.Y.K.) reviewed the preoperative US examinations of the 527 patients on the picture archiving and communication system (PACS) and selected representative transverse or longitudinal images of the tumor. The selected representative images were converted into JPEG files for manual segmentation. One radiologist (J.H.Y.) who had 9 years of experience in thyroid imaging manually set a region-of-interest (ROI) along the boundary of the selected tumor using Paint software of Windows (Fig 1). Since ROI marking with colored brush using Paint software alters original intensities in image, the manual ROI segmentation is conducted over the duplicate images of collected JPEG files. Before starting ROI extracting procedure, all images were normalized for fair comparison. First the location information of ROI marking (coordinate information of red curves in Fig 1) was sought and then applied to the original JPEG image to extract ROI only. This procedure ensures that the original intensity of the image was not affected by the ROI extraction process. Once ROI was extracted, a total of 730 feature information were gathered. The 730 features include the first order statistics (energy, entropy, kurtosis, skewness and so on), the second order statistics (the gray level co-occurrence matrix (GLCM) and gray level run-length matrix (GLRLM) were established and the corresponding features were extracted), and features from four discrete one-level wavelet decompositions. The detailed calculation for these features can be found in [22]. To obtain the feature quantities, the house code in MATLAB 2018b was used. Here, 256 bins using a bin with of 1 were utilized for intensity histogram and 4 angles of 0, 45, 90, and 135 degrees were utilized for GLCM and GLRLM anaylsis.

BRAF V600E mutational analysis
Direct DNA sequencing was used for the surgical specimen in mutation analysis. Exon 15, which contains the BRAF V600E mutation, was amplified by PCR with the foward primer AGGAAAGCATCTCACCTCATC and the reverse primer GATCACACCTGCCTTAAATTGC. The PCR parameters were as follows: 94˚C for 5 minutes, 35 cycles at 94˚C for 0.5 minutes, 60˚C for 0.5 minutes, and 72˚C for 10 minutes. The amplified products were purified with a QIA-GEN PCR purification kit and sequenced using the foward primer described previously with Big Dye Terminator (ABI Systems, Applied Biosystems, Foster City, CA), and an ABI PRISM 3100 Avant Genetic Analyzer (Perkin-Elmer).

Data & statistical analysis
For feature selection, LASSO logistic regression model was applied to the 730 texture features extracted from the US images, and a Radiomics Score was calculated for each patient using a linear combination of selected features weighted by the respective coefficients.
Univariable and multivariable logistic regression analysis was performed to calculate the odds ratio with 95% confidence intervals (CI), including patient's age, gender, tumor size, and radiomics score. For internal validation, bootstrap with 1,000 resampling was used. Calibration curves were plotted to assess the calibration of the model built using the factors included, using the Hosmer Lemeshow test. Harrell's C-index was measured to evaluate the model's discrimination ability.
R software (version 3.4.2, http://www.R-project.org) with the R package 'glmnet' was used for statistical analysis.

Feature selection and calcuation of radiomics score
Eight potential features were selected among 730 texture features in the training cohort with nonzero coefficients in the LASSO logistic regression model (Fig 2A and 2B). These 8 texture features were presented in the calculation formula below used to calculate the Radiomics Score, Radiomics Score ðtotalÞ ¼ 0:3715483 À 0:0179227 X mad 6 0 À 0:0202624 X sv 43 0 À 0:0000068 X HL ene 1 0 À 0:0000041 X HL rln 48 0 À 0:0769504 X LL uni 13 0 À 0:0013692 X LL lrlgle 54 0 þ 0:0025444 X LL se 42 45 þ 0:5554316 X LL se 42 90 For the conventional PTCs measuring <20-mm, 4 potential features were selected among the 730 texture features in the training cohort (Fig 2C and 2D). These 4 texture features were presented in the calculation formula below used to calculate the Radiomics Score (cPTC<20-mm), Radiomics ScoreðcPTC < 20À mmÞ ¼ À 2:2001791 þ 11:4205518 X LH srlgle 52 0 À 0:7666155 X LL uni 13 0 þ 0:8461400 X LL se 42 90 À 0:0001180 X LL lrhgle 55 90  Table 2 summarizes the results of univariable and multivariable logistic regression analysis for predicting the presence of BRAF V600E mutations. In the training cohort of the total thyroid cancers, tumor size and Radiomics Score were factors with statistical significance on univariable analysis. Among the training cohort including conventional PTCs measuring <20-mm, Radiomics Score was the single factor showing statistical significance. In both total cancers and the conventional PTC<20-mm, Radiomics Score was the single factor showing significant association to the presence of BRAF V600E mutation on multivariable analysis (all P<0.05). The calibration curve of the prediction model for the presence of BRAF V600E mutation demonstrated good agreement between prediction and observation in the training cohort among the thyroid cancers. The Hosmer-Lemeshow test yielded statistics of P = 0.502, suggesting good calibration (Fig 3A). C-statistics for the training set was 0.718 (95% CI: 0.650-0.786), and 0.629 (95% CI: 0.516-0.742) for the validation set ( Table 3). The calibration curve of the prediction model for the presence of BRAF V600E mutation among conventional PTCs <20-mm demonstrated good calibration, with the Hosmer-Lemeshow test yielding statistics of P = 0.257 (Fig 3B). C-statistics for training set among the conventional PTCs<20-mm was 0.729 (95% CI: 0.632-0.826), and 0.567 (95% CI: 0.434-0.699) for the validation set.

Discussion
One major challenge for thyroid cancer is how to distinguish patients who need aggressive treatment to survive to those who do not. There are no consistent predictors that reliably sorts out aggressive PTCs, and in addition to the lack of prospective data regarding appropriate treatment for PTCs due to its generally excellent survival [9], issues regarding overtreatment for low-risk patients who will not experience PTC-related mortality have surfaced and debated over the recent years. This reflects the need for a more effective and accurate biomarker in predicting aggressive PTCs, including molecular analysis such as BRAF V600E mutations. Mutation analysis requires invasive procedures such as biopsy or surgical resection to retrive specimen to be analyzed. Among the non-invasive imaging biomarkers, radiomics is an emerging method that has the potential to predict molecular characteristics of tumors, using quantitative imaging features extracted using data-characterization algorithms. The most widely used imaging modality in radiomics has been computed tomography (CT) or magnetic resonance imaging (MRI), however, US is the most sensitive and accurate imaging modality for the thyroid which we used in this study.
For feature selection in obtaining a Radiomics Score, the LASSO logistic regression model was used, which enables selecting features based on their strength of association on univariable analysis, and combining the selected features into a radiomics signature [23]. The Radiomics Score obtained was the single factor showing significant association in predicting the presence of BRAF V600E mutation in both univariable and multivariable analysis Among the clinical variables, higher rates of BRAF V600E mutation was seen in thyroid cancers of smaller size, which showed significant association on univariable analysis. As the tumor size and subtype of thyroid cancer has been reported to have association to the presence of BRAF V600E mutation [21], we performed a subgroup analysis using a separate Radiomics Score calculated among the 730 texture features from a subset of 389 thyroid cancers confirmed as conventional PTC<20-mm. When using the Radiomics Score (cPTC<20-mm), similar results were obtained with the total 527 PTCs; Radiomics Score (cPTC<20-mm) was the single factor showing significance on both univariable/multivariable analysis (all P<0.001), with c-statistics of 0.729 (95% CI: 0.632-0.826) for the training set, lower values for the validation set, 0.567 (95% CI: 0.434-0.699). This supports that USradiomics has limited value in predicting BRAF V600E mutation in PTC patients, regardless of size.
There are several limitations to this study. First, as the mutation analysis was performed in a selected group of PTC patients, results of our study does not represent mutation features of the general thyroid cancer population. Second, 81.2% of the PTCs in this study had BRAF V600E mutation analysis, which may have affected our results. PTCs among our population has been known for its high prevalence for BRAF V600E mutation [26], and results may have differed when conducted on different populations. Last, US images were used for feature extraction in obtaining a Radiomics Score that may be used in prediction of the presence of BRAF V600E mutation. Inherent observer variability of US compared to computed tomography (CT) or magnetic resonance imaging (MRI) may have affected our results, but since US is currently the generally applied imaging modality for detecting and differentiating thyroid nodules, feature extraction from US images may be more appropriate in extracting radiomics data among thyroid imaging. Also, ROIs for feature extraction was obtained from one radiologist, and observer variability among different radiologists were not considered in data analysis.
In conclusion, our results show that radiomics features extracted from US has limited value as a non-invasive biomarker for predicting the presence of BRAF V600E mutation status of PTC regardless of size.