Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Optimization and validation of 18F-DCFPyL PET radiomics-based machine learning models in intermediate- to high-risk primary prostate cancer

  • Wietske I. Luining ,

    Roles Data curation, Formal analysis, Methodology, Writing – original draft, Writing – review & editing

    w.i.luining@amsterdamumc.nl

    Affiliations Department of Urology, Amsterdam University Medical Centers, Prostate Cancer Network Netherlands, Amsterdam, The Netherlands, Department of Radiology & Nuclear Medicine, Amsterdam University Medical Centers, Cancer Center Amsterdam, Amsterdam, The Netherlands

  • Daniela E. Oprea-Lager,

    Roles Supervision, Writing – review & editing

    Affiliation Department of Radiology & Nuclear Medicine, Amsterdam University Medical Centers, Cancer Center Amsterdam, Amsterdam, The Netherlands

  • André N. Vis,

    Roles Supervision, Writing – review & editing

    Affiliation Department of Urology, Amsterdam University Medical Centers, Prostate Cancer Network Netherlands, Amsterdam, The Netherlands

  • Reindert J. A. van Moorselaar,

    Roles Writing – review & editing

    Affiliation Department of Urology, Amsterdam University Medical Centers, Prostate Cancer Network Netherlands, Amsterdam, The Netherlands

  • Remco J. J. Knol,

    Roles Data curation, Writing – review & editing

    Affiliation Department of Nuclear Medicine, Northwest Clinics, Alkmaar, The Netherlands

  • Maurits Wondergem,

    Roles Data curation, Writing – review & editing

    Affiliation Department of Nuclear Medicine, Northwest Clinics, Alkmaar, The Netherlands

  • Ronald Boellaard,

    Roles Conceptualization, Formal analysis, Methodology, Resources, Software, Supervision, Writing – review & editing

    Affiliation Department of Radiology & Nuclear Medicine, Amsterdam University Medical Centers, Cancer Center Amsterdam, Amsterdam, The Netherlands

  • Matthijs C. F. Cysouw

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Software, Supervision, Writing – original draft, Writing – review & editing

    Affiliation Department of Radiology & Nuclear Medicine, Amsterdam University Medical Centers, Cancer Center Amsterdam, Amsterdam, The Netherlands

Abstract

Introduction

Radiomics extracted from prostate-specific membrane antigen (PSMA)-PET modeled with machine learning (ML) may be used for prediction of disease risk. However, validation of previously proposed approaches is lacking. We aimed to optimize and validate ML models based on 18F-DCFPyL-PET radiomics for the prediction of lymph-node involvement (LNI), extracapsular extension (ECE), and postoperative Gleason score (GS) in primary prostate cancer (PCa) patients.

Methods

Patients with intermediate- to high-risk PCa who underwent 18F-DCFPyL-PET/CT before radical prostatectomy with pelvic lymph-node dissection were evaluated. The training dataset included 72 patients, the internal validation dataset 24 patients, and the external validation dataset 27 patients. PSMA-avid intra-prostatic lesions were delineated semi-automatically on PET and 480 radiomics features were extracted. Conventional PET-metrics were derived for comparative analysis. Segmentation, preprocessing, and ML methods were optimized in repeated 5-fold cross-validation (CV) on the training dataset. The trained models were tested on the combined validation dataset. Combat harmonization was applied to external radiomics data. Model performance was assessed using the receiver-operating-characteristics curve (AUC).

Results

The CV-AUCs in the training dataset were 0.88, 0.79 and 0.84 for LNI, ECE, and GS, respectively. In the combined validation dataset, the ML models could significantly predict GS with an AUC of 0.78 (p<0.05). However, validation AUCs for LNI and ECE prediction were not significant (0.57 and 0.63, respectively). Conventional PET metrics-based models had comparable AUCs for LNI (0.59, p>0.05) and ECE (0.66, p>0.05), but a lower AUC for GS (0.73, p<0.05). In general, Combat harmonization improved external validation AUCs (-0.03 to +0.18).

Conclusion

In internal and external validation, 18F-DCFPyL-PET radiomics-based ML models predicted high postoperative GS but not LNI or ECE in intermediate- to high-risk PCa. Therefore, the clinical benefit seems to be limited. These results underline the need for external and/or multicenter validation of PET radiomics-based ML model analyses to assess their generalizability.

Introduction

Risk stratification of primary prostate cancer (PCa) is important for determining prognosis and selecting optimal treatment strategies. Currently, this is based on clinical tumor stage (cT-stage), prostate-specific antigen (PSA) serum level, and the International Society of Urologic Pathology (ISUP) score in prostate biopsies [1]. These data are typically included in preoperative nomograms in patients without distant metastatic disease on conventional imaging (i.e., bone scintigraphy and computed tomography (CT)) and, more recently, prostate-specific membrane antigen (PSMA) positron emission tomography (PET) [2, 3]. These nomograms calculate the risk of lymph node involvement (LNI) and determine if extended pelvic lymph node dissection (ePLND) is indicated. However, ePLND remains the gold standard for determining lymph node status [1].

In recent years, modern imaging modalities, such as multiparametric magnetic resonance imaging (mpMRI) and PSMA-PET/CT, have been implemented in PCa care. Paralleled with the growth of artificial intelligence (AI) in medical imaging, this has triggered many investigations into AI-based image analysis for tumor staging and risk classification in primary PCa. One such approach is the use of machine learning (ML) models applied to radiomics data from prostate MRI or PSMA-PET/CT [47]. In radiomics, high-dimensional data is extracted from radiological images. ML algorithms can be trained to transform these radiomics data into clinically applicable predictions [8, 9].

In a previous analysis, we performed a machine learning-based analysis of 18F-DCFPyL (an 18F-labeled PSMA radioligand) PET/CT radiomics in patients with intermediate- to high-risk PCa scheduled for robot-assisted radical prostatectomy (RARP) with ePLND to predict LNI and high-risk tumor features, and observed excellent cross-validated predictive values [10]. However, this approach has not yet been validated to confirm its generalizability.

A known limitation of radiomics analyses is the often limited reproducibility caused by sensitivity to differences in imaging procedures (e.g., acquisition, reconstruction, segmentation) and factors related to radiomics calculation [9]. Therefore, in this study, we optimized and validated machine learning models based on 18F-DCFPyL PET radiomics, including Combat harmonization to compensate for possible center effects, in patients with intermediate- to high-risk primary PCa scheduled for RARP with ePLND [10].

Patients and methods

Patients

Patients with newly diagnosed, biopsy-proven intermediate- to high-risk PCa who underwent 18F-DCFPyL PET/CT imaging prior to RARP and ePLND were included. ePLND was performed if there were high-risk factors (i.e., serum PSA level >20 ng/mL, ISUP score ≥4, clinical or radiological tumor stage ≥3) or a nomogram-based risk of LNI of 8% or greater [1]. Patients underwent 18F-DCFPyL PET/CT at two centers in the Netherlands (Amsterdam University Medical Center (Amsterdam UMC) and the Northwest Clinics (NWZ) Alkmaar) between April 2017 and June 2020. Baseline clinical and pathology data were registered (i.e., PSA level, clinical tumor stage, biopsy ISUP score, and the percentage of positive prostate biopsies).

Training and validation datasets

We used a training dataset (n = 72/76, excluding 4 patients with distant metastases from the previously published internal cross-validation (CV) [10]) from Amsterdam UMC to optimize methodologies and train the machine learning models. The models were applied to internal (n = 24) and external (n = 27) validation datasets from Amsterdam UMC and NWZ, respectively. This resulted in a combined validation dataset of 51 patients.

Preoperative clinical and pathology data for baseline models

The radiomics-based machine learning models were compared to clinically used methods (henceforth referred to as baseline models) to assess their added value. The Memorial Sloan Kettering Cancer Center (MSKCC) nomogram was applied to calculate the risk of LNI and ECE and the corresponding AUCs [2]. The AUC of the biopsy Gleason score (GS) baseline model was calculated by using the biopsy GS as input.

Postoperative pathology data

The surgical tissue specimens of the prostate and lymph nodes were examined according to the international guidelines of uropathologists for primary tumor and nodal staging [1]. The three selected reference outcomes for analysis were dichotomized, being postoperative GS (<8 vs. ≥8), presence of ECE (≤pT2b vs. ≥pT3a) and LNI (pN0 vs. pN1).

Image acquisition and interpretation

PET input data (i.e., dose, calibration, injection, and scan times) were collected. At the Amsterdam UMC, patients were scanned on a time-of-flight PET/CT-scanner (Ingenuity or Vereos, Philips Healthcare) with European Association of Nuclear Medicine (EANM) Research Ltd. (EARL) accreditation [11]. Whole-body PET-images were obtained from mid-thigh to skull base at 4 minutes per bed position at 119 minutes (interquartile range (IQR) 116–125) post-injection (p.i.) of a median dose of 308 MBq (IQR 293–318) 18F-DCFPyL. Data were reconstructed using the vendor-provided iterative reconstruction algorithm using 3 iterations and 33 subsets with 4 mm isotropic voxels (EARL-compliant). PET-images were combined with a non-contrast-enhanced low-dose CT-scan (30–110 mAs at 120 kV).

At the NWZ, scans were performed on a Biograph-16 TruePoint PET/CT-scanner (Siemens Healthcare). At 114 minutes (IQR 110–122) p.i. of a median activity of 302 MBq (IQR 270–338, depending on body mass) 18F-DCFPyL, whole-body PET from mid-thigh to skull base was acquired. The acquisition time was 5 minutes per bed position. Images were reconstructed using iterative ordered-subset expectation maximization reconstruction (4 iterations, 16 subsets) with 2.67x2.67x4.0 mm voxels. PET-scans were combined with a contrast-enhanced CT-scan (110 mAs at 130 kV).

PET-images were corrected for scatter, decay, randoms, and attenuation. In addition to the originally reconstructed images, the Lucy-Richardson iterative deconvolution was applied for partial volume correction (PVC) [12].

Tumor segmentation

The latest version of the ACCURATE tool was used for tumor segmentation on PET-images [13]. A mask was manually drawn around the PSMA-avid prostatic lesion on PET to prevent the inclusion of activity from other structures, such as the urinary bladder or rectum. The masks were reviewed by a second observer. If necessary, the masks were jointly adjusted. Subsequently, the tumor was delineated semi-automatically using a region-growing isocontour corresponding to 50%, 55%, 60%, 65%, and 70% of the peak Standardized Uptake Value (SUV) with or without correction for local background uptake. The original and PVC images were delineated separately.

Radiomics feature extraction

Radiomics features were calculated from the segmented region using the Image Biomarker Standardization Initiative (IBSI)-compliant RaCat software [14, 15]. Voxels were resampled to 2 mm isotropic voxels using tri-linear interpolation with aligned edges [16, 17]. Voxels were scaled to the SUV. Before calculating textural features, discretization was applied with a fixed bin width of 0.25 SUV units. A total of 480 radiomics features on texture (n = 408), intensity (n = 50) and morphology (n = 22) were extracted per patient. Additionally, conventional PET metrics (e.g., SUV values, volume, and total PSMA uptake) were derived for comparative analysis.

Machine learning

Random Forest (RF) and Logistic Regression (LR) were used as machine learning classifiers. For data normalization, we applied i) z-score standardization and ii) Yeo-Johnson transformation [18]. Dimension reduction was performed using i) Principal Component Analysis, ii) Recursive Feature Elimination, iii) univariate feature selection, and iv) the Least Absolute Shrinkage and Selection Operator [19]. The Synthetic Minority Oversampling Technique (SMOTE) was applied to correct data imbalance [20]. To mitigate the center effect, Combat harmonization (using the neuroCombat implementation) was applied to external radiomics data, using the training data as reference dataset [21, 22]. Also, the impact of filtering features (based on linear correlations) before dimension reduction on was evaluated. Lastly, the influence of adding clinical parameters (initial PSA and biopsy ISUP score) to the data after radiomics feature selection was evaluated.

The optimal combination of tumor segmentation settings, use of PVC, and machine learning methods (data normalization, dimension reduction, oversampling) was selected using 10-times repeated 5-fold CV on the training dataset. Model hyperparameters were optimized in nested cross-validation. The models with optimal machine learning configurations were then trained on the entire training dataset and tested on the validation datasets (settings in Table 1).

thumbnail
Table 1. The optimal machine learning configuration based on the cross-validated AUCs of the training dataset.

https://doi.org/10.1371/journal.pone.0293672.t001

Statistical analysis

The area under the curve (AUC) of the receiver operator characteristic (ROC) curve, sensitivity and specificity was calculated to assess model performance. In CV, the mean AUC (CV-AUC) was calculated. In each CV iteration, the probability threshold at the Youden index was determined. The mean probability threshold was then used to calculate sensitivity and specificity in the validation datasets. The DeLong test was performed to compare the AUC of different models applied to the validation data [23]. Machine learning analyses were executed in Python 3.7 using the SciKit package [24]. Statistical analysis was performed using SPSS version 28 and GraphPad Prism version 9. Statistical significance was considered at p<0.05.

Results

Patients and clinical characteristics

Pre- and postoperative clinicopathological characteristics are listed in Tables 2 and 3, respectively. The distribution of intermediate- and high-risk disease according to European Urology (EAU) guidelines for the training dataset was 38.9% (28/72) versus 61.1% (44/72), and 27.5% (14/51) versus 72.5% (37/51) in the combined validation dataset, respectively [1].

thumbnail
Table 2. Preoperative characteristics of all included patients per dataset.

https://doi.org/10.1371/journal.pone.0293672.t002

thumbnail
Table 3. Final histopathological results of all included patients who underwent radical prostatectomy and extended pelvic lymph node dissection per dataset.

https://doi.org/10.1371/journal.pone.0293672.t003

Model validation

Highest CV-AUCs in the training dataset were 0.88 for LNI, 0.79 for ECE, and 0.84 for GS, respectively (all using the Random Forest classifier; see Table 1 for configurations). The radiomics models outperformed the conventional metrics models in CV (LNI 0.78, ECE 0.72, and GS 0.80). Compared to the CV-AUCs, in the combined validation dataset the radiomics-based machine learning models yielded lower non-significant AUCs for LNI (0.57) and ECE (0.63), but a significant AUC of 0.78 for GS (p<0.05). In the combined validation dataset, the models based on conventional PET metrics had similar AUCs for LNI (0.59, p>0.05) and ECE (0.66, p>0.05), and a lower, but significant AUC for GS (0.73, p<0.05). No significant differences were found between the radiomics- and conventional PET metrics-based AUCs of the combined validation dataset. The AUCs, sensitivity and specificity of the radiomics-based machine learning models are listed in Table 4. The ROC-curves for these analyses are given in Fig 1. The AUCs of the different machine learning classifiers with/without Combat harmonization or the addition of clinical parameters for predicting LNI, ECE, and GS are summarized in S1 Table.

thumbnail
Fig 1. ROC-curves of the machine learning models using radiomics features or conventional PET-metrics for predicting lymph node involvement (LNI; A,B), extracapsular extension (ECE; C,D), and Gleason score (GS; E,F) in the training dataset or combined validation cohort.

Training dataset AUC is the mean cross-validation AUC.

https://doi.org/10.1371/journal.pone.0293672.g001

thumbnail
Table 4. The AUC, sensitivity and specificity of the radiomics-based models per outcome.

https://doi.org/10.1371/journal.pone.0293672.t004

Impact of combat harmonization

The effect of Combat harmonization on the performance of the radiomics-based machine learning models in the external validation dataset for LNI, ECE, and GS predictions is illustrated in Fig 2. In general, Combat harmonization improved the external validation AUCs with a range of -0.03 to +0.18.

thumbnail
Fig 2. The effect of Combat harmonization in the external validation dataset on model performance per machine learning classifier for lymph node involvement (LNI), extracapsular extension (ECE), and Gleason score (GS) prediction.

https://doi.org/10.1371/journal.pone.0293672.g002

Impact of adding clinical data

Adding clinical parameters (initial PSA and biopsy ISUP score) to the radiomics data after feature selection did not improve model performance for LNI, ECE, or GS prediction for the combined validation dataset (Fig 3).

thumbnail
Fig 3. The effect of adding clinical parameters to radiomics data on model performance for the combined validation dataset.

https://doi.org/10.1371/journal.pone.0293672.g003

Comparison with baseline models

The AUC of the MSKCC-nomogram for predicting LNI was 0.68 (p>0.05) in the training dataset and 0.51 (p>0.05) in the combined validation dataset. The AUC of the MSKCC-nomogram for predicting ECE was 0.68 (p<0.05) in the training dataset and 0.69 (p<0.05) in the combined validation dataset. The AUC of the biopsy GS baseline model for predicting final GS was 0.80 (p<0.05) in the training and 0.85 (p<0.05) in the combined validation dataset. The ROC-curves of the baseline models are illustrated in Fig 4.

thumbnail
Fig 4. ROC-curves of the baseline models in the training dataset and the combined validation dataset.

The pre-radical prostatectomy MSKCC-nomogram for lymph node involvement (LNI; A) and extracapsular extension (ECE; B) prediction and the biopsy baseline model for postoperative Gleason score (GS; C) prediction.

https://doi.org/10.1371/journal.pone.0293672.g004

Discussion

In this study, we optimized and validated the performance of a previously proposed machine-learning approach using 18F-DCFPyL PET radiomics for predicting high-risk tumor features in patients with intermediate- to high-risk PCa in a multicenter dataset. In this validation, we found that 18F-DCFPyL PET radiomics could predict high postoperative GS but not LNI and ECE. Therefore, the clinical benefit seems to be limited. Still, this implies that radiomics-based machine learning models could enhance preoperative GS determination in clinical practice and potentially help clinicians in their daily decision-making process. The discrepancy between positive results from internal cross-validation and negative results from multicenter validation emphasizes the importance of such validation studies for radiomics analyses.

Newly diagnosed patients with PCa are typically stratified into three risk groups (i.e., low-, intermediate-, and high-risk PCa) based on PSA level, clinical T-stage, and biopsy GS [1]. Accurate prediction of the GS is essential for treatment guidance, to avoid over- or undertreatment. Not all patients undergo radical prostatectomy, so GS obtained with biopsy is usually used in daily practice regardless of the knowledge of discordance between biopsy and prostatectomy GS [25, 26]. However, since the introduction of targeted prostate biopsies, pathological upgrading at prostatectomy has decreased significantly. A recent systematic review and meta-analysis by Goel et al. evaluated the concordance between systematic or MRI-targeted prostate biopsies and prostatectomy GS. They found that pathologic upgrading at prostatectomy was significantly less frequent with targeted biopsy (23.3%) compared to systemic biopsy alone (42.7%), without significant differences in pathological downgrading between the two biopsy techniques [26]. Our present study, using radiomics-based models, could predict postoperative GS with an AUC of 0.78, and the AUC of our baseline model was 0.85. This high AUC of our baseline model might be explained by the improved accuracy of (target) prostate biopsies to determine the GS correctly. However, biopsies are subsamples of the prostate, making it subject to sampling errors, and minor complications (i.e., bleeding or infection) are frequent. Therefore, radiomics-based machine learning models could be a helpful non-invasive tool to assist in GS determination in patients with primary PCa.

Several studies have assessed the value of PSMA-PET radiomics to predict GS in PCa [2731]. However, multicenter validation is often lacking [32]. Zamboglou et al. reported similar results AUC for GS discrimination (GS 7 vs. ≥8) with 68Ga-PSMA-PET radiomics in their internal validation cohort [31]. Papp et al. found that radiomics combined with machine learning could discriminate between low- and high-risk PCa lesions on 68Ga-PSMA-PET/MR [28]. The use of 68Ga-PSMA-PET/MR radiomics was also investigated by Solari et al., who reported an AUC of 0.75 for predicting postsurgical GS with their PET radiomics model. Their method differed from ours in that they segmented the whole prostate gland and trained their model to predict three instead of two GS categories (GS <8, 8, and >8) [29]. Recently, Aksu et al. evaluated whether 68Ga-PSMA-PET/CT radiomics (without machine learning) could predict GS ≥8 and found an AUC of 0.90 [27]. Using 18F-1007-PSMA-PET radiomics, Yao et al. found a similar predictive performance for GS characterization (training AUC 0.82; testing AUC 0.80) [30]. In summary, our results are consistent with previous research, suggesting that PSMA-PET radiomics could be of added value for the preoperative prediction of postoperative GS.

In our training dataset, we found high cross-validation prediction scores for each outcome, whereas in the validation dataset, the radiomics-based models could not predict LNI and ECE (AUCs of 0.57 and 0.63). The same applied to the models using conventional PET metrics as input, implying that the issue is not merely related to the robustness of texture-based radiomics features. Also, two baseline models (MSKCC nomogram predictions for LNI and ECE) reached higher or equal AUCs in the training dataset compared to the validation cohort. Taken together, this may indicate that the low prediction AUCs in the (combined internal and external) validation datasets are due to inherent characteristics of the validation dataset instead of merely overfitted or biased models. In contrast, the performance of the biopsy baseline models to predict postoperative GS reached higher AUCs in the validation dataset compared to the training dataset. This prediction is based only on the biopsy GS, which relies on various factors such as the biopsy technique (systematic versus targeted), the number of biopsies taken, and the physician’s experience performing the procedure.

As is well known, radiomics features (especially those based on texture) are sensitive to variations in PET systems, image acquisition protocols, reconstruction settings, post-processing, and segmentation methods. As the purpose of this study was a multicenter validation, the validation dataset consisted of patients from two different institutions with differences in PET systems, reconstruction software, and imaging protocols. A radiomics-based machine learning model trained with data from one center may not apply directly to data from another. Therefore, to minimize the influence of a center effect, Combat harmonization was applied to harmonize the radiomics features. Indeed, this improved our validation AUCs of the external dataset. To date, the use of Combat harmonization in PCa radiomics analyses is limited. However, the Combat harmonization method has been used in several contexts, and its efficacy in PET imaging has been demonstrated [22, 3335].

Our study is not devoid of limitations. First, most patients had high-risk disease according to the EAU guidelines, and our results should also be validated in patient cohorts with lower-risk disease. Secondly, we only selected patients who had undergone radical prostatectomy as we chose postoperative GS as one of the reference outcomes. However, patients undergoing radiotherapy instead of radical prostatectomy are not expected to differ much from the current population. Thirdly, the anatomical locations of the prostate tumors delineated on PET were not directly compared to the prostatectomy specimen. Potentially, less PSMA-avid lesions on PET may be missed in, for example, multifocal tumors. In clinical practice, however, anatomical information from the prostatectomy specimen is, logically, not available at pre-operative imaging. We chose to delineate the most PSMA-avid foci within the prostate based on semi-automatic lesion segmentation. This could impact the radiomics feature values and, consequently, our results. Still, our results are in line with other data where whole-prostate radiomics were derived [29]. Also, semi-automatic segmentation analyses suffer less from observer variability than manual segmentations [36]. In future prospective research, additional information on disease recurrence during follow-up may help evaluate the utility of the machine-learning algorithm in treating patients with PCa.

Conclusion

The 18F-DCFPyL radiomics-based machine learning models could predict high postoperative GS but not LNI or ECE. Still, these results are in line with the baseline models. This implies that radiomics-based machine learning models could enhance preoperative GS determination in clinical practice and potentially help clinicians in their daily decision-making process. Furthermore, these results underline the need for external and/or multicenter validation of PET radiomics-based machine learning model analyses to investigate their reproducibility and clinical applicability.

Supporting information

S1 Table. The different AUCs for the prediction of lymph node involvement (LNI), extracapsular extension (ECE), and Gleason score (GS) per dataset with or without the addition of clinical features, omission of correlated radiomics features, and application of Combat harmonization.

AUC = area under the curve; RF = random forest; LR = logistic regression; LNI = lymph-node involvement; GS = Gleason score; ECE = extracapsular extension; SD = standard deviation; int. = internal; ext. = external; val. = validation; RFE = Recursive Feature Elimination; LASSO = Least Absolute Shrinkage and Selection Operator. *The number of radiomics features remaining after dimension reduction.

https://doi.org/10.1371/journal.pone.0293672.s001

(DOCX)

S2 Table. Selected features after dimension reduction for the prediction of lymph node involvement (LNI), extracapsular extension (ECE), and Gleason score (GS).

Derived from a Random Forest and Logistic Regression using different machine-learning configurations (Table 1).

https://doi.org/10.1371/journal.pone.0293672.s002

(DOCX)

References

  1. 1. EAU Guidelines. Edn. presented at the EAU Annual Congress Amsterdam 2022. ISBN 978-94-92671-16-5. Website. https://uroweb.org/ 2022.
  2. 2. Memorial Sloan Kettering Cancer Centre. Prostate cancer nomograms: pre-radical prostatectomy. 2020.
  3. 3. Gandaglia G, Fossati N, Zaffuto E, Bandini M, Dell’Oglio P, Bravi CA, et al. Development and Internal Validation of a Novel Model to Identify the Candidates for Extended Pelvic Lymph Node Dissection in Prostate Cancer. Eur Urol. 2017;72:632–40. pmid:28412062
  4. 4. Ghezzo S, Bezzi C, Presotto L, Mapelli P, Bettinardi V, Savi A, et al. State of the art of radiomic analysis in the clinical management of prostate cancer: A systematic review. Critical Reviews in Oncology/Hematology. 2022;169:103544. pmid:34801699
  5. 5. Guglielmo P, Marturano F, Bettinelli A, Gregianin M, Paiusco M, Evangelista L. Additional Value of PET Radiomic Features for the Initial Staging of Prostate Cancer: A Systematic Review from the Literature. Cancers; 2021. pmid:34885135
  6. 6. Ma K, Harmon SA, Klyuzhin IS, Rahmim A, Turkbey B. Clinical Application of Artificial Intelligence in Positron Emission Tomography: Imaging of Prostate Cancer. PET Clinics. 2022;17:137–43. pmid:34809863
  7. 7. Spohn SKB, Bettermann AS, Bamberg F, Benndorf M, Mix M, Nicolay NH, et al. Radiomics in prostate cancer imaging for a personalized treatment approach—current aspects of methodology and a systematic review on validated studies. Theranostics. 2021;11:8027–42. pmid:34335978
  8. 8. Lambin P, Rios-Velazquez E, Leijenaar R, Carvalho S, Van Stiphout RG, Granton P, et al. Radiomics: extracting more information from medical images using advanced feature analysis. European journal of cancer. 2012;48:441–6. pmid:22257792
  9. 9. Zwanenburg A. Radiomics in nuclear medicine: robustness, reproducibility, standardization, and how to avoid data analysis traps and replication crisis. Eur J Nucl Med Mol I. 2019;46:2638–55. pmid:31240330
  10. 10. Cysouw MCF, Jansen BHE, van de Brug T, Oprea-Lager DE, Pfaehler E, de Vries BM, et al. Machine learning-based analysis of [(18)F]DCFPyL PET radiomics for risk stratification in primary prostate cancer. Eur J Nucl Med Mol Imaging. 2020. pmid:32737518
  11. 11. Boellaard R, Delgado-Bolton R, Oyen WJG, Giammarile F, Tatsch K, Eschner W, et al. FDG PET/CT: EANM procedure guidelines for tumour imaging: version 2.0. Eur J Nucl Med Mol I. 2015;42:328–54. pmid:25452219
  12. 12. Cysouw MCF, Kramer GM, Hoekstra OS, Frings V, de Langen AJ, Smit EF, et al. Accuracy and Precision of Partial-Volume Correction in Oncological PET/CT Studies. Journal of Nuclear Medicine. 2016;57:1642. pmid:27230933
  13. 13. Boellaard R. Quantitative oncology molecular analysis suite: ACCURATE. Soc Nuclear Med; 2018.
  14. 14. Pfaehler E, Zwanenburg A, de Jong JR, Boellaard R. RaCaT: An open source and easy to use radiomics calculator tool. PLoS One. 2019;14:e0212223. pmid:30785937
  15. 15. Zwanenburg A, Leger S, Vallières M, Löck S. Image biomarker standardisation initiative. arXiv preprint arXiv:161207003. 2016.
  16. 16. Hatt M, Tixier F, Pierce L, Kinahan PE, Le Rest CC, Visvikis D. Characterization of PET/CT images using texture analysis: the past, the present… any future? Eur J Nucl Med Mol I. 2017;44:151–65.
  17. 17. Pfaehler E, van Sluis J, Merema BB, van Ooijen P, Berendsen RC, van Velden FH, et al. Experimental multicenter and multivendor evaluation of PET radiomic features performance using 3D printed phantom inserts. Journal of Nuclear Medicine. 2019.
  18. 18. Yeo IK, Johnson RA. A new family of power transformations to improve normality or symmetry. Biometrika. 2000;87:954–9.
  19. 19. Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological). 1996;58:267–88.
  20. 20. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research. 2002;16:321–57.
  21. 21. Fortin J-P, Cullen N, Sheline YI, Taylor WD, Aselcioglu I, Cook PA, et al. Harmonization of cortical thickness measurements across scanners and sites. NeuroImage. 2018;167:104–20. pmid:29155184
  22. 22. Orlhac F, Eertink JJ, Cottereau A-S, Zijlstra JM, Thieblemont C, Meignan M, et al. A Guide to ComBat Harmonization of Imaging Biomarkers in Multicenter Studies. Journal of Nuclear Medicine. 2022;63:172. pmid:34531263
  23. 23. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837–45. pmid:3203132
  24. 24. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine learning in Python. the Journal of machine Learning research. 2011;12:2825–30.
  25. 25. Cohen MS, Hanley RS, Kurteva T, Ruthazer R, Silverman ML, Sorcini A, et al. Comparing the Gleason prostate biopsy and Gleason prostatectomy grading system: the Lahey Clinic Medical Center experience and an international meta-analysis. Eur Urol. 2008;54:371–81. pmid:18395322
  26. 26. Goel S, Shoag JE, Gross MD, Al Hussein Al Awamlh B, Robinson B, Khani F, et al. Concordance Between Biopsy and Radical Prostatectomy Pathology in the Era of Targeted Biopsy: A Systematic Review and Meta-analysis. Eur Urol Oncol. 2020;3:10–20. pmid:31492650
  27. 27. Aksu A, Vural Topuz Ö, Yılmaz G, Çapa Kaya G, Yılmaz B. Dual time point imaging of staging PSMA PET/CT quantification; spread and radiomic analyses. Ann Nucl Med. 2022;36:310–8. pmid:34988888
  28. 28. Papp L, Spielvogel CP, Grubmuller B, Grahovac M, Krajnc D, Ecsedi B, et al. Supervised machine learning enables non-invasive lesion characterization in primary prostate cancer with [(68)Ga]Ga-PSMA-11 PET/MRI. Eur J Nucl Med Mol Imaging. 2020. pmid:33341915
  29. 29. Solari EL, Gafita A, Schachoff S, Bogdanović B, Villagrán Asiares A, Amiel T, et al. The added value of PSMA PET/MR radiomics for prostate cancer staging. Eur J Nucl Med Mol Imaging. 2022;49:527–38. pmid:34255130
  30. 30. Yao F, Bian S, Zhu D, Yuan Y, Pan K, Pan Z, et al. Machine learning-based radiomics for multiple primary prostate cancer biological characteristics prediction with 18F-PSMA-1007 PET: comparison among different volume segmentation thresholds. La radiologia medica. 2022. pmid:36018488
  31. 31. Zamboglou C, Carles M, Fechter T, Kiefer S, Reichel K, Fassbender TF, et al. Radiomic features from PSMA PET for non-invasive intraprostatic tumor discrimination and characterization in patients with intermediate- and high-risk prostate cancer—a comparison study with histology reference. Theranostics. 2019;9:2595–605. pmid:31131055
  32. 32. Buvat I, Orlhac F. The Dark Side of Radiomics: On the Paramount Importance of Publishing Negative Results. J Nucl Med. 2019;60:1543–4. pmid:31541033
  33. 33. Dissaux G, Visvikis D, Da-ano R, Pradier O, Chajon E, Barillot I, et al. Pretreatment 18F-FDG PET/CT Radiomics Predict Local Recurrence in Patients Treated with Stereotactic Body Radiotherapy for Early-Stage Non–Small Cell Lung Cancer: A Multicentric Study. Journal of Nuclear Medicine. 2020;61:814. pmid:31732678
  34. 34. Ferrández MC, Eertink JJ, Golla SSV, Wiegers SE, Zwezerijnen GJC, Pieplenbosch S, et al. Combatting the effect of image reconstruction settings on lymphoma [18F]FDG PET metabolic tumor volume assessment using various segmentation methods. EJNMMI Research. 2022;12:44. pmid:35904645
  35. 35. Hotta M, Minamimoto R, Gohda Y, Miwa K, Otani K, Kiyomatsu T, Yano H. Prognostic value of 18F-FDG PET/CT with texture analysis in patients with rectal cancer treated by surgery. Annals of Nuclear Medicine. 2021;35:843–52. pmid:33948903
  36. 36. Kocak B, Durmaz ES, Ates E, Kilickesmez O. Radiomics with artificial intelligence: a practical guide for beginners. Diagn Interv Radiol. 2019;25:485–95. pmid:31650960