A machine learning algorithm predicts molecular subtypes in pancreatic ductal adenocarcinoma with differential response to gemcitabine-based versus FOLFIRINOX chemotherapy

Purpose Development of a supervised machine-learning model capable of predicting clinically relevant molecular subtypes of pancreatic ductal adenocarcinoma (PDAC) from diffusion-weighted-imaging-derived radiomic features. Methods The retrospective observational study assessed 55 surgical PDAC patients. Molecular subtypes were defined by immunohistochemical staining of KRT81. Tumors were manually segmented and 1606 radiomic features were extracted with PyRadiomics. A gradient-boosted-tree algorithm was trained on 70% of the patients (N = 28) and tested on 30% (N = 17) to predict KRT81+ vs. KRT81- tumor subtypes. A gradient-boosted survival regression model was fit to the disease-free and overall survival data. Chemotherapy response and survival were assessed stratified by subtype and radiomic signature. Radiomic feature importance was ranked. Results The mean±STDEV sensitivity, specificity and ROC-AUC were 0.90±0.07, 0.92±0.11, and 0.93±0.07, respectively. The mean±STDEV concordance indices between the disease-free and overall survival predicted by the model based on the radiomic parameters and actual patient survival were 0.76±0.05 and 0.71±0.06, respectively. Patients with a KRT81+ subtype experienced significantly diminished median overall survival compared to KRT81- patients (7.0 vs. 22.6 months, HR 4.03, log-rank-test P = <0.001) and a significantly improved response to gemcitabine-based chemotherapy over FOLFIRINOX (10.14 vs. 3.8 months median overall survival, HR 2.33, P = 0.037) compared to KRT81- patients, who responded significantly better to FOLFIRINOX over gemcitabine-based treatment (30.8 vs. 13.4 months median overall survival, HR 2.41, P = 0.027). Entropy was ranked as the most important radiomic feature. Conclusions The machine-learning based analysis of radiomic features enables the prediction of subtypes of PDAC, which are highly relevant for disease-free and overall patient survival and response to chemotherapy.


Feature extraction
Tumors were manually segmented by consensus reading as detailed in the main manuscript.
Radiomic features were extracted from the quality-controlled 3-dimensional volumes as follows: Intensity discretization was performed to a fixed bin number of 32 bins for all scans as described in (1). No normalization was performed. Value plausibility was ascertained for all parameters. Images were spatially resampled to 3x3x3mm using the BSpline interpolator. No resegmentation was performed. Radiomic features were derived using PyRadiomics v. 2.1.1 (2) yielding 19 first order statistics, 16 3D shape-based, 10 2D shape based, 24 Gray Level Cooccurence Matrix, 16 Gray Level Run Length Matrix, 16 Gray Level Size Zone Matrix, 5 Neighbouring Gray Tone Difference Matrix and 14 Gray Level Dependence Matrix features as well as Laplacian of Gaussian-filtered, wavelet-decomposition-based (using the coiflet 1 function), square, exponential, gradient, square-root, logarithm and local binary-pattern filtered versions of these features. Laplacian of Gaussian-filtering was performed with a kernel (sigma) of 3mm. GLCM and GLRLM were extracted using the default settings (separately for each direction then averaged). Feature descriptions can be found in the PyRadiomics documentation (3). Feature extraction and processing was carried out according to the IBSI manual v. 7 (4). 1606 features were extracted in total.
Because of the small dataset size, the use of automatic hyperparameter tuning and nested cross-validation was eschewed. Some manual hyperparameter modifications were made to reduce model variance and increase overfitting resistance: tree complexity penalization was applied to the leaf values in the form of L2 regularization and shrinkage regularization was introduced by reducing the learning rate.
For training and testing, the estimator was fit and tested using stratified shuffle/split crossvalidation with 10 splits of 70%/30% (train/test) of the dataset. For testing survival prediction, a manual train/test set was built using interleaving to preserve the time-series structure and a stochastic Gradient Boosted Regression Tree algorithm was used. The learning rate was set at 0.01 and individual trees were built using 90% of the rows. Cox proportional hazards was used as a loss function to optimize and trees were split based on the Friedman MSE criterion.
The concordance index between the predictions of the model on the testing set and the actual survival times was calculated.
For all other survival analyses, the stratified shuffle/split cross-validation technique was used.