Fig 1.
(a) A 3D plot showing the result of the Principal component analysis (PCA) using all the 28 physiological and immunological parameters of non-stressed (SHC or No Stress; red) and stressed mice (CSC or Stress). The first three components shown here cumulatively explain 70.5% of the variation in the dataset, segregating the SHC and CSC mice. (b) A bi-plot showing the PCA analysis results (as shown in Fig 1a) with additional information on loadings for each variable in the analysis. The direction and length of arrows indicate the sign and magnitude of the coefficient of each variable in the PC1 and PC2 coordinate, respectively. Size of the balls indicates PC3 coefficient loadings in scales 1 to 5 (shown the right side of the plot). Light red shades and red balls represent SHC samples while purple shades and light blue balls indicate CSC mice. The corresponding amount of variation explained by each component is given in brackets. Details of the numeric labels for each arrow are given in S3 Table.
Table 1.
Class prediction accuracy of the top scoring pair (tsp) model (left adrenal weight [mg] and relative thymus weight [mg/g]).
Only one mouse in the SHC (No stress; n = 19) and two individuals in the CSC (Stress; n = 18) were misclassified.
Table 2.
Tsp model prediction accuracy.
The model correctly predicted the stress status of 34 mice out of 37 mice in the validation set (~92%).
Fig 2.
(a) Scatter plot based on top scoring pairs (tsp) analysis of the training dataset aimed at classifying stressed (red) and non-stressed (blue) samples. The Left Adrenal Weight [mg] (LAWmg; x-axis) and Relative Thymus Weight [mg/g] (RTWmg/g; y-axis) were identified as the most relevant pair. The fitted line represents the linear function that discriminates the two groups. (b) The model was tested using independent validation set of 37 samples. The obtained results are shown here as a scatter plot. Correctly predicted samples (34/37 or 91.89%) are colored by matching to the colors of the test set (red = No Stress; blue = Stress). Misclassified samples (n = 3) are shown in grey.
Table 3.
Table showing the stress status prediction based on support vector machine (SVM).
Five mice in the SHC (No Stress) and 1 in the CSC (Stress) were incorrectly predicted.
Table 4.
SVM model prediction accuracy.
Of the 37 mice in the validation set, the model correctly assigned 31 individuals to their appropriate groups (~84%).
Fig 3.
(a) Contour plot depicting Support vector machines (SVM) analysis results using the LAWmg (x-axis) and RTWmg/g (y-axis) parameters (see Methods for details of parameter settings used for analysis). Red contours represent non-stressed sample boundaries while blue covers the stressed samples. Points represent samples. (b) Correctly predicted samples (31/37 or 83.78%) are shown in this plot and are colored by matching to the colors of the test set (red = No Stress; blue = Stress). Misclassified samples (n = 6) are colored grey.
Fig 4.
(a) A plot showing null hypothesis distribution generated using Monte Carlo Simulation (blue density and scatter plots) and bootstrap based mean and confidence interval estimation of the prediction error encountered by the tsp model (red density and scatter plots). The x-axis shows the number of iterations/replicates while the y-axis shows the prediction error. Dark blue lines, both in the density plots and scatter plots represents estimated mean of the null hypothesis ( [Ho]). Full red line is the mean prediction error (
[boostrap]) as estimated by the bootstrap analysis while the dashed red lines are the corresponding 95% Confidence Intervals (95% CI [bootstrap]). The black line indicates the empirical prediction error (EPE) obtained in the validation set analysis. (b) A Plot of the statistical significance and confidence interval of the prediction analysis of the SVM model (details of the labels are as described in Fig 4a).