Toward characterizing cardiovascular fitness using machine learning based on unobtrusive data

doi:10.1371/journal.pone.0282398

Table 1.

Characteristics of volunteers, peak variables obtained during the cardiopulmonary exercise testing, and the mean response of the variables obtained by the wearable.

More »

Expand

Fig 1.

The wearable system has embedded cardiac, respiratory, and movement sensors that measure unsupervised and unobtrusive biological data.

These raw data are processed, filtered, and averaged. Mean response to heart rate (μHR), breathing rate (μBR), minute ventilation (μVE), tidal volume (μVt), total hip acceleration (μAcc), and walking cadence (μCad), as well as sex, age, weight, height, and body mass index (BMI), were used as inputs to predict the maximum oxygen uptake (). The resultant prediction model is a black box due to its high complexity and low explainability; therefore, explainable methods are necessary to extract meaningful knowledge that might have clinical applications.

More »

Expand

Fig 2.

Evaluation protocol adopted for this study.

Given a dataset containing wearable data from 43 participants, we first use the k-fold cross-validation (k = 9) to evaluate generalization aspects of regression models. Then, we use the R-squared measure to select the best model, which is used to build an explainable model via Shapley values to assess the feature contributions. We also computed the average Mean Absolute Error (MAE), and Pearson correlation to assess the accuracy of regression models.

More »

Expand

Fig 3.

Flowchart of screening, evaluation and inclusion and exclusion criteria for the study.

This flow diagram illustrates the sample size and the volunteer characteristics. DM: diabetes mellitus; COPD: chronic obstructive pulmonary disease; CAD: coronary artery disease.

More »

Expand

Fig 4.

Linear correlation between maximum oxygen uptake during CPET and the predicted maximum oxygen uptake by machine learning technique (on letter A) and Bland-Altman plot of maximum oxygen uptake and prediction of the maximum oxygen uptake with the bias and the confidence interval (CI₉₅) (on letter B). Support vector regression (SVR); Pearson coefficient (R).

More »

Expand

Fig 5.

Shapley values (importance level) of the inputs used to predict cardiovascular fitness.

A- Median and 25-75th percentile of Shapley values of the inputs from the Support Vector Regression (SVR). * Significant difference between age and BR (p > 0.001), between age and BMI (p > 0.001), between age and Cad (p = 0.006). † Significant difference between HR and BR (p = 0.003). ‡ Significant difference between height and BR (p = 0.004). § Significant difference between Weight and BR (p = 0.049). B- Mean±SD of Weighted average of Shapley values of the domains from the Support Vector Regression (SVR) model. * Significant difference between Hemodynamic and Physical Activity (p = 0.010), between Hemodynamic and Pulmonary (p = 0.003). ‡ Significant difference between Anthropometric and Pulmonary (p = 0.023). HR: heart rate, Acc: total hip acceleration; BMI: body mass index; BR: breathing rate, Vt: tidal volume; Cad: walking cadence, Ve: minute ventilation.

More »

Expand