Fig 1.
Flow chart used to divide Chest X-Ray impressions into three categories.
The scans were preprocessed and then given as an input for categorization. These categories include Normal, Stable, and New findings.
Table 1.
COVID-19 positive and negative patients belonging to “New finding”, “Stable”, and “Normal” categories*.
Table 2.
Distribution of COVID-19 positive and negative patients according to demographics (includes sex, race, ethnicity, and age) along with its statistical significance.
These patients have CXRi, RoS, and history of past diseases which were considered for our final model *.
Fig 2.
Frequencies of each feature in COVID-19 positive and negative patients.
Fig 3.
Random forest classifier performance for identifying COVID-19 infection.
The ROC curves shown above identifies the ability of all ML models to classify pediatric patients with COVID-19 infection.
Table 3.
Random Forest Classifier features for model 1 in decreasing order of their importance.
The feature list shown was most prominent in Chest X-Ray impressions infected with COVID-19. The table also depicts the Odds ratio along with a 95% confidence interval which was obtained using Fisher’s exact test since the sample size of these features was small.
Fig 4.
SHAP values for most important model features in model 5 shown in decreasing order of their importance along the Y-axis.
Features in the upper case indicate RoS data, the lower case without an underscore represents features obtained from CXRi, and some additional demographic features. The top 20 features for the Random Forest classifier using a total of 76 features are shown using model 4. Each dot on the X-axis represents the importance value of the corresponding feature for each patient. The location of each dot indicates whether the feature is positively or negatively associated with the output. The color of each dot indicates whether the value is high (shown in red) or the value is low (indicated in blue).
Fig 5.
SHAP values for two patients with absence (a) and presence (b) of COVID-19 infection. The SHAP values above indicate the impact of a particular feature with a certain value in comparison to the prediction made if the feature took some baseline value. As observed in (a), the absence of pneumonia, atelectasis, and small airways disease indicates the absence of COVID-19 infection and the presence of these features in (b) indicates the presence of COVID-19 infection.