Predicting the causative pathogen among children with pneumonia using a causal Bayesian network

Background Pneumonia remains a leading cause of hospitalization and death among young children worldwide, and the diagnostic challenge of differentiating bacterial from non-bacterial pneumonia is the main driver of antibiotic use for treating pneumonia in children. Causal Bayesian networks (BNs) serve as powerful tools for this problem as they provide clear maps of probabilistic relationships between variables and produce results in an explainable way by incorporating both domain expert knowledge and numerical data. Methods We used domain expert knowledge and data in combination and iteratively, to construct, parameterise and validate a causal BN to predict causative pathogens for childhood pneumonia. Expert knowledge elicitation occurred through a series of group workshops, surveys and one-on-one meetings involving 6-8 experts from diverse domain areas. The model performance was evaluated based on both quantitative metrics and qualitative expert validation. Sensitivity analyses were conducted to investigate how the target output is influenced by varying key assumptions of a particularly high degree of uncertainty around data or domain expert knowledge. Results Designed to apply to a cohort of children with X-ray confirmed pneumonia who presented to a tertiary paediatric hospital in Australia, the resulting BN offers explainable and quantitative predictions on a range of variables of interest, including the diagnosis of bacterial pneumonia, detection of respiratory pathogens in the nasopharynx, and the clinical phenotype of a pneumonia episode. Satisfactory numeric performance has been achieved including an area under the receiver operating characteristic curve of 0.8 in predicting clinically-confirmed bacterial pneumonia with sensitivity 88% and specificity 66% given certain input scenarios (i.e., information that is available and entered into the model) and trade-off preferences (i.e., relative weightings of the consequences of false positive versus false negative predictions). We specifically highlight that a desirable model output threshold for practical use is very dependent upon different input scenarios and trade-off preferences. Three commonly encountered scenarios were presented to demonstrate the potential usefulness of the BN outputs in various clinical pictures. Conclusions To our knowledge, this is the first causal model developed to help determine the causative pathogen for paediatric pneumonia. We have shown how the method works and how it would help decision making on the use of antibiotics, providing insight into how computational model predictions may be translated to actionable decisions in practice. We discussed key next steps including external validation, adaptation and implementation. Our model framework and the methodological approach can be adapted beyond our context to broad respiratory infections and geographical and healthcare settings.


Variation in detection pathways & organisms
Here is a list of organisms that were sought and detected in the pneumoWA project. There were two pathways for obtaining a detection result: study and clinic. Quantification was performed for some organisms via the study pathway. More detailed information is provided in the Are there any other organisms that you think important to consider for respiratory infection in children in Australia? If yes, please specify: With regard to each organism listed above (V1-7, B1-5), along with the ones you have just specified, please answer the following questions: -Where can the organism reside/be present without necessarily causing disease? Please delete sites as appropriate.

Survey -Overview
In the previous knowledge elicitation session in October 2020, we focused on identifying important pathogens that can cause infection in the respiratory tract. These pathogens include:  As illustrated in the figure above, given the presence of an organism in one's nasopharyngeal area (organism in NP), there is a chance for this organism to become pathogenic and infect the host's lower respiratory tract (causing LRTI). Infection of the LRT can either occur directly (the vertical arrow), potentially by direct aspiration/inhalation, or it can occur indirectly via secondary expansion from another site/s (e.g. secondary to URTI).
It is important to note that we explicitly distinguish between the actual presence of an organism at a body site (yellow nodes) from its successful detection by sampling that site (orange node), although we only ever know about the former via information about the latter. In the model, we treat the actual presence as a latent node, which means it cannot be observed directly, and this allows us to capture any factors that might affect the sensitivity and specificity of a detection method, such as age, specimen type, assay type, density of organism at the site and collection performance.

Figures 2-4 below show the PCR detection of different groups of organisms in children in
PneumoWA with x-ray confirmed pneumonia (defined as cases; top row) and non-pneumonia controls (bottom row). The detection of bacteria in NP gradually decreases as age increases, but it does not differ between cases and controls ( Figure 2). The detection of pathogenic viruses in the NP is strongly associated with being a case, while decreasing with age ( Figure 3). The detection of less pathogenic viruses does not differ notably between cases and controls ( Figure 4). Figure 2. Bacteria refers to streptococcus pneumoniae, haemophilus influenzae, moraxella catarrhalis, staphylococcus aureus and/or mycoplasma pneumoniae (B1 -B5 in Table 1). Figure 3. Pathogenic virus refers to influenza, respiratory syncytial virus, human metapneumovirus and/or parainfluenza (V1 -V4 in Table 1).  Table 1). Figure 5 demonstrates how the varying levels of organism-specific pathogenicity can be described using models. The presence of bacteria, pathogenic virus and less pathogenic virus in NP are associated with 1.4%, 9.2% and 2.5% risk of pneumonia (these numbers are illustrative and not inferred by data). Figure 5. Preliminary models that illustrate varying levels of organism-specific pathogenicity

Questions
The following set of questions aims to better understand the pathogenicity of different organisms, and their subsequent implications on antibiotic use. Please provide your responses in the tables that follow, for each organism listed in the first column of the tables.
1. Imagine you randomly swab a child in the community without knowing anything about whether they have signs or symptoms of RTI. For each organism, order from highest to lowest your belief that its presence in the NP (whether detected or not) predicts that the child has (or will soon develop) a LRTI (which does not need to be caused solely by that specific organism). Please enter your order in the 2 nd column, 1 for the highest and ties are allowed. Precise answers are not required. Provide your degree of confidence in your answer for each organism in the 3 rd column, from highest 1 to lowest 5.