Figures
Abstract
Recently, numerous studies have reported on different predictive models of disease severity in COVID-19 patients. Herein, we propose a highly predictive model of disease severity by integrating routine laboratory findings and plasma metabolites including cytosine as a potential biomarker of COVID-19 disease severity. One model was developed and internally validated on the basis of ROC-AUC values. The predictive accuracy of the model was 0.996 (95% CI: 0.989 to 1.000) with an optimal cut-off risk score of 3 from among 6 biomarkers including five lab findings (D-dimer, ferritin, neutrophil counts, Hp, and sTfR) and one metabolite (cytosine). The model is of high predictive power, needs a small number of variables that can be acquired at minimal cost and effort, and can be applied independent of non-empirical clinical data. The metabolomics profiling data and the modeling work stemming from it, as presented here, could further explain the cause of COVID-19 disease prognosis and patient management.
Citation: Soares NC, Hussein A, Muhammad JS, Semreen MH, ElGhazali G, Hamad M (2023) Plasma metabolomics profiling identifies new predictive biomarkers for disease severity in COVID-19 patients. PLoS ONE 18(8): e0289738. https://doi.org/10.1371/journal.pone.0289738
Editor: Konlawij Trongtrakul, Faculty of Medicine Chiang Mai University, THAILAND
Received: October 5, 2022; Accepted: July 25, 2023; Published: August 10, 2023
Copyright: © 2023 Soares et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The metabolomics data have been deposited in the Metabolomics Workbench repository (https://www.metabolomicsworkbench.org/) with data ID 3469.
Funding: This work was supported by research grant CoV19-0305 (MH), seed grant 2001110138, University of Sharjah, UAE. This research is part of the -Human Disease Biomarkers Discovery Research Group-study. The authors wish to acknowledge the generous support of the Research Institute for Medical and Health Sciences, University of Sharjah UAE. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interest exist.
Introduction
The SARS-COV-2 pandemic, which has gripped the world over the last three years, has resulted in more than 530 million reported infections and 6.3 million deaths worldwide so far [https://covid19.who.int/]. The pandemic also resulted in unimaginable suffering to individuals, families, communities and countries across the globe. At its peak, the pandemic stressed healthcare systems in different parts of the globe to their limit, disrupted supply chains, destroyed businesses, resulted in massive unemployment and poverty and a never-seen-before upward re-distribution of wealth; which will collectively continue to impact life on earth for possibly generations to come [1–3]. Although it is arguable whether the pandemic was foreseen and/or could have been avoided or even better managed; still, the way in which it was handled speaks of utter incompetence and indisputable lack of preparedness at all levels from governments and healthcare authorities all the way to the scientific community [1–3]. Therefore, the world needs to learn its lesson, not only in terms of how to deal with future epidemics and pandemics but also how to study and understand them and how to use cutting edge technologies in such endeavors [1,2,4].
One of the puzzling questions about the COVID-19 pandemic that still lingers is how and why some, seemingly healthy (low risk) individuals, succumbed to the disease while others, possibly of poorer health, recovered and survived [5–7]. Indeed, most people would agree that the worst of this pandemic is, more or less, behind us, but efforts to uncover infection and disease correlates that may have contributed to its outcomes remain timely and needed [5,6]. In this context, there is a real need to develop and test disparate data-integrating approaches and data-based models to understand the various aspects of COVID-19 and to easily and quickly enlist such models in combating future epidemics and pandemics.
Polymerase chain reaction (PCR) testing of the nasopharyngeal swab for the presence of SARS-CoV-2 RNA continues to be the gold standard in identifying infected individuals [8]. Based on global data input, almost 80% of SARS-COV-2 individuals end up with no symptoms to mild-to-moderate symptoms [9]. Serological testing in the form of differential blood count along with inflammatory marker testing has proven partially successful in identifying patients at high risk of disease severity or death. The experience with COVID-19 has demonstrated that testing for serum IL-6, D-dimer, lactate dehydrogenase (LDH) and other analytes is helpful in identifying patients at risk of sever or fatal complications [10]. That said, there is still a need for more specific predictive parameters of severe infection beyond serum ferritin, prothrombin time, and fibrin degradation products (FDP) [11].
Serum/plasma metabolomics profiling using liquid chromatography-mass spectrometry (LC-MS) has proven useful in identifying diagnostic, prognostic and therapeutic biomarkers in infectious diseases. Studies have shown that serum levels of citrate, malate and succinate increase in response to S. aureus and S. pneumonia infections [12,13]. In a study on COVID-19 intensive care unit patients, the plasma metabolites kynurenine and arginine ratio was reported to be helpful in predicting COVID-19 disease irrespective of age, gender or hospital admission [14]. However, the role of these findings in COVID-19 prognosis remains limited given that only ICU patients were assessed. In another study, the metabolites cytosine and tryptophan-nicotinamide were reported to be moderately sensitive in discriminating COVID-19 patients from healthy individuals [15]. It is predictable that metabolomic changes resulting from SARS-CoV-2 infection could vary widely among patients owing to differences in patient health profiles, vis-à-vis comorbidities, medications, diet, lifestyle, etc. Accordingly, the search for profiles of metabolic biomarkers may provide higher sensitivity and specificity in assessing disease prognosis [16]. In this study, we retrospectively recruited COVID-19 patients with no known comorbidities and divided them into three groups based on disease severity: asymptomatic, mild and severe. We performed LC-MS metabolomics profiling in serum samples of these patients and identified eight predictive biomarkers of COVID-19 disease severity. We then integrated patients’ laboratory findings and metabolomics profiles to generate a predictive model of disease severity.
Material & methods
Sample collection and processing
In this retrospective cohort study, blood samples were collected from donors who tested positive for COVID-19 and presented with no, mild or severe symptoms between March 20 until July 17, 2020. Patients were diagnosed with COVID-19 using a nasal swab PCR test and later divided into three groups (asymptomatic, mild, and severe) based on their clinical presentation. Each donor gave a 10 mL blood sample, one half of which was collected in a plain tube and the other half in an EDTA vacutainer. A total of 85 samples were collected (30 COVID-19-positive asymptomatic, 10 COVID-19-positive with mild symptoms, and 45 COVID-19-positive with severe symptoms) for the purpose of this study. COVID-19-positive asymptomatic individuals were identified as a result of the national screening campaigns. Symptomatic COVID-19 patients were classified into mild or severe based on guidelines published by Abu Dhabi Department of Health (circular number 33, 19th April 2020). Patients with mild disease presented with upper respiratory tract infection and symptoms like fever, dry cough, sore throat, runny nose, muscle and joint pains without shortness of breath. Patients with severe disease presented with severe pneumonia and symptoms like fever, cough, dyspnea and fast breathing (>30 per minute), in addition to oxygen saturation <90%. Patient records showed that many of the patients with symptoms were self-medicating with aspirin prior to their hospital visit and that some of the patients with moderate-severe symptoms were placed on dexamethasone and/or heparin subsequent to hospital admission. Immediately upon sample collection, the hospital laboratory staff separated and tested the serum for CRP, D-dimer, ferritin, IL-6 and LDH; a complete blood count was also performed on each sample. Whole blood samples were also aliquoted and frozen at −80 °C for subsequent processing and analysis. The study was jointly approved by the Ministry of Health, Abu Dhabi and Dubai Health Authority (DOH/CVDC/2020/1949) on the understanding that samples will be number-coded to hide patient’s identity, that no personal information will be shared with a third party and that no sample analysis can be performed by entities other than the Research Institute of Medical and Health Sciences (RIMHS), the University of Sharjah (UOS) without prior written approval. No informed consent was required as per the ethical approval decision (DOH/CVDC/2020/1949); in compliance with the said decision, all samples were fully anonymized before accessing or receiving them.
Serum levels of hepcidin and soluble transferrin receptor (sTfR), sCD163 and haptoglobin (Hp) concentration and phenotype distribution
Upon receipt of frozen samples at RIMHS, UOS laboratories, whole blood samples were thawed and centrifuged; serum was separated and levels of hepcidin (Cat No.733228; MyBiosource, San Diego, California, United States), soluble sTfR (Cat No. 750294; MyBiosource), sCD163 (MBS508555) and Hp (MBS763395) were measured using commercially-available colorimetric assay kits; absorbance was read at 450 nm on a microplate reader. Hp phenotypes were determined by vertical polyacrylamide gel electrophoresis, and the bands were visualized by staining with benzedine solution as previously described [17].
Liquid chromatography tandem mass spectrometry (LC-MS/MS)
Plasma was obtained after the collection of samples into heparinized tubes followed by centrifugation for 5 minutes (3000g). The samples were stored at –80 ºC for long-term storage until further metabolomics analysis. An aliquot of plasma sample was first placed into a microcentrifuge tube where cold methanol was added into the sample at 3:1 v/v (i.e., 30 μL sample, add 90 μL cold methanol) vortex and was then stored at –20ºC for two hrs. Next, the samples were centrifuged at 20,817 x g for 15 min at 4ºC. Then, the supernatant was transferred to a new microcentrifuge tube. Usually, the original sample volume is transferred three times (i.e., for 30 μL sample, add 90 μL cold methanol, then transfer 90 μL supernatant). The sample was dried down using Speed vac at 30–40°C. The dried sample was then either stored in a –80ºC freezer for further use or dissolved in solvent for LC-MS/MS analysis. Metabolites were analyzed by HPLC-Q-TOF MS/MS using the auto MS/MS positive scan mode as per described in our recent publications [18,19]. Briefly, samples were chromatographically separated using a Hamilton® Intensity Solo 2 C18 column (100 mm x 2.1 mm x 1.8 μm) and eluted using 0.1% formic acid in water (A) and 0.1% formic acid in ACN (B) using the following gradient: at a flow rate of 0.250 mL/min 1% B from 0–2 min, then gradient elution to 99% B from 2–17 min, held at 99% B from 17–20 min, then re-equilibrated to 1% B from 20–30 min using a flow rate of 0.350 mL/min. The autosampler temperature was set at 8°C and the column oven temperature at 35°C. The ESI source with dry nitrogen gas was 10 L/min, and the drying temperature was equal to 220°C with nebulizer gas pressure set to 2.2 bar. The capillary voltage of the ESI was 4500 V and the Plate Offset 500 V. MS acquisition scan was set at 20–1300 m/z and the collision energy at 7 eV. Sodium formate was injected as an external calibrant between 0.1 and 0.3 minutes. A total volume of 10 μL sample was injected into the TIMS-TOF MS.
Processing analysis was performed using MetaboScape® 4.0 (Bruker Daltonics). Analyte bucketing and identification were done using the software provided available T-ReX 2D/3D workflow with the following parameters: intensity threshold greater than 1000 counts and peak length equal to 7 spectra or greater. Feature quantitation performed using peak area, for features present in at least 3 (of 12) samples (per cell type) were considered for statistical analysis. Analyte MS2 spectra were averaged on import and only features eluting between 0.3 and 25 min with mz between 50 and 1000. For metabolite identification, both MS2 spectra and retention time (RT) were used with the MS/MS spectra as the minimum criterion for a positive hit. For the set of compounds meeting this criterion (either MS/MS alone or MS/MS with RT), annotation using Bruker’s implementation of the Human Metabolome Database (HMDB-4.0) was performed; all selected compounds were matched against this library. Where a particular database entry was matched by multiple features, putatively matching features were filtered by considering each of the features against the highest annotation quality score (AQ score) among other putative matches for the same metabolite; i.e. features exhibiting the best fit across the greatest number of factors such as retention time, MS/MS, m/z values, analyte list and spectral library matching were ranked first for the associated identification as per previous publication [18,19]. Pathway enrichment analyses were performed using MetaboAnalyst V5 (https://www.metaboanalyst.ca). Pathway enrichment evaluates overall pathway impact by considering the relative importance of altered metabolite based on their position in the pathway map.
All data, including raw files, have been deposited in the Metabolomics Workbench Repository (https://www.metabolomicsworkbench.org/).
Data analysis
The metabolomics data were first tabulated in Microsoft excel format and then exported to the Statistical Package for Social Sciences (SPSS) software, version 27 [20]. Demographics, clinical and metabolites data were all merged into one SPSS dataset. Descriptive statistics was used to conduct univariate analysis; frequencies and relative frequencies were used to condense categorical data while measures of central tendency were performed for scale data. Normality of scale data was first tested graphically, using Q-Q plots and histograms and then statistically analyzed using the Kolmogorov Smirnov test. Mean and standard deviations (SD) were reported for scale variables showing normally distributed data, whereas median and interquartile range (IQR: Q1-Q3) were used to summarize variables with skewed data. Chi-square test was performed to test for associations between categorical variables where the strength of an association was measured using the odds ratio (OR). To study the relationships between a normally distributed outcome and a categorical dichotomous predictor, the independent t-test was used. If the predictor had more than two groups, one-way ANOVA test was used to compare three or more means. For skewed outcome variables, similar analyses were conducted using the non-parametric tests Mann-Whitney U test and Kruskal-Wallis test, respectively. Spearman correlation coefficient was performed to investigate the correlation between two variables with skewed scale data. P-values less than or equal to 0.05 indicated statistical significance. Bonferroni correction was used to adjust p-values in pairwise comparisons.
The receiver operating characteristic curve (ROC) and the area under the curve (AUC) were performed to identify, from among the clinical and metabolite tests, significant diagnostic biomarkers for predicting the severity of COVID-19 infection. An ROC AUC value above 0.70 indicated moderate to high level of accuracy of prediction. For each test’s AUC value, statistical significance was assessed against chance by calculating its 95% Confidence Interval (CI) and associated p-value. For each significant diagnostic test/biomarker showing high/moderate accuracy prediction level (AUC > 0.70), data-driven approach was used to determine the optimal cut-off value, specifically, by maximizing the Youden’s index (Sensitivity + Specificity– 1) [21]. Next, the sensitivity (SN) and specificity (SP), along with their 95% confidence intervals, were calculated for each diagnostic test.
Optimal cut-off values were used to dichotomize each biomarker into low and high levels. A low level was coded as zero and a high level was coded as 1. After excluding biomarkers that were linearly related, predictors were identified to develop a risk scoring system to define a diagnostic model for COVID-19 severity based on a combination of important biomarkers used as predictors. The risk score was calculated by counting, for each patient, the number of biomarkers that were of high levels. The ROC and the AUC, using Youden’s Index, were then used to identify the optimal risk score for predicting the severity of COVID-19. Demographics, clinical and serum metabolite laboratory test results were first compared between the three levels of COVID-19 infection severity (asymptomatic, mild and severe). Preliminary analysis has shown that the asymptomatic and mild groups were comparable on most clinical and metabolite results; no significant differences were observed between the two groups. Accordingly, the two groups (asymptomatic and mild) were clustered into a single group (asymptomatic/mild), which was then compared to the severe COVID-19 group to conduct the analysis reported in this manuscript.
Results
Study population demographics
In this study, we analyzed data pertaining to a total of 85 COVID-19 cases (Fig 1), of whom 35.3% (n = 30) were asymptomatic, 11.8% (n = 10) were mild and 52.9% (n = 45) were severe. Males constituted the majority of patients in this study (84.7%, n = 72) as compared to (15.3%, n = 13) females. Mean age of patients was 42 years (SD = 7.73) with a minimum age value of 27 years and a maximum of 62 years. Age was categorized into two groups where 43.5% (n = 37) were 40 years or younger and 56.5% (n = 48) were older than 40 years (Table 1). Age group and gender distributions were comparable between the two groups (asymptomatic/mild) vs. severe (Table 1).
Laboratory findings (clinical tests) and severity of COVID-19 disease
In the study sample as a whole, inflammatory markers including the C-reactive protein (CRP) and D-dimer had median values of 18.2 mg/L (Q1-Q3: 5.45–115.49) and 0.60 μg/ml (Q1-Q3: 0.31–2.24) respectively and mean values of lymphocyte and neutrophil counts of 1.45 (SD = 0.72) and 7.56 (SD = 4.13) cells/μL respectively (Table 1). The majority of the inflammatory markers were found to be significantly higher in the severe COVID-19 group relative to the asymptomatic/mild group. For example, CRP had a median value of 5.90 mg/L in the asymptomatic/mild group and 131.22 mg/L in the severe group (U = 1627.50, p-value<0.001). Similarly, D-dimer had median values of 0.28 μg/ml in the asymptomatic/mild group and 1.27 μg/ml in the severe group (U = 161050, p-value<0.001). While lymphocytes were significantly higher in the asymptomatic/mild group (mean = 1.94 cells/μL) than in the severe group (mean = 1.00 cells/μL; t = 7.880, p-value<0.001), neutrophils were significantly higher in the severe (mean = 9.35 cells/μL) as compared to the asymptomatic/mild group (mean = 5.58 cells/μL; t = -4.773, p-value<0.001) (Table 1). No significant differences were found between the asymptomatic/mild group vs. the severe COVID-19 group, vis-à-vis median values of serum hepcidin or sCD163. However, Hp and soluble sTfR levels were significantly higher (p-value<0.001) in the severe vs. the asymptomatic/mild group; Hp median values were 138.02 vs. 115.73 ng/ml and sTfR median values were 31.61 vs. 21.46 ng/ml (Table 1).
Metabolomics profiles of COVID-19 patients according to disease severity
To investigate the possibility of identifying serum metabolites that help in studying the prognosis of disease severity, metabolomics profiling of plasma samples from patients with no, mild and severe symptoms was performed. A total of 99 metabolites were measured and compared between the asymptomatic, mild and severe cases. Pairwise comparisons showed comparable results for asymptomatic vs. mild patients, hence the grouping of data obtained from asymptomatic patients and patients with mild symptoms as one “asymptomatic/mild” group; much the same as was done in the previous section. Out of the 99 metabolites (S1 Table), 68 have shown significantly different median values between the asymptomatic/mild group and the severe groups. Of these 68, eight (8) metabolites (K_4_Aminophenol, Acetaminophen glucuronide, Cytosine, Elaidic acide, Glycine, Isobutyric, Paracetamol sulfate and Succinylacetone) were significantly higher in the severe group. Additionally, sixty (60) metabolites showed significantly higher values in the asymptomatic/mild group vs. the severe group (Table 2). Next, we conducted an enrichment analysis of the Biological Process gene ontology terms linked with those metabolites. The enrichment pathway analysis using the "small molecule pathway database (SMPDB)" (available in MetaboAnalyst 5.0 software) revealed that the pathways, that the differentially abundant metabolites were most enriched for included the citrate cycle, phenylalanine metabolism, phenylalanine, tyrosine, and tryptophan biosynthesis, pantothenate and coenzyme A biosynthesis, tryptophan, glycine, and serine (Fig 2A). Additionally, the same data set produced disease-enriched groups for Hartnup disease, acute seizures, critical illness (major trauma, severe septic shock, or cardiogenic shock), and hyperbaric oxygen exposure when it was searched against the "blood disease signatures database" (available in MetaboAnalyst 5.0 software). As further detailed in the text, the bulk of the diseases or conditions that emerged from this research have symptoms that are consistent with those listed among the most severe COVID-19 cases (Fig 2B).
(A) Represent the enrichment pathway analysis against "small molecule pathway” database (SMPDB)" (found in the MetaboAnalyst 5.0 software) the results showed that the pathways for which the differentially abundant metabolites were most enriched were the citrate cycle, phenylalanine metabolism, phenylalanine, tyrosine, and tryptophan biosynthesis, pantothenate and coenzyme A biosynthesis, tryptophan. (B) Represent the enrichment pathway analysis against "blood disease signatures database," it generated disease-enriched groups for Hartnup disease, acute seizures, critical sickness (serious trauma, severe septic shock, or cardiogenic shock), and others (available in MetaboAnalyst 5.0 software). Nodes are coloured according to the level of significance for the enrichment (–log10(p)) and sized according to the number of associated dysregulated members (metabolites).
Determining Cut-off values for Biomarkers
To identify clinical biomarkers of disease severity, the ROC and AUC were performed for each clinical laboratory finding (CRP, ferritin, sTfR, LDH, etc.) and metabolite test. For the clinical laboratory findings, the highest accuracy level in predicting disease severity was for LDH (AUC = 1.000) followed by Ferritin (AUC = 0.988, 95% CI: 0.966 to 1.000), D-dimer (AUC = 0.936, 95% CI: 0.876 to 0.997), CRP (AUC = 0.904, 95% CI: 0.842 to 0.966), IL-6 (AUC = 0.919, 95% CI: 0.831 to 1.000), Hp (AUC = 0.792, 95% CI: 0.696 to 0.889), sTfR (AUC = 0.756, 95% CI: 0.653 to 0.858) and Neutrophils (AUC = 0.749, 95% CI: 0.643 to 0.854) (Table 3, Fig 3A). Hepcidin and Lymphocytes showed either insignificant or low AUC values (<0.70), indicating low accuracy in predicting COVID-19 severity. Optimal cut-off values for the laboratory findings as determined by Youden’s index were 226 for LDH (SN = 100%, SP = 100%), 365 for Ferritin (SN = 95.6%, SP = 100%), 0.545 for D-dimer (SN = 93.0%, SP = 90.0%), 33.95 for IL-6 (SN = 87.1%, SP = 100%), 58.35 for CRP (SN = 73.3%, SP = 100%), 124.37 for Hp (SN = 79.5%, SP = 72.5%), 24.67 for sTfR (SN = 82.2%, SP = 60%), and 8.94 for Neutrophils (SN = 59.1%, SP = 85.0%) (Table 3).
Clinical tests showing adequate AUC curves (>0.70) (A), metabolites showing adequate AUC values (>0.70), (B) and ROC curves of predictive Model (C).
Of all the 99 metabolite tests, five were identified as significant diagnostic tests in predicting COVID-19 disease severity; namely, K_4_Aminophenol (AUC = 0.883, 95% CI: 0.803 to 0.962), Acetaminophen (AUC = 0.949, 95% CI: 0.894 to 1.000), Acetaminophen glucuronide (AUC = 0.791, 95% CI: 0.539 to 1.000), Cytosine (AUC = 0.784, 95% CI: 0.680 to 0.887), and Paracetamol sulfate (AUC = 0.836, 95% CI: 0.660 to 1.000) (S1 Table, Fig 3B). Cut-off diagnostic value for K_4_Aminophenol was 381.5 (SN = 82.2%, SP = 90.0%), for Acetaminophen was 1595.5 (SN = 90.9%, SP = 91.2%), for Acetaminophen glucuronide was 1416.0 (SN = 78.0%, SP = 85.7%), for Cytosine was 818.0 (SN = 68.2%, SP = 90.0%), and for Paracetamol sulfate was 652.5 (SN = 81.0%, SP = 88.9%) (Table 3). Each biomarker was then dichotomized into two groups, low and high, based on its determined cut-off value.
Predicting COVID-19 disease severity
Predicting the severity of COVID-19 was done at two levels, first by using a single biomarker and then a combination of biomarkers. All clinical and metabolites tests were significantly and strongly associated with the severity of COVID-19. The proportion of patients with severe COVID-19 in the high group of each clinical/metabolites test was significantly higher than that in the low group. The strength of association, as measured by the odds ratio, between disease severity and the clinical and metabolites groups, was lowest for CRP (OR = 4.3, 95% CI: 2.6 to 7.1) and highest for D-dimer (OR = 120.0, 95% CI: 25.1 to 572.9) (Table 4).
A risk scoring system was developed to define a diagnostic model for COVID-19 disease severity based on a combination of important biomarkers used as predictors. All clinical laboratory findings and metabolites that were significantly associated with disease severity were considered as important biomarkers. After excluding biomarkers that were linearly related, and based on the statistical and clinical importance of all identified diagnostic biomarkers, we selected six predictors to conduct the risk-scoring predictive model. This model included the five lab findings (D-dimer, Ferritin, Neutrophils, Hp, and sTfR) and one metabolite (cytosine). A score was calculated for each patient. This score corresponded to the number of biomarkers that were of levels above their respective cut-off values (high group). The accuracy of predicting disease severity in the risk-scoring predictive model was reflected in a highly significant AUC value of 0.996 (95% CI: 0.989 to 1.000) (Fig 3C). The optimal cut-off risk score for the risk-scoring predictive model, as determined by Youden’s Index, was 3 with a perfect sensitivity of 100% and a specificity of 92.5% (Table 3). Accordingly, all patients with high levels of at least three of the six predictors would be predicted as developing severe COVID-19 and none of the severe cases would be missed out.
Correlating ferritin with other laboratory findings and plasma metabolites
Among all patients, ferritin values were significantly correlated with the values of all other clinical tests except for hepcidin and sCD163. The significant correlations were all positive except for Lymphocytes that showed a moderate indirect correlation with ferritin (rho = - 0.520, p-value<0.001, 95% CI: -0.664 to -0.338). The strongest direct correlations of ferritin were found with LDH (rho = 0.785, p-value<0.001, 95% CI: 0.682 to 0.858), followed by D-dimer (rho = 0.630, p-value<0.001, 95% CI: 0.475 to 0.748) and CRP (rho = 0.618, p-value<0.001, 95% CI: 0.461 to 0.737); the weakest correlation was between ferritin and sTfR (rho = 0.282, p-value = 0.009, 95% CI: 0.067 to 0.472) (Table 5). Moreover, there were significant, mild to moderate indirect correlations between ferritin and several metabolites with serotonin showing the strongest indirect correlation (rho = -0.697, p-value<0.001, 95% CI: -0.836 to -0.474). Ferritin was also found to have a significant positive correlation with acetaminophen (rho = 0.670, p-value<0.001, 95% CI: 0.521 to 0.780), K_4_Aminophenol (rho = 0.573, p-value<0.001, 95% CI: 0.405 to 0.704) and cytosine (rho = 0.416, p-value<0.001, 95% CI: 0.215 to 0.583) (S2 Table).
Discussion
In this study, the concentration of several serum analytes and metabolites was measured in COVID-19 patients with no, mild or severe symptoms. Consistent with numerous previous studies, the serum concentration of analytes that are routinely measured in infected individuals including CRP, ferritin, IL-6, D-dimer, IL-6 and LDH was significantly elevated in patients with severe COVID-19 [22–26]. Also consistent with previous work was the observation that the levels of hepcidin [22,25], sTfR [22,23], and Hp [22,24] were either slightly-moderately elevated or not changed in COVID-19 patients with severe disease. Contrary to the suggestion of Zhou et al [25], our analysis showed that hepcidin is not a true predictor of disease severity in COVID-19 patients. Additionally, while data presented here show that the levels of sCD163 were reduced in severely ill patients, other studies have shown that sCD163 levels increase with disease severity [27]. This discrepancy could be a reflection of variations in methodology, sample collection timing, and/or differences in macrophage activity [28]. Irrespective of these discrepancies, variations in sCD163 concentration seem to have little, if any, impact on COVID-19 disease severity. Our data also showed that Hp phenotype distribution was similar in severe vs. asymptomatic/mild groups, which is in agreement with previous work which has suggested that Hp phenotype has no bearing on COVID-19 disease severity [29].
With regard to the metabolomics profiling of COVID-19 patients and as noted earlier, 60 metabolites decreased in the severe cases relative to asymptomatic/mild patients’ group. The list of identified metabolites included several amino acids, vitamins and few fatty acids (Table 1). These are regarded as the fundamental elements that support the rise in cellular demands during illness. However, through catabolism pathways, they are also involved in innate and adaptive immune responses to infection [30,31]. Therefore, the decrease in some of the reported metabolites is consistent with previous studies, that reported lower levels of amino acids in hospitalized COVID-19 patients compared to asymptomatic ones [32,33]. The outcomes do in fact support the previously noted negative correlation between amino acids and immune responsiveness and hyper-inflammation indicators [34], which is characteristic of severe COVID 19. For example, the levels of Kynurenic acid in severe cases was found to be lower than that in asymptomatic/mild cases. Previous studies have suggested that the Tryptophan catabolite/ Kynurenine pathway may play a significant role in COVID-19 and critical COVID-19 [35]. Moreover, it appears that the increased level of kynurenine and the ratio of kynurenine to tryptophan is strongly correlated with the severity of COVID-19 patients [32,36,37]. Interestingly, the ratio of Kynurenic acid/Kynurenine did not significantly differ between COVID-19 patients compared with non-COVID-19 controls, indicating no significant changes in Kynurenic acid activity, according to a systematic review [35]. This is indeed consistent with our finding that in patients with severe COVID-19, tryptophan and Kynurenic acid levels were significantly lower than in the counter group (Table 3).
Over the last three years, much time and effort has been spent on identifying serum biomarkers that could predict disease severity in COVID-19 patients with high accuracy. Elevated levels of serum biomarkers such as ferritin, IL-6, D-dimer and lactate dehydrogenase among others were reported to be valuable predictors of disease severity and death [22–26]. However, not all COVID-19 patients showing elevated levels in one or more of these biomarkers ended up with severe disease and death [38]. In other words, relying on one or more serum analytes tends to yield low prediction accuracy as evidenced by the fact that such approaches could only account for only a significant percentage of cases. In this context, numerous predictive models that relied on overlapping sets of variables drawn from COVID-19 patients’ demographic data, clinical signs and symptoms, chest X-ray imaging and co-morbidities were proposed (reviewed in [39]. However, many of these severity predictive models suffer from a high degree of subjectivity and high likelihood of bias [39]. For example, a disease severity predictive model based on the static demographics (age, gender, occupation, urban vs. rural living, socio-economic status, profile, etc.) of >50000 Irish patients showed that modeling such parameter could predict hospitalization [(AUC 0.816 (95% CI 0.809, 0.822)], admission to ICU [AUC 0.885 (95% CI 0.88 0.89)] and death [AUC of 0.955 (95% CI 0.951 0.959)] [40]. In the same study, body mass index (BMI≥40) was shown to be a risk factor for ICU admission [OR 19.630] and death [OR 10.802]. Moreover, while rural living was found to associate with increased risk for hospitalization (OR 1.200 (95% CI 1.143–1.261)], urban living was found to associate with increased risk for ICU admission [OR 1.533 (95% CI 1.606–1.682)]. Another study which developed an artificial intelligence (AI)-based model based on 41 variables relating to patient’s demographics, physical measurements, initial vital signs, comorbidities and laboratory findings in a cohort of 5628 Korean COVID-19 patients yielded a predictive power of >0.93 when 6 variables were used [40]. Besides the fact that this model could be skewed by the demographics components making it more population-specific than desired, achieving 93% accuracy by relying on 6 variables is cumbersome and difficult to apply in many poor countries and rural settings. Another AI-based model was developed by relying on laboratory findings including LDH, IL-6, D-dimer, fibrinogen, glucose, monokine induced by gamma interferon (MIG) and macrophage derived cytokine (MDC) levels in 60 COVID-19 Russian patients [41]. The model described by the study relied on eight parameters (creatinine, glucose, monocyte number, fibrinogen, MDC, MIG and CRP) to yield a predictive power of 83−87% [41]. In other words, this laboratory findings-based model failed to account for 13–17% of patients at risk of severe disease.
With the availability of six metabolite predictive biomarkers at our disposal, we sought to develop a high accuracy prediction model based on disparate data-integrating approaches; namely, patient laboratory findings and plasma metabolomics profiles. The statistical model developed and tested was based on ROC-AUC values. Although some metabolomes including acetaminophen, acetaminophen glucuronide and paracetamol sulfate significantly predicted COVID-19 severity, it was unlikely that these metabolomes were involved in the pathophysiology of COVID-19. Severe cases of the disease were more likely to receive more paracetamol than asymptomatic/mild cases. Therefore, these metabolomes were excluded from the predictive models. Accordingly, the predictive model was developed, and its prediction accuracy was internally validated using six biomarkers that were not linearly-related; namely, D-dimer, ferritin, neutrophil counts, Hp, sTfR and cytosine. Prediction accuracy of disease severity using this model was 0.996 (95% CI: 0.989 to 1.000) with optimal cut-off risk score of three biomarkers. In other words, out of 100 patients with severe COVID-19 showing significant elevation in at least three of the six metabolites would predict disease severity in all 100 patients. The model has the advantage of yielding high predictive power with small set of variables (three laboratory findings) that can be easily and quickly acquired at minimal cost. Moreover, the predictive model can be dynamically applied independent of non-empirical clinical data (co-morbidities, signs and symptoms and loss of taste or smell among others) and can be dynamically applied as the disease progresses making timely and proper clinical interventions possible. That said, the utility of the model remains limited by the fact that the study was conducted retrospectively on a small number of samples. Another limitation in our study is that, with the number of COVID-19 cases gradually dwindling to almost zero in the UAE as in most parts of the world, we were not able to compile a new independent dataset with the same set of predictors as means of validating our prediction model. Future studies are recommended to test the validity of the suggested model on multiple datasets to ensure its generalizability.
Conclusion
By integrating laboratory findings and metabolomic profiling data, a model to predict disease severity in COVID-19 patients was generated. The accuracy of the model was high (>98%), and it has the advantage of requiring three biomarkers to yield high sensitivity and specificity in predicting disease severity. The suggested model may prove useful in better managing COVID-19 patients at high risk of severe disease. Lastly, the fact that the model included cytosine as a biomarker and that cytosine is not usually included in routine laboratory testing for COVID-19 patients, merit further work on developing reliable and highly sensitive, yet quick and easy to perform, assays to measure serum cytosine concentration.
Supporting information
S1 Table. Statistical Significance of ROC AUC of metabolites in predicting severity of COVID-19.
https://doi.org/10.1371/journal.pone.0289738.s001
(DOCX)
S2 Table. Correlations between Ferritin and the different metabolites.
https://doi.org/10.1371/journal.pone.0289738.s002
(DOCX)
References
- 1. Ball P. What the COVID-19 pandemic reveals about science, policy and society. Interface focus. 2021;11(6):20210022. pmid:34956594
- 2. Horton R. Offline: COVID-19 and the NHS-"a national scandal". Lancet. 2020;395(10229):1022. pmid:32222186
- 3. Singh VB. The human costs of COVID-19 policy failures in India. Nature Human Behaviour. 2021;5(7):810–1. pmid:34040171
- 4. Baros-Steyl SS, Al Heialy S, Semreen AH, Semreen MH, Blackburn JM, Soares NC. A review of mass spectrometry-based analyses to understand COVID-19 convalescent plasma mechanisms of action. Proteomics. 2022;22(18):e2200118. pmid:35809024
- 5. Cheng B, Hu J, Zuo X, Chen J, Li X, Chen Y, et al. Predictors of progression from moderate to severe coronavirus disease 2019: a retrospective cohort. Clinical microbiology and infection: the official publication of the European Society of Clinical Microbiology and Infectious Diseases. 2020;26(10):1400–5. pmid:32622952
- 6. Phua J, Weng L, Ling L, Egi M, Lim CM, Divatia JV, et al. Intensive care management of coronavirus disease 2019 (COVID-19): challenges and recommendations. Lancet Respir Med. 2020;8(5):506–17. pmid:32272080
- 7. Sanyaolu A, Marinkovic A, Prakash S, Abbasi AF, Patidar R, Williams M, et al. A Look at COVID-19 Global Health Situation, 1-Year Post Declaration of the Pandemic. Microbiology insights. 2022;15:11786361221089736. pmid:35464119
- 8. Sethuraman N, Jeremiah SS, Ryo A. Interpreting Diagnostic Tests for SARS-CoV-2. Jama. 2020;323(22):2249–51. pmid:32374370
- 9. Reilev M, Kristensen KB, Pottegård A, Lund LC, Hallas J, Ernst MT, et al. Characteristics and predictors of hospitalization and death in the first 11 122 cases with a positive RT-PCR test for SARS-CoV-2 in Denmark: a nationwide cohort. International journal of epidemiology. 2020;49(5):1468–81.
- 10. Zhang C, Wu Z, Li JW, Zhao H, Wang GQ. Cytokine release syndrome in severe COVID-19: interleukin-6 receptor antagonist tocilizumab may be the key to reduce mortality. International journal of antimicrobial agents. 2020;55(5):105954. pmid:32234467
- 11. Tang N, Li D, Wang X, Sun Z. Abnormal coagulation parameters are associated with poor prognosis in patients with novel coronavirus pneumonia. Journal of thrombosis and haemostasis: JTH. 2020;18(4):844–7.
- 12. Slupsky CM, Cheypesh A, Chao DV, Fu H, Rankin KN, Marrie TJ, et al. Streptococcus pneumoniae and Staphylococcus aureus pneumonia induce distinct metabolic responses. Journal of proteome research. 2009;8(6):3029–36. pmid:19368345
- 13. Ping F, Li Y, Cao Y, Shang J, Zhang Z, Yuan Z, et al. Metabolomics Analysis of the Development of Sepsis and Potential Biomarkers of Sepsis-Induced Acute Kidney Injury. Oxidative medicine and cellular longevity. 2021;2021:6628847. pmid:33981387
- 14. Fraser DD, Slessarev M, Martin CM, Daley M, Patel MA, Miller MR, et al. Metabolomics Profiling of Critically Ill Coronavirus Disease 2019 Patients: Identification of Diagnostic and Prognostic Biomarkers. Critical care explorations. 2020;2(10):e0272. pmid:33134953
- 15. Blasco H, Bessy C, Plantier L, Lefevre A, Piver E, Bernard L, et al. The specific metabolome profiling of patients infected by SARS-COV-2 supports the key role of tryptophan-nicotinamide pathway and cytosine metabolism. Scientific reports. 2020;10(1):16824. pmid:33033346
- 16. Hoerr V, Zbytnuik L, Leger C, Tam PP, Kubes P, Vogel HJ. Gram-negative and Gram-positive bacterial infections give rise to a different metabolic response in a mouse model. Journal of proteome research. 2012;11(6):3231–45. pmid:22483232
- 17. Awadallah S, Hamad M. The prevalence of type II diabetes mellitus is haptoglobin phenotype-independent. Cytobios. 2000;101(398):145–50. pmid:10755213
- 18. Alsoud LO, Soares NC, Al-Hroub HM, Mousa M, Kasabri V, Bulatova N, et al. Identification of Insulin Resistance Biomarkers in Metabolic Syndrome Detected by UHPLC-ESI-QTOF-MS. Metabolites. 2022;12(6). pmid:35736441
- 19. Jalaleddine N, Hachim M, Al-Hroub H, Saheb Sharif-Askari N, Senok A, Elmoselhi A, et al. N6-Acetyl-L-Lysine and p-Cresol as Key Metabolites in the Pathogenesis of COVID-19 in Obese Patients. Frontiers in immunology. 2022;13:827603. pmid:35663953
- 20.
IBM Corp. Released 2020. IBM SPSS Statistics for Windows VA, NY: IBM Corp.
- 21. Perkins NJ, Schisterman EF. The inconsistency of "optimal" cutpoints obtained using two criteria based on the receiver operating characteristic curve. American journal of epidemiology. 2006;163(7):670–5. pmid:16410346
- 22. Taneri PE, Gómez-Ochoa SA, Llanaj E, Raguindin PF, Rojas LZ, Roa-Díaz ZM, et al. Anemia and iron metabolism in COVID-19: a systematic review and meta-analysis. European journal of epidemiology. 2020;35(8):763–73. pmid:32816244
- 23. Sonnweber T, Boehm A, Sahanic S, Pizzini A, Aichner M, Sonnweber B, et al. Persisting alterations of iron homeostasis in COVID-19 are associated with non-resolving lung pathologies and poor patients’ performance: a prospective observational cohort study. Respiratory Research. 2020;21(1):276. pmid:33087116
- 24. Maisonnasse P, Poynard T, Sakka M, Akhavan S, Marlin R, Peta V, et al. Validation of the Performance of A1HPV6, a Triage Blood Test for the Early Diagnosis and Prognosis of SARS-CoV-2 Infection. Gastro hep advances. 2022;1(3):393–402. pmid:35174366
- 25. Zhou C, Chen Y, Ji Y, He X, Xue D. Increased Serum Levels of Hepcidin and Ferritin Are Associated with Severity of COVID-19. Medical science monitor: international medical journal of experimental and clinical research. 2020;26:e926178. pmid:32978363
- 26. Terpos E, Ntanasis-Stathopoulos I, Elalamy I, Kastritis E, Sergentanis TN, Politou M, et al. Hematological findings and complications of COVID-19. American journal of hematology. 2020;95(7):834–47. pmid:32282949
- 27. Zingaropoli MA, Nijhawan P, Carraro A, Pasculli P, Zuccalà P, Perri V, et al. Increased sCD163 and sCD14 Plasmatic Levels and Depletion of Peripheral Blood Pro-Inflammatory Monocytes, Myeloid and Plasmacytoid Dendritic Cells in Patients With Severe COVID-19 Pneumonia. Frontiers in immunology. 2021;12:627548. pmid:33777012
- 28. Chauvin P, Morzadec C, de Latour B, Llamas-Gutierrez F, Luque-Paz D, Jouneau S, et al. Soluble CD163 is produced by monocyte-derived and alveolar macrophages, and is not associated with the severity of idiopathic pulmonary fibrosis. Innate immunity. 2022;28(3–4):138–51. pmid:35522300
- 29. Delanghe JR, Speeckaert MM, D Buyzere ML. COVID-19 infections are also affected by human ACE1 D/I polymorphism. Clinical Chemistry and Laboratory Medicine (CCLM). 2020;58(7):1125–6. pmid:32286246
- 30. McGaha TL, Huang L, Lemos H, Metz R, Mautino M, Prendergast GC, et al. Amino acid catabolism: a pivotal regulator of innate and adaptive immunity. Immunological reviews. 2012;249(1):135–57. pmid:22889220
- 31. Sikalidis AK. Amino acids and immune response: a role for cysteine, glutamine, phenylalanine, tryptophan and arginine in T-cell function and cancer? Pathology oncology research: POR. 2015;21(1):9–17. pmid:25351939
- 32. Karu N, Kindt A, van Gammeren AJ, Ermens AAM, Harms AC, Portengen L, et al. Severe COVID-19 Is Characterised by Perturbations in Plasma Amines Correlated with Immune Response Markers, and Linked to Inflammation and Oxidative Stress. Metabolites. 2022;12(7). pmid:35888742
- 33. D’Alessandro A, Thomas T, Akpan IJ, Reisz JA, Cendali FI, Gamboni F, et al. Biological and Clinical Factors Contributing to the Metabolic Heterogeneity of Hospitalized Patients with and without COVID-19. Cells. 2021;10(9):2293. pmid:34571942
- 34. Schrijver B, Assmann J, van Gammeren AJ, Vermeulen RCH, Portengen L, Heukels P, et al. Extensive longitudinal immune profiling reveals sustained innate immune activation in COVID-19 patients with unfavorable outcome. European cytokine network. 2020;31(4):154–67. pmid:33648924
- 35. Almulla AF, Supasitthumrong T, Tunvirachaisakul C, Algon AAA, Al-Hakeim HK, Maes M. The tryptophan catabolite or kynurenine pathway in COVID-19 and critical COVID-19: a systematic review and meta-analysis. BMC Infectious Diseases. 2022;22(1):615. pmid:35840908
- 36. Richard VR, Gaither C, Popp R, Chaplygina D, Brzhozovskiy A, Kononikhin A, et al. Early prediction of COVID-19 patient survival by targeted plasma multi-omics and machine learning. Mol Cell Proteomics. 2022:100277. pmid:35931319
- 37. Cihan M, Doğan Ö, Ceran Serdar C, Altunçekiç Yıldırım A, Kurt C, Serdar MA. Kynurenine pathway in Coronavirus disease (COVID-19): Potential role in prognosis. Journal of clinical laboratory analysis. 2022;36(3):e24257. pmid:35092710
- 38. Samprathi M, Jayashree M. Biomarkers in COVID-19: An Up-To-Date Review. Frontiers in Pediatrics. 2021;8. pmid:33859967
- 39. Wynants L, Van Calster B, Collins GS, Riley RD, Heinze G, Schuit E, et al. Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal. BMJ (Clinical research ed). 2020;369:m1328. pmid:32265220
- 40. Oh B, Hwangbo S, Jung T, Min K, Lee C, Apio C, et al. Prediction Models for the Clinical Severity of Patients With COVID-19 in Korea: Retrospective Multicenter Cohort Study. Journal of medical Internet research. 2021;23(4):e25852. pmid:33822738
- 41. Krysko O, Kondakova E, Vershinina O, Galova E, Blagonravova A, Gorshkova E, et al. Artificial Intelligence Predicts Severity of COVID-19 Based on Correlation of Exaggerated Monocyte Activation, Excessive Organ Damage and Hyperinflammatory Syndrome: A Prospective Clinical Study. Frontiers in immunology. 2021;12:715072. pmid:34539644