Metabolomics based predictive biomarker model of ARDS: A systemic measure of clinical hypoxemia

Despite advancements in ventilator technologies, lung supportive and rescue therapies, the outcome and prognostication in acute respiratory distress syndrome (ARDS) remains incremental and ambiguous. Metabolomics is a potential insightful measure to the diagnostic approaches practiced in critical disease settings. In our study patients diagnosed with mild and moderate/severe ARDS clinically governed by hypoxemic P/F ratio between 100–300 but with indistinct molecular phenotype were discriminated employing nuclear magnetic resonance (NMR) based metabolomics of mini bronchoalveolar lavage fluid (mBALF). Resulting biomarker prototype comprising six metabolites was substantiated highlighting ARDS susceptibility/recovery. Both the groups (mild and moderate/severe ARDS) showed distinct biochemical profile based on 83.3% classification by discriminant function analysis and cross validated accuracy of 91% using partial least squares discriminant analysis as major classifier. The predictive performance of narrowed down six metabolites were found analogous with chemometrics. The proposed biomarker model consisting of six metabolites proline, lysine/arginine, taurine, threonine and glutamate were found characteristic of ARDS sub-stages with aberrant metabolism observed mainly in arginine, proline metabolism, lysine synthesis and so forth correlating to diseased metabotype. Thus NMR based metabolomics has provided new insight into ARDS sub-stages and conclusively a precise biomarker model proposed, reflecting underlying metabolic dysfunction aiding prior clinical decision making.


Introduction
Acute respiratory distress syndrome (ARDS), despite many ventilator and therapeutic measures persists with a death toll of more than 40%,which has grabbed the attention of intensivists worldwide [1]. With a incidence of 58.7 cases per 100,000 population [2], efforts are needed to understand ARDS progression, aid prognostication and predict outcome with better a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 follow up and surveillance. It is clinically manifested by acute respiratory hypoxemia, bilateral pulmonary infiltrates and noncardiogenic pulmonary edema resulting from direct or indirect pulmonary injury [3,4]. Since ARDS first depiction in 1967 [5], several studies have been carried out to address its underlying heterogeneity, complex etiological factors and epidemiology which misleads diagnosis, risk stratification and subsequent therapeutic intervention [6]. Despite large number of rescue therapies, [7][8][9][10][11]the diagnostic criteria do not suffice in the management of ARDS [12][13][14][15]. The new diagnostic criteria based on Berlin definition [4] convened in 2012categorizes ARDS patient on the basis of partial pressure of oxygen to the fraction of inspired oxygen (P/F) ratio between 200-300, 100-200 and below 100 into mild, moderate and severe group respectively. But the disparity in clinical definitions so far proposed with no specific pharmacotherapy and many predisposing factors (sepsis, pneumonia, trauma, aspiration, pancreatitis) [16,17]have added to ARDS morbidity, thus necessitating symptomatic biomarkers. In continuum biomarkers of ARDS (interleukins, angiopoietin-2, and surfactant proteins) [18][19][20] have been explored for high risk prediction and to predict the mortality outcome. Biomarkers with predictive value, accuracy, specificity and sensitivity would permit a comprehensive evaluation of underlying predisposition to disease [21,22]. In search of biomarkers with high discriminatory ability, different biofluids have been studied extensively for better prognostication which has always served as a conventional indicator of disease pathophysiology [23,24].
The basis of further exploratory studies in ARDS relies on omics approaches by using biofluids to monitor the cellular metabolism in the diseased state. Though genomics [25] and proteomics [26]based studies have made significant progress in targeting pathological derangements but metabolomics has complemented and facilitated the contemporary understanding of dynamic, systemic and functional aspects of ARDS pathophysiology [27]. Metabolomics gives a comprehensive overview of the metabotype in response to exogenous, endogenous and environmental stimuli thus holds the novel aspect of providing readout of real time in-vivo phenotype from the top-down biochemical phenomena [28]. Many disease implicated markers of ARDS and related pulmonary diseases have been reported and probed in serum [29], exhaled breath condensate [30], undiluted pulmonary edema fluid [31], plasma [32], whole blood for gene expression analysis [33], urine [34], bronchoalveolar lavage fluid (BALF) [35][36][37][38][39][40][41] employing metabolomics. The simultaneous targeted and untargeted metabolite profiling, computing the inherent variance and its quantitation in response to disease or drug relies on robust, reproducible and time conducive analytical platform requiring minimal sample preparation [42]. The multiple precipitating factors of ARDS has gained notable interest among clinicians to rely on method that is unbiased, unambiguous and renders multiparametric snapshot of molecular level changes aiding both qualitative and quantitative therapeutic response to disease. Nuclear magnetic resonance (NMR) based metabolomics has emerged out as a reliable method for detecting distinct chemical signature of each metabolite and its relative perturbation in diseased state non-invasively with least sample processing thus enabling dynamic reflection of pathophysiology in biofluids [43,44]. ARDS which is radiological and clinically attributed by edema, infiltrates and inflammation of lungs requires localized epithelial lining fluid to best capture the cellular and biochemical changes [45]. A recent study has established mini Bronchoalveolar lavage fluid (mBALF) [46]from the proximal alveoli as diagnostic biofluid reflecting pulmonary metabolome for biomarker studies by NMR based metabolomics [47].Aiming to resolve the long standing bottleneck of ARDS prognostication and progression, NMR based phenotyping using mBALF though in its infancy has resulted in differentiating metabolites of acute lung injury (ALI) and ARDS with respect to controls [48]. But the explicit role of these metabolites as surrogate markers in terms of clinical utility and applicability is yet to be validated for clinical decision making and clinical trials.
The severity classification as per the new Berlin definition is based on the clinical index mainly P/F ratio which rules out some key ancillary variables of high risk profile thereby stressing more reliable and unambiguous insightful studies. Our study aims to better interpret ARDS severity classification using NMR based metabolomics and thereby set up metabolite credentials for a biomarker model aiding ARDS disease management and follow up. Our present study gives the distinct biochemical profile of mild and moderate/severe ARDS group based on endogenous metabolic profile. Differential biomarker candidates were categorized based on chemometric and pattern recognition methods from lung alveoli using NMR spectroscopy of mBALF. Comprehensive univariate and multivariate analysis resulted in the pivotal role of metabolites such as taurine, threonine, glutamate, proline, and lysine/arginine to metabolic perturbations in acute and mild stages of ARDS. Substantial role of these metabolites in ARDS intricate metabolic network was found to be governed by the dysregulated arginine and proline metabolism, lysine synthesis and degradation, taurine and hypotaurine metabolism, glycine, serine and threonine metabolism and glutamine and glutamate metabolism. The above mentioned analysis led to a putative biomarker model predictive of ARDS susceptibility or recuperation and the underlying dynamic biochemical behavior. Potential and prospects of the proposed biomarker model comprising these six metabolites is also evident from receiver operating characteristic (ROC) curve and significant pathway analysis impact greater than equal to 0.1. Thus metabolomics studies can correlate clinical endpoints with diseased specific markers that will be predictive, preventive and participatory. Correspondingly such biomarker model validated in large sample size holds application for early identification of high risk individuals providing metabolic and clinical variables of ARDS progression left unmapped.

Patients
Total 36 samples were included in the study with 23 samples from patients diagnosed with moderate/severe ARDS and 13 samples taken from mild ARDS patients enrolled at Intensive Care Unit (ICU) admission of Sanjay Gandhi Post Graduate Institute of Medical Sciences (SGPGIMS). Patients included in the study group were suffering from mild and moderate/ severe ARDS as stipulated by Berlin definition convened in the year 2012. The diagnostic criteria for moderate/severe ARDS [49] was based on the P/F ratio between 100-200 and 100 with bilateral opacities on chest radiograph that need not be explained by effusions or collapse, positive end-expiratory pressure(!10cm H 2 O) and respiratory system compliance. Similar diagnostic measures were followed for mild patients except the P/F ratio was taken between 200-300. Lung protective ventilation as per the ARDSNet protocol was practiced in the ICU. Exclusion criteria in the current study include patients with age less than 18 years, pregnancy, chronic obstructive pulmonary disease (COPD) patients, bronchial asthma, interstitial lung disease, and other chronic respiratory ailments.

Study protocol
The cross-sectional study was performed in ICU of a tertiary care medical center in SGPGIMS, Lucknow, and Centre of Biomedical Research (CBMR), Lucknow, India. The experimental protocol was approved by the SGPGIMS ethical committee and written informed consent was obtained from all subjects or their surrogate decision makers. The mBALF sample from mechanically ventilated patients was collected using the standardized non bronchoscopic "catheter in catheter" technique [50] in ICU within 24 hours of diagnosis. For mBALF suctioning, a soft suction, nontoxic, pyrogen free, graduated 16 French gauge/8 French gauge catheter in catheter set was employed. The sample was collected in mucus trap using 10ml sterile distilled water, transferred to a collection vial following all the sterile aseptic precautions and then quenched in liquid nitrogen till further processing. The mBALF sampling and experiments were performed in accordance with the approved guidelines and regulations of the institute's ethical committee.

mBALF sampling and processing
The samples were vortexed and then centrifuged at 16000 rpm for 10min at 4˚C to remove cellular debris, bacteria and then supernatant preserved at -80˚C inside freezer for storage till NMR experiments were performed. The acquisition parameters were optimized for 1 H NMR mBALF spectra of mild and moderate/severe ARDS group in 800MHz NMR spectrometer (BrukerBiospin). The process of NMR spectral acquisition begins with preparation of 550μl sample including 200μl buffer to minimize the variation in pH with Trimethylsilylpropanoic acid (TSP), D 2 O and 350μl BALF sample. TSP (6.53mM) was added in the buffer (0.1 M Na 2 HPO 4 /NaH 2 PO 4 , pH-7.4) for internal chemical shift reference along with 10% D 2 O to provide a field frequency lock NMR spectroscopy NMR spectra of diseased sample were recorded at 800-MHz Bruker NMR spectrometer encompassed with a cryogenically cooled triple-resonance TCI ( 1 H, 13 C, 15 N, and 2 H lock) probe at 300.15K for one dimensional (1D) proton ( 1 H) NMR data acquisition of mBALF samples. All 1D NMR spectra was based on water pre saturation pulse sequence and acquired using a 90˚flip angle, a 20ppm spectral width and a relaxation delay of 5 seconds, 32 transients were collected into 64k data points with an acquisition time (T aq ) of 1.99 seconds and 16 dummy scans. 1 H 1D spectra were referenced to the TSP signal (δ = 0.00 ppm). The Free induction decays (FIDs) were multiplied by an exponential weighting function corresponding to a line broadening function of 0.3 Hz and zero filled before Fourier transformation. The acquired spectra were phased and baseline corrected and manually integrated with respect to TSP for chemical shift calibration and calculating relative resonance intensities of small molecular weight metabolites. Small metabolite resonances present in 1Dspectra were assigned and confirmed by using two dimensional (2D) NMR spectra including HSQC (Figure A in S1 Text), TOCSY ( Figure B in S1 Text), COSY and Biological magnetic resonance bank (BMRB) database. For 1 H-13 C HSQC, a total of 2048 data points were collected and 24 scans were averaged for each of 826 increments covering spectral width of 16 ppm and 165 ppm in 1 H and 13 C dimension respectively. Prior to Fourier transform, zero filling up to 2k data points with forward linear prediction to 128 points and shifted sine-bell-squared apodization function were used in both t2 and t1 dimension. TOCSY spectra was acquired in phase sensitive mode and mixing time of 80 ms was set using MLEV spin lock. A total of 2k data points and 256 increments were acquired with 60 scans per increment. Prior to Fourier transform, FID was weighted in both dimension by sine-bell shaped window function and zero filled to 1k data points with 2s relaxation delay. Resonances present in 1 H NMR spectra were assigned to 29 small molecular weight metabolites in the diseased spectra. Among these, 17 metabolites with significant aberrance in concentration reported from our previous study [36] were manually integrated using Topspin 2.1.

Statistical analysis
In order to compute potential discriminating markers of mild and moderate/severe ARDS group, a series of univariate and multivariate analysis was performed using the software Metaboanalyst [51] and SPSS 16.0. Data was Pareto scaled and log transformed, a pre-treatment method to obtain Gaussian distribution. Discriminant function analysis(DFA) was performed to get the accuracy of the model with 17 significant metabolites. PCA and PLS-DA analysis of these discriminating variables were performed to reduce the dimensionality of matrix and to maximize the covariance between predictor variables (metabolite intensities) and the response variables (class labels). These biomarker candidates were further validated by Independent Sample T-test at the univariate level to highlight the important variable with a threshold p value of < 0.05. For statistical inference VIP values (weighted sum of squares of PLS loadings) greater than 1 was used to specify discriminating variables. Sorted 9 metabolites were plotted in bar chart to get an overview of their mean concentration. DFA, PCA, PLS-DA were performed in succession, using the resonance intensities of these 9 metabolites to categorize the model and to project the latent variables (components) which best infer the response variables. These putative metabolites were further narrowed down to selective differential biomarkers by stepwise DFA. Using projection algorithms PCA and PLS-DA, these metabolites were checked for improved separation between the groups. Feature selection tools like supervised hierarchical clustering (HC), empirical Bayesian analysis of metabolites (EBAM), volcano plot and random forest (RF) led to the conclusive results. A detail of these statistical models is provided in the supplementary information. The Pathway analysis module was performed using Metaboanalyst, which incorporates pathway enrichment analysis and topology analysis based on KEGG database and employs novel algorithms and concepts to identify the utmost affected pathway in the diseased state. Area under the ROC curve(AUROC) was used as a benchmark to assess the diagnostic parameters (sensitivity and specificity) of candidate biomarker prototype responsible for validating the possible confounders within the stages of diseased groups.

Patient characteristics
The clinical and baseline characteristics of the patients at the time of sampling are provided in Table 1 which includes demographic profile and illness severity scores like Acute Physiology and Chronic Health Evaluation-II(APACHE II) and Sequential Organ Failure Assessment (SOFA) which was taken at the time of admission. The mild and moderate/severe patients were age matched and reported with more or less similar co-morbidities with a higher total leukocyte count (TLC) and procalcitonin value in ARDS.

NMR snapshot of ARDS substages
A representative NMR spectrum showing the difference between mild and moderate/severe ARDS is depicted in Fig 1. The NMR spectrum from moderate/severe stage of ARDS are mostly dominated by resonances from amino acids, lactate, glucose, intermediates of tricarboxylic acid cycle and many other low molecular weight endogenous metabolites while the mild group is predominated by glutamate and choline resonances in the complex mixture of mBALF sample as per the Fig 1. In order to obtain a significant comparison across all subjects with interpretation of metabolic signatures predictive of disease, data were subjected to chemometric analysis. Out of the 29 metabolites assigned 18 were found to have anomaly in their intensity values as reported from our earlier study [36]. Due to the large variation exhibited by lactate resonance which hindered the possible interpretation of significant results, it was excluded from further statistical analysis. Lactate is also nonspecific and intrigued in terms of its diagnostic value as its upsurge is common phenomena associated with tissue hypoperfusion together with many etiologies [52]. So in the present study (36 disease samples) the 17 prominent metabolites which contributed to the separation within different categories of ARDS were selected to get a snapshot of their relevance on a larger dataset. These metabolites were further substantiated for their paramount role in distinguishing 13 mild from 23 moderate/severe patients by various chemometric approaches. The study made progress with screening of 9 significant variables from 17 selected and further narrowing down to 6 potential biomarkers. Since single a priori statistical tool is susceptible to underlying data structure therefore successive univariate and multivariate analysis were carried out to corroborate the robustness and reproducibility of the six metabolites linked to the diseased group with their anomaly interconnected to downstream metabolic pathway.

Class separation based on significant metabolites
Although clinical disposition of ARDS and it varying heterogeneous causes and outcome adds to the uninduced biological variation, hence standardized protocols were practiced in ICU by clinicians and expertise to mitigate the amount of technical variation in sampling. All the intubated patients in ICU underwent sedation and analgesia thereby corresponding to minimal stress repercussions. Sampling has been carried out using a standardized non bronchoscopic "catheter in catheter" technique and a fixed volume of sterile distilled water was used for suctioning of tracheal aspirates by trained person. The visual interpretation of NMR generated multidimensional datasets reflects the underlying induced biological variation. This necessitates data mining and other exploratory data analysis by chemometrics with requisite data pretreatment measures [53] to remove the heteroscedasticity due to uninduced biological variation[54] mentioned above. To strengthen the classification and prediction of differential metabolites, NMR data was manually referenced to TSP and Pareto scaled [55] and log transformed [56] removing the skewed distribution because of heteroscedasticity thereby obtaining a symmetric data for further confirmatory statistical analysis. Firstly, the DFA test classified the model with 17 metabolites as 94.4% accurate with lower Wilks lambda, significant classification as apparent from chi square test, with greater Eigen value and significant value of 0.01 (Table A in S1 Text). All the data were log transformed and Pareto scaled ( Figure C in S1 Text). The PCA and PLS-DA ( Figure D in S1 Text) further assured that the experimental parameters (17 metabolites) can be employed to probe for the attributes contributing to separation between mild and moderate/severe stages of ARDS. Details of PCA and PLS-DA component contributing to model significance are provided in the supplementary information. The lower values of R2 and Q2 (accuracy = 0.88, R2 = 0.78 and Q2 = 0.54) can be correlated with the separation which is being sorted within the diseased group. Amid all the 17 metabolites scrutinized, the determinant metabolites were ranked by the VIP score. High VIP score of more than 1 was used as a cutoff to include variables with discriminatory power ( Table B in S1 Text). Phenylalanine, threonine, taurine, proline, lysine/arginine, alanine, glycine, branched chain amino acids (BCA) were included in the classification. In accordance with the Figure E in S1 Text all the metabolites except glutamate and acetate were notable in the mild group. The observed significant metabolites obtained in independent sample T-test with unequal variance (threshold p value <0.05 and confidence interval 95.0%) were analogous with the VIP ranked metabolites (Table B in S1 Text). These results further corroborated these set of metabolites as significant.

Chemometrics based screening of metabolites
These eight metabolites were further screened by an iterative multivariable data mining strategies. The discriminating variables were again subjected to DFA with nine metabolites including glutamate as its prominence was shown in the mild group. The mean concentration of these nine metabolites is also assessed in arbitrary unit (au) with respect to the concentration of TSP with standard error shown as error bars ( Figure F in S1 Text). An elevated level of all the metabolites is depicted in the moderate/severe group excluding glutamate which is showing an increased concentration in mild group. The PCA and PLS-DA showed improved separation with glutamate and yielded conclusive results in subsequent analysis with DFA which yielded 88.9% correct classification (Table C in S1 Text).

Six differentiating markers of ARDS disease course
Finally, stepwise DFA sorted out 6 metabolites (proline, lysine/arginine, taurine, threonine, glutamate) with 83.3% correct classification suggesting their predominance in ARDS (Table D in S1 Text). Their discriminating power is reflected both in PCA, PC 1 = 35.1% and PC 2 = 20.8% (Fig 2a) and PLS-DA Component 1 = 20.8% and Component 2 = 26% (Fig 3a). Each spectrum represents an observation (scores) and by examining the proximity of these observation due to variables contribution (loadings) under each observation within and between the cluster of Mild ARDS and Moderate/Severe ARDS the similarity and dissimilarity in metabolic profile of mild ARDS and moderate/severe ARDS can be explained. Projection of scores and loadings in two dimensional score plot of PCA and PLS-DA (Figs 2 and 3), visually revealed variance and separation between mild ARDS and moderate/severe ARDS. On the basis of PCA scores and loadings as points and arrows, a biplot (planar graph) (Fig 2b) and loading plot (Fig 2c) was plotted which provided the explanatory pattern in the data set. The six most important metabolites contributing to the 60% PCA variance were interpreted on the basis of loadings. Further biplot, a scatter plot representing an overlay of scores and loading was reduced along two principal components with arrows indicating the loadings of six metabolites, making the observation (scores) and the corresponding metabolite correlation more obvious and significant in mild ARDS and moderate/severe ARDS. Longer arrow outlines the relevance of these metabolite and distance of projection depicts its level in discriminatory group as per the Fig 2b. The Fig 2b explains the role of glutamate in mild ARDS whereas threonine, taurine, lysine/arginine, and proline contribution was found towards moderate/severe ARDS group. An improved accuracy = 0.91, predicted (R2 = 0.72) and expected variance (Q2 = 0.60) were in accordance with the above stated result using leave one out cross validation (LOOCV) (Fig 3b and Table E in S1 Text). The PLS-DA model validation by permutation test was determined by separation distance based on sum of squares between and sum of squares within (B/W) ratio. The random class assignment prediction based on 100 permutations is plotted in histogram (Fig 3c). The p valueless than 0.01 and B/W ratio of real class assignment is calculated as shown in Fig 3c. If the real B/W ratio is not a part of the random distribution, it can be deduced that the difference between the respective groups is significant as seen in the Fig 3c. Validation of predictive biomarker model The selected subsets of disease specific markers (proline, lysine, arginine, threonine, taurine and glutamate) were further supported by various feature selection tools. The results were confirmed with volcano plot, EBAM (Figure G in S1 Text), random forest analysis ( Figure H in S1 Text) and hierarchical clustering analysis (HCA). The details of these analyses have been provided in the supplementary information. In HCA, represented as tree dendrogram a discrete grouping is observed which signifies biological variance using the Pearson correlation and wards linkage (Fig 4a). The groups were segregated on the basis of metabolic intensities of 6 metabolites with the height of branches representing the distance between the groups. Red cluster denoting mild group and green as moderate/severe ARDS. Heat map of metabolite intensities of the 6 selective biomarkers shows correlation between the 1 H NMR spectral regions of selective metabolites to respective mild and moderate/severe ARDS group (Fig 4b).
The predictive accuracy, specificity and sensitivity of a single biomarker is rather speculative thus an ensemble of biomarkers is sorted for more elaborate understanding of pathophysiology which governs its clinical utility. Subsequently ROC study was applied for the proposed biomarker prototype comprising proline, taurine, lysine/arginine, threonine and glutamate which came out to be 0.95 (Fig 4c).

Metabolome interaction network
The putative biomarker model comprising narrowed down metabolites pooled from biostatistics were further evaluated by generating metabolite pathway analysis highlighting the most correlated and discriminant pathway under study. The pathway analysis helped in establishing the underlying connectivity of the proposed biomarker prototype with the affected biological process based on signature specificity and metabolite impact !0.1. The most relevant and significant metabolic pathway mirroring ARDS biochemical phenomena included arginine and proline metabolism, lysine biosynthesis, lysine degradation, aminoacyl-tRNA biosynthesis, taurine and hypotaurine metabolism, glycine, serine and threonine metabolism and D-glutamine and D-glutamate metabolism, alanine, aspartate and glutamate metabolism (Fig 5a). The interaction between metabolites and the corresponding biological process involved in the diseased condition can help to probe the possible biological roles of the upregulated/ downregulated metabolites (Fig 5b). The metabolites comprising biomarker model (proline, lysine/ arginine, taurine, threonine, glutamate) were also found with significant impact in pathway analysis thereby further large sample study will help in predicting their role in susceptibility/ recovery of ARDS disease course.
A comparison study was followed up to understand the role of these biomarkers in lung physiology. The novel findings in this study arise from the fact that:(1) NMR based metabolomics can differentiate mild and moderate/severe ARDS group on the basis of discrete metabolic signature distinct from the clinical and other ancillary variables. (2) This is the first differential study executed on such sample size of ARDS employing the New Berlin definition.

Discussion
Lung injury due to its many etiologies results in complex metabolic perturbation, consequently metabolites linked to the diseased state shows a surge of metabolites interlinked with the dysregulation of metabolism that can be used to address the various clinical manifestations. ARDS pathophysiology governing inflammatory reactions results in hypoxia, reactive oxygen species (ROS) production and oxidative stress which trigger oxidative attack on lysine, arginine, proline, threonine leading to carbonylated proteins. Carbonylated proteins which are oxidated proteins have found to be implicated in ARDS [57]similar to what we found elevated in our study. Glutamate was found in increased concentration in the mild group indicating its association with less diseased lung of ARDS. Falling in the class of excitatory neurotransmitters glutamate evidential role has been implicated in acute lung injury [58]. The role of glutamate agonist N-methyl-D-aspartate (NMDA) has been reported in the rat lungs. [59] NMDA mediates excitotoxicity through NMDA receptors present possibly in the alveolar capillary area by causing pulmonary edema [60]. Glutamate signaling can in turn evoke nitric oxide (NO) production and apoptosis attributed by caspase 3 productions. Further studies on the physiological role of glutamate signaling in lung toxicity can unravel more restorative measures for ARDS.
Macrophage and neutrophil sequestration results in the collateral release of ROS, proinflammatory cytokines hampering epithelial and endothelial functions with increased microvascular permeability due to endothelial barrier disruption. The consequent free radical production during the pathogenesis of ARDS is likely reduced by the high concentration of taurine. Taurine is an endogenous sulphur containing amino acid found in neutrophil. Taurine has already been attributed for its protective role as an antioxidant with anti-inflammatory properties. Taurine has also been found imperative in alleviating IL-2 induced lung injury by attenuating neutrophil-endothelial interaction [61]. Thus taurine escalation in ARDS is an index of rescue mechanism to check inflammation.
Another metabolite found proliferated in ARDS patients in our study was arginine. The key role of arginine has been signified in the intermediary metabolism of critically ill patients with salient effect in the periods of hypermetabolic stress [62]. Arginine is the only biosynthetic precursor of NO. Arginine upregulation can be correlated with the body defense mechanism to increase NO production to relieve pulmonary hypertension associated with hypoxia [63]. NO a potent endothelial derived relaxing factor produced by lungs, has got pertinent roles in lung physiology and pathobiology of lung diseases [64]. Arginine can be prospective target to modulate the endothelial NO production. Lysine level was also found to be increased in ARDS and may be ascribed for its viable defensive role in ARDS. Its role has been found evident in decreasing NO production thus increasing vascular resistance as reported in neonatal pigs [65].
Another metabolite upregulated in the diseased group is threonine. Threonine a gluconeogenic amino acid is found to be associated with sepsis [66] which accounts to be a major aetiological factor of ARDS. It regulates immune responses by the production of antibodies, triggers lymphocyte proliferation with a significant role in the inhibition of apoptosis [67]. There is a paucity of literature suggesting the immunoregulatory role of threonine in the critical setting of lung physiology. Proline, a proteogenic secondary amino acid is a key precursor of collagen formation and mesenchyme in the lungs. It showed a significant increase in the ARDS group. Collagen is found to be associated with branching in airways. Airway remodelling in chronic lung diseases is reported with the production of proline due to the activity of arginase from alveolar macrophages [68]. Proline residues are vulnerable to oxidation by ROS resulting in fragmentation and apparent loss of function. Proline displays a relevant role as a stress substrate in the microenvironment of inflammation [69].
These metabolites (lysine, arginine, proline, threonine, taurine, glutamate) can be used as a possible measure to illustrate the complex pathophysiology associated with lung injury and thus exemplifying the different metabolic pattern and its associated biochemical alterations exhibited in mild and moderate/severe ARDS. The above pilot study helped in identifying diagnostic signatures based on ARDS substages. But the subsequent biomarker credentials based on severity and progression requires more such follow up and outcome studies with validation in large sample size. The above mentioned limitation is on the pipeline to further establish the role of the proposed biomarker model in the clinical setting.

Conclusion
In conclusion, NMR based metabolomics study was performed to fingerprint the metabolites depicting different stages of ARDS and their impact in the associated biological pathway suggesting their possible role in susceptibility/recuperation. Iterative statistics led to the compilation of 6 key metabolites which could differentiate mild and moderate/severe ARDS, thus arriving at a conclusive predictive biomarker model. The pivotal biological roles of these biomarker candidates investigated through pathway analysis can further establish their contribution to ARDS complex heterogeneity and disease manifested systemic response. The prior research findings are based on pilot study which necessitates streamlining the putative biomarker model in future prospective cohort studies to establish its diagnostic accuracy and potency. Further research into these biomarkers can give better understanding of the pathophysiology, progression and their possible utility and applicability as therapeutic targets. Our study infers NMR based metabolomics as a plausible tool with multidisciplinary applications to provide biomarker model having potent implication in ARDS management with distinct clinical relevance and downstream effect of metabolome in ARDS stages.
Supporting information S1 Text. (Figure A)