Novel breath biomarkers identification for early detection of hepatocellular carcinoma and cirrhosis using ML tools and GCMS

According to WHO 2019, Hepatocellular carcinoma (HCC) is the fourth highest cause of cancer death worldwide. More precise diagnostic models are needed to enhance early HCC and cirrhosis quick diagnosis, treatment, and survival. Breath biomarkers known as volatile organic compounds (VOCs) in exhaled air can be used to make rapid, precise, and painless diagnoses. Gas chromatography and mass spectrometry (GCMS) are utilized to diagnose HCC and cirrhosis VOCs. In this investigation, metabolically generated VOCs in breath samples (n = 35) of HCC, (n = 35) cirrhotic, and (n = 30) controls were detected via GCMS and SPME. Moreover, this study also aims to identify diagnostic VOCs for distinction among HCC and cirrhosis liver conditions, which are most closely related, and cause misleading during diagnosis. However, using gas chromatography-mass spectrometry (GC-MS) to quantify volatile organic compounds (VOCs) is time-consuming and error-prone since it requires an expert. To verify GC-MS data analysis, we present an in-house R-based array of machine learning models that applies deep learning pattern recognition to automatically discover VOCs from raw data, without human intervention. All-machine learning diagnostic model offers 80% sensitivity, 90% specificity, and 95% accuracy, with an AUC of 0.9586. Our results demonstrated the validity and utility of GCMS-SMPE in combination with innovative ML models for early detection of HCC and cirrhosis-specific VOCs considered as potential diagnostic breath biomarkers and showed differentiation among HCC and cirrhosis. With these useful insights, we can build handheld e-nose sensors to detect HCC and cirrhosis through breath analysis and this unique approach can help in diagnosis by reducing integration time and costs without compromising accuracy or consistency.


Introduction
Hepatocellular carcinoma (HCC) is the most common type of liver cancer and the fourth leading cause of death globally.HCC is still a health problem around the world, and a million new cases are expected by 2025 [1].Moreover, it is the most common type of liver cancer and accounts for 90% of cases of liver diseases, while cirrhosis is the root cause of HCC [2].Chronic viral hepatitis B or C, alcohol, and toxins like aflatoxin and pyrrolizidine alkaloids are all linked to HCC and cirrhosis.These risk factors cause oxidative stress (ROS) and cause changes in the liver's molecular mechanisms [3].The pathophysiology of HCC is the failure of the cytochrome poly substrate-450 (CYP450) enzyme, which initiates molecular processes that increase the risk of HCC [4].Ultrasound scans (US), alpha-fetoprotein (AFP) levels, and liver biopsies can be used to diagnose HCC lesions [5].Data from AFP and liver biopsies showed 30% and 70% of liver cancer patients had depraved outcomes [6].Early cancer detection is a very active area of research, and for good reason: it improves patient outcomes and lowers the cost of treatment.However, US, MRI, and AFP screening are unable to diagnose the disease at earlier stages.Moreover, these techniques also failed to differentiate the closely related diseases like liver cirrhosis and HCC [7].
Finding new biomarkers that can detect HCC and other diseases early is important for lowering the mortality rate [8].HCC is a vascularized tumor, and CYP450 produces volatile organic compounds (VOCs) as a byproduct [9].VOCs can be as big as DNA or as small as H2 molecules, and most of the time they can be found in breath [10].VOCs are a new class of biomarkers that can be found in a wide variety of compounds, and they provide information about the individual's body condition; either it is in a healthy state or in a diseased condition.A non-invasive breath analysis could find new biomarkers (VOCs) linked to the inflammatory pathways in cancer [11].
Disease monitoring using exhaled breath analysis (EBA) has been the subject of much research.VOCs come from both inside the body; endogenous VOCs and outside through different eatables known as exogenous VOCs, and are found commonly in exhaled breath, sweat, urine, and sometimes in blood.Endogenous VOCs are generated by oxidative stress, inflammation, and microbes in the body.The VOCs are then released into the bloodstream, where they diffuse into the lungs and are expelled out [12].Oxidative stress and inflammation change the VOC content exhaled in the breath of the diseased organs.An analytical approach known as Gas chromatography and mass spectrometry (GCMS-SPME) is widely used to analyze these VOCs.This GCMS approach is a gold standard for VOC measurement in breath samples because of its low limits of detection, orthogonal data structure, and molecular structural features [13].However, there are a few drawbacks of GCMS analysis, including this approach required high expertise, time consumption, and chances of error are also present.
Nowadays, for the early diagnosis of disease accurately through breath biomarkers, machine learning-based algorithms are also used in combination with metabolomics [14].Multivariate analysis, such as principal component analysis (PCA), partial least discriminant analysis (PLS-DA), and forest mapping are the gold standards in machine learning for VOCs profiling and screening.Moreover, Artificial intelligence (AI) and machine learning (ML) applications have shown significant promise in recent years for improving healthcare and critical care [15].The present study used a microextraction technique (GCMS-SPME) to collect VOCs from HCC and cirrhosis patients' breath.For validation of GCMS analysis and accurate classification of these VOCs, in-house R-based machine learning algorithms are used.Another goal of this research is to properly categorize VOCs of closely related medical conditions such as cirrhosis and HCC so that incorrect diagnoses can be ruled out at an early stage.Figs 1 and  2. signifythe schematic and graphical presentation of this study.

Study plan
GCMS-SPME analyzed breath samples [31].This study compared cirrhotic/HCC patients and healthy people and identified promising VOCs for use in e-nose sensors.

Patients & subjects
This study comprises cirrhotic, HCC, and healthy subjects.Table 1 showed the demographics of the patient groups.A total of 100 breath samples were taken from patients and healthy volunteers, including students, security guards, hospital staff, and workers.
Inclusion criteria.Untreated hepatitis B & C patients were included, and their diseases were confirmed cytologically and histologically.Cirrhosis and HCC patients, drinkers, smokers/nonsmokers both, and individuals on fasting were included in this study.
Exclusion criteria.Pregnant women, people under 18, and those with hepatitis A, renal infections, lung disorders, stomach difficulties, heart ailments, and various cancers were excluded.

Ethics statement
Protocol # 121 was approved by the National University of Sciences & Technology (NUST) Institutional Review Board (IRB) committee.Before the sample collection, patients or their

Breath sampling procedures
Patients were given three instructions.1) Inhale deeply; 2) hold for 5 seconds; 3) exhale slowly into 0.5-liter tedlar bags and 20 ml glass tubes for each patient.All vials were sterilized, and nitrogen flushed.

GCMS for VOCs extraction and analysis
The VOCs were analyzed by using GCMS with few modifications in the protocol [22].A preconditioned SPME fiber with GCMS was used to analyze breath before measuring VOCs.Before being exposed to the gas phase, the SPME fiber (PDMS) was first placed in the headspace (HS) mode to allow the analytes to be absorbed.The target VOCs responded most to the 75 pm CAR/PDMS fiber.The SPME fiber was placed in the tedlar bags for 15-20 minutes to properly absorb the VOCs.Following the extraction time, it was heated to 250˚C for 5 minutes in the chromatograph injector to aid in the desorption of the extracted VOCs.After 15 minutes of extraction, the fiber was immediately exposed for 5 minutes at 250˚C in the GC injector.Thermal desorption of the analytes required three minutes.A Shimadzu GCMS-QP 2010 was utilized with a 30 m, 0.25 mm internal diameter, 0.5 m thick, 5% phenyl methyl siloxane capillary column to measure VOCs.High-quality helium gas C60 (China) flowed at 1.0 ml/ min.The oven was held at 40˚C for 2 minutes, then increased 5˚C/min to 200˚C (held for 5 minutes), then 280˚C (held for 1 min).The transfer line was 280˚C, the manifold temperature was 40˚C, and the trap was 180˚C.Six scans per minute covered 40-400 m/z in this study.50A emission current and relative electron multiplier mode were used to autotune the method.The maximum ionization time was 25,000 seconds at the rate of 35 m/z.

Data analysis of VOCs
In this study, VOCs were screened out by GCMS analysis, and a comparison between HCC, cirrhotic patients, and control VOCs was performed by using univariate and multivariate analysis.Significant VOCs were analyzed by using unsupervised and supervised Machine learning (ML) models including principal component analysis (PCA), Random Forest mapping (RF), and Gini mean score.These methods were used to reduce the original huge data set's variables.To assess model sensitivity, specificity, and diagnostic accuracy the leave-oneout cross-validation model (LOOCV) model was put up in R software.We used mean classification accuracy, sensitivity, specificity, and balanced accuracy (BA) to evaluate the model's prediction performance.In this study, all analyses were run on IBM SPSS 20.0, Minitab, and R software.

Patients demographics
This study includes a total of 70 diseased patients and 30 healthy individuals.

VOCs screening
Screening with SPME-GCMS.In this study, GCMS spectral library detected endogenous or exogenous VOCs in exhaled breath samples from cirrhotic, HCC, and healthy volunteers.Numerous VOCs were screened out in the GCMS chromatograms of HCC and cirrhotic patients, but very few were essential based on area percentage and prevalence rate.The significant biomarkers in the GCMS analysis are the following; Phenol, 2,2'-methylenebis[6-(1,1-dimethyl ethyl)-4-methyl (MBMBP) (compound 1), sulfurous acid hexyl octyl ether (compound 2), 1,3-benzene dicarboxylic acid (compound 3) and toluene (compound 4) in HCC breath samples whereas 3-hydroxy-2-butanone (compound 5), methane sulfonyl chloride (compound 6), styrene (compound 7), di-n-octyl phthalate (compound 8), and d-limonene (compound 9) in the cirrhotic breath samples.The peak location of these significant breath biomarkers according to retention time (RT) interval, is shown in (S1 Table in S1 File).In  control chromatogram, whereas for the verification of this peak, the standard solution of Dlimonene was run at the same GCMS parameters.Similarly, the graphs (IIA & IIC) showed phenol, 2,2'-methylene bis [6-(1,1-dimethyl ethyl)-4-methyl (MBMBP) peak at 27.4 minutes in the HCC group and standard solution, whereas this peak is absent in the GCMS graph of control individual (IIB).From the graph of a standard solution of MBMBP, it can be justified again that the MBMBP peak is significant and confirmed at 27.4 minutes.From this study's results, it can be suggested that MBMBP is for HCC, and d-limonene breath biomarkers for cirrhosis could be proven as diagnostic biomarkers.Other compounds mentioned in this studywere also found as obvious biomarkers in the GCMS results.However, the individual graph of each biomarker is difficult to show here in Fig 3 .To corroborate the validity of the data, more reliable results were obtained using machine learning analysis, such as principal component analysis.

Unsupervised machine learning analysis-Principal multivariate analysis
In this research, multivariate statistical analysis, and machine learning methods (ML), both supervised and unsupervised, were used to study HCC and cirrhotic VOCs.After identifying the significant VOCs through statistical analysis, ML models were used to screen out the potent VOCs stepwise and get the most sensitive VOCs for early diagnosis of HCC and cirrhosis.

Unsupervised machine learning models
Principal ComponentAnalysis (PCA).This study screened thousands of VOCs using the GCMS approach and further analyzed them by PCA whereasvariance and dimensionality were used to categorize and reduce VOCs in PCA analysis.

Supervised machine learning models
Gini mean score plot: HCC vs. control.Gini Mean Score Plot ensures more accurate results overall, with minimum errors.This group showed NPV 85% precise and PPV values were 100%.Based on these NPV values in our model, negative metabolites were successfully recognized.The kappa classification accuracy in this group was 86.4%, the trees used were 800n, and the confidence level of this group was 95%.The model's p-value was <0.005, demonstrating its significance.This model's 1.89% error rate is quite low compared to other ML analyses, indicating it's a perfect model.Cirrhosis vs. control.The model's best test group was cirrhosis vs. control, with 100% accuracy and precision.This model's PPV and NPV were 85% and 100%, with 93.3% accuracy.The NPV was 100%, showing that most cirrhosis VOCs were also identified in other disease groups.VOCs specific to cirrhosis had an 85% effect on the disease group.This model has 100% accuracy for BA, NPV, and kappa classifications with a p-value <0.00005, indicating high significance.This model's error rate was 1.56%, and the group's confidence level is 95%, suggesting its stability.Fig 6B compares cirrhosis and control compounds and it is found that compounds 5, 6, 9, 8, and 10-(9-octadecamide Z) showed a good mean decreasein accuracy in this model.
Test score.LOOCV (Leave One out Cross-Validation) is a cross-validation approach in which each observation represents the validation set.Random forest model sensitivity and specificity for HCC vs. control is 80%.In cirrhosis vs control, the model's sensitivity was 60-80% and specificity was 100%.The HCC vs. cirrhosis group has 100% specificity, however, sensitivity was 60% to 80%.These results revealed overlap accuracy with the leave-one-out cross-validation (LOOCV) model for the three groups: (HCC against control, cirrhosis versus control, and HCC versus cirrhosis) are 93%, 100%, and 89%, respectively.The performance and validation of the RF Model are explained in the S1 Data.

VOCs analysis through GCMS
Melanoma protein and gene alterations promote cancer and induce membrane peroxidation [16].This disrupts blood chemistry, alters liver function, and emits VOCs in breath [17].VOCs are known for non-invasive, low-cost early disease detection.Metabolomics combines several VOC extraction headspace techniques with statistical analysis to find novel biomarkers in sweat, breath, blood, and urine to diagnose disease.Comparing cancer patients to healthy people, exhaled breath showed significant VOCs [18].Catarina L. Silva et al. (2016) used GCMS to study VOCs in breast cancer and found potential biomarkers [19].Stavropouloset et al. ( 2021) identified 1-octene with an aldehyde in lung cancer patients' breath using GCMS [20].The current study also used GCMS to compare HCC and cirrhosis breath samples to healthy controls.This study also examined whether specific breath VOCs observed in liver cirrhosis and HCC patients might be useful for differential diagnosis at early stages.For breath VOC analysis, subjects were classified into HCC, cirrhotic, and healthy groups.From the results, it is found that various VOCs are more significant in HCC patients than in cirrhotic patients and healthy individuals.
Changes in VOCs amounts are attributed to the liver's lipid metabolism and cellular amino acids.Compound 1, compound 2, and compound 4 peaks were revealed as major VOCs in the HCC GCMS chromatogram.These VOCs have a higher peak area and height than controls.Compounds 5, 6, 7, 8, and 9 were revealed in high amounts in cirrhotic patients' breath samples.The above analysis was based on all patients' mean VOC accuracy.The VOCs were chosen based on the percentages of area and peak heights.The VOC was found to be significant in all HCC patients and was then compared in all cirrhotic patients and healthy individuals' breath samples.The VOCs were chosen based on mainly two factors: first, the prevalence rate of specified VOC in the required diseased patients, and second, the comparison of that VOC in all other groups' results.The Venn diagram program was used to refine all the GCMS results in this study.
In the present work, numerous statistical analyses and supervised and unsupervised machine learning (ML) models were also applied to the GCMS data to refine the results and specifically identify the potential biomarkers based on classification and clustering.ML models allow for the distinction between cirrhotic and HCC stages to be observed.Both cirrhosis and HCC are liver diseases, and it is plausible that the former is the underlying cause of the latter.It has been noted that most individuals with cirrhosis die in the advanced stages of the disease, long before the development of HCC.Thus, there exist distinctions between the two diseases, and doing so is vitally crucial.In this investigation, there was just a handful of VOCs that stood out as significantly differentiating between the two disease groups.However, there is a volatile organic compound (VOC) called Methane sulfonyl chloride that is more prevalent in cirrhotic individuals but is seen in few of the HCC patients, therefore additional research is needed into this biomarker.It is clear from this study that the emission of this VOC molecule begins in the early stages of cirrhosis and persists through the last stages of HCC.Multivariate analysis, therefore, is required for the investigation and confirmation of the theory to provide more reliable results for this study.

Multivariate analysis
Unsupervised machine learning models-PCA analysis.In one study, PCA analysis found seven metabolites with 93.59% sensitivity and 71.62% specificity for breast cancer.Seven biomarkers from PCA analysis predict DCIS with 80% sensitivity and 100% specificity [21].Principal component analysis (PCA) based on clustering and classification exposed HCC and cirrhotic-specific biomarkers.From the PCA analysis, as shown in Fig 4A(I-III), three HCC VOCs, compounds 1, 2, and 4, were found significant, with sensitivity and specificity of 80% and 100% respectively.Five VOCs of cirrhosis, compounds 5, 6, 7, 8, and 9, were found in the PCA analysis with 100% sensitivity and 80% specificity compared to the control group In all groups, PC1 had a larger variance than PC2 and PC3.PCA analysis of all 3D graphs shows that VOCs away from the original line of the graph have a substantial impact on disease progression, Compounds 1 and 4 demonstrated a larger impact on HCC progression in 3D PCA analysis, shown in Fig 4A(III) because they are away from the origin line and these compounds are positively correlated with each other.Compounds 4 and 8 in the cirrhosis group had a negative correlation, so if one increased, the other one decreased.Compound 4 is positively correlated with VOCs in the same PC group, while VOCs in the opposite direction are negatively correlated.In the 3D graph of cirrhosis vs. control, displayed in Fig 4B (III), compound 4 is favorably correlated with compounds 6 and 7.However, compound 4 and compound 8 are negatively correlated.Thus, PCA results of the first two patient groups identified crucial biomarkers by reducing data dimensionality, which was cross verified in the third group: HCC vs. cirrhosis, as shown in Fig 4C (III).
Hybrid heat maps.In recent studies, significant VOCs of most malignant diseases, such as cancer, Crohn's disease, ulcerative colitis, and preeclampsia were screened out based on the heat maps analysis [22,23].In that study, significant VOCs of most malignant diseases, such as (Crohn's disease, ulcerative colitis, and preeclampsia), were similar [23].In the current study, Hybrid heat map results will depend on the sensitivity and specificity of the VOCs for their respective disease groups, which are based on variance and interrelation.More analysis is needed to refine the results and identify the most significant VOCs through supervised machine-learning models.

Supervised machine learning models
Random forest mapping & Gini mean score plot.RF mapping is a sensitive supervised machine learning approach.Hershberger et al. (2021) evaluated breath VOCs using a random forest model.In that study, disease VOCs when combined with age and sex helped in the classification of disease conditions more precisely.The RF model in that research predicted acetoin and di-n octyl phthalate with 85% accuracy, 80% sensitivity, and 95% specificity [24].In this investigation, health conditions were influenced by substantial VOCs found in the patient's breath.This demonstrates that the VOC levels in breath samples of HCC, cirrhotic patients, and controls vary.Exhaled VOCs' sensitivity and specificity in RF models were used to predict the state of HCC disease.The VOCs at the top of Fig 6A -6C, had a substantial impact on the disease as they had a high mean decrease accuracy percentage.Overall, RF results are more accurate than other ML model results.The rise in accuracy using RF suggests this method evaluates VOCs and their relevance more accurately.
The Gini mean score plot was also used to screen out significant VOCs based on mean accuracy [25].RF mapping methods were used to categorize VOCs and find the most important compounds.Only one or two components with high mean decrease accuracy will be considered diagnostic VOCs for each disease.Compounds 1, 2, and 4 of the HCC vs. control group revealed a higher mean decrease in accuracy, indicating that these VOCs are significant for diagnosing HCC disease, as shown in Fig 6A .Compound 1 had a higher mean score than compounds 2 and 4, indicating it was chosen as the diagnostic compound for HCC diagnosis.
Gini means score plot was also used to screen out five significant VOCs from the cirrhotic vs. control group.Compound 9 earned the highest mean decrease accuracy compared to the other VOCs in the cirrhotic group (Fig 6B ), hence it is considered the diagnostic compound for cirrhotic patients.Although the benefits of random forests in this context have been noted in prior sections, it is vital to remember that the RF model may find correlations and high sensitivity between VOCs.Similarly, Fig 6C showed the same significant diagnostic VOCs in HCC and cirrhosis groups such as Compound 1 and compound 9, respectively.
ROC curves.Zhang et al. (2020) employed ROC curves to identify breast cancer and DCIS.He found that breath biomarkers were 93.59% and 71.62% sensitive and specific whereas the AUC and ROC curves demonstrate a test's discriminative power.The higher the AUC percentage and the closer the curve to the upper-left corner, the better the tests for disease-healthy discrimination.The AUC curve, which ranges from 0 to 1, indicates a test's accuracy such as perfect diagnostic test AUC = 1.0, whereas nondiscriminatory test AUC = 0.5 [26,27].
The metabolization of VOCs in the human liver.Saturated hydrocarbons are converted to alcohol in healthy people by the P450 enzyme.This enzyme was reduced in the body due to liver injury and disrupts exhaled VOCs.Oxidation stress (ROS) and lipid peroxidation are two mechanisms for free radical-driven VOC emission in the breath.Chronic hepatic conditions reduce CYP450, liver functional units, portal blood flow, and liver extraction capacity [28,29].Although the molecular mechanism is uncertain [30], interleukin 6 (IL-6) could be key.
Our study supports the concept in the literature that differences between healthy control VOCs and liver disease patients are linked to CYP450, CYP2E1 enzyme, and CYP2B6 isoform activity alteration in the body due to liver injury.Sulfur transamination in liver.Incomplete transamination of sulfur-containing amino acids in the liver through the P450 enzyme produces sulfur-containing molecules.Sulfur compounds are associated with liver damage, giving cirrhotic patients a unique odor [31].In healthy people, the P450 enzyme converts saturated hydrocarbons to alcohols, however, in Liver damage this enzyme gets reduced and disturbed the number of VOCs in a breath.Oxidative stress and lipid peroxidation are two free radical-driven mechanisms overall for VOCs emission, as shown in Fig 7.In this study, the sulfur-containing breath biomarkers in this study are methane sulfonyl chloride (compound 6) and sulfurous acid hexyl octyl ether (compound 2) which are associated to be elevated in both cirrhotic and HCC patients, respectively.
The metabolization of ketonic compounds in liver.This was attributed to insulin resistance, which increased compound 5 and compound 10 in liver disorders.During lipolysis, unsaturated fats, triglycerides, and ketones were released [32].As depicted in Fig 7, the ketonic molecule (also known as acetoin; compound 4) is elevated when the liver enzyme CYP2E1 is inhibited.In the current study, the p-values for these compounds are less than (p<0.05),showing a substantial increase in compounds 10 and 4 (fatty acids and ketones).
The metabolization of D-limonene (compound 9) in live.According to the results of this research, cirrhotic patients had a higher level of limonene in their breath than healthy individuals and HCC patients.Recently, Fernandez Del Rio et al. (2017 & 2020) found that restoring liver function with a liver transplant reduces compound 9 levels in the breath and restores them to normal [14,33], although one study has stated that compound 9 level is elevated in the breath of cirrhotic patients due to CYP450 enzyme inactivity, as seen in Fig 7 .In this study, the results support the above theory that variations in CYP450 enzyme activity between healthy and cirrhotic individuals increased the level of compound 9 in the exhaled breath of cirrhotic patients.Certain studies [34] examined VOCs in the breath of patients with liver disease, and certain VOCs have been proposed as biomarkers for cirrhosis.Sulfur compounds are associated with liver disease and are responsible for the distinct odor of patients' breath known as Foetor Hepaticus.Limonene, methanol, and 2-pentanone have been identified as biomarkers of liver cirrhosis.These VOCs not only discriminated cases from controls but were also significantly different in a subset of transplant patients before and after liver transplantation.Limonene is converted in the liver by the P450 enzymes known as: CYP2C9 and CYP2C19 to the following metabolites: perillyl alcohol, trans-carveol, and trans-isopiperitenol. Cirrhosis patients have decreased levels of the enzyme CYP2C19, and these levels are adversely correlated with cirrhosis severity.
The metabolization of styrene (compound 7) in liver.Exogenous sources of styrene include plastics, cigarette smoke, exhaust fumes, and food [30].As depicted in Fig 7 , CYP2E1 is the primary enzyme responsible for styrene metabolism in humans.In this study, compound 7 levels are much greater in cirrhosis patients than in healthy individuals.In addition, the statistical results confirmed the idea that the amount of compound 7 is elevated in smokers with cirrhosis.According to the statistical data, the p-value of compound 7 in cirrhotic patients is less than 0.0001 (p<0.0001),confirming that CYP2E1 is dysfunctional in liver disease.
The metabolization of phenolic compounds in liver.Compound 1 is a prominent HCC breath biomarker associated with a phenolic family.The breakdown of tyrosine and tryptophan in the liver releases phenol and indole and subsequently binds to the albumin in the blood [35].In liver disease, HCC, due to the loss of albumin, the indole and phenol lose their binding ability and cannot break down.So, the individuals with reduced liver function had larger quantities of free tyrosine and tryptophan.As a result, in the blood of these individuals, molecules like indole and phenol are raised, which shows that these biomarkers are the mediators of liver damage conditions.According to a recent study [36], MBMBP or MOCA compound oxidation is carried out by the liver enzyme P450.Several studies were carried out on the livers of rats in that study.Liver microsomes were important in the oxidation of MBMBP.In the instance of HCC, the inactivity of the cytochrome enzyme, p450, inhibited proper oxidation of MBMBP, resulting in a high concentration of MBMBP.The compound 4,4, Methylene bis (2,2 cholroaniline), known as MOCA, is the alternate name for the chemical MBMBP [37].So, the elevated level of component 1 in this study, backs up the notion that albumin is reduced in the blood of HCC patients.The p-value of compound 1 identified in the statistical results of this investigation is <0.0001, indicating its high importance.
The metabolization of toluene (compound 4).Compound 4 is metabolized in the liver by the P450 enzyme, which is essential for the biotransformation of many endogenous and foreign chemicals [38] whereas isoform P450, endogenous and exogenous substrates are all degraded by CYP2E1 enzyme.Compound 4 levels in the breath of HCC patients and healthy individuals differed significantly in this study.Higher levels of compound 4 in disease patients compared to healthy controls are indicative of abnormal activity of the liver enzyme CYP2E1.
Effect of confounding factors on the liver.In addition, confounding factors significantly impact the progression of liver cancer severity.Most of these confounding factors are smoking, age, gender, hepatitis B and C, and occasionally other environmental factors [39,40].In this study, we also examined the impact of confounding factors such as smoking, age, gender, and hepatitis B and C on the course of liver disease.Table 2 reveal the p-values of HCC and cirrhotic patients regarding smoking, hepatitis B, and hepatitis C are found to be less than 0.05 (p<0.05).The statistical study of drinking, age, and sex characteristics revealed a p-value greater than 0.05, indicating that these confounding factors are less significant.

Fig 3 .
Comparison of GCMS chromatographs of Cirrhosis patients, controls, and standard solution (IA, IB & IC) & HCC patients, control and standard solution (IIA, IIB & IIC) areperformed.The D-Limonene peak was visualized at a retention time of 10.74 minutes in all graphs, which indicates the authenticity of the results.However, this peak was absent in the

Fig 6 .
Fig 6.Forest model, comparison of Gini means the decreased frequency of (A) HCCs vs control (B) Cirrhosis vs control (C) HCC vs cirrhosis.https://doi.org/10.1371/journal.pone.0287465.g006 Fig 4B(I-III).PCA results in HCC and cirrhotic patients show high sensitivity and unique clustering of VOCs.PCA clustering classified both groups' VOCs in Fig 4C(I-III), whereas the 3-D PCA plot shows the exact dimensions and angles of each breath biomarker for each PC component.
Fig 7 depicted theflow chart of VOCs metabolization in the human liver whereas the (S1 Table in S1 File) elaborated the statistical analysis of HCC and cirrhotic VOCs with their concentrations and p-values found in this study.