Figures
Abstract
Background
Obesity, smoking, and lipid imbalances are well-established predictors of all-cause mortality, but conventional definitions lead to misinterpretations commonly in observational studies. This study aims to evaluate three key refinements of obesity, smoking exposure, and lipid profiles in the context of all-cause mortality using two large-scale Korean cohorts.
Methods
This retrospective cohort study analyzed 659,494 participants from the Korean National Health Insurance Service-National Sample Cohort (NHIS-NSC), with external validation in 10,477 participants from the Korean National Health and Nutrition Examination Survey (KNHANES), both linked to mortality records. Obesity was classified by BMI and abdominal obesity criteria, smoking exposure was assessed using the pack-year to age ratio, and lipid abnormalities were measured using composite lipid ratios (total cholesterol/HDL≥5.0, triglyceride/HDL > 6.0, LDL/HDL≥5.0). Cox proportional hazards models were used for primary analyses, with time-dependent Cox models as sensitivity analyses.
Results
In primary analyses using general Cox models, underweight individuals showed significantly elevated mortality risk across all age groups, with the combination of underweight and abdominal obesity showing a particularly high risk in those younger than 60 years (adjusted hazard ratio[AHR]=2.42, 95% confidence interval[CI]: 2.16–2.71 for underweight without abdominal obesity; AHR = 1.36, 95% CI: 0.19–9.67 for underweight with abdominal obesity). In those aged 60 years or older, being underweight without abdominal obesity was the strongest predictor (AHR = 1.79, 95% CI: 1.67–1.91). A pack-year to age ratio ≥1 was significantly associated with an increased risk of mortality (AHR = 1.65, 95% CI: 1.51–1.81). Individuals with one or more high-risk lipid profiles had an increased risk of death (AHR = 1.04, 95% CI: 1.01–1.07). Sensitivity analyses using time-dependent Cox models showed directionally consistent patterns with the primary analyses, and the findings were validated in an independent cohort (KNHANES).
Conclusion
Refining obesity, smoking exposure, and lipid profile definitions led to more interpretable and clinically consistent mortality risk estimates, avoiding paradoxical findings common in observational studies. Future research would explore whether these refined metrics improve predictive accuracy compared to conventional definitions and validate their applicability in diverse populations.
Citation: Lee B, Im S, Won S (2026) Refined obesity, smoking exposure, and lipid metrics in mortality risk assessment: a nationwide cohort analysis. PLoS One 21(6): e0348128. https://doi.org/10.1371/journal.pone.0348128
Editor: Ozra Tabatabaei-Malazy, Tehran University of Medical Sciences, IRAN, ISLAMIC REPUBLIC OF
Received: July 24, 2025; Accepted: April 9, 2026; Published: June 24, 2026
Copyright: © 2026 Lee et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data underlying this study are third-party data owned and maintained by two public institutions in the Republic of Korea and were not collected or owned by the authors. The authors had no special access privileges that other researchers would not have. All interested researchers can request access to the same datasets through the standard application procedures described below. (1) Korean National Health Insurance Service – National Sample Cohort (NHIS-NSC 2.2) is maintained by the National Health Insurance Service (NHIS), Republic of Korea. Researchers may submit a data request through the NHIS National Health Information Data Request System (https://nhiss.nhis.or.kr) or by contacting bigdata@nhis.or.kr. Applicants must submit a research proposal, which is reviewed by the NHIS Data Provision Deliberation Committee. Approved researchers may access fully anonymized data only within the NHIS-controlled secure analytic environment. (2) Korea National Health and Nutrition Examination Survey (KNHANES), linked to cause-of-death records, is maintained by the Korea Disease Control and Prevention Agency (KDCA), Republic of Korea. Researchers may submit a data request through the KNHANES Data Request System (https://knhanes.kdca.go.kr) or by contacting knhanes@korea.kr. Applicants must submit a research proposal and a data use agreement. Approved researchers may access fully anonymized data within the KDCA-controlled secure analytic environment for the period granted by the committee. The authors confirm that any qualified researcher would be able to access these datasets in the same manner as the authors did, following the publicly documented application procedures of NHIS and KDCA. The datasets are de-identified by the respective data custodians prior to release, and analyses must be performed within the institutions’ secure servers. Because of these governance rules imposed by NHIS and KDCA, the authors are not permitted to redistribute the individual-level data or to upload it to a public repository.
Funding: This work was supported by Global-Learning & Academic research institution for Master’s, PhD students, and Postdocs (LAMP) Program of NRF funded by Ministry of Education (No. RS-2023-00285353 to BL) and Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. RS-2025-25441324 to SW, Development of AI-Enabled, Phenome Data-Driven Precision Health Prediction and Management Solutions for Customized Life Coaching of Healthy Adults).
Competing interests: The authors declare that this study was conducted in the absence of commercial or financial relationship that could be construed as potential conflicts of interest.
Introduction
Obesity, smoking, and dyslipidemia are well-established risk factors for all-cause mortality. However, the conventional variables used to assess these factors can introduce biases, leading to misinterpreted or counterintuitive results, especially in large-scale observational studies. For instance, relying solely on body mass index (BMI) to define obesity, categorizing smoking status only as never/former/current, or using total cholesterol (TC) alone for dyslipidemia may overlook critical nuances. Non-linear associations, unmeasured confounders, and methodological constraints such as multicollinearity further complicate these analyses, while overly broad variable definitions can impede accurate modeling and leave residual bias [1–5]. A well-known illustration is the “obesity paradox”, in which higher BMI paradoxically appears associated with lower mortality risk [6,7]. This counterintuitive finding may stem from BMI’s inability to distinguish between lean mass and fat mass, as it does not differentiate among normal-weight individuals with visceral fat, metabolically healthy obese individuals, and those with sarcopenic obesity. Similarly, simplified smoking-status categories can produce misleading conclusions; for example, some studies report higher mortality among former smokers than current smokers, potentially due to survivor bias, selective cessation in high-risk individuals, and residual effects of cumulative smoking exposure, even after cessation [8,9]. Moreover, although many studies examine multiple lipid markers such as triglycerides (TG), high-density lipoprotein cholesterol (HDL-C), and low-density lipoprotein cholesterol (LDL-C), analyzing these correlated markers simultaneously often introduces multicollinearity, complicating the interpretation of their individual effects on mortality [10,11].
A large-scale study by Ahn et al. (2017) exemplifies these methodological challenges, investigating predictors of all-cause mortality within the Korean National Health Screening Cohort (NHIS-HEALS) [6]. Although it offered valuable epidemiological insights, the study highlighted several measurement limitations. Specifically, obesity was assessed only by BMI, TC were the sole lipid measure, and smoking exposure was captured via multiple overlapping measures such as smoking status, total pack-years, smoking amount, and smoking duration – leading to potential redundancies. These observations underscore the need for more refined, integrated approaches that can better address the complex interactions among obesity, smoking and dyslipidemia when evaluating mortality risk.
In the present study, we address these limitations by introducing three key refinements in measuring obesity, smoking and dyslipidemia within a population-based cohort studies. First, rather than relying solely on BMI, we incorporates abdominal obesity to provide a clearer indication of body fat distribution and its age-related effects on mortality [12,13]. Since BMI alone cannot distinguish between lean and fat mass, this additional measures allows for a more precise estimation of adiposity-related risk. Second, instead of simple smoking status or total pack-years alone, we employ the pack-year to age ratio to quantify smoking exposure more accurately. This refined metric accounts for cumulative smoking burden relative to age and adjusts for potential survivor bias [14]. Finally, recognizing the strong interrelationship among lipid parameters, we explore composite lipid ratios (TC/HDL-c, TG/HDL-c, LDL-c/HDL-c) as alternative predictors of mortality. These ratios may serve as more robust indicators of cardiovascular and metabolic risk compared to individual lipid markers or total cholesterol alone [4,15–17].
Using a large-scale nationwide cohort with over 12 years of follow-up, we systematically examines how these enhanced metrics – abdominal obesity, the pack-year to age ratio, and composite lipid ratios – relate to all-cause mortality. To ensure the robustness and generalizability of our findings, we subsequently validate the results using an external dataset. The specific aims are to:
- Determine how general and abdominal obesity affect mortality across different age groups
- Assess whether the pack-year to age ratio provides a more interpretable result compared to traditional smoking metrics
- Evaluate the role of composite lipid ratios in mortality risk, despite simultaneous inclusion of all four lipid indices
- Validate these findings in an external dataset to ensure generalizability.
Methods
Data source
This study utilized data from the NHIS-National Sample Cohort (NHIS-NSC 2.2), a large-scale longitudinal retrospective cohort. As for 2022, 97.1% of Korea’s 51.4 million citizens were covered by the National Health Insurance System (NHIS), and NHIS-NSC 2.2 includes 1,137,861 individuals, which accounts for 2% of the total Korean population, sampled based on sex, age, insurance type, income level, and region [18]. NHIS-NSC 2.2 covers health records from 2002 to 2021, providing information on demographics, lifestyle factors, clinical measurements, and health examination results. For individuals with chronic diseases, the dataset includes detailed medical history, such as diagnosis dates, treatments, and disease progression.
For external validation, we used data from the Korean National Health and Nutrition Examination Survey (KNHANES), which has been conducted since 1998. KNHANES provides nationally representative data on the health and nutritional status of the Korean population, supporting public health policies [19]. This dataset includes both health examinations and self-reported surveys. Additionally, we incorporated an external validation dataset by linking KNHANES data with cause-of-death records for participants aged ≥19 years from 2007 to 2018.
To ensure privacy and confidentiality, both datasets were anonymized by the NHIS and KNHANES authorities prior to researcher access, and all the analyses were conducted within secure, restricted servers. For this study, access to the NHIS data server was granted from September 14, 2022, for a duration of six months, and to the KNHANES data server from April 15, 2024, for a period of one week.
This study was approved by the Public Institutional Review Board designated by the Ministry of Health and Welfare, Korea (Approval No.: IRB-P01-202107-21-006 for NHIS and IRB-P01-202303--01–005) and conducted in accordance with relevant ethical standards and regulations. Given the retrospective nature of the study and the use of fully anonymized secondary data from established databases (NHIS-NSC and KNHANES), the requirement for informed consent was waived by the IRB.
Study population
Significant differences were identified in health examination items before and after 2009. Accordingly, this study focused on 700,851 individuals who underwent the NHIS Medical and Health Examination between 2009 and 2019. We excluded subjects who (1) had health examination records within one year before and after pregnancy, (2) did not have a health examination record after turning 20 years old, (3) died from accidental or unintentional injuries, congenital disorders, or perinatal conditions [20,21], or (4) had missing covariate values. After these exclusions, 659,494 individuals remained for analysis. After these exclusions, 659,494 individuals remained for analysis and were followed from baseline examination (2009–2019) until death or December 31, 2021, contributing 9.5 years of mean follow-up duration(range: 0.1–12.9 years; median(interquartile range[IQR]: 9.2 years (6.8–11.3 years)).
For external validation, we used KNHANES data linked to mortality records, which included 12,081 participants from 2010 and 2011. After applying the same exclusion criteria, 10, 477 individuals were followed until death or end of mortality linkage, contributing 7.4 years of mean follow-up duration(range: 0.1–8.0 years; median(IQR): 7.3 years (6.9–7.9 years)). The flow chart was presented in S1 Fig.
Obesity classification
Individuals were classified based on both general obesity measured by body mass index(BMI) and abdominal obesity by waist circumference to provide a comprehensive assessment of body fat distribution and its age-dependent impact on mortality [22]. This combined classification was pre-specified based on evidence that BMI alone cannot adequately distinguish metabolic risk, as it does not differentiate between lean mass and fat mass or capture visceral adiposity [12,13].
General obesity was defined using BMI thresholds recommended by the World Health Organization(WHO) for the Asian population [23]: underweight (BMI < 18.5 kg/m2), normal weight (18.5 ≤ BMI < 23.0 kg/m2), overweight (23.0 ≤ BMI < 25.0 kg/m2), or obese (BMI ≥ 25.0 kg/m2). For detailed analysis, we further stratified obesity categories into their subcategories (25.0–27.5, 27.5–30.0, 30.0–35.0, ≥35.0 kg/m2) to examine dose-response relationships.
Abdominal obesity was defined as waist circumference ≥ 90 cm in men and ≥ 85 cm in women, based on Korean-specific cutoff values [24]. By cross-classifying BMI with abdominal obesity status, we created combined phenotypes that distinguish individuals with different patterns of fat distribution, thereby addressing limitations of BMI-only classifications that can produce paradoxical findings in mortality studies [6,7].
Smoking exposure
We hypothesized that the effect of smoking exposure on mortality accumulates over an individual’s lifespan [25]. Consequently, we employed the pack-year to age ratio – a metric that quantifies cumulative smoking exposure relative to age – to reduce potential survivor bias [26]. The pack-year to age ratio was calculated as follows:
where total pack-years = (number of cigarette packs smoked per day) x (number of years smoked).
Based on this metric, participants were classified into three categories: non-smokers (0 pack-year), low exposure (pack-year to age ratio <1.0), and high exposure (pack-year to age ratio ≥1.0). By incorporating age as a denominator, the pack-year to age ratio more accurately reflects the long-term effects of smoking relative to an individual’s lifespan, even after cessation [27,28]. A pack-year to age ratio ≥1.0 indicates that an individual has, on average, smoked at least one pack per day throughout their entire lifetime, representing particularly intensive smoking exposure. For instance, a ratio of 1.0 could correspond to smoking one pack per day for 40 years in a 40-year-old or smoking two packs per day for 30 years in a 60-year-old. This approach also helps account for the fact that the same absolute smoking burden (e.g., 30 pack-years) represents a substantially different proportion of lifetime exposure for a 40-year-old versus a 60-year-old, and may better reflect the biological impact of smoking intensity relative to age.
Lipid profile
Due to the high correlation among individual lipid markers (TC, LDL, HDL, TG), simultaneous inclusion of these variables in statistical models can result in conflicting interpretations. Therefore, we employed composite lipid ratios provide clearer and clinically relevant cardiovascular and metabolic risk assessments.
Composite lipid ratios have been shown to be stronger predictors of cardiovascular disease and mortality than individual lipid parameters alone [17,28–29]. Specifically, we defined high-risk lipid profiles using the following established thresholds: TC/HDL≥5.0, LDL/HDL≥5.0, TG/HDL > 6.0,
A high-risk lipid profile was defined as meeting at least one of these three criteria [30]. This approach alleviates multicollinearity, a common challenge when analyzing multiple lipid markers simultaneously, thereby improving the stability and interpretability of the statistical models [11].
Outcome
The primary outcome was all-cause mortality, which was identified by linkage with the Korean National Death Registry [31]. Mortality data were recorded using ICD-10 codes, and follow-up time was measured from the baseline health examination to death or the end of the study period (December 31, 2021).
Covariates
Blood test parameters included fasting blood sugar (FBS), hemoglobin (Hgb), estimated glomerular filtration rate (eGFR), aspartate aminotransferase (AST), and gamma-glutamyl transpeptidase (-GTP). FBS levels were categorized into five groups following the American Diabetes Association guidelines: low (<50 mg/dL), healthy (55–99 mg/dL), prediabetes (100–125 mg/dL), diabetes (126–199 mg/dL), and hyperglycemic crisis (≥200 mg/dL) [32]. Hemoglobin, eGFR, AST, and
-GTP were categorized based on clinical reference ranges [33].
Urine protein was assessed using dipstick tests, with results classified as proteinuria or albuminuria (≥2+) [34].
History of hypertension, diabetes mellitus, heart disease, stroke, and cancer was determined using self-reported data and NHIS-NSC diagnostic records, classified according to ICD-10 codes (e.g., diabetes: E10-E14, hypertension: I10-I15). Due to the low frequency of individual diseases, any history of these conditions was considered a risk factor for mortality. Alcohol consumption was categorized into non-heavy drinkers without disease, heavy drinkers without disease, and those with at least one disease or classified as heavy drinkers [35].
Physical activity was categorized into two groups: no participation vs. engaging in exercise at least once per week. This binary classification was adopted because detailed information on exercise intensity and duration varied across survey years and between NHIS-NSC and KNHANES, precluding reliable granular classification. Weekly participation in any exercise represents a meaningful threshold and ensures comparability across the entire study period (2009–2021) and between cohorts.
Statistical analysis
Distributions of categorical variables were presented as frequency and percentage. Time-to-death for each participant was summed as person-year to compute the incidence rates (IRs) with 95% confidence intervals (CIs) [36].
Cox proportional hazard (PH) models were applied to estimate the association between risk factors and all-cause mortality [37]. Adjusted hazard ratios (AHRs) with 95% CIs were calculated after adjusting for age, sex, obesity status, history of cancer, smoking pack-year/age ratio, alcohol drinking, physical activity, blood pressure, fasting blood glucose, hemoglobin, AST, -GTP, eGFR, urine protein, and lipid ratios. To address the robustness of our findings and account for potential time-varying confounding, we additionally conducted time-dependent Cox PH model. Time-varying confounding occurs when covariates such as obesity status, lipid profiles, and other health indicators change during the follow-up period, potentially influencing mortality risk differently at various time points. This time-varying model provides a sensitivity analysis to validate the primary finding from the general Cox models.
The proportional hazards assumption for Cox models was not formally tested using diagnostic methods such as Schoenfeld residuals due to computational constraints with the large dataset. However, we employed time-dependent Cox models as a sensitivity analysis to assess whether covariate effects varied over time. The consistency of results between general and time-dependent models provides indirect evidence supporting the validity of the proportional hazards assumption for our primary analyses.
Statistical analyses were conducted at a two-sided significance level of 0.05, and we used the Statistical Package for SAS (version 9.4; SAS Institute Inc., Seoul, Korea), Rex (version 3.6.0.2), and R statistical software (version 4.2.3; R Foundation for Statistical Computing, Vienna, Austria) [38–40].
Results
Baseline characteristics
In the NHIS-NSC and KNHANES cohorts, 47.83% and 39.03% of participants, respectively, were aged 40 to <60 years, while 49.94% in NHIS-NSC and 45.65% in KNHANES were male (Table 1). Regarding general and abdominal obesity, most participants had a BMI between 18.5 and <27.5 without abdominal obesity (72.53% in NHIS-NSC; 68.03% in KNHANES), whereas a smaller proportion (10.19% in NHIS-NSC; 15.84% in KNHANES) had a BMI in the same range but with abdominal obesity.
The prevalence of self-reported cancer history was very low (0.52% in NHIS-NSC; 0.51% in KNHANES). The proportion of individuals with a higher smoking pack-year to age ratio decreased as the ratio increased, but heavy smokers (ratio ≥1) comprised 12.92% of NHIS-NSC, which was notably higher compared to 0.91% in KNHANES.
Lipid profile abnormalities were more prevalent in KNHANES than in NHIS-NSC. The proportion of individuals with TC/HDL > 5.0 was higher in KNHANES (19.45%) compared to NHIS-NSC (11.66%), and TG/HDL > 6.0 was also more frequent in KNHANES (14.86% vs. 10.82%). Overall, the prevalence of individuals with at least one abnormal lipid ratio was higher in KNHANES (24.77%) compared to NHIS-NSC (16.93%).
Despite these differences in smoking pack-year to age ratio and lipid abnormalities, the overall distribution trends across categories remained consistent in both cohorts. This discrepancy likely reflects differences in sampling frames and survey methodologies between the two cohorts. NHIS-NSC includes all health insurance subscribers over a 12-year period (2009–2021), capturing cumulative smoking exposure across the entire adult lifespan, while KNHANES data come from a cross-sectional survey conducted in 2010–2011 with shorter exposure ascertainment windows. Additionally, KNHANES may have different response rates among heavy smokers, and the two surveys may use slightly different methods for collecting smoking history data.
Association between obesity and all-cause mortality
Table 2 outlines the association between risk factors and all-cause mortality using general Cox models, with complete results in Supplementary Table S1. In the NHIS-NSC, underweight individuals (BMI < 18.5) showed significantly elevated mortality risk. Among those aged < 60 years without abdominal obesity, AHR was 2.42 (95% CI: 2.16–2.71) compared to normal weight. Severe obesity with abdominal obesity (BMI ≥35) also increased risk (AHR = 2.11, 95% CI: 1.61–2.76). For those age ≥ 60 years, underweight without abdominal obesity showed the highest incidence rate (IR) (56.27 per 1,000 person-years, Supplementary Table S1) and remained a strong mortality predictor (AHR = 1.79, 95% CI: 1.67–1.91). Moderately obesity showed lower mortality risk than the reference group, potentially reflecting survival bias.
These findings were reflected in the KNHANES validation with consistent patterns despite smaller sample size. Time-dependent models generally confirmed the results from the general Cox models (Supplementary Table S1). While the time-dependent model suggested a stronger association for younger underweight individuals with abdominal obesity (AHR = 6.43, 95% CI: 2.41–17.17), this estimate should be interpreted with caution given the small sample size in this subgroup and the wide confidence interval observed in the primary Cox model (AHR = 1.36, 95% CI: 0.19–9.67).
Association between smoking exposure and mortality
A cumulative dose-dependent relationship was observed between cumulative smoking exposure and all-cause mortality (Table 2). In NHIS-NSC, compared to non-smokers, individuals with low exposure(a pack-year to age ratio <1) exhibited a significantly higher mortality risk (AHR = 1.36, 95% CI: 1.32–1.40), while those with high exposure(a pack-year to age ratio ≥1) demonstrated the greatest risk (AHR = 1.65, 95% CI: 1.51–1.81). The result from KNHANES showed consistent direction though not statistically significant for high exposure due to smaller sample size. Time-dependent models confirmed the dose-dependent relationship: AHR = 1.31 (95% CI: 1.27–1.35) for low exposure and AHR = 1.75 (95% CI: 1.59–1.93) for high exposure (S1 Table).
Association between lipid profile and mortality
Participants with at least one high-risk lipid ratio (TC/HDL ≥ 5.0, LDL/HDL ≥ 5.0, TG/HDL > 6.0) had an elevated risk of mortality in NHIS-NSC(AHR = 1.04, 95% CI: 1.01–1.07; Table 2). KNHANES showed similar estimates (AHR = 1.10, 95% CI: 0.92–1.32) though non-significant due to smaller sample size. Time-dependent models showed stronger association (AHR = 1.15, 95% CI: 1.11–1.18; S1 Table), suggesting composite lipid ratios provide meaningful prognostic value beyond individual markers.
Reassessment across age group
To investigate how mortality risk factors vary over different life stages, we categorized participants into three age groups: younger than 40 years, 40–59 years, and 60 years or older. Then, the three key refinements were reassessed in mortality risk (Table 3).
Individuals classified as underweight (BMI < 18.5) had an elevated mortality risk across all age groups, with the highest risk observed in those aged 40–59 years (AHR = 2.22, 95% CI: 2.04–2.41), followed by individuals aged ≥60 years (AHR = 1.98, 95% CI: 1.91–2.04) and those younger than 40 years (AHR = 1.28, 95% CI: 1.01–1.62). The relative risk was higher in older adults, though still elevated in younger individuals. However, after age 40, the excess mortality risk associated with underweight status showed a gradual decline compared to individuals with normal weight. Abdominal obesity was associated with increased mortality risk across all age groups, with a greater impact in individuals under 40 years.
Smoking exposure demonstrated a dose-dependent association with mortality risk. Compared to non-smokers, the highest mortality risk was observed in heavy smokers (pack-year to age ratio ≥1), with AHRs of 2.26 (age < 40), 1.76 (40–59 years), and 1.69 (≥60 years). Notably, younger heavy smokers (<40 years) exhibited the largest relative risk increase (AHR = 2.26, 95% CI: 1.12–4.58).
For lipid profiles, individuals with at least one high-risk lipid ratio had an increased risk of mortality in older adults (AHR = 1.08, 95% CI: 1.05–1.10). Interestingly, among younger individuals (<40 years), a high lipid ratio was associated with a slightly lower mortality risk (AHR = 0.86, 95% CI: 0.75–0.99), which may reflect residual confounding, metabolic compensation, or survival bias.
Despite slight differences in specific risk estimates across age groups, the overall trends in obesity, smoking exposure, and lipid profile effects on mortality remained consistent.
Discussion
This study examined the association between obesity, smoking exposure, and lipid abnormalities with all-cause mortality using refined definitions of these risk factors. Unlike conventional approaches that rely solely on BMI, total pack-years, and individual lipid markers, this study incorporated a combined classification of general and abdominal obesity, the pack-year to age ratio for smoking exposure, and composite lipid ratios [25,41,42]. These modifications were intended to make the results more clinically interpretable and to minimizes misleading findings common in observational studies.
Obesity’s effect on mortality varied by age and adiposity distribution. Among individuals younger than 60 years, underweight without abdominal obesity was a strong predictor of mortality (AHR = 2.42, 95% CI: 2.16–2.71 in primary Cox models; Table 2). Although the time-dependent Cox model suggested a notably elevated risk for underweight individuals with abdominal obesity in this age group (AHR = 6.43, 95% CI: 2.41–17.17; Supplementary Table S1), the corresponding estimate from the primary model was underpowered due to the small sample size (AHR = 1.36, 95% CI: 0.19–9.67; Table 2). Among individuals aged 60 and older, underweight without abdominal obesity remained the strongest mortality predictor (AHR = 1.79, 95% CI: 1.67–1.91; Table 2). These findings suggest that underweight status is a consistent and strong predictor of mortality across age groups, and that the role of abdominal obesity may differ by age, although estimates for rare phenotypes such as underweight with abdominal obesity require further investigation with larger samples [43,44].
By cross-classifying BMI with abdominal obesity status, our approach distinguished several clinically important adiposity phenotypes: (1) sarcopenic obesity (underweight with abdominal obesity), characterized by loss of muscle mass with visceral fat accumulation and associated with the highest mortality risk in younger adults; (2) metabolically unhealthy normal weight (normal BMI with abdominal obesity), reflecting central adiposity despite normal body weight; (3) metabolically healthy obesity (elevated BMI without abdominal obesity), representing obesity without excess visceral fat; and (4) various gradations of obesity with central adiposity. This refined classification helps explain the ’obesity paradox’ by revealing that not all individuals within the same BMI category have equivalent metabolic risk, and that visceral adiposity–rather than total body weight alone–is a critical determinant of mortality risk. Additionally, the combined effects of general and abdominal obesity on mortality differed across age groups, with a stronger association observed in younger individuals. These findings highlight the important of considering both BMI and abdominal obesity simultaneously, particularly in age-stratified analyses.
For smoking exposure, the pack-year to age ratio was introduced to allow for a more detailed assessment of cumulative smoking burden. The results confirmed a dose-dependent relationship between smoking exposure and mortality, with heavy smokers (pack-year to age ratio ≥ 1) consistently having the highest mortality risk across age group (AHR = 2.26, 95% CI: 1.12–4.58 for age < 40; AHR = 1.76, 95% CI: 1.56–1.98 for 40–59 years; AHR = 1.69, 95% CI: 1.57–1.82 for ≥60 years). Notably, the strongest relative risk was observed in young smokers, which strengthens the long-term consequences of early smoking exposure [45,46]. Unlike total pack-years, which does not take into account differences in smoking duration relative to age [47], the pack-year to age ratio provided results that were easier to interpret and more consistent with established clinical knowledge on smoking risk.
For lipid profiles, this study examined composite lipid ratios to provide clinically meaningful interpretation while minimizing interpretative conflicts due to opposing risk implications among correlated lipid markers. Individuals with at least one high-risk lipid ratio (TC/HDL≥5.0, LDL/HDL≥5.0, TG/HDL > 6.0) showed an increased risk of mortality in primary analyses (AHR = 1.04, 95% CI: 1.01–1.07), particularly pronounced in older adults (AHR = 1.04, 95% CI: 1.05–1.10). Composite lipid ratios (TC/HDL, LDL/HDL, TG/HDL) offer practical advantages over individual lipid markers by reducing interpretative conflicts that may arise due to opposing risk implications of correlated lipid components. Several prior studies support the superior predictive power of these ratios for cardiovascular and metabolic diseases [4,15,17]. Therefore, the use of lipid ratios is not merely a statistical approach but a clinically driven decision aimed at enhancing interpretability and validity in risk assessment models.
Among individuals younger than 40 years, those with high-risk lipid ratios showed paradoxically lower mortality risk (AHR = 0.86, 95% CI: 0.75–0.99). This counterintuitive finding may reflect several factors. First, residual confounding from unmeasured lipid-lowering treatments or lifestyle interventions initiated after baseline may have reduced mortality risk in this group. Second, survival bias may have selected individuals with protective genetic factors. Third, the latency period for lipid-related cardiovascular mortality typically spans decades, and our follow-up may have been insufficient to observe the full impact of dyslipidemia in younger adults. Finally, metabolic compensation mechanisms may be more robust in younger individuals. Importantly, this paradoxical association was limited to those under 40 years and did not persist in older age groups, where high-risk lipid ratios showed expected positive association with mortality. This age-dependent pattern suggests the inverse association in younger adults reflects methodological limitations or biological latency rather than a true protective effect.
While this study demonstrated that refined metrics produce more interpretable and clinically consistent mortality risk estimates, we did not formally compare model performance (e.g., c-statistics, Akaike Information Criterion) between refined and conventional definitions. Such comparisons would require a different study design specifically aimed at predictive model development, whereas our focus was on improving the validity and interpretability of epidemiological associations. The refined metrics were evaluated based on: (1) consistency across general and time-dependent Cox models, (2) validation in an independent cohort, and (3) alignment with established clinical knowledge. Future studies could build upon these findings to formally compare predictive performance and develop clinical risk prediction tools.
Several limitations should be acknowledged. First, formal diagnostic testing of the proportional hazards assumption using Schoenfeld residuals was not performed due to computational constraints associated with the large dataset (N = 659,494). This remains as a methodological limitation. However, time-dependent Cox models were employed as a complementary approach, and the consistency of results between the primary and time-dependent models provides indirect evidence supporting the proportional hazards assumption. Further studies should incorporate the conventional diagnostics, potentially using subsampling or stratified approaches to manage computational burden. Second, this study was limited to Korean populations. The generalizability of our refined metrics to other ethnic groups with different body composition patterns, smoking behaviors, and lipid profiles remains to be established. External validation in diverse populations is needed. Third, our analysis focused on all-cause mortality and did not examine cause-specific mortality. Fourth, we were unable to incorporate information on lipid-lowering medication use or changes in treatment during follow-up, which may have influencing the observed associations. Fifth, despite adjusting for multiple confounders, residuals confounding from unmeasured factors (e.g., dietary patterns, genetic factors, socioeconomic variables) cannot be entirely excluded in observational studies. Additionally, history of chronic diseases including cancer, heart disease, stroke, hypertension, and diabetes was combined into a single binary variable due to low individual frequencies. This pragmatic approach may mask important differences in mortality risk profiles among these conditions, as the prognostic implications of a cancer history likely differ from those of hypertension alone. Future studies with larger sample sizes should consider modeling these conditions separately.
Across all analyses, refinement of obesity, smoking exposure, and lipid abnormalities led to mortality risk estimates that were more aligned with clinical expectations and less prone to the paradoxical findings commonly reported in observational studies. For instance, in contrast to studies where obese individuals appeared to have lower mortality risk than normal-weight individuals, incorporating abdominal obesity in the classification helped reduce these inconsistencies. Similarly, while some previous studies have reported unexpectedly lower mortality rates among smokers, the use of the pack-year to age ratio allowed for a more clinically plausible gradient of risk, showing a consistent relationship between cumulative smoking burden and increased mortality risk.
Conclusion
This study demonstrated that refining obesity, smoking exposure, and lipid profile measurements led to more clinically interpretable mortality risk estimates. Combining BMI with abdominal obesity distinguished metabolically distinct phenotypes and eliminated paradoxical findings. The pack-year to age ratio provided clearer smoking dose-response relationships. Composite lipid ratios enabled evaluation of multiple correlated markers without multicollinearity issues.
Although this study did not compare predictive performance with conventional definition directly, the refined metrics were validated through modeling consistency, independent cohort replication, and alignment with clinical knowledge. These findings demonstrate that carefully refined risk factor definitions can improve the interpretability and validity of epidemiological research, addressing longstanding paradoxes in observational mortality studies.
Supporting information
S1 Table. Risk assessment for mortality.
Time-dependent Cox model results referenced in the main text.
https://doi.org/10.1371/journal.pone.0348128.s002
(PDF)
Acknowledgments
We thank the National Health Insurance Service and the Korea Disease Control and Prevention Agency for providing access to the NHIS-NSC and KNHANES datasets.
References
- 1. Pischon T, Boeing H, Hoffmann K, Bergmann M, Schulze MB, Overvad K, et al. General and abdominal adiposity and risk of death in Europe. N Engl J Med. 2008;359(20):2105–20. pmid:19005195
- 2. Ng R, Sutradhar R, Yao Z, Wodchis WP, Rosella LC. Smoking, drinking, diet and physical activity-modifiable lifestyle risk factors and their associations with age to first chronic disease. Int J Epidemiol. 2020;49(1):113–30. pmid:31329872
- 3. Antonopoulos AS, Oikonomou EK, Antoniades C, Tousoulis D. From the BMI paradox to the obesity paradox: the obesity-mortality association in coronary heart disease. Obes Rev. 2016;17(10):989–1000. pmid:27405510
- 4. Mohammadshahi J, Ghobadi H, Matinfar G, Boskabady MH, Aslani MR. Role of lipid profile and its relative ratios (cholesterol/hdl-c, triglyceride/hdl-c, ldl-c/hdl-c, wbc/hdl-c, and fbg/hdl-c) on admission predicts in-hospital mortality covid-19. Journal of Lipids. 2023;2023:6329873.
- 5. Lavie CJ, De Schutter A, Milani RV. Healthy obese versus unhealthy lean: the obesity paradox. Nat Rev Endocrinol. 2015;11(1):55–62. pmid:25265977
- 6. Ahn C, Hwang Y, Park SK. Predictors of all-cause mortality among 514,866 participants from the Korean National Health Screening Cohort. PLoS One. 2017;12(9):e0185458. pmid:28957371
- 7. Byun AR, Lee SW, Lee HS, Shim KW. What is the most appropriate lipid profile ratio predictor for insulin resistance in each sex? A cross-sectional study in Korean populations (The Fifth Korea National Health and Nutrition Examination Survey). Diabetol Metab Syndr. 2015;7:59. pmid:26146523
- 8. Reitsma MB, et al. Smoking prevalence and attributable disease burden in 195 countries and territories, 1990–2015: a systematic analysis from the Global Burden of Disease Study 2015. The Lancet. 2017;389(10082):1885–906.
- 9. Lee Y-H, Shin M-H, Kweon S-S, Choi J-S, Rhee J-A, Ahn H-R, et al. Cumulative smoking exposure, duration of smoking cessation, and peripheral arterial disease in middle-aged and older Korean men. BMC Public Health. 2011;11:94. pmid:21310081
- 10. Jacobs D, et al. Report of the Conference on Low Blood Cholesterol: Mortality Associations. Circulation. 1992;86(3):1046–60.
- 11. Emerging Risk Factors Collaboration, Di Angelantonio E, Gao P, Pennells L, Kaptoge S, Caslake M, et al. Lipid-related markers and cardiovascular disease prediction. JAMA. 2012;307(23):2499–506. pmid:22797450
- 12. Cembrowska P, Stefańska A, Odrowąż-Sypniewska G. Obesity phenotypes: normal-weight individuals with metabolic disorders versus metabolically healthy obese. Medical Research Journal. 2017;1(3):95–9.
- 13. Yang HK, Han K, Kwon H-S, Park Y-M, Cho J-H, Yoon K-H, et al. Obesity, metabolic health, and mortality in adults: a nationwide population-based study in Korea. Sci Rep. 2016;6:30329. pmid:27445194
- 14. Inoue-Choi M, Liao LM, Reyes-Guzman C, Hartge P, Caporaso N, Freedman ND. Association of Long-term, Low-Intensity Smoking With All-Cause and Cause-Specific Mortality in the National Institutes of Health-AARP Diet and Health Study. JAMA Intern Med. 2017;177(1):87–95. pmid:27918784
- 15. Jung HW, Hong SP, Kim KS. Comparison of apolipoprotein B/A1 ratio, TC/HDL-C, and lipoprotein (a) for predicting outcomes after PCI. Plos One. 2021;16(7).
- 16. Sun T, Chen M, Shen H, Ping Y, Fan L, Chen X, et al. Predictive value of LDL/HDL ratio in coronary atherosclerotic heart disease. BMC Cardiovasc Disord. 2022;22(1):273. pmid:35715736
- 17. Assmann G, Schulte H, von Eckardstein A, Huang Y. High-density lipoprotein cholesterol as a predictor of coronary heart disease risk. The PROCAM experience and pathophysiological implications for reverse cholesterol transport. Atherosclerosis. 1996;124 Suppl:S11-20. pmid:8831911
- 18. Kim YI, Kim Y-Y, Yoon JL, Won CW, Ha S, Cho K-D, et al. Cohort Profile: National health insurance service-senior (NHIS-senior) cohort in Korea. BMJ Open. 2019;9(7):e024344. pmid:31289051
- 19. Yun S, Oh K. The Korea National Health and Nutrition Examination Survey data linked Cause of Death data. Epidemiol Health. 2022;44:e2022021. pmid:35167742
- 20. Kang YM, Kim Y-J, Park J-Y, Lee WJ, Jung CH. Mortality and causes of death in a national sample of type 2 diabetic patients in Korea from 2002 to 2013. Cardiovasc Diabetol. 2016;15(1):131. pmid:27618811
- 21. Lee S-H, Kim D-H, Park J-H, Kim S, Choi M, Kim H, et al. Association between body mass index and mortality in the Korean elderly: A nationwide cohort study. PLoS One. 2018;13(11):e0207508. pmid:30444893
- 22. Wan EYF, Yu EYT, Chin WY, Barrett JK, Mok AHY, Lau CST, et al. Greater variability in lipid measurements associated with cardiovascular disease and mortality: A 10-year diabetes cohort study. Diabetes Obes Metab. 2020;22(10):1777–88. pmid:32452623
- 23.
Consultation W. Waist circumference and waist-hip ratio. Geneva: World Health Organization. 2008.
- 24. Lee SY, Park HS, Kim DJ, Han JH, Kim SM, Cho GJ, et al. Appropriate waist circumference cutoff points for central obesity in Korean adults. Diabetes Res Clin Pract. 2007;75(1):72–80. pmid:16735075
- 25. Flegal KM, Kit BK, Orpana H, Graubard BI. Association of all-cause mortality with overweight and obesity using standard body mass index categories: a systematic review and meta-analysis. JAMA. 2013;309(1):71–82. pmid:23280227
- 26. Thun MJ, Carter BD, Feskanich D, Freedman ND, Prentice R, Lopez AD, et al. 50-year trends in smoking-related mortality in the United States. N Engl J Med. 2013;368(4):351–64. pmid:23343064
- 27. Emerging Risk Factors Collaboration, Di Angelantonio E, Sarwar N, Perry P, Kaptoge S, Ray KK, et al. Major lipids, apolipoproteins, and risk of vascular disease. JAMA. 2009;302(18):1993–2000. pmid:19903920
- 28. Kannel WB, Castelli WP, Gordon T, McNamara PM. Serum cholesterol, lipoproteins, and the risk of coronary heart disease. The Framingham study. Ann Intern Med. 1971;74(1):1–12. pmid:5539274
- 29. Cullen P. Evidence that triglycerides are an independent coronary heart disease risk factor. Am J Cardiol. 2000;86(9):943–9. pmid:11053704
- 30. Ridker PM, Hennekens CH, Buring JE, Rifai N. C-reactive protein and other markers of inflammation in the prediction of cardiovascular disease in women. N Engl J Med. 2000;342(12):836–43. pmid:10733371
- 31. Jee SH, Sull JW, Park J, Lee S-Y, Ohrr H, Guallar E, et al. Body-mass index and mortality in Korean men and women. N Engl J Med. 2006;355(8):779–87. pmid:16926276
- 32. Association AD. Standards of medical care in diabetes—2016 abridged for primary care providers. Clinical diabetes: a publication of the American Diabetes Association. 2016;34(1):3.
- 33. Coco B, Oliveri F, Maina AM, Ciccorossi P, Sacco R, Colombatto P, et al. Transient elastography: a new surrogate marker of liver fibrosis influenced by major changes of transaminases. J Viral Hepat. 2007;14(5):360–9. pmid:17439526
- 34. Lamb EJ, MacKenzie F, Stevens PE. How should proteinuria be detected and measured? Annals of clinical biochemistry. 2009;46(3):205–17.
- 35. Hernández-Vásquez A, Chacón-Torrico H, Vargas-Fernández R, Grendas LN, Bendezu-Quispe G. Gender Differences in the Factors Associated with Alcohol Binge Drinking: A Population-Based Analysis in a Latin American Country. Int J Environ Res Public Health. 2022;19(9):4931. pmid:35564326
- 36.
Porta M. A dictionary of epidemiology. Oxford University Press. 2014.
- 37. Kalbfleisch JD, Schaubel DE. Fifty Years of the Cox Model. Annu Rev Stat Appl. 2023;10(1):1–23.
- 38. Lee B, An J, Lee S, Won S. Rex: R-linked EXcel add-in for statistical analysis of medical and bioinformatics data. Genes & Genomics. 2023;45(3):295–305.
- 39. Rodriguez RN. Sas. Wiley Interdisciplinary Reviews: Computational Statistics. 2011;3(1):1–11.
- 40. Ripley BD. The R Project in Statistical Computing. MSOR Connections. 2001;1(1):23–5.
- 41. Carter BD, Abnet CC, Feskanich D, Freedman ND, Hartge P, Lewis CE, et al. Smoking and mortality--beyond established causes. N Engl J Med. 2015;372(7):631–40. pmid:25671255
- 42. Kromhout D, Bosschieter EB, Drijver M, de Lezenne Coulander C. Serum cholesterol and 25-year incidence of and mortality from myocardial infarction and cancer. The Zutphen Study. Arch Intern Med. 1988;148(5):1051–5. pmid:3365076
- 43. Hamer M, O’Donovan G, Stensel D, Stamatakis E. Normal-Weight Central Obesity and Risk for Mortality. Ann Intern Med. 2017;166(12):917–8. pmid:28437799
- 44. Carmienke S, Freitag MH, Pischon T, Schlattmann P, Fankhaenel T, Goebel H, et al. General and abdominal obesity parameters and their combination in relation to mortality: a systematic review and meta-regression analysis. Eur J Clin Nutr. 2013;67(6):573–85. pmid:23511854
- 45. Fa-Binefa M, Clará A, Pérez-Fernández S, Grau M, Dégano IR, Marti-Lluch R, et al. Early smoking-onset age and risk of cardiovascular disease and mortality. Prev Med. 2019;124:17–22. pmid:31054906
- 46. Wang J-L, Yin W-J, Zhou L-Y, Wang Y-F, Zuo X-C. Association Between Initiation, Intensity, and Cessation of Smoking and Mortality Risk in Patients With Cardiovascular Disease: A Cohort Study. Front Cardiovasc Med. 2021;8:728217. pmid:34977166
- 47. Peto J. That the effects of smoking should be measured in pack-years: misconceptions 4. Br J Cancer. 2012;107(3):406–7. pmid:22828655