Association between red cell distribution width and all-cause mortality in patients with breast cancer: A retrospective analysis using MIMIC-IV 2.0

Purpose Investigating the association between red cell distribution width (RDW) and all-cause mortality in patients with breast cancer, to evaluate the potential clinical prognostic value of RDW. Methods Based on the RDW index, patients with breast cancer in the Medical Information Mart for Intensive Care (MIMIC-IV) database were categorized into quartiles. The primary outcomes included in-hospital mortality from all causes during the first six months, the first year, and the first three years. Cox hazards regression and restricted cubic spline (RCS) models were developed to investigate the effects of RDW on primary outcomes. Results The study included 939 patients (female). The 6-month, 1-year, and 3-year mortality rates were 14.0%, 21.4%, and 28.4%, respectively. Multivariate Cox proportional hazards analyses demonstrated that RDW exhibited an autonomous association with an increased risk of all-cause mortality. After adjusting for confounders, higher RDW quartiles were significantly associated with 6-month mortality (adjusted hazard ratio (HR), 3.197; 95% confidence interval (CI), 1.745–5.762; P < 0.001), 1-year mortality (adjusted HR, 2.978; 95% CI, 1.867–4.748; P < 0.001), and 3-year mortality (adjusted HR, 2.526; 95% CI, 1.701–3.750; P < 0.001). The RCS curves demonstrated that high RDW (> 14.6) was associated with a greater risk of all-cause mortality. Subgroup analyses revealed no statistically significant differences in the interactions between the subgroups. Conclusion The study revealed a highly pronounced relationship between RDW and overall mortality, indicating its potential as an autonomous prognostic factor for increased mortality among patients with breast cancer.


Introduction
Breast cancer is a common cancer affecting women worldwide.The American Cancer Society predicts that breast cancer will contribute to approximately 32% of newly diagnosed cases in women by 2024 [1].Despite the continuous progress in medical innovations that have led to a gradual reduction in mortality rates, breast cancer remains the foremost contributor to cancer-related fatalities in the female population [2].Consequently, it is imperative to employ efficient screening techniques to accurately evaluate the risk of mortality in the clinical diagnosis of breast cancer, as this greatly influences treatment decisions and patients' clinical outcomes.In the assessment of breast cancer, the commonly utilized prognostic factors include age, tumor size, axillary lymph node status, and histological characteristics (particularly histological grade and lymphatic invasion) [3].Additionally, molecular subtypes including human epidermal growth factor receptor 2 (HER2) [4], estrogen receptor (ER) [5], progesterone receptor (PR) [6], and antigen Ki-67 [7] are considered important for prognostics evaluation.Currently, the identification of histological features and molecular subtypes relies upon pathological biopsies, which are invasive, time-consuming, and relatively expensive, limiting their widespread clinical use.Consequently, there is a crucial need to explore alternative indicators such as routine complete blood counts, which can provide quick and straightforward insights to assist clinicians in determining the prognosis of patients with breast cancer.
The red cell distribution width (RDW) is a straightforward hematologic parameter that represents the heterogeneity of red blood cell volume [8].The higher the RDW value, the greater the change in red blood cell size.Fluctuations in RDW have been reported in many pathophysiological conditions; for example, elevated RDW is associated with acute and chronic heart failure [9], coronary artery disease, cerebral infarction, and acute myocardial infarction [10].Previous studies have reported an association between elevated RDW levels in elderly patients and unfavorable outcomes in terms of overall survival (OS) and disease-free survival (DFS) [11].Furthermore, based on data from a pilot study, Seretis et al. suggested that RDW may serve as a potential biomarker of breast cancer activity [12].However, research on the role of RDW in breast cancer prognosis is limited, particularly large-scale studies.Takeuchi et al. analyzed 299 patients with breast cancer and found no significant association between RDW and DFS [13].In contrast, Huang et al. identified RDW as a relevant inflammatory marker in patients with breast cancer, potentially associated with DFS and OS in young females [14].Additionally, Yoo et al. demonstrated that preoperative elevation of RDW (>13.5) in patients with breast cancer has the strongest predictive ability for postoperative mortality, with the risk of recurrence and death increasing by approximately 1.7 times once RDW exceeds the critical threshold [15].In the existing body of research, there are divergent opinions among researchers regarding the impact of RDW on breast cancer patients.In contrast, our study selected a cohort of breast cancer patients across all age groups from the Medical Information Mart for Intensive Care (MIMIC-IV) database.We adjusted for a series of confounding variables and further stratified the patients into specific subgroups to validate the robustness of the analytical results.We aimed to investigate the potential relationship between RDW and overall mortality in patients with breast cancer, elucidating the precise role of RDW in the prognosis of breast cancer.

Study population
This retrospective study investigated health-related data from the MIMIC-IV database version 2.0, a comprehensive and extensive single-center database administered by the Laboratory of Computational Physiology at the Massachusetts Institute of Technology (MIT).The MIMI-C-IV database is a valuable resource that offers a substantial collection of meticulously documented medical records encompassing resident patients at the Beth Israel Deaconess Medical Center in Boston from 2008 to 2019, relating specifically to patients admitted to the intensive care unit (ICU) [16].Access to MIMIC-IV requires passing the Protecting Human Research Participants online course and exams from the National Institute of Health.Data extraction was mainly completed by one of the authors (Jie Xiao) after obtaining access to the datasets (certification number: 56775311).The MIMIC-IV database was approved for research by the Institutional Review Boards of the Massachusetts Institute of Technology and Beth Israel Deaconess Medical Center (BIDMC); thus, this study adhered to the ethical standards outlined in the Helsinki Declaration and received a waiver of ethical approval and informed consent.Patients with breast cancer were selected from the MIMIC-IV database based on the diagnostic criteria outlined in the 9th and 10th editions of the International Classification of Diseases.For patients admitted to the ICU multiple times, only the initial admission data were analyzed.The exclusion criteria for this study were as follows: (1) individuals under the age of 18 years at initial admission; (2) patients with incomplete RDW information recorded on the first day of admission; (3) hospitalization within 1 day; (4) death within 7 days of hospitalization.Finally, a comprehensive sample of 939 participants was enrolled and subsequently allocated to four distinct groups based on the RDW quartile (Fig 1).

Data collection
Baseline characteristics were extracted from the MIMIC-IV database using the PostgreSQL (version 15.3) and Navicat Premium (version 16) software.The categorization of accessible variables was segregated into four distinct groups as follows: (1) demographic information encompassing age, race, sex, stature, mass, and body mass index (BMI); (2) treatment procedures, such as radiation, medication, and breast surgery; (3) comorbidities, including myocardial infarction (MI), congestive heart failure (CHF), peripheral vascular disease (PVD), cerebrovascular disease (CVD), chronic pulmonary disease, rheumatic disease, mild liver disease, and renal disease; (4) laboratory indicators, including anion gap (AG), bicarbonate, white blood cells (WBC), red blood cells (RBC), platelet (PLT), hemoglobin (Hb), hematocrit (HCT), chloride, serum calcium, serum potassium, serum sodium, glucose (GLU), serum creatinine, blood urea nitrogen (BUN), mean corpuscular hemoglobin (MCH), mean corpuscular volume (MCV), mean corpuscular hemoglobin concentration (MCHC), and RDW.The value of the RDW index was obtained as the (standard deviation [SD] of red cell volume/ MCV) ×100 [17].Data extracted from the ICU included laboratory variables reported within the initial 24-hour period of patient admission.Subsequent analyses commenced on the day of admission and continued until the day of mortal demise.Owing to the occurrence of missing data in MIMIC-IV, a single imputation approach was employed to address any gaps.To circumvent the potential bias resulting from directly inputting missing values, variables displaying a missing rate exceeding 20% were transformed into dummy variables within the models.Additionally, variables with >25% missing values were excluded (S1 Table ).

Statistical analysis
The study population was divided into four groups based on the quartiles of the RDW index on the first day of ICU admission.Mean ±SD or median with interquartile range were used to express continuous variables, while frequency and percentage (%) were used for categorical variables.The normality of continuous parameters was evaluated using the Kolmogorov-Smirnov test.When a normal distribution was met, a t-test or ANOVA was used for comparison.When a non-normal distribution was encountered, the Mann-Whitney U test or Kruskal-Wallis test was used.Fisher's exact test or Pearson's chi-square test was used to examine the differences between groups for categorical variables.The incidence of primary end-events (6-month mortality, 1-year mortality, and 3-year mortality) was examined using Kaplan-Meier survival analysis in groups with varying RDW index values (1 unit and quartile), and disparities between groups were assessed using log-rank tests.To determine the hazard ratio (HR) and 95% confidence interval (CI) between RDW and primary endpoints, Cox proportional hazards models were employed, with certain models being adjusted.Baseline variables were used as candidate predictors in the multiple regression models.To avoid model overfitting, we calculated the variance inflation factor (VIF) to quantify multicollinearity between variables and removed any variables that VIF �5.Ultimately, only clinically significant and prognostically impactful confounding factors were included in the multivariate model as follows: Model 1, unadjusted; Model 2, adjusted for age, race, and BMI; and Model 3, adjusted for age, race, BMI, radiation, medication, breast surgery, MI, CHF, PVD, CVD, chronic pulmonary disease, rheumatic disease, mild liver disease, and renal disease.To examine the non-linear relationship between RDW and all-cause mortality, we used a restricted cubic spline (RCS) regression model with four knots.Additionally, we employed Receiver Operating Characteristic (ROC) curves to establish the optimal threshold for RDW.In our analysis, the RDW index was included in the model as a continuous variable, and alternatively as a categorical variable.The lowest RDW quartile served as the reference group for the latter approach.To assess any potential trends, P-values were determined based on quartile levels.Moreover, we conducted additional stratified analyses considering age (�65 years and < 65 years), race (white and nonwhite), BMI (< 30 kg/m 2 and � 30 kg/m 2 ), diabetes, and renal disease.The objective was to evaluate the consistent prognostic significance of the RDW index for the primary endpoints using likelihood ratio tests to examine any interactions between RDW and the stratified variables.All data analyses were conducted using the SPSS Statistical Software (IBM SPSS Statistics, version 29.0), Prism software (GraphPad Prism, version 9.4.0), and R software (R, version 4.3.6).A double-sided P < 0.05 was considered statistically significant.

Results
This study included 939 patients diagnosed with breast cancer; with an average age of 64.95 ± 14.05 years.All participants were females, with 657 (70.0%) identifying as white.The median RDW index for the entire population stood at 14.6, with an interquartile range (IQR) between 13.4 and 16.4.At a mean follow-up duration of 26.3 months, 350 (37.3%) patients died from any cause.The mortality rates at 6 months, 1 year, and 3 years were 14.0%, 21.4%, and 28.4%, respectively (Table 1).
Differences in baseline characteristics between individuals who survived and those who did not survive their hospital stays are shown in Table 2. Individuals in the non-survival group had older average ages, a lower rate of breast surgery, and a greater occurrence of CHF, chronic pulmonary disease, mild liver disease, renal disease, and metastatic solid tumors (P < 0.05).As for laboratory parameters, the non-surviving group exhibited lower levels of Hb, chloride, serum sodium, GLU, and MCHC and higher levels of AG and BUN (P < 0.05).There were no discernible changes in BMI, bicarbonate levels, RBC count, HCT, or MCH.Compared to the survival group, the non-survival group's RDW Index was substantially higher (15.8 vs. 14.0%,P < 0.001).The distribution of the RDW level, stratified by mortality status, for fatalities that occurred within six months, one year, and three years is shown in S1 Fig.

Primary outcomes
The Kaplan-Meier survival analysis curves for the incidence of the primary outcome in each quartile of RDW are shown in Fig 2 .In general, patients with elevated RDW have an increased risk of in-hospital death.At 6-month, 1-year, and 3-year extended follow-ups, individuals with a high RDW index exhibited significantly higher overall mortality rates than those with a low RDW index (all log-rank P < 0.001; Fig 2).ROC analysis was performed to evaluate the clinical predictive value of the RDW index for in-hospital mortality.However, we observed that the effectiveness of RDW in predicting allcause mortality was suboptimal (AUC for 6-month death: 0.704, P < 0.001; AUC for 1-year death: 0.691, P < 0.001; and AUC for 3-year death: 0.679, P < 0.001).The cutoff values for RDW were 15.75, 15.35, and 14.55, respectively (S2 Fig) .Cox proportional hazards analysis was performed to examine the relationship between RDW and overall mortality.It was demonstrated that when RDW was taken as a continuous variable, in the unadjusted model (HR, 1.226 [95% CI 1.166-1.289];P < 0.001), partially adjusted model (HR, 1.251 [95% CI 1.187-1.319];P < 0.001), and fully adjusted model (HR, 1.187 [95% CI 1.114-1.264];P < 0.001), RDW was significantly correlated with 6-month, 1-year, and 3-year mortality.When RDW was considered as a nominal variable, in the three established models, patients within higher RDW quartiles were at a considerably elevated risk of 6-month death: unadjusted model (HR, 5.136 [95% CI 2.919-9.037];P <0.001), partially adjusted model (HR, 5.321[95% CI 3.011-9.403];P <0.001), and completely adjusted model (HR, 3.197 [95% CI 1.745-5.762];P <0.001), compared to participants in the bottom quartile of RDW; and showed a tendency to rise with the RDW index (Table 3, Fig 3A).Similar outcomes were found in the multivariate Cox analyses conducted to evaluate the association between the RDW index and the 1-year and 3-year mortality rates (Table 3, Fig 3B and 3C).
Furthermore, the application of RCS regression models helped ascertain a significant association, indicating that elevated RDW levels (> 14.6) were linked to an increased likelihood of mortality (Fig 4).

Subgroup analysis
Considering the various subgroups of the enrolled patients that might be potentially influencing factors, we investigated how the RDW level was risk-stratified for the main outcomes meticulously, taking into account various subgroups including BMI, race, CHF, chronic  4).The RDW index was significantly associated with an increased risk of 6-month mortality in specific subgroups of patients with breast cancer.These subgroups included individuals of non-white (HR, 4.599; 95% CI 1.043-20.286),individuals aged �65 years (HR 3.479; 95% CI 1.607-7.532),individuals with BMI �30 kg/m 2 (HR 2.441; 95% CI 0.082-7.433),individuals without CHF (HR 3.866; 95% CI 1.975-7.569),and individuals without renal disease (HR 3.631; 95% CI 1.925-6.850)(all P < 0.05).Similar associations were observed in the stratified analyses of the RDW index and the 1-year, and 3-year mortality rates.However, there was no significant difference in the cross-stratification of RDW quartiles by age, BMI, race, chronic pulmonary disease, or renal disease (P for interaction > 0.05), suggesting that our subgroup analysis was relatively stable and less affected by confounding factors.Interestingly, the predictive ability of RDW appeared to be even more remarkable in patients without CHF than those with CHF (HR 3.866; [95% CI 1.975-7.569]vs. HR 1.410; [95% CI 0.325-6.121],P for interaction = 0.026).

Discussion
Our study suggests that increased RDW is a robust and independent predictor of higher mortality in patients with breast cancer.The formidable association between elevated RDW and all-cause mortality remains prominent even after adjusting for potential interfering factors.
Owing to its availability and cost-effectiveness in routine blood examinations, our investigation proposes that RDW could serve as a novel, reliable indicator in clinics, helping to identify patients with breast cancer at risk of unfavorable prognosis.RDW is a common measurement of RBC included in the complete blood count (CBC), reflecting the heterogeneity of the circulating red blood cell volume.The abnormal elevation in RDW suggests that inflammatory cytokines stimulate the premature release of immature large red blood cells into the peripheral blood circulation [18], leading to an increase in red blood cell volume variation.Clinical studies have shown that, compared with healthy subjects, there were significant differences in RDW values between patients treated for cancers and those with non-cancer diseases such as hematologic [19], cardiovascular [10], and systemic diseases.Changes in RDW are particularly evident in cardiovascular diseases, and its elevation   has been demonstrated as a reliable indicator of negative consequences in a variety of cerebrovascular diseases, including heart failure, pulmonary embolism, ischemic stroke, hemorrhagic stroke, and coronary heart disease among others [20].Cancer-related chronic inflammation is a key feature of tumor development.RDW has emerged as a reliable marker for systemic inflammatory response in various malignancies, consistently linked to adverse outcomes in extensive research.In 2009, a community-based prospective study reported a strong and independent association between higher RDW and the risk of death from cancer [21].A meta-analysis by Wang et al. identified a negative correlation between a pre-treatment RDW threshold of 13%-14% and poor survival outcomes [22], while Ines et al. associated high RDW with adverse prognostic factors in patients with Hodgkin's lymphoma (HL) [23].Warwick et al. examined the data of 917 patients who underwent surgery for non-small-cell lung carcinoma and confirmed that a preoperative RDW-CV of >15.3% was a significant risk factor for postoperative death (P = 0.001) and survival (P = 0.0001) [24].In colorectal cancer, a high RDW level (�13.5%) was reported as an additional separate indicator of both cause-specific survival (CSS) and OS [25], with a significant reduction in 10-year OS among patients with high RDW [26].These studies all suggest that RDW is an important biomarker for cancer.However, the exact biological mechanism underlying the association between RDW and all-cause mortality risk in patients with cancer remains unclear.One hypothesis is that increased oxidative stress may reduce red blood cell survival and increase immature red blood cells in circulation, leading to increased RDW [27].Another plausible explanation is that elevated RDW levels in cancer patients may be attributed to prolonged inflammatory responses and increased circulating cytokine levels, possibly causing damage to red blood cell membranes and influencing erythropoietin production, ultimately leading to an increase in RDW [28].Additionally, due to the limited specificity of RDW in cancer diagnosis, its application has certain constraints.Therefore, scholars have mainly focused on investigating the value of RDW in the prognosis assessment of cancer patients.
Researchers have disagreed on the predictive value of elevated RDW for breast cancer.Previous studies have shown that high RDW levels can be observed in patients with breast cancer [29], especially in postmenopausal women [30], compared to healthy individuals [31].Therefore, RDW levels can effectively distinguish patients with breast cancer from healthy individuals [32].Furthermore, RDW levels were significantly higher in patients with breast cancer than in patients with benign breast fibroadenomas [33].Seretis et al. observed elevated preoperative RDW in breast cancer patients compared to those with breast fibroadenoma.The heightened RDW correlated with tumor size, metastatic lymph nodes, and HER2 overexpression, suggesting its potential in distinguishing benign from malignant breast tumors [12].Some scholars believe that RDW might act as a biomarker for evaluating the metastatic capability of tumors [34], and an escalated RDW before treatment was found to serve as a standalone variable that negatively affected the survival rate of young females with breast cancer [14].Moreover, Yao et al. showed that high pretreatment RDW levels in patients with breast cancer were associated with poorer OS and DFS [35], suggesting that RDW may be a potential predictor of poor prognosis in all patients.Yoo et al. [15] further investigated preoperative hematologic indicators in patients with breast cancer and found that a markedly elevated RDW >13.5% was the most robust predictor of postoperative mortality.Surpassing this RDW threshold was associated with about a 1.7-fold increase in both recurrence likelihood and death risk.In contrast, Zou et al. showed no difference in RDW values between patients with breast cancer and patients with breast fibroadenoma, but RDW levels were significantly negatively associated with the histological grade of breast cancer [36].Another retrospective study showed that pre-treatment RDW values in patients with breast cancer were not significantly associated with survival, whereas increased RDW and levels after surgery and adjuvant therapy were associated with poor DFS and OS [37,38].Fu et al. [39] demonstrated that elevated RDW was significantly associated with poor prognosis in triple-negative breast cancer, but not in the Luminal A subtype.Although the conclusions from these studies vary, we should be aware of the value of RDW markers in adjuvant diagnosis, differentiation between benign and malignant breast cancer, and prognostic judgment.

Strengths and limitations
Our study observed a significant increase in short-term, medium-term, and long-term in-hospital mortality rates among patients with breast cancer exhibiting elevated RDW, aligning with findings from some previous studies.In contrast to prior studies, our study stands out in the exploration of the relationship between RDW and in-hospital mortality in patients with breast cancer through several distinctive features.Firstly, by incorporating data from a large sample of patients with breast cancer over the last 11 years into the MIMIC-IV database, we used multivariate covariate analysis to account for distinct baseline characteristics linked to RDW, intending to eliminate the impact of confounding variables (such as age, race, BMI, medication, surgery, and comorbidities).Employing advanced statistical methods and meticulous adjustment for confounding factors enhances the robustness and reliability of our findings.The persistence of the significant association between RDW and in-hospital mortality, even after accounting for these confounding risk factors, underscores the utility of RDW as a precise tool for risk assessment in clinical practice.Secondly, our comprehensive subgroup analyses provide a nuanced understanding of the specific associations between RDW and in-hospital mortality in various subpopulations, offering valuable insights for personalized treatment decision-making.Additionally, our study extends beyond the conventional focus on younger patients or specific phases of breast cancer treatment, integrating RDW into the broader clinical context over an extended period.This approach enables a more comprehensive analysis of RDW's potential impact on patient care, risk stratification, and clinical decision-making.In conclusion, our study not only confirms and refines the association between RDW and in-hospital mortality in breast cancer patients but also contributes methodological nuances, broadening the understanding of RDW's significance through meticulous adjustment, subgroup analyses, and comprehensive clinical contextualization.The methodological rigor and comprehensive exploration distinguish our study, enhancing its applicability and contributing meaningfully to the scientific discourse in this field.The study has several limitations.Firstly, the MIMIC database lacks staging information for patients with breast cancer, preventing the incorporation of this crucial factor into our analysis.The tumor stage is pivotal in understanding the extent of cancer progression and tailoring appropriate treatment strategies.To mitigate this limitation, future research endeavors could involve collaboration with multiple medical centers or databases that encompass comprehensive patient data, including detailed staging information.Secondly, due to the limited availability of postoperative RDW records, with only 11 samples, the clear distinction between preoperative and postoperative RDW data was challenging.Recognizing postoperative RDW as a dynamic indicator reflecting the interplay between systemic inflammatory and immune responses after surgery adds complexity to its interpretation and differentiation from preoperative RDW.Additionally, the reliance on data from a single center introduces limitations in terms of generalizability.To address these constraints and enhance the robustness of our findings, a future multicenter or prospective study with a larger sample size is essential.

Conclusion
In conclusion, our study suggests that RDW, as a simple, inexpensive, and readily available routine blood test, may serve as a significant risk predictor of all-cause mortality in patients with breast cancer.By understanding the relationship between RDW and the survival outcome, we can comprehensively assess the overall health status of breast cancer patients.This knowledge allows for strengthened monitoring and management of high-risk patients in future clinical practice.

Fig 4 .
Fig 4. Restricted cubic spline curves for the RDW hazard ratio.Heavy central lines represent the estimated adjusted hazard ratios, with shaded ribbons denoting 95% confidence intervals.RDW index 14.6 was selected as the reference level represented by the vertical dotted lines.The horizontal dotted lines represent the hazard ratio of 1.0.(a) Restricted cubic spline for 6-month mortality.(b) Restricted cubic spline for 1-year mortality.(c) Restricted cubic spline for 3-year mortality.HR, hazard ratio; CI, confidence interval.Abbreviations: RDW, red cell distribution width.https://doi.org/10.1371/journal.pone.0302414.g004