Analysis of breast cancer survival in a northeastern Brazilian state based on prognostic factors: A retrospective cohort study

Breast cancer is a major health problem worldwide. Analysis of breast cancer epidemiology in emerging countries enables assessment of prognostic factors, cancer care quality, and the equity of resource distribution. We aimed to estimate the overall (OS) and cancer-specific survival (SS) of breast cancer patients in the northeastern Brazilian state of Sergipe to identify independent prognostic factors. We analyzed a cohort for the factors age at diagnosis, place of residence, time to treatment, staging, and molecular classification, using the Kaplan–Meier method, log-rank test, Pearson’s chi-squared test and Cox regression model. The outcome was the vital status at the end of the study. Our analysis showed an OS probability of 0.72 and an SS probability of 0.75. In multivariate analysis, time to treatment within 60 days, stage IV, and triple-negative classification remained independent prognostic factors for both OS [unadjusted hazard ratio (HRp) 1.50 (1.21; 1.86), HRp 16.56 (8.35; 32.85), and HRp 2.73 (1.73; 4.29), respectively] and SS [HRp 1.43 (1.13; 1.81), HRp 20.53 (9.45; 44.56), and HRp 3.14 (1.88; 5.26), respectively]. Better survival was demonstrated for the following patients: those receiving their first treatment after 60 days, with an OS of 52.5 months (51.2; 53.8) and SS of 53.5 months (52.3; 54.7); stage I patients, with an OS of 58.8 months (57.7; 60.0) and SS of 59.2 months (58.1; 60.3); patients without nodal metastasis, with an OS of 54.2 months (53.0; 55.4) and SS of 55.6 months (54.5; 56.7); and patients with luminal A classification, with an OS of 56.8 months (55.0; 58.5) and SS of 57.8 months (56.2; 59.4). This study identified independent prognostic factors and that OS and SS were lower for patients from Sergipe than for patients in high-income areas. Therefore, determining the profiles of breast cancer patients in this population will inform specific cancer care.

Introduction Breast cancer is a major public health problem because it has high incidence rates with consequent high morbidity and mortality. It is the second most common cancer in women after nonmelanoma skin cancer worldwide, with over two million new cases annually [1][2][3]. Despite early detection resulting in favorable prognosis, breast cancer is still the leading cause of cancer death among women, especially in economically deprived regions [1,4,5].
In some high-income countries, breast cancer incidence and mortality rates have steadily increased, while in others these rates have decreased [6]. However, in low-and middle-income countries, breast cancer incidence and mortality rates have consistently increased [3,7,8]. According to estimates from the Brazilian National Cancer Institute, breast cancer was the most common type of cancer in women (excluding nonmelanoma skin cancer) in 2020, with over 66,000 cases and a mean age-standardized rate (ASR) of 43.74 per 100,000 women. In Northeast Brazil, an economically deprived region, the mean ASR was 43.74 per 100,000. In Sergipe, a northeastern Brazilian state, the mean ASR was 44.27 per 100,000 [3]. The Brazilian Mortality Information System recorded 16,593 deaths from breast cancer in 2019, and the North and Northeast regions displayed the highest rates [9].
Cancer statistics can reveal the effectiveness of public health policies, the equity of resource distribution, and the impact of predictive factors on survival. Therefore, assessing breast cancer survival can help establish criteria for the objective evaluation of patient prognosis and can contribute to the improvement of cancer control strategies [10].
In Brazil, factors such as study design, calendar year, and region and population studied might explain differences in survival [11][12][13]. Several other factors, such as staging, age at diagnosis, time from diagnosis to treatment, race, histology, and socioeconomic status, may also play a role; however, their contributions are still uncertain [13][14][15][16][17][18]. It is also noteworthy that breast neoplasms of similar histological subtype can present different outcomes, which may be consistent with the molecular subtype [18].
Based on these assumptions and considering that survival studies in economically deprived regions are scarce, the present study aimed to identify independent prognostic factors for breast cancer survival in the northeastern Brazilian state of Sergipe and to assess how they influenced survival in the study population.

Materials and methods
We collected data from a retrospective hospital cohort of breast cancer patients treated in the main cancer facility in the state of Sergipe, Brazil from 2014 to 2010. Patients were followed up for at least 60 months.
We used the hospital-based cancer registry (HCR) database to retrieve information from women with invasive breast cancer diagnosed at either the facility or elsewhere. This facility is the largest referral center for the study population; therefore, a considerable number of advanced cases are registered for management. The HCR personnel input demographic, tumor, and treatment variables into the HCR information system, data obtained solely from medical records.
To define the vital status and to collect information concerning the date and underlying cause of death, the HCR was used to search the Brazilian Mortality Information System of the Ministry of Health. To complement information concerning death, the HCR was used to access the following databases: 1) the National Deceased Registry (CNF Brazil), 2) the Federal Revenue Service of Brazil, 3) the Brazilian Electoral System, and 4) the National Health Registry.
The Authorization for Outpatient Procedures database was accessed for additional information concerning staging, molecular classification, hormone therapy, chemotherapy, and radiotherapy. The database of the Authorization Hospital Admissions of the Ministry of Health provided information on clinical and surgical admissions.
The variables used were as defined on the standard tumor registry form [19]. For age at diagnosis, we employed age groups according to the hormonal phases (�45 years, 45 to 54 years, 55 to 64 years, and �65 years). We selected other variables, such as place of residence (whether from the capital or countryside/outside), time from diagnosis to treatment, staging [20], histological type, molecular classification (defined as luminal A, luminal B, HER2/neuenhanced, or triple-negative after immunohistochemical profiling and Fluorescence In Situ Hybridization (FISH) test whenever necessary), and vital status. The time to treatment was set at �60 days and >60 days to conform to Law No. 12 732/2012 [21]. Missing data were considered confounding variables that might influence survival estimates.

Statistical analysis
We used the Kaplan-Meier method to estimate the survival probability of the cohort and then calculated OS and SS for each time interval as the number of women surviving divided by the total number at risk.
To appraise differences among survival distributions, we applied the log-rank test and checked whether any factor would influence the time to event. The Bonferroni method provided compensation for the effect of multiple comparisons, provided correction for the logrank test results, and assessed differences among several subgroups of variables to control significance levels by adjusting P values.
To evaluate the effect of multiple independent variables and the burden that some prognostic factors may impose upon the outcome, we resorted to Cox's proportional risk model. The method required evaluation of the independent variables by a univariate analysis and then by a multivariate analysis, identifying hazard ratios, adjusted (HR) and unadjusted (HRp), and 95% confidence intervals. To test the proportional hazards assumptions, we employed the method based on scaled Schoenfield residuals.
Pearson's chi-squared model was used to analyze the differences in proportions between the categorized variables to a 5% significance level. The backward selection method selected variables that would fit the tests. We used R Core Team 2020 to perform all the analyses.

Ethical considerations
The Research Ethics Committee of the Federal University of Sergipe approved this research. We conducted all methods in accordance with relevant guidelines and regulations. As patient databases remained anonymized, obtaining informed consent was not possible. Consequently, as specified in Resolution number 466, December 12, 2012, of the Ministry of Health of Brazil, the ethics committee granted exemption from the necessity for informed consent. In addition, all data remained confidential to be used exclusively for scientific purposes.

Results
We included 1,278 women with invasive breast cancer in this analysis. Of these, 966 were alive and 312 had died by the end of follow-up. Considering place of residence, 60.7% of the patients lived in the countryside. The median age at diagnosis was 55 years and patients were distributed similarly among the age groups. Invasive ductal carcinomas were the most frequent breast neoplasm subtype (90.1%). A high number of patients (47.6%) had their first treatment 60 days after diagnosis. Most patients were stage II (32.2%); however, this information was missing from the medical records of 22.9% of cases. Most of the patients did not have lymph node involvement (42.4%) but, again, the number of missing data points was high (32.4%). Most of the cases (30.6%) had their molecular status determined as luminal B. It should be noted that Ki67 was not stained for in 27.5% of cases, preventing determination as either luminal A or B; therefore, these cases were considered luminal X (Table 1).
Less favorable survival estimates were produced for patients who had their first treatment within 60 days, with an OS of 48 months (95% CI 46.3; 49,6) and SS of 50.1 months (95% CI  Table 2). Even though the Schoenfield test rejects the hypothesis of hazard proportionality, S1 and S2 Figs show that hazards remain fairly constant throughout the follow-up period, except for the cancer-specific survival variables. Thus, to explain non proportionality, time-dependent variables were presented (Table 3).

Discussion
The present study demonstrated that time to treatment, staging, and molecular classification of HER2 significantly impacted OS and SS in the univariate (unadjusted) analysis. In the multivariate (adjusted) analysis, time to treatment after 60 days, stage IV, and triple-negative classification remained independent prognostic factors. The survival estimates observed in the study were lower than those found in some affluent areas of Brazil [22][23][24], as well as in highincome countries [25,26] and China (89.4%) [27].
Patients in this study under the age of 40 years had a lower OS and SS than older patients. Some studies report that patients under 40 years of age usually present unfavorable prognostic characteristics, usually associated with advanced staging, HER2 overexpression and nodal metastasis [28]. Nixon et al. (1994) reported that women under 35 years of age had poor tumor differentiation, lymphatic involvement, necrosis, and estrogen receptor negativity; consequently, they had

PLOS ONE
more recurrences and distant metastases [29]. In contrast, older women present less aggressive features but have several comorbidities that, when associated with advanced stages, might contribute negatively to survival [13], although this was not shown in our data.
An intriguing finding was that time to treatment � 60 days can be considered as a prognostic factor. While reanalyzing our data, we assumed that this was because of factors such as younger age and more advanced and triple-negative tumors; however, it remained an independent prognostic factor after multivariate analysis. It is possible that improvement in health care plays a role in this finding. The difficulty in accessing diagnosis and treatment in low-and middle-income countries has to be overcome because better cancer survival is a consequence of early diagnosis and timely treatment [30,31].
Advanced staging remained an independent prognostic factor after multivariate analysis, and it became quite clear that both OS and SS decreased as staging progressed. Bulky tumors directly interfere with the quality of life of patients with breast cancer. Conversely, patients presenting at early stages undergo less aggressive modalities of treatment and face a lower risk

PLOS ONE
of death [32]. In the present study, stage IV indicated an increased risk of death, as also determined by Höfelman et al. (2014) [32] and Fayer (2014) [33]. We were cautious in our analysis because of the high percentage of missing information, mainly in staging, which might have influenced the results. The group with missing data on staging was presented as an independent factor in the multivariate analysis, as also performed by Basílio (2011) [28] and Brito, Portela and Vasconcellos (2009) [34]. Ayala et al. (2019) warned that failure to register this information, especially in the Breast Cancer Information System (SISMAMA), would compromise data monitoring [13].
The different survival probability estimates for different molecular classifications might require different considerations. The different survival probabilities between luminal A and luminal X might be caused by the portion of HER2-overexpressing tumors that were not detected. In addition, missing data (approximately 10%) were shown to be an independent prognostic factor in multivariate analysis, also denoting a confounding factor for survival. Some studies report that a lack of this information may be associated with difficulty in accessing adequate diagnosis and treatment and may be correlated with social status [33,34]. Apart from that, triple-negative classification was a clear independent prognostic factor. Al-Thoubaity (2020) reported that HER2 overexpression and triple-negative features were the most frequently observed; they were associated with an early surge in young women, usually harboring bulky tumors and lymph node metastases [35].
In the present study, the most common histology type was invasive ductal carcinoma, while invasive lobular carcinoma comprised only a small part of the cohort, which was similar to the findings of Basílio (2011) [28]. In our study, lobular carcinomas indicated worse prognosis, which was also observed by others [36][37][38][39][40][41].
The independent prognostic factors identified in our multivariate analyses were in agreement with hospital-based studies [33,42]. Among these, lymph node metastasis only impacted OS and SS in univariate analysis. Some studies estimated that it indicated an increased risk of death of four to eight times [30,43].
Some limitations of this research should be considered, such as the use of retrospective secondary data from medical records without controls and a lack of standardization of the information in pathological reports. They might have interfered with the accuracy of the presented results.
Despite these limitations, we have determined prognostic factors and estimated the survival probabilities of cancer patients in a northeastern Brazilian state. The outcomes observed indicate the need to improve the cancer care system in this region. The data obtained support the implementation of targeted strategies to improve breast cancer survival irrespective of socioeconomic and cultural background, with the ultimate aim of healthcare equity.

Conclusions
In summary, our results indicate that independent prognostic factors, such as time to treatment � 60 days, advanced stage, and triple-negative molecular classification, significantly impact OS and SS. In addition, a lack of information, such as staging and molecular classification, may compromise survival analysis and, consequently, jeopardize cancer care actions.
Estimating OS and cancer-specific survival provided a better understanding of the profile of breast cancer patients treated in the state of Sergipe. This emphasizes the need for specific health policies to improve access to cancer facilities for early diagnosis and timely treatment.