Temporal trends in breast cancer survival by race and ethnicity: A population-based cohort study

Introduction Differences in breast cancer survival by race and ethnicity are often assumed to be a fairly recent phenomenon, and are hypothesized to have arisen due to gaps in receipt of screening or therapy. The emergence of these differences in calendar time have implications for identification of their origin. We sought to determine whether breast cancer survival differences by race or ethnicity arose in tandem with the advent of screening or therapeutic advances. Materials and methods A cohort of women diagnosed with invasive breast cancer from 1975–2009 in 18 population-based registries were followed for five-year breast cancer cause-specific survival. Differences in survival according to race/ethnicity and estrogen receptor status were quantified in Cox proportional hazards models, with estimation of hazard ratios (HR), 95% confidence intervals (CI), and absolute risk differences. For 2010, we also assessed differences in survival by breast cancer subtypes defined by hormone receptor and Her2/neu status. Results Among over 930,000 women, initial differences in five-year breast cancer-specific survival by race became apparent among 1975–1979 diagnoses and continued to be evident, with stronger disparities apparent in those of Black vs. White Non-Hispanic (WNH) race and among estrogen-receptor positive vs. negative disease. Within breast cancer subtype, all included race/ethnic groups experienced disparate survival in comparison with WNH women for triple-negative disease. Black women had a consistent gap in absolute survival of .10-.12, compared with WNH women, from 1975–1979 through all included time periods, such that 5- year survival of Black women diagnosed in 2005–09 lagged more than 20 years behind that of WNH women. Discussion Survival differed by race for diagnoses that predate the introduction of mammographic screening and most therapeutic advances. Absolute differences in survival by race and ethnicity have remained almost constant over 40 years of observation, suggesting early origins for some contributors.


Introduction
Breast cancer survival disparities by race and ethnicity have retrospectively been identified using cancer registry and mortality data, with differences in outcome generally believed to have started in the mid-to-late1980's [1]. Investigators have also sought to understand whether specific events, including the advent of adjuvant endocrine therapy (1990), or taxane chemotherapy (1999) [2] have altered outcomes by race in specific eras, but distinct differences by calendar time have not been apparent. Thus, the emergence of survival differences by race or ethnicity have been attributed in part to increased diffusion of screening mammography since the early 1980's, and changes in availability and use of adjuvant therapy that began just prior to the decade of the 1990s. As our understanding of the contribution of these sources has grown [2][3][4][5], they have accounted for 6% (treatment accounted for 0.8% of the12.9% difference in mortality between whites and blacks) [2] to 25% (screening accounted for 8% and treatment accounted for up to 19% of the mortality difference between whites and blacks) [3] of the increased mortality, while the largest proportion of survival disparities increasingly appears to be attributable to a different source, tumor characteristics at diagnosis (24-66%) [2,5]. However, as the evaluation of survival disparities has occurred primarily in more recent data, earlier data that might hold clues to the origin of survival differences generally have not been explored. Such data may provide insight regarding ultimate sources of outcome divergence if survival disparities arise prior to or in tandem with initiation of new cancer diagnostics or therapeutics.
As early data regarding survival outcomes are an untapped resource, we evaluated temporal patterns in breast cancer survival disparities, including by three race/ethnic groups, in Surveillance Epidemiology End Results (SEER) data beginning in 1975. We sought to identify temporal changes in survival differences, including in tumor marker subsets, that might yield fresh insights regarding their origin.

Study population
We conducted a retrospective analysis of a cohort of incident invasive female breast cancer cases diagnosed in 18 population-based sites of the National Cancer Institute-funded Surveillance Epidemiology End Results (SEER) cancer registry system [6]. Eligible women included those with first invasive breast cancer diagnosed from 1975-2010, and were required to be of White, Black or American Indian/Alaska Native race (the latter includes Aleutian, Alaskan Native or Eskimo, and all indigenous populations of the Western hemisphere). Excluded were breast cancers identified only by autopsy or death certificate, those missing age at diagnosis, and those whose cause of death was unknown. White women of Hispanic ethnicity were included, but Black or American Indian/Alaska Native women of Hispanic ethnicity, who constituted a subgroup too small for analysis, and white women identified only by Hispanic surname were excluded.

Data collection
SEER utilizes certified tumor registrars to collect information regarding patient demographics, diagnosis, and first course of treatment (generally within four months of diagnosis). Included in this analysis is information ascertained from the medical chart regarding age, sex, race, and Hispanic ethnicity, date of diagnosis, and tumor characteristics (tumor size, lymph node status, and metastasis information). Since 1990, status of the hormone receptors for estrogen and progesterone (positive, negative, borderline, unknown) has also been collected, and in 2010, the tumor marker Her2/neu was added. The three tumor markers allow tumors diagnosed in 2010 to be classified according to four breast cancer subtypes: including hormone receptor positive (HR+), Her2 negative (HER2-) (Luminal A-like), HR+ Her2 positive (Her2+) (Luminal B-like), HR-Her2+ (Her2-overexpressing), and HR-HER2-(Triple Negative). Hispanic ethnicity was determined by the North American Association of Central Cancer Registries (NAACR) Hispanic Identification Algorithm (NHIA) as calculated by SEER. Invasive breast cancer cases were followed for breast cancer-specific death by linkage to local and state vital statistics registries, as well as the National Death Index. Vital status was confirmed by matching to the files of the Centers for Medicare and Medicaid Services. Women were followed until either death, loss-to-follow-up, or December 31, 2014, the end of the study.

Statistical analysis
Cox proportional hazards models were used to calculate five-year breast cancer-specific survival, with women dying of breast cancer-specific causes within that time period considered "events" and all others censored. Hazard ratios (HR) and 95% confidence intervals (CI) were estimated with an alpha level of .05. Each five-year diagnosis group (i.e., 1990-1994) was followed for a full 60 months (1995)(1996)(1997)(1998)(1999), while breast cancer subtype diagnoses in 2010 were followed for up to 60 months (median 51 mo) through end of 2014. All analyses were adjusted for age in ten-year age groups, and for SEER registry start date. Nine registries initiated data collection in 1973-75, four in 1992, and five in 2000. Interaction terms between registry group and race/ethnicity were fit by including the main effects and their product (race/ethnicity � registry group) to determine whether hazard ratios by race differed by registry group, thus suggesting that addition of data from new registries may have altered HRs. Utilizing adjusted 5-year specific survival proportions from Cox models, absolute differences in survival were estimated, and trends with time were quantified using linear regression. Differences in trend by race/ethnicity were evaluated using generalized linear models (GLM). Analyses of estrogen receptor (ER) or progesterone receptor (PR) marker status were restricted to those of known status who were diagnosed in 1990 or later, and breast cancer subtypes to 2010 diagnoses only. The proportional hazards assumption was validated using cumulative sums of Martingale residuals. All analyses were conducted using the SEER Research Data 1975-2014 released in April 2017, based on the November 2016 submission [6]. These SEER research data contain race/ethnicity-specific data since inception (1975 diagnoses), and estrogen receptor data since the 1990 diagnosis year. All analyses were conducted in SAS (v. 9.4, Cary, N.C.). The data analyses were reviewed and approved by the Institutional Review Board of the University of New Mexico Health Sciences Center.
Black women had an increased risk of breast cancer specific mortality, relative to White Non-Hispanic white women, beginning in the earliest included time period (1.6-fold) ( Table 2). Black women continued to have elevated breast cancer-specific mortality compared to white women for every diagnosis period through the end of follow-up (2.4-fold).
Hispanic women had a significantly elevated risk of breast cancer-specific mortality beginning with diagnoses in 1990-1994 (1.2-fold), relative to White Non-Hispanic women, and risk remained elevated through 2005-2009 diagnoses ( Table 2). While the hazard ratio for mortality appeared to peak in 1995-1999, the hazard ratio for those years and subsequent fiveyear time periods did not statistically differ from one another. For American Indian/Alaskan Native women, breast cancer-specific mortality was elevated 1.4-fold compared to White Non-Hispanic women for diagnoses in 1980-1984, and again for 1990-1994 diagnoses, then American Indian/Alaskan Native women continued to have an elevated hazard ratio through the end of follow up in 2014 (2005-2009 diagnoses) ( Table 2). Similar to Hispanic women, hazard ratios for mortality peaked in 1995-1999 then declined slightly, although more recent hazard ratio estimates did not differ significantly from those in the 1995-1999 diagnosis years.
Five-year breast cancer-specific absolute survival estimates increased with time for all race/ ethnic groups (Fig 1). Notably, an absolute difference of approximately 12 percentage points between Black and White Non-Hispanic women persisted throughout most of the observation period, such that Black women diagnosed in 2005-2009 (.80) (most recent period with 5-year survival) had attained the survival experienced by White Non-Hispanic women after 1980-84 diagnoses (an over 20-year lag, or 22 years when modeled by exact year). Similarly, Hispanic (11 years) and American Indian/Alaska Native (15 years) absolute survival lagged by over a decade behind that of White Non-Hispanic women.
Collection of estrogen receptor (ER) information began with 1990 diagnoses (Table 3). Among ER positive women, Black, Hispanic and American Indian/Alaskan Native women all had an increased risk of dying of breast cancer-related causes (1.2-2.4-fold) relative to White Non-Hispanic women in 1990-1994 diagnoses, which continued through end of follow-up.
Five-year breast cancer -specific survival estimates ranged from .80 (Black women) in 1990-1994 to .92 (White Non-Hispanic women) in 2005-2009, with race/ethnic group survival estimates generally differing by 4 to 8 percentage points from White Non-Hispanics in each time period (Fig 2). Women with ER positive disease appeared to have a greater breast cancer survival disparity than their counterparts with ER negative disease (p-value for interaction = .0001 for Black, .006 for Hispanic, and .32 for American Indian/Alaska Native, relative to White Non-Hispanic women).
Among women with ER-negative tumors, breast cancer-specific survival was also lower among Hispanic, Black, and American Indian/Alaskan Natives in all time periods compared to White Non-Hispanics (Table 3). Breast cancer-specific survival estimates for 5-year . Absolute differences in survival proportion exhibited more variability among women with ER negative disease, reflecting the smaller sample size available despite the inclusion of five-year diagnosis groups, with differences between White Non-Hispanic and other race/ethnic groups ranging from 2 to 11 percentage points by time periods (Fig 3).
The proportional hazards assumption did not hold for a few diagnosis years or race-ethnic subgroup analyses overall or in ER-positive or ER-negative analyses. Specifically, violation of the assumption only occurred in 1990 or later diagnoses, when the initiation of data collection in new SEER registries caused some subgroups to double in size, thus facilitating the detection of seemingly small quantitative differences. Survival results are split into 1-24 and 25-60 months post-diagnosis for comparison (Table 4 -assumption violations in bold) although a difference across only a few months sometimes appeared to prompt the violation. Absolute risk estimates for 5-year breast cancer-specific survival do not require the proportional hazards assumption. Temporal trends in breast cancer survival by race and ethnicity: A population-based cohort study To determine whether addition of data from new SEER registries influenced these results, an interaction term for grouped registry start data (1973-1975, 1992, 2000) and race/ethnicity was fit. For 1995-1999 and later years, survival was significantly greater than expected on a multiplicative scale for American Indian/Alaskan Natives in registries that initiated data collection in 1992, suggesting that addition of new SEER registries may have increased survival estimates for American Indian/Alaska Native women (data not shown).
Five-year survival by race/ethnicity increased consistently throughout the observation period for White Non-Hispanic women (linear slope = .023; p-value for trend .001) (a slope of .02 suggests an increase in survival such as from .85 to .87 for each 5-year follow up period), Hispanic women (slope = .021; p-trend < .0001; Black women (slope = .026, p-trend < .0001), and American Indian/Alaska Native women (slope = .028; p-trend = .004). Slope did not differ by race-ethnic group (p-value = .68). Over the entire period of observation, 5-year survival for  Temporal trends in breast cancer survival by race and ethnicity: A population-based cohort study Table 4. Comparison of hazard ratio estimates for breast cancer survival in months 1-24 and 25-60 post diagnosis. Relationships that failed to meet the proportional hazards assumption denoted in bold and by �� . All others (nonbold) provided for completeness.

990-1994:
White White Non-Hispanic and Hispanic women each equally increased by .13 (from .74 to .87 for Hispanic women for instance), while American Indian/Alaska Native women gained .14 (from .72 to .86) and Black women gained .15 (from .65 to .80). ER-specific slopes were not estimated because statistically stable estimates required more than four time points. Women diagnosed in 2010 and followed through the end of 2014 offered an opportunity to evaluate five-year breast cancer-specific survival by breast subtype (Table 5). Black and Hispanic women with Luminal A or Luminal B-like disease experienced increased mortality (1.3-2.1-fold), relative to White Non-Hispanic women with the same subtypes. Among women with Her2 overexpressing tumors, only Black women had a greater risk of dying of breast cancer related causes (1.4-fold) than White Non-Hispanic women. Black, Hispanic and American Temporal trends in breast cancer survival by race and ethnicity: A population-based cohort study Indian/Alaskan Native women with triple negative tumors all were more likely to die of their disease (1.3-2.3-fold) than White Non-Hispanic women with that subtype.

Discussion
Our findings provide a broad perspective on the development of breast cancer survival disparities by race/ethnicity. Our results suggest that gaps in survival by race are present even in the earliest available SEER data (1975)(1976)(1977)(1978)(1979)  Temporal trends in breast cancer survival by race and ethnicity: A population-based cohort study illustrate that absolute increases over the 1975-2014 period were similar for each group of women, but the baseline survival (1975)(1976)(1977)(1978)(1979) differed substantially, suggesting the nature of the gap that needs to be closed. Our findings should be evaluated with consideration of the strengths and limitations of the study. SEER registry staff collect race and ethnicity as indicated in medical records, and do not have independent means to verify recorded information. Similarly, while tumor marker classification may be derived from pathology reports, some may be abstracted from medical charts. The definition of ER and PR positivity changed from 10% to 1% positive staining in 2010 [7], and retrospective information that would allow results to be standardized to the current definition is not available. In addition, despite the inclusion of data from 18 SEER registries and 40 years of diagnoses, sample sizes were limited for some estimates, including for American Indian/Alaskan Natives and rarer breast cancer subtypes.
When examined across 5-year diagnosis periods, breast cancer survival estimates by race/ ethnicity present several contrasting observations. Since 1990-1994 diagnoses, gaps in survival on the hazard ratio scale have become evident for Hispanic and American Indian/Alaskan Natives. One potential explanation is increasingly accurate reporting of race and ethnicity with time. Access to advances in healthcare may also be implicated. However, in 1992, additional SEER registries initiated data collection, leading to over a doubling in numbers of breast cancer cases in those race/ethnic groups compared to 1985-89 (Table 2). While our analyses suggest that hazard ratios are homogeneous in both registry groups, the additional cases may have allowed detection of smaller but statistically significant differences.
Although the relative hazard estimates suggest an increase over time in survival disparities for Black women, an absolute risk difference of 10 to 12 percent in comparison with White Non-Hispanic women has been fairly constant for each 5-year observation period. Hazard ratio increases are apparent because the 12-point gap is measured against a White Non-Hispanic (reference group) approaching over 90% 5-year survival with time. While newly arisen factors may have contributed to the 12-point gap to a varying extent over the observation period, rather than a persistent contribution from baseline factors present in 1975-1979, the study data cannot distinguish between these hypotheses. The overall improvements in breast cancer-specific mortality, coupled with a lag, possibly of over 20 years in length, in attaining equivalent absolute survival for some presents a distinct challenge to policymakers, health care providers, and affected women.
ER positive disease currently allows a wider range of treatment options and confers a more favorable prognosis. Thus, the findings raise questions regarding the source of the higher HR for race/ethnic disparities evident in less aggressive disease. Our results are consistent with those in other investigations [5,[8][9][10][11] including several in which survival differences by race/ ethnicity (on a multiplicative scale) are stronger among women with ER-positive disease. While endocrine therapy adherence for five or more years may only be attained by a fraction of those treated [12,13] and thus would constitute a compelling explanation for poor survival in this subgroup, Black or Hispanic women have had reduced adherence to such therapy in some previous studies [14,15] butt adherence equivalent to White Non-Hispanic women in others [16].
Breast cancer subtype-specific survival information should further inform efforts to close survival gaps. Her2-overexpressing disease, not previously identified as subject to differential outcomes by race/ethnicity, may warrant attention directed to appropriate therapy receipt and adherence as in other subtypes. In several studies, race/ethnic women have had poorer outcomes despite either receipt of guideline-based treatments [17], or standard of care in clinical trials [8,18,19]. Thus, hitherto unrecognized tumor biology [20] differences by subgroup may be driving some disparate prognosis. Our results hint that a multi-pronged approach, uniting access to appropriate care and support for adherence, with more precise tumor-specific therapeutic options may pave the way to narrow and ultimately eliminate survival differences.
Supporting information S1 Fig. Flow