Comparison of Pathologic Response Evaluation Systems after Anthracycline with/without Taxane-Based Neoadjuvant Chemotherapy among Different Subtypes of Breast Cancers

Purpose Several methods are used to assess the pathologic response of breast cancer after neoadjuvant chemotherapy (NAC) to predict clinical outcome. However, the clinical utility of these systems for each molecular subtype of breast cancer is unclear. Therefore, we applied six pathologic response assessment systems to specific subtypes of breast cancer and compared the results. Patients and Methods Five hundred and eighty eight breast cancer patients treated with anthracycline with/without taxane-based NAC were retrospectively analyzed, and the ypTNM stage, residual cancer burden (RCB), residual disease in breast and nodes (RDBN), tumor response ratio, Sataloff’s classification, and Miller—Payne grading system were evaluated. The results obtained for each assessment system were analyzed in terms of patient survival. Results In triple-negative tumors, all systems were significantly associated with disease-free survival and Kaplan-Meier survival curves for disease-free survival were clearly separated by all assessment methods. For HR+/HER2- tumors, systems assessing the residual tumor (ypTNM stage, RCB, and RDBN) had prognostic significance. However, for HER2+ tumors, the association between patient survival and the pathologic response assessment results varied according to the system used, and none resulted in distinct Kaplan—Meier curves. Conclusion Most of the currently available pathologic assessment systems used after anthracycline with/without taxane-based NAC effectively classified triple-negative breast cancers into groups showing different prognoses. The pathologic assessment systems evaluating residual tumors only also had prognostic significance in HR+/HER2- tumors. However, new assessment methods are required to effectively evaluate the pathologic response of HR+/HER2+ and HR-/HER2+ tumors to anthracycline with/without taxane-based NAC.


Introduction
Neoadjuvant chemotherapy (NAC) is often used to treat three categories of patient: those with locally advanced breast cancer; those with operable breast cancer who are not candidates for breast-conserving surgery; and those with proven lymph node metastases [1,2]. NAC induces a spectrum of morphologic changes in tumors and lymph nodes, including the complete disappearance of invasive cancer cells (pathologic complete response [pCR]), partial tumor regression, no response, or progressive tumor growth during treatment [3][4][5]. The pCR rate varies according to the molecular subtype of breast cancer and the therapeutic regimen [6,7], and correlates well with prolonged survival [7,8]. However, the majority of post-NAC breast cancer cases show residual tumor in the tumor bed.
Several pathologic response evaluation systems for residual cancer have been proposed. These evaluation systems can be roughly divided into two categories: absolute assessment of the residual tumor and relative assessment of the treatment response (comparing the cellularity or tumor size of post-NAC specimens with those of pre-NAC specimens or images) [9][10][11][12][13][14]. Parameters such as ypTNM stage, residual disease in breast and nodes (RDBN), and residual cancer burden (RCB) evaluate only residual tumor in the breast parenchyma and lymph nodes [6,13,15]. Conversely, Miller-Payne grading and Sataloff's classification compare the size and cellularity of the pre-and post-NAC tumor [9,10]. The recently developed tumor response ratio (TRR) compares tumor size on pre-NAC images and post-NAC microscopic tumor size [14]. Each evaluation system predicts survival outcome for breast cancer patients. Recent studies compared several of these classification systems and found that they yielded different predictive values. [16,17] However, no standardized and/or superior pathologic response evaluation system exists at the present time.
Breast cancers can be classified using immunohistochemistry-based approaches, the results of which correlate well with the molecular subtypes determined by microarray-based analyses of intrinsic gene expression [18]. For example, the luminal A subtype is estrogen receptor(ER)positive, progesterone receptor(PR)-positive, and human epidermal growth factor receptor (HER)2-negative (ER+/PR+/HER2-); the luminal B subtype is ER+/PR+/HER2+; the HER2positive subtype is ER-/PR-/HER2+; and the triple-negative subtype is ER-/PR-/HER2-. These molecular classifications have some prognostic value [19,20]. Previously, we revealed that each subtype of breast cancer shows intrinsic morphologic differences and characteristic pathologic response patterns to anthracycline and taxane-based NAC [21]. Triple-negative tumors frequently presented as a single mass on pre-NAC MRI analyses, and pre-NAC biopsy specimens showed high overall and invasive cancer cellularity. Hormone receptor (HR)-tumors showed higher nuclear and histologic grades, and denser lymphocytic infiltration than HR+ tumors. The tumors within each subtype retained their morphologic features after NAC. For example, pushing margins, high grade, and high cellularity were observed in triple-negative breast cancers, whereas an infiltrative growth pattern and abundant in situ components were observed in HR+ subtypes. These differences might affect the classification of residual tumors according to different pathologic evaluation systems. Therefore, the most effective system for evaluating the NAC response might be different for each subtype of breast cancer. However, no studies have compared different pathologic evaluation systems for each subtype of breast cancer.
Therefore, the aims of this study were to compare pathologic response assessment systems and identify the one that is best for predicting outcome in patients with different subtypes of breast cancer.

Patients and treatments
In total, 588 female patients were diagnosed with primary breast cancer by core needle biopsy, and all underwent anthracycline with/without taxane-based NAC, followed by definitive surgical excision at Asan Medical Center (Korea) from 2010 to 2012. The patient group yielded 594 tumor specimens (the group included six cases of bilateral breast cancer). The NAC regimen, either anthracycline alone or anthracycline plus taxane, was determined according to the involvement of axillary lymph nodes. None of the patients received neoadjuvant trastuzumab. All patients underwent dynamic contrast-enhanced breast MRI before NAC to measure the number of masses and to determine tumor size.
Of the 588 patients included in the study, 147 (25%) received an anthracycline-based NAC regimen and 441 (75%) received an anthracycline and taxane-based NAC regimen. Anthracycline-based regimens included three to five cycles of 60 mg/m 2 adriamycin and 600 mg/m 2 cyclophosphamide. Anthracycline and taxane-based regimens included either four cycles of 75 mg/m 2 docetaxel plus 50 mg/m 2 adriamycin, or four cycles of 60 mg/m 2 adriamycin and 600 mg/m 2 cyclophosphamide followed by four cycles of 75 mg/m 2 docetaxel. Surgery was performed approximately 3-4 weeks after the final chemotherapy cycle. This study was conducted in compliance with the Declaration of Helsinki and approved by the Institutional Review Board of Asan Medical Center. The requirement for informed consent was waived.

Histologic evaluation
The entire tumor bed was submitted for pathologic evaluation. Pre-treatment biopsy and surgery specimens were histologically reviewed. The histologic grade of the pre-NAC specimens, and the overall pathologic cancer size (area of the primary tumor bed, including in situ carcinoma), and the size of the largest invasive cancer in post-NAC surgery specimens were evaluated. Histologic type was defined according to the WHO criteria, and histologic grade was assessed using the modified Bloom-Richardson classification [22]. pCR was defined as the complete disappearance of invasive cancer cells from breast tissue and lymph nodes(ypT0/Tis, N0). The expression of ER, PR, and HER2 was examined in full sections that were immunostained at the time of diagnosis.

Statistical analysis
The results obtained for each assessment system were analyzed using the Kaplan-Meier method and time-dependent receiver operating characteristic (ROC) curve estimated using inverse probability of censoring weighed (IPCW) [24]. Comparisons of two assessment systems were performed based on the asymptotic Z test [25]. Kappa values were calculated after changing classification categories from 1 to 4 (e.g., Miller-Payne grades 1 and 2 were combined to yield four category values rather than five). Kappa values were interpreted as poor (<0.20), fair (0.21-0.40), moderate (0.41-0.60), good (0.61-0.80), and very good (0.81-1.00). All statistical analyses were performed using SPSS software (version18; SPSS Inc., Chicago, USA) and R program (www.r-project.org). P<0.05 was considered significant.  with/without taxane-based NAC. However, the response values for each cancer subtype showed a significantly different distribution (Table 2). Kappa values were calculated for each tumor subtype in an attempt to identify agreement among the various pathologic response evaluation systems (Table 3). ypTNM stage, RCB, and RDBN showed moderate to good agreement (kappa value, 0.401-0.791) for all subtypes, and

Survival outcomes for each subtype
The median follow-up period was 37.2 months. Kaplan-Meier survival analyses showed disease-free survival rates and their prognostic significance for all six pathologic response evaluation systems in each subtype of breast cancers. For HR+/HER2-tumors (Fig 1), systems absolutely assessing the residual tumor (ypTNM stage, RCB, and RDBN) had prognostic significance. For HR+/HER2+ (Fig 2) and HR-/HER2+ tumors (Fig 3), the association between patient survival and pathologic response assessment results varied according to the examination system used. However, none of evaluation systems yielded distinct Kaplan-Meier survival curves for those patients. On the other hand, Kaplan-Meier survival analysis revealed that all of the pathologic response evaluation systems had prognostic significance for triple-negative tumors in terms of disease-free survival (Fig 4). Each evaluation system yielded distinct Kaplan -Meier survival curves for patients with triple-negative breast cancer. Sataloff's N classification also had prognostic significance for those with HR+/HER2-, HR-/HER+, and triple- To compare the prognostic significance of evaluation systems, time-dependent ROC curve estimation analysis has been performed. In all subtypes, the values of area under the curve (AUC) were over 0.5 in all the assessment systems regardless of time (Fig 5). Only in triplenegative subtype, values of AUC were relatively constant over time. The rankings of predictive accuracy among the systems were variably changed as time passed by. When we compared two evaluation systems among seven systems, none of the evaluation system showed superiority over other systems at every time points (S1 Table).

Discussion
The present study is the first to examine the prognostic significance of several pathologic response evaluation systems using specimens derived from breast cancer patients undergoing anthracycline with/without taxane-based NAC. We found significant differences in the distribution of response values depending on the subtypes. Kappa values were calculated for each tumor subtype to identify agreement among the various pathologic response evaluation systems. Systems that assessed residual tumor in breast tissue and lymph nodes (ypTNM stage, RCB, and RDBN) showed moderate to good agreement for all tumor subtypes. These three systems also showed fair to moderate agreement with the TRR because the size of the residual tumor in the breast also forms part of the TRR. However, the kappa values for the absolute and relative response evaluation systems were generally lower for HR+/HER2-tumors and higher for triple-negative tumors. This difference may be due to intrinsic differences in the morphology of these two tumor types. Triple-negative breast cancers usually have pushing margins and high cellularity, and tend to shrink in response to NAC without a large reduction in tumor cellularity, resulting in more compact tumors [21]. Therefore, a reduction in size is the main outcome measure of a tumor's response to NAC. These characteristics of triple-negative tumors may explain why we found better agreement between the absolute and relative response evaluation systems in such cases. Conversely, HR+/HER2-tumors usually show an infiltrative growth pattern and a therapeutic response in a relatively large area of the tumor bed, accompanied by a reduction in cellularity. Thus, tumors that remain large but show reduced overall cellularity may be more common than with triple-negative breast cancer. These features might contribute to the generally lower kappa values calculated for HR+/HER2-tumors between the absolute and relative response evaluation systems. Although both absolute assessment of the amount of residual tumor and relative assessment of treatment responses (i.e., comparing post-NAC specimens with pre-NAC images or specimens) predict similar clinical outcomes for patients with triple-negative tumors, using absolute assessment systems might be more effective in routine practice. This is because pre-NAC images or biopsy specimens are not always available in a clinical setting; therefore, obtaining results using relative assessment systems might be difficult. Also in HR+/HER2tumors, systems absolutely assessing the residual tumor (ypTNM stage, RCB, and RDBN) showed prognostic significance. Therefore, absolute response assessment systems appear superior in terms of availability for pathologists and predicting the prognosis of patients with triple-negative and HR+/HER2-tumors after NAC based on anthracycline with/without taxane.
Even though the Miller-Payne grade and the TRR showed prognostic significance in some tumor types, neither system takes lymph node status into account. However, several studies show that integrating lymph node status is important [5,26,27]. Similarly, we could find different survival outcome even in tumors with no metastatic tumor cells in lymph nodes according to the presence or absence of response of pre-existing tumor cells. The difference in survival outcome was particularly significant for those with triple-negative breast cancer. Therefore, additional prognostic information may be acquired if pathologic reports mentioned the presence/absence of a therapeutic response in the lymph nodes after anthracycline with/ without taxane-based NAC.
Despite of its originality and novelty, this study has some limitations. First, the follow-up period was relatively short, and the number of patients with HER2+ tumors was small. Therefore, further studies with a larger cohort and longer clinical follow-up are warranted. Second, our conclusions are limited to the cases of anthracycline with/without taxane-based NAC. In cases treated with other regimens might show different results, so that further studies including NAC regimens other than anthracycline with/without taxane as well as other neoadjuvant anti-HER2 or hormonal treatments are also warranted.
In conclusion, most of the currently available pathologic assessment systems used after anthracycline with/without taxane-based NAC effectively classified triple-negative breast cancers into groups showing different prognoses. The pathologic assessment systems evaluating residual tumors only also had prognostic significance in HR+/HER2-tumors. However, new assessment methods are required to effectively evaluate the pathologic responses of HR +/HER2+ and HR-/HER2+ tumors to NAC, especially based on anthracycline with/without taxane. Anonymized data including subtype of breast cancers, the regimen of chemotherapy, and other information essential for the pathologic evaluation systems. (XLSX) S1 Table. Comparison of area under the curve of pathologic response assessment systems in each subtype at each time point (P values). (DOCX)