The Value of Tumor Infiltrating Lymphocytes (TILs) for Predicting Response to Neoadjuvant Chemotherapy in Breast Cancer: A Systematic Review and Meta-Analysis

Background We carried out a systematic review and meta-analysis to evaluate the predictive roles of tumor infiltrating lymphocytes (TILs) in response to neoadjuvant chemotherapy (NAC) in breast cancer. Method A PubMed and Web of Science literature search was designed. Random or fixed effect models were adopted to estimate the summary odds ratio (OR). Heterogeneity and sensitivity analyses were performed to explore heterogeneity among studies and to assess the effects of study quality. Publication bias was evaluated using a funnel plot, Egger's test and Begg's test. We included studies where the predictive significance of TILs, and/or TILs subset on the pathologic complete response (pCR) were determined in NAC of breast cancer. Results A total of 13 published studies (including 3251 patients) were eligible. In pooled analysis, the detection of higher TILs numbers in pre-treatment biopsy was correlated with better pCR to NAC (OR = 3.93, 95% CI, 3.26–4.73). Moreover, TILs predicted higher pCR rates in triple negative (OR = 2.49, 95% CI: 1.61–3.83), HER2 positive (OR = 5.05, 95% CI: 2.86–8.92) breast cancer, but not in estrogen receptor (ER) positive (OR = 6.21, 95%CI: 0.86–45.15) patients. In multivariate analysis, TILs were still an independent marker for high pCR rate (OR = 1.41, 95% CI: 1.19–1.66). For TILs subset, higher levels of CD8+ and FOXP3+ T-lymphocytes in pre-treatment biopsy respectively predicted better pathological response to NAC (OR = 6.44, 95% CI: 2.52–16.46; OR = 2.94, 95% CI: 1.05–8.26). Only FOXP3+ lymphocytes in post-NAC breast tissue were a predictive marker for low pCR rate in univariate (OR = 0.41, 95% CI: 0.21–0.80) and multivariate (OR = 0.36, 95% CI: 0.13–0.95) analysis. Conclusion Higher TILs levels in pre-treatment biopsy indicated higher pCR rates for NAC. TILs subset played different roles in predicting response to NAC.


Introduction
Breast cancer is one of the most common malignancies among women all over the world. In the USA, approximately 230,000 new cases of invasive breast cancer are expected to be diagnosed in 2014 [1]. However, due to both early diagnosis and improved systemic therapy, the mortality rates for this kind of tumor have decreased in recent decades. Early stage breast cancer may be cured with the future development of therapeutic approaches that are based on appropriate biomarkers. The immune system has been a promising new target for breast cancer diagnosis. Indeed, a large body of evidence has shown the existence of immune defects in breast cancer patients, and various studies have observed the heavy infiltration of tumors by immune cells [2,3]. These immune cells are primarily tumorinfiltrating lymphocytes (TILs) that are associated with good prognosis in various cancers, such as epithelial ovarian carcinoma [4,5], endometrial cancer [6][7][8][9][10], and also breast cancer [11][12][13][14]. These cells demonstrate that the host immune response plays an important role in tumor progression.
Systemic neoadjuvant therapy is the treatment of choice for patients with locally advanced breast cancer and is increasingly used to treat patients with operable breast cancer who are not candidates for breast-conserving surgery or who have proven lymph node metastases [15,16]. Although both chemotherapy and endocrine therapy have been administered in the neoadjuvant setting, cytotoxic chemotherapy is more commonly used because of a more extensive and rapid response. Anti-HER2 therapy, such as trastuzumab, has also been administered in the neoadjuvant setting in combination with chemotherapy in HER2 positive patients [17]. More importantly, patients showing a pathologic complete response (pCR) to neoadjuvant chemotherapy (NAC) may experience prolonged disease-free survival, especially in triple negative breast cancer patients [18][19][20]. Therefore, identifying effective biomarkers useful for predicting the pCR rate is a high priority.
Previous studies have shown that the changes in Ki67 before and after neoadjuvant chemotherapy may indicate a higher pCR rate and good prognosis for breast cancer [21][22][23]. Also, there are some studies showing that infiltration of lymphocytes in tumor may have predictive values for NAC response [24][25][26][27][28][29][30][31][32][33][34] and indicate good survival in adjuvant setting [35,36], but there are no confirmed results of their roles in predicting pCR rate in a neoadjuvant chemotherapy setting. Therefore, we performed a systematic review and meta-analysis, aiming to establish pooled estimates for pCR rate based on the presence of TILs in breast cancer and different subtypes. Since many studies identified TILs by CD3, CD4, CD8 and FOXP3, we also analyzed the predictive value of TILs subset in response to NAC.

Search strategy
Original articles studying the predictive value of TILs in the neoadjuvant setting of breast cancer were sought in the PubMed and Web of Science databases using the following key words: 'breast cancer', 'lymphocytes, tumor-infiltrating', 'CD3positive T-Lymphocytes', 'CD4-positive T-Lymphocytes', 'CD8-positive T-Lymphocytes', 'FOXP3-positive T-lymphocytes' 'neoadjuvant' and 'pathologic complete response'. Additionally, possible additional articles were searched in reference lists of selected papers and related articles as suggested by PubMed. Review articles were also scanned for additional eligible studies.

Inclusion and exclusion criteria
Original and review articles published before May 2014 were extracted. The search results were then screened according to the following inclusion criteria: (1) published as original articles; (2) evaluated human subjects;(3) investigated the predictive value of TILs, CD3+, CD4+, CD8+, and FOXP3+ lymphocytes, including ratios between these subsets in neoadjuvant chemotherapy settings; (4) reported the relationship between TILs and pCR rates; (5) contained the minimum information necessary to estimate the effects (i.e., odds ratio) and a corresponding measure of uncertainty (i.e., confidence interval, P-values, standard errors or variance); (6) written in English. As an additional criterion, when a single population was reported in multiple reports, only the report with the most complete data was included to avoid duplication.
Articles were excluded from the analysis if they met the following criteria: (1) non-original reports; (2) in vitro and animal study; (3) absence of key information such as, odds ratio (OR), 95%CI and P value; (4) immunological clinical trials were rejected, because active immunotherapy aims to modify the presence or the composition of T-lymphocyte subsets.

Data extraction and quality assessment
The selected articles were assessed independently by two reviewers (Y.M. and Q.Q.). The following data were collected from each of the included studies and key elements pertaining to the study design, sample size, country, pCR definition, T lymphocyte subsets, T lymphocyte counting sites, treatment regimen, use of multivariate or univariate logistic model analysis, OR estimates (with the corresponding 95% CIs) for the high density over the low TILs density or each Tlymphocytes subset at certain locations within tumors (intratumoral, stromal and both sites) and the staining cutoff point were obtained (Table 1, S2 Table, Table). Discrepancies were resolved by discussion with a third reviewer (Y.Z.Z.) or by contacting content experts if necessary until the two reviewers reached consensus. The quality of each study was assessed using an established form that was first developed and applied by McShane et al. [37] and Hayes et al. [38]. The following seven domains were assessed and scored on a scale from 0 to 8: inclusion and exclusion criteria, study design (prospective or retrospective), patient and tumor characteristics, description of the method or assay, study endpoints, follow-up time with patients and the number of patients that dropped out during the follow-up period. Studies achieving five or more scores were considered to be high quality.

Statistical analysis
We evaluated the overall ORs and 95% CIs of eligible data for the predictive value of TILs in pCR to NAC in breast cancer. The OR extracted from each study provided an estimate of the ratio of pCR rate for high-density vs. low-density of TILs and/or TILs subset. We then performed subgroup analyses according to the location of lymphocytes infiltration, breast cancer subtypes and TILs subset (CD3+, CD4+, CD8+, or FOXP3+). Pooled ORs were obtained using the chisquare based Q test for heterogeneity assuming two models. Presence of heterogeneity (p,0.05 or I 2 .50%) merited use of the random effects model, the fixed-effects model was used in its absence (p.0.05 or I 2 ,50%). Heterogeneity between studies was evaluated using sensitivity analysis. Publication bias was evaluated using the funnel plot with the Egger's [39] and Begg's [40] tests. All statistical analyses were performed using STATA version 11.0 (Stata Corporation, College Station, TX, USA).

Search results
Fig. 1 presents the selection process for eligible studies. Briefly, a total of 1647 studies were identified for initial evaluation, after a series of exclusions, the final number of studies included in the meta-analysis was 13 and involving 3251 patients. The agreement between the two authors was 98% for study selection and 93% for trial quality assessments.

Study characteristics
The characteristics of all included studies are summarized in Table 1. In the thirteen studies, generalized TILs were reported in eight studies [25,26,31,33,34,[41][42][43], and TILs subset were reported in seven studies [24,25,27,28,30,32,42] including two studies evaluated both TILs and subsets [25,42]. Six [25-27, 33, 41, 43] and seven [24, 28, 30-32, 34, 42] studies were performed in Europe and Asia, respectively. The total sample size from all studies were 3251, ranging from 56 to 840 patients, with a mean of 232 patients in each study. Ten studies enrolled more than 100 patients. All studies were published between 2008 and 2014. The eligible studies tested TILs before NAC (n59) or both before and after NAC (n54). The detection method for TILs was IHC (n56), H&E staining (n56) or gene-signature (n51). The majority of neoadjuvant chemotherapy regimen contains anthracycline and taxane. In HER2 positive patients, trastuzumab or lapatinib were typically used. The most frequently cutoff values used were 10% increment (n55), median (n54) and values calculated by semi-quantitative methods, such as the staining score (n53) or the upper quartile (n51). The most common definition of pCR in all studies was the absence of residual invasive tumor cells in breast and lymph nodes (n57), whereas the remaining 6 studies used the definition of no invasive cancer in breast (n52), no malignant cells in breast (n52), no malignant cells in both breast and axillary specimens (n51), or the Japanese pathological response criteria (n51). The assessment of quality for the individual study was presented in S1 Table. Seven studies had high quality with score more than 4, while six studies had low quality with score equals to or less than 4.
Since breast cancer has been divided into four main molecular subtypes in clinical practice, we analyzed the predictive value of TILs in different subtypes.
The analysis indicated that TILs tested before NAC had predictive values in ER negative (OR53. 30 Fig. 3). However, limited studies analyzed the relationship between TILs and the pCR rate in these subtypes. Therefore, more prospective studies are needed.
Considering age, tumor size, tumor type, grade, hormonal receptor status, lymph node status or therapeutic regimens, five studies [25,26,33,34,42] were pooled for analysis of the density of TILs by multivariate analysis. The cutoff values were 10% increment (in intra-tumoral and stromal sites) and LPBC (in both sites). The results indicated that TILs infiltration was an independent predictive marker for higher pCR rate (OR51. 41 Fig. 4).

Pooled analysis of TILs subset
The most commonly tested markers for TILs subset tested in breast cancer are CD3+, CD4+, CD8+ and FOXP3+. Seven studies [24,25,27,28,30,32,42] were pooled for analysis of the association between T lymphocytes subsets and the pCR rate in univariate way. Five studies reported which subset were present in pretreatment biopsy, and the pooled analysis showed that higher level of CD8+ (OR56. 44 Fig. 5B). Although CD4+ infiltrating lymphocytes also indicated a good response to NAC in pre-treatment biopsy (OR57.33, 95% CI, 2.03-26.40, P50.002; Fig. 5B), there is only one study tested this relationship. More prospective studies are needed to address this issue. Three studies [24,27,28] reported the presence of TILs subset in post-treatment breast tissue and indicated that higher levels of FOXP3+ lymphocytes infiltration after treatment indicated lower response to NAC (OR50.41, 95% CI, 0.21-0.80, P50.009; Fig. 6). Since there was not enough data available concerning the correlation of TILs subset in different locations and subtypes with NAC response, we were unable to perform further analysis.

Sensitivity analysis and publication bias
Sensitivity analysis for the predictive roles of TILs and TILs subset (CD8+, FOXP3+) in different study designs (prospective vs. retrospective), countries (European vs. Asia), sample sizes (#100 vs..100), study quality (score#4 vs..4), cutoff criteria (median vs. others) and locations (intratumoral, stromal or both sites) were shown in Table 2. The patterns of differences were similar to those of the original analysis. The heterogeneity among the studies was significantly reduced in studies from Europe, studies with prospective design, large sample size, high quality, 10% as a cutoff, intratumoral and stromal sites. For TILs subset, all studies included were retrospective. For CD8+ lymphocytes, all included studies had low quality (score 54). However, CD8+ cells in stromal sites or with 10% infiltration as a cutoff had no predictive values. All patients evaluated CD8 staining were from Asia, and there was no difference in studies from Japan or Korea. For FOXP3+ lymphocytes in pre-NAC breast biopsy, studies in Japan, with small sample size, median as a cutoff or in stromal, intratumoral sites did not exhibit significant differences in the predictive roles of FOXP3+ lymphocytes for pCR rate. For FOXP3+ lymphocytes in post-NAC breast tissue, studies in Asia, with a large sample size, or high quality did not exhibit significant differences in the predictive roles of TILs for pCR rate.
A funnel plot, Egger's test and Begg's test were performed to assess the publication bias of the selected studies for the pooled pCR rate analysis. The shapes of the funnel plots revealed little evidence of asymmetry for pooled pCR Tumor Infiltrating Lymphocytes and Neoadjuvant Chemotherapy Response analysis using the TILs level. Egger's test (P50.093) and Begger's test (P50.152) provided no publication bias in these 8 studies (Fig. 8A). For the comparison of pCR rate in different TILs subset, there was some evidence of asymmetry in the funnel plot before (Fig. 8B) treatment. However, only Egger's test (P50.008) before treatment was significant (S4 Table). For TILs in post-NAC breast tissue, there were only three studies, so we did not do the funnel plot, and the Egger's test (P50.179) and Begger's test (P51.000) provided no publication bias.

Discussion
This systematic review and meta-analysis suggested that higher level of TILs in pre-NAC biopsy increased about 4 times of pCR rate, whether or not they were detected in stromal, in intra-tumoral or in both sites in univariate analysis. Considering age, tumor size, tumor type, grade, hormonal receptor status, lymph node status or therapeutic regimens, the multivariate analysis indicated that TILs infiltration was still an independent marker for higher pCR rate.
Since immunosuppression has been found in tumor development and progression, immunotherapy has attracted the interest of investigators. Historically, a number of parameters have been assessed as biomarkers for the host immune response in the tumor microenvironment. TILs have been an effective biomarker of anti-tumor immune response in a wide range of cancers and indicated improved overall survival in epithelial ovarian carcinoma [4,5], colorectal cancer [44], endometrial cancer [6][7][8][9][10] and breast cancer [11][12][13][14]. Compared to adjuvant chemotherapy, neoadjuvant chemotherapy has been proved has been shown to have an equivalent effect on overall survival [16,[45][46][47]. Moreover, pCR, which is the most significant prognostic factor in breast cancer patients with neoadjuvant therapy, is considered to be a valid surrogate marker for overall survival and progression-free survival. More importantly, the neoadjuvant setting provides us with an opportunity to rapidly assess the therapeutic efficacy of a given regimen. Therefore, many studies have analyzed the relationship between the pCR rate and TILs in the neoadjuvant setting.
In an early study, the authors found that the number of intra-tumor CD3+ TILs from pre-treatment biopsy were significantly higher in patients who had pCR after neoadjuvant chemotherapy [48]. Since this study contained a limited sample, the predictive role of TILs in the pathologic response to NAC was subsequently confirmed with a large patient cohort of patients [25]. Depending on the infiltrating sites, TILs may be separated into intratumoral TILs, where lymphocytes have infiltrated in tumor bed (in direct contact with tumors) and stromal TILs, where lymphocytes have infiltrated the tumor stroma [48]. Independent of the training cohort (GeparDuo trial, n5218), in the validation cohort (GeparTrio trial, n5840), a higher percentage of intratumoral TILs was an independent maker of a higher pCR rate. They also found that stromal and intratumoral TILs had the predictive value to indicate a better pathological response in the validation cohort. Other studies [31,42,43] also found that TILs in stroma still could predict treatment response in a neoadjuvant setting. Five studies [25,26,33,34,41] reported that higher TILs numbers in both intratumoral and stromal sites in pre-treatment biopsy indicated a higher pCR rate. These results were consistent with our meta-analysis that a high number of TILs is a significant predictor of the pCR rate in response to neoadjuvant chemotherapy, whether they were detected in stromal, in intra-tumoral or both sites. For most of the studies we included, the median or more than 10% as a cutoff value, although some studies used increments of 10% [25,26,42,43] or LPBC [25,26,41] as a cutoff. We also analyzed these studies, and the results showed that increment 10% of TILs in stomal or intratumorall sites or LPBC patients indicated an increase in the pCR rate in univariate and multivariate analysis. The sensitivity analysis  indicated the robustness of the OR estimates. We found no publication bias in the TILs analysis.
As well as we know that, patients with different subtypes of breast cancer have different responses to NAC. A low pCR rate was observed in HR+ and HER2patients [49][50][51], while a markedly higher pCR rate was reported in HER2+ or triple-negative breast cancer (TNBC) patients [52][53][54]. Whether TILs played different roles in different subtypes of breast cancer remains unknown. We further  Tumor Infiltrating Lymphocytes and Neoadjuvant Chemotherapy Response analyzed the predictive roles of TILs in subtypes of breast cancer. These results indicated that TILs detected in pre-treatment biopsy indicated about 2.5 and 5 times of pCR rate increase in triple negative and HER2 positive breast cancer patients, respectively, but not higher pCR rate in ER positive patients. Previous studies have mostly supported our analysis. Ono et al. [31] found a significant association between pCR and the TILs number in triple-negative patients, but not in other breast cancer subtypes. West et al. [33] also found that higher TILs numbers detected by eight-gene expression were related to the pCR rate in triple negative and HER2 positive patients. Although Yamaguchi et al. [34] found TILs detected by IHC were not a statistically significant correlate to pCR in triple negative patients, only a limited number of patients were included in this study.
Our pooled analysis showed that higher levels of lymphocyte infiltration in TNBC and HER2+ patients indicated a better pathologic response to NAC. For multivariate, one study found TILs in ER- [33] and HER2- [26] patients were an independent marker for pCR considering age, tumor size, grade and node status. TILs subset in breast cancer, primarily CD3+, CD4+, CD8 and (regulatory Tlymphocytes expressing forkhead box P 3 protein) FOXP3+ lymphocytes, have also been studied in the relationship of the pCR rate. Since TILs subset have been shown to change after NAC, we found that higher level of CD8+, CD4+ T and FOXP3+ T lymphocytes in pre-treatment biopsy were correlated with the pCR rate, while CD3+ and lymphocytes in pre-treatment biopsy was not predictive in univariate analysis. Considering age, HR status, HER2 status or Ki67 level, all these subsets were not independent markers for pCR rate. And these finding came from limited studies. Therefore, we have analyzed this result with caution. In posttreatment breast tissue, we found that higher levels of FOXP3+ T lymphocytes infiltration predicted a lower pCR rate in univariate and multivariate analysis. There was none study that showed that the relationship between other T lymphocytes infiltration in post-NAC breast tissue and the pCR rate. The sensitivity analysis indicated sample size, cutoff criteria, study quality or study origin (country) affected the robustness of the OR estimates in TILs subset. Therefore, we should interpret these results with caution and we still need more prospective studies to confirm these results.
In addition to the presence of TILs subset, other ratios or changes were also reported in previous studies. Ladoire et al. [27] found that a reduced Foxp3/CD8 ratio after neoadjuvant chemotherapy was strongly associated with the pCR rate. In their study, CD3+ and CD8+ infiltrates remained stable during the treatment, while FOXP3+ infiltrate strongly declined, suggesting that Treg cells were more sensitive to the chemotherapeutic regimen than were conventional T cells. García-Martínez et al. [55] reported on the relationship between the changes in TILs subset and pCR rate. This study demonstrated that changes greater than the median in CD3+ and CD4+ TILs subset indicated a reduced chance for pCR in response to NAC. These observations suggested that patients with pCR experienced little change in CD3+ and CD4+ T lymphocyte infiltration after NAC, while CD8+ and FOXP3+ T lymphocytes changes were not predictive. Considering limited study reported the relationship between the pCR rate and ratios or changes in TILs subset before and after NAC, more prospective studies were needed in future.
Due to the small number of studies in each subgroup panel, the results of this analysis should be interpreted with caution. Our systematic review has some limitations. First, this is a meta-analysis of published trials, and only six studies included a randomized design, while four studies had a prospective design. The included studies cover a mixed population of operable and locally advanced breast cancer with different prognoses and responses to NAC. Second, different cell scoring methods resulted in bias regarding the assignment of high and low lymphocyte infiltration. In our meta-analysis, cutoff points were used in several ways, with some studies choosing more than 10%, while other studies used the median, quartiles or various scores and related statistics. These differences might be responsible for the variability in reaching a standard threshold for the lymphocyte count. Moreover, in the analysis of TIL, most studies used H&E staining and one study used an 8-gene signature. Third, a limited number of studies showed the correlation between TILs subset and the pCR rate in preciously intratumoral or stromal sites, even in different subtypes of breast cancer. Therefore, there should be more prospective studies to identify these relationships. Moreover, other possible factors, such as hormonal receptor status, grade and ki67 also affect the pCR rate. However, those potentially confounding variables varied considerably among individuals and thus yielded inconsistent prognostic results; multivariate analysis was performed in only a few of the included studies to obtain more precise estimates by adjusting for clinicopathological variables, and we also evaluated these ORs in our meta-analysis. Fourth, some studies concerning TILs subset combined multiple markers and indicated that a particular combination of markers was a more sensitive predictor for recurrence and survival than a single T lymphocyte type, which was identified in intra-tumoural CD8/FOXP3 ratios. However, this finding must be investigated further because of the limited number of studies. Fifth, this is a literature-based analysis, and the majority of included trials are retrospective in nature. Finally, the NAC schemes comprised conventional and nonconventional schedules (e.g., EC-TX epirubicin/cyclophosphamide -docetaxel/capecitabine; EC-T-X epirubicin/ cyclophosphamide -docetaxel -capecitabine; PM paclitaxel/non-pegylated liposomal doxorubicin; PMCb paclitaxel/non-pegylated liposomal doxorubicin/ carboplatin; and HER2 positive patients with trastuzumab therapy) with slightly different durations. Most studies included both anthracycline-and taxane-based regimens. To the best of our knowledge, the present study represents the largest review ever published concerning TILs and NAC response. These results suggested that a higher TILs level indicated a higher pCR rate for anthracycline-and taxanebased regimens and also platinum-and trastuzumab-containing regimens.
However, the results should be still interpreted with caution, because we may have failed to identify some published and unpublished studies with negative results that would have affected our pooled estimates. Although the funnel plots did not provide evidence of publication bias for pCR stratified by TILs or T lymphocyte subsets, we recognize that the use of relatively few studies may have reduced the power for detecting publication bias.

Conclusion
In summary, despite the above limitations, our findings suggest that TILs could serve as a robust marker for predicting the pCR rate to NAC, especially in HER2 positive and TNBC patients. In TILs subset, only FOXP3+ lymphocytes had an independent predictive value in post-NAC breast tissue. The predictive role of CD3+ and CD4+ lymphocytes was still unclear due to a limited number of studies addressing these markers. Future studies should use a prospective study design to improve the quality of clinical data and should also consider the clinic pathological variables of the patient, including the tumor grade, ki67, and other tumor microenvironment factors. TILs subset from different locations, breast cancer subtypes and specific neoadjuvant chemotherapy regimens should also be investigated in future studies.