A New Prognostic Score for Elderly Patients with Diffuse Large B-Cell Lymphoma Treated with R-CHOP: The Prognostic Role of Blood Monocyte and Lymphocyte Counts Is Absent

Background Absolute lymphocyte count (ALC) and absolute monocyte count (AMC) have been documented as independent predictors of survival in patients with newly diagnosed Diffuse Large B-cell Lymphoma (DLBCL). Analysis of the prognostic impact of ALC and AMC in the context of International Prognostic Index (IPI) and other significant variables in elderly population treated in the R-CHOP regime has not been carried out yet. Methodology/Principal Findings In this retrospective study, a cohort of 443 newly diagnosed DLBCL patients with age ≥60 was analyzed. All patients were treated with the R-CHOP therapy. An extensive statistical analysis was performed to identify risk factors of 3-year overall survival (OS). In multivariate analysis, only three predictors proved significant: Eastern Cooperative Oncology Group performance status (ECOG), age and bulky disease presence. These predictors were dichotomized (ECOG ≥1, age ≥70, bulk ≥7.5) to create a novel four-level score. This score predicted 3-year OS of 94.0%, 77.4%, 62.7% and 35.4% in the low-, low-intermediate, high-intermediate and high-risk groups, respectively (P<0.001). Further, a three-level score was tested which stratifies the population better (3-year OS: 91.9%, 67.2%, 36.2% in the low, intermediate and high-risk groups, respectively) but is more difficult to interpret. Both the 3- and 4-level scores were compared to standard scoring systems and, in our population, were shown to be superior in terms of patients risk stratification with respect to 3-year OS prediction. The results were successfully validated on an independent cohort of 162 patients of similar group characteristics. Conclusions The prognostic role of baseline ALC, AMC or their ratio (LMR) was not confirmed in the multivariate context in elderly population with DLBCL treated with R-CHOP. The newly proposed age-specific index stratifies the elderly population into risk groups more precisely than the conventional IPI and its existing variants.


Introduction
Diffuse large B-cell lymphoma (DLBCL) is one of the most frequent subtypes of lymphoma of the Western Hemisphere [1]. The median age at diagnosis is about 65 years and the majority of patients are sixty or older. Novel treatment with rituximabcontaining regimens and better supportive care markedly improved the outcomes in elderly patients [2][3][4]. The improved prognosis of DLBCL in elderly patients may also be related to intrinsic biological features of the tumor [5]. In addition to clinical conditions related to age, the role of the conventional prognostic variables, included in the International Prognostic Index (IPI) [6] or novel revised IPI (R-IPI) [7], may be altered in this population. The IPI was postulated in the pre-rituximab era and some retrospective analyses show its limited predictive value: Despite being a four-level score, the IPI usually identifies only two risk subgroups. Analyses published by Ziepert et al. [8] confirm IPI as a valid predictor when analyzing data from prospective trials with rituximab-based regimens. A subanalysis of older patient population (the RICOVER-60 study) [9] showed overlaps between the high-intermediate and high IPI categories. Moreover, two of the IPI variables (ECOG, and Ann Arbor stage) did not reach statistical significance in the Cox regression model for progressionfree survival (PFS) and overall survival (OS). The novel ''recalculated'' R-IPI is a more powerful tool for the whole population, however with a limited information value for patients older than sixty years. No patients over sixty are considered low risk due to their age. This fact, together with an increasing proportion of elderly patients in good physical conditions, advocates for age-specific prognostic tools. Advani et al. [10] published an analysis of patients older than 60 treated with R-CHOP in US intergroup studies. Their elderly IPI (E-IPI) considered age over 70 as a negative prognostic marker, and it showed a superior discrimination power compared to IPI and ageadjusted IPI (AA-IPI) [6] scores. Unfortunately, no extensive multivariate analysis of predictor variables was done. Prognostic stratification in older population should be more focused on the real ''biological'' age of patients and on primary variables that reflect tumor aggressiveness and immune interaction between the tumor and host. There is growing evidence of a strong predictive role of the absolute lymphocyte count (ALC), absolute monocyte count (AMC) or their ratio (lymphocyte to monocyte ratio, LMR). This supports the hypothesis that host innate immunity is critical in tumor growth control and it is a limiting factor for the efficacy of immunochemotherapy in patients with DLBCL [11][12][13]. The optimal cut-off levels of ALC and AMC may be different in various populations [14][15]. This fact should be taken into account when designing new ALC/AMC-based prognostic schemes [16][17][18].
This retrospective study analyzes the role of conventional clinical and laboratory parameters in an unselected cohort of elderly patients with DLBCL treated in the Czech Republic with rituximab-based chemotherapy. The original focus was on modifying the IPI score for elderly population, by incorporating the prognostic roles of AMC, ALC, and LMR. However, no prognostic role of baseline ALC, AMC or their ratio (LMR) was found in the multivariate context in elderly population with DLBCL treated with R-CHOP. On the other hand, two variants of a novel prognostic score were postulated for this population. The scores are based on age, performance status according to WHO (ECOG), and the presence of bulky disease. Both the novel scores are found to be superior to previously published schemes. The novel scores were successfully validated on an independent cohort of similar group characteristics.

Ethics Statement
The study was performed in accordance with the 2008 revision of the Declaration of Helsinki. All patients provided an informed written consent to anonymous processing of data on their disease. The study was approved by the ethical committee of the Faculty Hospital in Prague.

Subjects
The Czech Lymphoma Study Group (CLSG) is a national scientific organization which provides a platform for cooperation among Czech hematologists, oncologists and hematopathologists. The Lymphoma Registry (LR) is a prospective online database founded and operated by the CLSG which collects data from newly diagnosed lymphoma patients since 2000. The CLSG database covers up to 68% of all newly diagnosed lymphoma cases [19]. It currently contains 11,122 patients with lymphoma, including 627 DLBCL patients sixty years and older treated in the rituximab era. A cohort was selected to include all patients with a histologically confirmed diagnosis of DLBCL who were sixty years or older at the time of diagnosis and were treated with the R-CHOP regime [20]. The cohort included all patients with newly diagnosed DLBCL recorded in LR between April 2002 and May 2010, to allow for at least three-year follow-up. Patients with central nervous system involvement were excluded from the study. All biopsies were reviewed by a reference hematopathologist and the final diagnosis was provided in compliance with the published World Health Organization (WHO) classification. A central review of all final diagnosis reports was carried out [21]. The cohort consists of 443 patients (clinical data summarized in Table 1). On this cohort, all univariate and multivariate statistical analyses (see below) were performed. Before constructing the predictive score, further 64 patients were excluded because of missing data (i.e. at least one of the predictors used in the final score was missing). Consequently, the comparison with existing scores and assessment of the score performance was done on a group of 379 patients. In the original CLSG query, only patients with complete data on ALC and AMC were selected. However, no prognostic role of these predictors was found (see Results). This enabled us to repeat the query without this constraint and thereby obtain a validation cohort of 162 patients from the same population. The validation cohort was selected about 1 year later than the original one.

Follow-up
OS was defined as the time from diagnosis of DLBCL to death from any cause. PFS was defined as time from diagnosis to lymphoma relapse, progression or death of any cause. Analyses were fitted to detect differences in survival times after 3 years of follow-up. All living patients' OS and PFS were censored three years from the diagnosis. This was done because the prognostic factors allow for the best discrimination of the population at around three years from the diagnosis. In later years, DLBCL unrelated factors may start outweighing the DLBCL-related ones in the OS and PFS.

Statistical methods
First, univariate analysis was performed to find out which of the risk factors are significant independent predictors of the 3-year OS. The Cox proportional hazards model was used. All independently significant predictors were consequently used in multivariate Cox regression analysis. By stepwise elimination, the least significant predictors were excluded to arrive at the final model. Only non-significant predictors were excluded. The predictors included in the final model were further dichotomized to allow for the construction of a simple predictive score (see Results). Performance of the newly proposed score was compared to existing predictive scores by means of the concordance measure and Akaike's information criterion (AIC). Concordance measures the probability of agreement for any pair of patients, where  agreement means that the patient with the shorter survival time also has the larger risk score. Comparison of survival times was performed by the Kaplan-Meier survival curve plots and log-rank tests. All statistical analyses were performed using the R software [22]. The significance level of all tests was set to 0.05. Validation was performed by means of the concordance measure and by comparing the proportional hazards of the respective risk groups in the training and the validation cohorts.

Regression analysis
Univariate Cox regression analysis was performed on all prognostic factors listed in Table 1. The regression analysis revealed that all the considered risk factors are significant individual predictors of 3-year OS, except for the sex and AMC. In the multivariate analysis, we started by including all significant univariate predictors in a Cox proportional hazards model, and then gradually eliminated the insignificant ones. The final model contains only age, bulky disease presence (dichotomous) and ECOG performance status (0-4) (see Table 2). The AMC has no prognostic impact and the significance of ALC disappeared in the multivariate context.

Predictive score construction
The construction of a simple predictive score was based on the multivariate analysis results. In order to construct a simple score out of the three significant multivariate predictors, it is necessary to further discretize age and ECOG. Otherwise, the score would stratify the population into too many groups. Thus, we propose the following scheme: A patient gets one point for having age $70, one point for having bulky disease (bulk $7.5 cm) and one point for having ECOG $1. The cut-off level for age (70 years) was determined so that the hazard rate of the high risk group (age $ 70) relative to the low risk group (age below 70) is comparable to the hazard rates of the two remaining predictors. Moreover, the median age of the cohort is 70 years, and the E-IPI prognostic score [10] uses the same cut-off. The ECOG score was discretized according to Figure 2 which clearly differentiates patients with ECOG = 0 from the rest of the population. The three dichotomized predictors remain significant: the hazard rates (HR) for age $70, bulky disease and ECOG $1 are 2.20 (95% CI: 1.52-3.19, P,0.0001), 2.00 (95% CI: 1.39-2.87, P = 0.0002) and 3.18 (95% CI: 1.70-5.95, P = 0.0003), respectively.
Presence or absence of these three binary predictors (risk factors) stratifies the entire population into eight groups (see Table 3). It is convenient to define a four-level prognostic score (here denoted as ABE4-Score to remember that it is derived from Age, Bulk, and ECOG), analogously to IPI, as the number of risk factors present in the patient. Thus, patients without any risk factor ('Group I') are assigned ABE4-Score = 0 (N = 51) and represent the low risk group, patients with 1 risk factor ('Groups ii-iv') are assigned Table 3. Construction of the ABE4-Score.  Table 4 shows the hazard rates of the individual ABE4-Score groups calculated by means of the Cox proportional hazards model with the ABE4-Score as the only predictor. This prognostic score is easy to interpret (it represents the number of risk factors present), however, it is interesting to note that 'Group v' has significantly worse 3-year OS (HR with respect to 'Group I' is 14.8, P = 0.0004) than 'Group vi' (HR = 8.1, P = 0.0005) and 'Group vii' (HR = 6.6, P = 0.0022). The HR of 'Group v' is even comparable to the HR of the worst prognosis 'Group viii' (HR = 18.9, P,0.0001). Thus, it seems that the combination of age $70 and bulky disease, despite ECOG = 0, has comparably pessimistic prognosis as the group where all the risk factors are present. This suggests defining another prognostic score, this time a three-level one: according to the results of the Cox regression analysis (see Table 3), there is no significant difference in HR of Groups i, ii and iii. Thus, Groups i, ii and iii are pooled into the low risk group and are assigned ABE3-Score = 0 (N = 87), Groups iv, vi, and vii represent the intermediate risk group and are assigned ABE3-Score = I (N = 231), and Groups v and viii are assigned ABE3-Score = II (N = 61) and represent the high risk group of patients (see Table 5). The Cox proportional hazards model provides the following results with respect to the ABE3-Score low risk group (see Table 4): HR of

Comparison to existing scoring systems
Let us now compare the ABE4-Score and ABE3-Score systems to existing scoring systems, namely the four-level scores IPI, ageadjusted IPI (AA-IPI), and elderly-IPI (E-IPI), and the three-level scores revised IPI (R-IPI) [6] and its ALC/RIPI form. We fitted a Cox proportional hazards model with each of the scores as the only predictor and calculated the measure of concordance and AIC for each model. Both the ABE4-Score and the ABE3-Score are superior to the existing scoring systems because the ABE4-Score and the ABE3-Score have the highest measures of concordance, which indicate better discrimination. Apart from E-IPI, the ABE4 and ABE3-Scores also have the lowest AIC values in their group which indicate better fit (see the results in Table 6). The estimated 3-year OS by risk groups of individual scoring systems are provided in Table 4 as well as the HR using the lowest risk group as the reference group. These results show better stratification of the risk groups by the ABE4-Score and the ABE3-Score as well. For each of the scoring systems, the estimated OS distribution using the Kaplan-Meier curves are shown in Figures 3 and 4.

Validation
Validation was performed on an independent cohort selected from the same population approximately a year later than collecting the data for the ABE scores construction (see Methods). There is no overlap between the training and validation cohorts. The descriptive statistics of the validation cohort are shown in Table 7. The characteristics of the validation cohort are very similar to the training cohort except for bone marrow involvement, which is notably less present in the validation group (8.8% in the validation group compared to 18.7% in the training group). Also, the median follow-up is significantly lower (3.53 years for the surviving patients) in the validation cohort because it was selected about a year later than the training one. Most of the patients with a long follow-up had already been included in the training group consequently, the validation cohort is biased towards patients with shorter follow-ups. Table 8 compares the hazard rates and the measures of concordance of the ABE3 and ABE4-Score groups in Table 5. Construction of the ABE3-Score. Presence (1) or absence (0) of the three risk factors stratifies the population into eight groups (Group i-viii). The ABE3-Score pools certain groups to stratify the population into three risk groups. Note that the Groups i-viii (the first column) do not appear in ascending order in contrast to Table 3. doi:10.1371/journal.pone.0102594.t005 the training and validation cohorts. For ABE4, the validation hazard rates are well within the confidence intervals of the HR in the training cohort. For ABE3, the validation hazard rates are significantly lower. The measure of concordance for ABE4-Score (resp. ABE3-Score) on the validation cohort reads 0.66 (resp. 0.65).
Both these values are well within the CI of the respective concordance measures on the training cohort.

Discussion
Recent years have brought a lot of information about prognostic role of the absolute lymphocyte count (ALC) and absolute monocyte count (AMC), together with their ratio, LMR. Lymphocytopenia was found to be a strong negative prognostic marker which correlates strongly with the disease burden, patients' fitness and overall outcome. Negative prognostic roles of low ALC and, inversely, high AMC were explained as results of impaired host-tumor immunosurveillance mechanisms and probably also by the weakening of ADCC activity. Unfortunately, none of these studies used large classes of prognostic factors not included in the conventional IPI score [17], [23], [24]. The present study shows that, if more prognostic factors are included, the role of ALC, AMC, and LMR is overshadowed by different factors.
Diffuse large B-cell lymphoma is a disease of elderly patients, with median age at diagnosis of about 70 years [25]. Despite this fact, most of the predictive scores use the cut-off age of 60 and cover the whole population of DLBCL. Elderly population is markedly different from the younger patients who tend to be in a better physical condition. Consequently, some prognostic factors may have different impact on the overall outcome in the elderly population.
This study attempts to establish the roles of ALC and AMC in an unselected DLBCL population aged over 60, when the role of (at least) all IPI-related factors is taken into account. Analysis of the fourteen clinical and laboratory parameters found only three of them to be sufficient (multivariate) predictors of survival: age $70 years, bulk $7.5 cm and ECOG $1. Surprisingly, ALC, AMC, or LMR were not found to add any predictive power to the multivariate model. Even when tested in the univariate context (each factor as the only predictor of the OS), AMC was found insignificant. The analyses were performed both with continuous values of these variables and with dichotomized values (with the cut-off set to the median of each variable). We suggest that the lack of predictive power of the AMC and ALC can be explained by their close correlation with the bulk and ECOG predictors. These two predictors possibly overshadow the role of AMC and ALC in the final model.   On the other hand, IPI-related factors were found to be strong predictors of OS. First, the cut-off for age was set to the median value of 70 years, in agreement with previously published data [10]. Second, the overall fitness of the patients seems to be more important in the elderly population. In contrast to IPI, patients with only a moderate performance status decrease (ECOG $1) showed significantly decreased survival times. Figure 2 shows that the standard dichotomization (ECOG#1 and ECOG $2) does not seem appropriate for the elderly population. Consequently, both the newly proposed ABE scores dichotomize ECOG = 0 and ECOG $1. Another important finding is the strong prognostic role of the tumor bulk. This predictor is not included in the IPI score but its relevance has already been confirmed in younger DLBCL patients but not in older population treated with dosedense regimens [9], [26].
According to the measure of concordance, the four-level ABE4-Score is superior to IPI, AA-IPI, and E-IPI in our dataset. Analogously, the ABE3-Score is superior to both R-IPI and ALC/ RIPI. We advise caution when using the measure of concordance to compare a four-level score to a three-level score, however, even in this comparison, the ABE3-Score outperforms all the standard four-level scores (IPI, AA-IPI, and E-IPI). This interpretation is confirmed by the AIC that shows the ABE3-Score to be superior to all other scores except for the E-IPI. However, the stratification of the cohort according to E-IPI lacks the power of the ABE4-Score, because the hazard rates of the E-IPI groups are much lower than the hazard rates of the ABE4 groups. From the practical point of view, both ABE4 and ABE3 scores show the highest span (highest discrimination power) between low-and high-risk groups (59% and 56% difference in OS at 3 years, respectively) compared to all other scores tested. This fact is well captured in the Kaplan-Meier curves (see Figures 3 and 4). IPI, AA-IPI, E-IPI, R-IPI, and ALC/RIPI scores all exhibit some degree of overlapping among the Kaplan-Meier curves for the various risk groups but ABE4 and ABE3 scores show markedly differing outcomes.
It is important to understand the way in which our scores are ''fitted'' to the data. When fitting a regression model (i.e. tuning its parameters) it comes as no surprise that the fitted model outperforms many other models which were fitted on different datasets. However, in our case, there are no ''tunable'' parameters that can be fitted to our data. Our training dataset was used only to identify the important predictors and, in case of ECOG, make a decision about their dichotomization. The ABE4-Score was successfully validated on an independent cohort selected from the same population. The score was shown to retain its high discriminatory power and high concordance measure. In case of the ABE3-Score, the validation revealed significantly lower hazard rates in the intermediate and high risk groups. This, together with the simpler interpretation of the ABE4-Score (it represents the number of risk factors present) advocates for the use of the ABE4-Score.

Conclusions
Prognostic stratification in lymphoma is a ''moving target'' [16] and our tools should be under continuous revalidation process. Elderly patients are an extremely heterogeneous population and optimal treatment strategy must be adapted with respect to comorbidities and should reflect the true biological age. On the other hand, DLBCL is a curable disease even in the elderly population. Our goal was to postulate a simple, valid and robust prognostic tool for population above the ''arbitrary'' age limit of sixty years, treated with R-CHOP. We have constructed two variants (three-and four level) of a novel prognostic score. For the routine practice, we recommend the four level ABE4-Score, which is simple to interpret (it represents the number of risk factors present) and robust (it was validated successfully). In conclusion, this study represents the first large analysis of a wide spectrum of prognostic factors in elderly, homogenously treated population with DLBCL. Predictive value of lymphocyte or monocyte count has not been confirmed. The proposed scores based on age, bulk and ECOG were found to be superior to previously published schemes. Other researchers are invited to validate our findings on different populations of elderly patients, homogeneously treated for DLBCL.

Author Contributions
Conceived and designed the experiments: VP MT. Analyzed the data: VP TF JF. Contributed reagents/materials/analysis tools: VP RP AJ DB DS TP VC MT. Wrote the paper: TF JF VP. Proof reading: RP AJ DB DS TP VC MT.