A Multilevel Model to Estimate the Within- and the Between-Center Components of the Exposure/Disease Association in the EPIC Study

In a multicenter study, the overall relationship between exposure and the risk of cancer can be broken down into a within-center component, which reflects the individual level association, and a between-center relationship, which captures the association at the aggregate level. A piecewise exponential proportional hazards model with random effects was used to evaluate the association between dietary fiber intake and colorectal cancer (CRC) risk in the EPIC study. During an average follow-up of 11.0 years, 4,517 CRC events occurred among study participants recruited in 28 centers from ten European countries. Models were adjusted by relevant confounding factors. Heterogeneity among centers was modelled with random effects. Linear regression calibration was used to account for errors in dietary questionnaire (DQ) measurements. Risk ratio estimates for a 10 g/day increment in dietary fiber were equal to 0.90 (95%CI: 0.85, 0.96) and 0.85 (0.64, 1.14), at the individual and aggregate levels, respectively, while calibrated estimates were 0.85 (0.76, 0.94), and 0.87 (0.65, 1.15), respectively. In multicenter studies, over a straightforward ecological analysis, random effects models allow information at the individual and ecologic levels to be captured, while controlling for confounding at both levels of evidence.


Introduction
The relationship between aetiological factors and risk of disease in multicenter studies can be evaluated at the individual and at the aggregate (ecologic) levels [1][2][3], thus reflecting withinand between-center associations, respectively. In order to estimate simultaneously the two components of the exposure/disease association, multilevel models can be used [4]. In this framework, the role of (a possibly consistent list of) individual and aggregate level confounding variables can be accounted for. In addition, the use of random effects makes it possible to quantify the heterogeneity across centers of the association in multicenter studies, and to investigate the potential sources of such heterogeneity.
The characteristics of individual level and ecologic analyses have been extensively investigated [5][6][7]. Ecologic analysis is relatively efficient when the between-population exposure variability is large relative to the within-population variation [3,8]. However, individual and ecologic analyses are prone to bias due to confounding [3]. Generally, individual level confounders are assessed through specifically designed questionnaires, e.g. dietary or lifestyle. Furthermore, over individual level studies, ecologic analysis may be more robust to classical (random) measurement errors but prone to systematic exposure misclassification [8].
Among the rationale that motivated the European Prospective Investigation into Cancer and Nutrition (EPIC), a large multicenter cohort study on diet and cancer conducted in ten Western European countries [9], was the idea of complementing aetiological evidence at the individual level with information available at the aggregate level through a comparison of dietary mean intakes with mean incidence rates [2]. In EPIC, cohorts from populations with diverse dietary exposures were combined, in order to increase between-subject variability and thereby reduce the impact of measurement errors of individual level dietary assessments [2]. In addition, substantial effort was invested in the collection of dietary measures harmonized across recruitment centers, by means of 24-hour dietary recall (24-HDR) [10]. 24-HDR measurements are used in linear regression calibration models [11] to correct for random and systematic errors in center-specific dietary questionnaire measurements [12,13].
Multilevel analysis on survival data with random effects has been extensively described, including a gamma frailty model where the likelihood function was maximized by an EM algorithm [14], or a two-step algorithm to maximize a penalized partial likelihood [15], a proportional hazard models with random intercept and/or random slope that entails adaptation of penalized likelihood [16], Gibbs sampling [17,18], or EM algorithm [19].
In this work a random effects piecewise exponential proportional hazards model was used to estimate the individual and aggregate components of the association between fiber intake and risk of colorectal cancer (CRC). This individual-level relationship was recently evaluated in the EPIC study using a standard approach [20]. A multilevel random effects model was used to estimate aggregate and individual-level components of the dietary fiber and CRC relationship, thus accounting for study center heterogeneity, both in terms of disease occurrence and exposure/disease associations [21,22]. In order to correct for measurement errors, a linear regression calibration model was also employed [12]. The use of the random effects model is illustrated and discussed.

Materials and Methods
The rationale and methodology used in the EPIC study have been previously described in detail [9]. Briefly, in the EPIC cohort study, participants were recruited from 28 centers in 10 European countries: France (North-East, North-West, South, South-coast), Italy (Ragusa, Naples, Florence, Turin, Varese), Spain (Granada, Murcia, Navarra, San Sebastian, Asturias), United Kingdom (Oxford Health conscious and General population, Norfolk), The Netherlands (Bilthoven and Utrecht), Greece, Germany (Heidelberg and Potsdam), Sweden (Malmo and Umeå), Denmark (Aarhus and Copenhagen) and Norway (South-East and North-West). Between 1992 and 1998, information on dietary, anthropometric and lifestyle factors was collected from a total of 521,457 subjects (70% women), mostly aged from 35 to 70 years. Written informed consent was provided by all study participants. Ethical approval for the EPIC study was provided from the review boards of the International Agency for Research on Cancer (IARC) and local participating centers. Information on habitual individual dietary intake was assessed using different validated dietary questionnaires across participating countries to capture geographical specificity of diet [9]. A single standardized 24-hour recall (24-HDR) of actual food consumption during the previous day for a large (7% of the whole study) representative sub-sample of the entire EPIC cohort was also collected [10]. Lifestyle questionnaires were used to assess additional information from study participants, including physical activity, lifetime alcohol use, smoking history, level of education and medical history.
In this work, after systematic exclusions of participants with missing, incomplete, or implausible information, as detailed elsewhere [20], a total of 478,461 subjects were included in the analysis. Out of 5,262,998 person-years accumulated over an average follow-up time of 11.0 years, 4,517 CRC cases were identified. Cancer incidence data were coded in accordance with the 10 th Revision of the International Classification of Diseases (ICD-10) and the second revision of the International Classification of Disease for Oncology (ICDO-2).

Statistical analysis
Variability of exposure and confounders. The between-and within-center variability of dietary and non-dietary exposures were evaluated. For this purpose, variance components were estimated using a mixed model where the variable 'center' was modeled with a random effect, as detailed in S1 Appendix. Variance component terms were used to compute the intraclass correlation coefficient (ICC), which expresses the proportion of exposure heterogeneity at the center-level compared to total variability. The extent of between-and within-center variability was evaluated also for dichotomous exposure variables, and the associated ICC values.
The disease model. In this work, age was used as main the time scale variable, and it was assumed that the hazard function, λ(t), was piecewise constant in the s = 1,. . .,S age intervals, as Where h 0 is the baseline hazard, and g 1 = 1,g 2 ,g 3 ,. . .,g S , are the relative hazard. We considered S = 7 age intervals, i.e. for the categories <49 years, 50-54, 55-59, 60-64, 65-69, 70-74, >75 years with first and last age intervals being open-ended to avoid a sparse number of CRC cases. The representation of a classical survival likelihood with a piecewise constant hazard by a Poisson likelihood is well known [23]. Thus, the time-to-event data becomes a counting process of 0 events or 1 event occurring with an offset of the time experienced at risk for the event within each time interval. Each study subject i = 1,. . .,n j within center j = 1,. . .,J was represented by S i S observations equal to the number of time intervals up to failure or censoring. For each subject, the intervals S i ranged from the interval corresponding to age at entry (S i,min ) to the interval corresponding to age at failure/censoring (S i,max ). A variable indicating the presence of the event of interest was created, for interval k = S i.min ,. . .,S i,max as d ijk ¼ 0 for k < S i;max ; and d ijk ¼ 1 failure A random effects model was fitted as Indicated with τ k and τ k−1 , the extreme of age interval k, for k = S i.min−1 ,. . .,S i,max−1 , the terms t ijk are equal to 5 years, whereby for the first interval k = S i,min and for the last interval k = S i,max , the terms t ijk express the time unit spent in interval k, that is t ijk = τ kage at entry (i) , and t ijk = τ kage at exit (i) -τ k−1 , respectively. The terms α k = ln(g k ) indicate the logarithm of the relative hazard were estimated as coefficients of a list of indicator variables h ijk , i.e. one for each age interval except the category 60-64 years that was set as reference. The term δ 0j is a random effect that models the center effect on the logarithm of the baseline hazard among men. Sex was included as an indicator variable (0 = men, 1 = women), and modelled with a random effect coefficient δ 1j . The variability of the outcome is captured by the term s 2 0 in men, while in women it is equal to s 2 0 þ s 2 1 þ 2s 01 . In this way, the disease model allows the variability of the outcome across centres to be sex-specific. A negative value of the covariance term σ 01 permits the variability in women to be lower than in men.
The variable X ij is the exposure factor of interest, ie. dietary fiber from questionnaire measurements in this work; the terms X :j are the center-specific means, and Z ij and Z :j are vectors of confounding variables at the individual and aggregate levels, respectively. The coefficients (β W and β B ) model the relationship of dietary fiber and risk of colorectal cancer at the individual and aggregate levels, respectively. The terms β Wj (random slope) have a fixed (β W ) and a random (u wj ) component, which capture heterogeneity of the exposure/disease associations across centers. The terms σ W,0 and σ W,1 express the covariance between random intercepts and random slope in men, as σ W,0 and women, as σ W,0 + σ W,1 . In the case of a positive association, a positive (negative) covariance would suggest that the association between dietary fiber and CRC risk is more (less) pronounced in centers displaying higher CRC incidence. The evaluation of the heterogeneity of the exposure/risk of disease association across centers is of particular relevance in international collaborative studies. Unlike a fixed effects analysis, where this evaluation involves a large number of interaction terms, in a multilevel model, the variance associated to a random slope constitutes a succinct way to quantify the heterogeneity of associations. In addition, potential heterogeneity can be investigated in terms of aggregate level variables, thus exploring the extent of cross-level confounding in the association.The vector of coefficients, γ W and γ B , model the relationship between confounding factors and cancer risk at the individual and aggregate levels, respectively. For the sake of simplicity, γ W is assumed constant across centers, but this assumption could possibly be relaxed.
Adjustment for confounding variables was performed at the individual and at the aggregate level in the disease model. At the individual level the model included: red meat baseline alcohol consumption, energy from fat, energy from non-fat and non-alcohol sources, participant's height and weight, physical activity (inactive/moderately inactive, moderately active/active, unknown), smoking habits (never, former/current, unknown) and educational level (no degree to secondary school, university degree, not specified/missing), consistently with a recent EPIC study [20].
At the aggregate level, models included center-specific means of energy from sources other than fat and alcohol, baseline alcohol consumption, latitude, the percentage of subjects with university degrees and the percentage of former and current smokers. Moreover, latitude of participating centers was considered to account for potential residual ecologic confounding.
In this work, models with different degrees of complexity have been fitted and compared. First, a model with sex-specific random intercept terms was estimated (model (1)). Then, individual level variables were included (model (2)). Aggregate level variables were added, after centering individual level variables to center-specific means (model (3)). Finally, a model with random slopes for the term ðX ij À X :j Þ was fitted (model (4)). Each model was adjusted for individual-and aggregate-level confounding factors Z, as detailed in S2 Appendix.
For proper comparison of dietary measurements between cohorts, and to partially correct risk estimates for the effect of dietary measurement errors, a fixed-effect multivariate linear regression calibration model was used [11,24], in which 24-hour dietary recall data from an 8% sample from all participating cohorts [25] were regressed on dietary questionnaire (DQ) measurements with adjustment for the same list of covariates mentioned above, and further controlled for the day of the week day and the season of recall measurements [12]. In a linear regression calibration model, the conditional expectations of true intake given the questionnaire measurements and the set of covariates, E(T|Q,Z), are computed, under the assumption that 24-hour dietary recall measurements are unbiased (11). E(T|Q,Z) values are used in the disease model. The variability of parameters was corrected to account for uncertainty due to the estimation process of regression calibration. Ten bootstrap samples were randomly generated, and the models (1) to (4) were fitted for each of them. The Rubin's rule was used to obtain a correct estimate of the standard error by combining information across the ten different samples [26]. Linear regression calibration models were used for dietary fiber, red meat, energy from fat and energy from non-fat and non-alcohol sources. Alcohol intake was not corrected for measurement errors as the complexity of dealing with excess zero values in 24-HDRs due to the episodically consumption pattern of this factor, was beyond the scope of the present work.
The proportional hazards assumption was satisfied and evaluated via the inclusion into the disease model of interaction terms between dietary fiber and the time variable, t ijk (data not shown). Statistical tests were two-sided. Multivariate linear regression calibration and piecewise exponential proportional hazards models were fitted using MLwiN [27] and Stata [28]. The syntax to carry out the different steps of the analysis is reported in S3 Appendix.

Results
The center-specific frequencies of CRC cases, means and standard deviations of dietary fiber intake (g/day) using DQ measurements and after linear calibration are reported in Tables 1  and 2, for women and men respectively. In women, DQ dietary fiber means ranged from 18.7 g/day in Malmö to 27.0 g/day observed in Naples, and their ICC values were equal to 0.08 and 0.29 before and after linear calibration. In men, dietary fiber means went from 21.1 g/day in Malmo to values exceeding 30 g/day in the center of Murcia. ICC values were 0.12 and 0.30,  before and after linear calibration. As previously observed and discussed in the EPIC study [29], linear regression calibration reduces the variability of the within-center component, thus resulting in larger ICC values. Means and frequencies of dietary and lifestyle variables at the aggregate level are reported in S1 Table and S2 Table. ICC values ranged from 8% (weight and the percentage of smokers) to 26% (height) in women, and from 3% (weight and the percentage of physically active subjects) to 25% (red meat intake) in men. The large ICC values observed for height (0.25 and 0.24, in women and men respectively) reflects the diversity of the EPIC populations.

South coast of France
The association between dietary fiber intake and CRC risk quantified by rate ratios (RR) estimated in a model using individual-level variables only (model (2) in S2 Appendix) was 0.90 (95% confidence interval (CI): 0.85, 0. 96) for a 10 g/d increment, as reported in Table 3. In a random effects model with dietary fiber and confounding factors included at the individual and aggregate levels (model (3)), RR were 0.90 (0.85, 0.96) and 0.85 (0.64, 1.14), for individual and aggregate levels respectively.
In models with a between-center random slope (model (4) in S2 Appendix, S3 Table), the fixed component of the coefficient was very similar to the RR estimate in model (3). The variance associated to the random component was 0.003 (SE = 0.004, p = 0.5). In men, the correlation between the intercept and the slope residuals was estimated as 0.30, suggesting that the protective association between fiber consumption and CRC was stronger in centers with elevated CRC incidence rates. Random slope models can be easily extended to test whether aggregate level variables explain some of the between-centers heterogeneity in the fiber/CRC association (model (4)). This can be achieved by making the random slope coefficient dependent on aggregate level variables in a random slope model [4]. However, in our study, there was limited heterogeneity across centers in the exposure/disease associations.

Discussion
In this work, within-and between-center components of the association between dietary exposure and risk of disease were simultaneously estimated. A multilevel piecewise exponential proportional hazards model was used to complement standard evidence on the relationship between dietary fiber intake and risk of CRC at the individual level [20] with results obtained at the aggregate level. Results showed that dietary fiber intake was consistently negatively associated with the risk of CRC based on both levels of evidence.
It is very natural in the context of a multicenter study like EPIC to exploit and use each level of evidence to evaluate the relationship between dietary exposure and risk of disease. The use of information from the between-center component with a random effect model is based on the assumption that individual and aggregate components of the exposure/disease association are of similar magnitude. In the context of an etiological relationship that involves dietary factors, it is reasonable to assume that individual-and aggregate-level inference should provide consistent evidence, if aggregate-level confounding is adequately controlled. However, the two components can result in rather different estimates, possibly due to uncontrolled (residual) confounding and/or exposure measurement errors [8].
An important source of bias in ecological studies is the lack of information on confounding factors at the aggregate level [6]. In addition, confounding by center, typically arising from an uneven distribution of confounders across centers, was identified as one of the factors responsible of ecological bias or fallacy, that determines a difference between individual and ecologic risk estimates [5-7, 30, 31]. In the context of a multilevel model in a multicenter study, information is available at the aggregate level, thus potentially addressing both aggregate-level and individual-level confounding.
Inclusion of individual-level covariates explained an estimated 22% and 32% of the crosscenter variability in CRC incidence, in women and men respectively (results obtained by comparing the variance components in models (2) and (1) in S2 Appendix), suggesting that some of the observed heterogeneity across center in CRC incidence was explained by individual level variables. The association between alcohol consumption and CRC risk at the aggregate level was consistent with individuallevel associations observed in EPIC [32]. The percentage of subjects with university degrees and the percentage of former and current smokers were, respectively, negatively and positively associated with CRC risk. These variables are relevant determinants of unhealthy patterns of dietary and lifestyle factors [33,34]. Overall, the association of confounders at the aggregate level was stronger than individual level counterparts. A variable expressing study center latitude was included in the random effects models, in an effort to explore potential residual confounding not captured by the variable center, for example exposure to vitamin D.
Our strategy to include potential individual-level confounding factors followed the standard practice of including in the model variables both associated to dietary fiber and the risk of colorectal cancer. The choice of aggregate level confounding followed a very similar strategy. We consistently used the same confounders at the individual and at the aggregate levels. In our study, a general principle of parsimony was used to include confounder factors at the aggregate level, mainly to account for the fact that aggregate level analysis does not carry the same amount of information, in terms of sample size, ie. statistical units. Means of energy from nonfat and non-alcohol sources, and center-specific percentages of former and current smokers displayed weak associations with dietary fiber (results not shown), but were related to CRC risk at the aggregate level. In our working example, individual-level variables were centered on sex-and study centerspecific means to make disease parameters readily interpretable as within-and between-center estimates of associations. As a result, individual-and aggregate-level variables were orthogonal, de facto limiting the possibility of evaluating cross-level confounding in our study. In addition, the fact that individual-level effect estimates for fiber intake were very similar in models (2) (exposure and confounders included as 'overall' variables) and (3) (exposure and confounders included as 'centered' variables) may suggest that the extent of cross-level confounding was limited in our study.
Cross-level interaction can be explored by relating the variability of association across centers in terms of aggregate-level variables. In this way, random slope coefficients that capture the association (between dietary fiber and risk of colorectal cancer in our study) at the individual level are dependent on aggregate-level variables [4]. The lack of heterogeneity across centers of exposure/disease associations in our study suggested there was little such variability to be explained.
In agreement with Sheppard and Prentice [8], the parameter of interest for biological inference, particularly in nutritional epidemiology, is the individual-level association of fiber with colorectal cancer risk (β W ). The overall relationship between exposure and risk of disease was also fitted, in a model with no inclusion of the variable 'center'. Neuhaus and Kalbfleisch pointed out that this model can be biased, if between-and the within-center associations are heterogeneous, with the degree of bias also depending on the degree of clustering of the exposure [35]. In this study, the between-and within-center estimates were of similar magnitude. The overall RR estimate was equal to 0.89 (95%CI: 0.85, 0.94), mostly reflecting the within-center component of the association. After linear calibration, dietary fiber showed a higher degree of clustering (ICC*0.3), and the overall estimate was 0.87 (0.81, 0.94). It was noteworthy that the overall calibrated estimate displayed a gain in efficiency of 26% when compared to calibrated within-center estimates in model (3). This model did not include the possible increase in the standard errors due to clustering of the outcome, which was nevertheless low in our study.
Non-differential exposure misclassification may cause bias in either direction in multilevel models, depending upon the nature of the exposure (binary, or continuous) and the level of measurement, aggregate or individual [36]. DQ measurements are assumed to be affected by systematic misclassification, while the one single replicate of 24-HDR per subject are hypothesised to be relatively accurate but rather imprecise, as a result of random measurement errors [37,38], possibly reflecting systematic misclassification and day-to-day within-person variability randomly distributed between subjects [39]. As a result, in multilevel studies in nutritional epidemiology, the aggregate level relationship between a dietary variable and an outcome may be biased by errors mostly originating from the lack of comparability of different dietary questionnaires across EPIC centers [2]. In addition, the accuracy of DQ measurements may have a geographical component in terms of accuracy. To minimize the effect of measurement errors in DQs (individual level) and to correct the lack of comparability of DQ measurements across countries (aggregate level), a multivariate linear regression calibration model was used [12].
In the multilevel model, both the within-and between-center components of the association between dietary fiber intake and CRC risk showed an inverse association, with no statistically significant difference between the two estimates. The slightly stronger relation observed at the aggregate level may suggest that information derived from between-population sources can be more robust to classical measurement errors than the corresponding individual level analysis [8]. After linear calibration, the within-and between-center components of the association were closer in magnitude. RR estimates for individual-and aggregate-level dietary fiber intake were 0.85 (0.76, 0.94) and 0.87 (0.65, 1.15), respectively.
In this study, the multivariable piecewise exponential proportional hazards random effects model showed that individual-and aggregate-level evidence were consistent in magnitude and direction, particularly after measurement error correction. The EPIC study offers an ideal design to explore the use of advanced statistical techniques, information on disease outcomes and dietary exposure(s) being available at different levels of evidence. In a relatively simple multilevel model these different components can ultimately become very informative.