Estimation of a Common Effect Parameter from Follow-Up Data When There Is No Mechanistic Interaction

In a stratified analysis, the results from different strata if homogeneity assumption is met are pooled together to obtain a single summary estimate for the common effect parameter. However, the effect can appear homogeneous across strata using one measure but heterogeneous using another. Consequently, two researchers analyzing the same data can arrive at conflicting conclusions if they use different effect measures. In this paper, the author draws on the sufficient component cause model to develop a stratified-analysis method regarding a particular effect measure, the ‘peril ratio’. When there is no mechanistic interaction between the exposure under study and the stratifying variable (i.e., when they do not work together to complete any sufficient cause), the peril ratio is constant across strata. The author presents formulas for the estimation of such a common peril ratio. Three real data are re-analyzed for illustration. When the data is consistent with peril-ratio homogeneity in a stratified analysis, researchers can use the formulas in this paper to pool the strata.


Introduction
A central issue in epidemiology is characterizing the relationship between exposure and disease. As many other factors may confound or modify the effect of exposure under study, epidemiologists often need to perform a stratified analysis of these confounders/modifiers. On the one hand, if heterogeneity of exposure effects are present (i.e., the effects are different across strata), then we report the stratum-specific estimates separately. On the other hand, if the data is consistent with homogeneity, then we pool the results from different strata to obtain a single summary estimate for the common effect parameter of the exposure [1].
However, the effect of an exposure can either be measured in a ratio scale, e.g., risk ratio, odds ratio and rate ratio, or in a difference scale, e.g., risk difference, odds difference and rate difference [1]. No one scale is better than the others and so universally endorsed. Worse, an effect can appear homogeneous across strata when using one measure and heterogeneous when using another. Consequently, if using different measure, two researchers analyzing the same data can arrive at conflicting conclusions, which is certainly undesirable.
The sufficient component cause model [1][2][3][4][5][6][7][8][9][10][11][12][13] can help to resolve this conflict. A sufficient cause contains a combination of component causes. There may be many classes of sufficient causes for a disease, and any class with all of its components completed is sufficient to cause the disease. When there is no mechanistic interaction between the exposure under study and the stratifying variable (i.e., when they do not work together to complete any sufficient cause), a particular effect measure, the 'peril ratio', will be constant across strata [13]. In this paper, I present formulas for the estimation of such a common peril ratio. Three real data will be re-analyzed for illustration.

Methods
Consider a dichotomous exposure and disease in the follow-up of a population in a certain time interval. The exposure status is assumed to be time-invariant, and the follow-up, to be without loss to follow up and competing death. A stratified analysis is to be performed based on a stratification variable with a total of K(i~1,2,:::,K) strata. Table 1 presents the data layout for the i th stratum.
The peril ratio (PR) for the i th stratum is defined as Peril ratios are to be interpreted as 'fold decreases' [13]. Assuming that no residual confounding exists, c PR PR i is the fold decrease in a disease-free probability for a subject at the i th stratum if he/she changes status from being unexposed to being exposed. Under the assumption that there is no mechanistic interaction between the exposure under study and the stratifying variable, the peril ratios are constant across strata [13]. This common peril ratio is the fold decrease in disease-free probability for anyone, regardless of the stratum, whose status changes from being unexposed to being exposed.
To estimate the common peril ratio, one can use the inversevariance weighted Woolf-type estimator [14]: where w W i~V ar -1 log c PR PR i : The variance of the logarithm of this pooled estimate (under the largestratum limiting model) is [14] Var log c Alternatively, one can use the Mantel-Haenszel estimator [15]: , the Mantel-Haenszel weight, is the harmonic sum of the unexposed (N E E i ) and the exposed (N E i ) population for the i th stratum. A variance formula for the Mantel-Haenszel estimator, which is valid under both the large-stratum and sparse-data limiting models is [15] Var log c In the above, we have invoked the assumption of peril-ratio homogeneity (equivalently, the assumption of no mechanistic interaction). In practice, this assumption needs to be checked using the data on hand. Here, I extend the PRISM (peril ratio index of synergy based on multiplicativity) test used in a previous paper [13] in order to deal with the present situation of K §2. First, we calculate b d d, a (K{1)|1 column vector of the estimates of the logPRISMs with its i th element (1ƒivK) being and its i th row and j th column (1ƒi=jvK) off-diagonal element, respectively. Next, we calculate the following heterogeneity statistic (Het): Asymptotically (large-stratum limiting model), Het is distributed as a chi-square distribution with K{1 degree of freedom (df) under the null hypothesis of peril-ratio homogeneity (no mechanistic interaction).

Non-diseased Diseased Total
Unexposed  If a summary effect measure is the desired end but the assumption of peril-ratio homogeneity fails, then one can resort to standardization techniques [1] in order to pool the strata. In general, the resulting standardized effect measures have larger variances. Using the total population as the standard, the standardized peril ratio is calculated as where the 'Set X' operator dictates that the exposure status of each and every subject in the population is set to X, and the weight, is the population size (the arithmetic sum of the unexposed and the exposed, cf., the harmonic sum in Mantel-Haenszel weight) for the i th stratum. The variance of log c PR PR S under the large-stratum limiting model is Var log c PR PR S P

Mortality Data from All causes for Tolbutamide and Placebo Treatment Groups
The first example consists of randomized, controlled trial data comparing all-cause mortality between tolbutamide treatment and placebo groups, taken from Table 15-1 in the textbook Modern Epidemiology [1]. The stratifying variable is age (two strata: age,55 and age 55+). Table 2 presents the peril ratios and their 95% confidence intervals (CIs) for the two strata. The heterogeneity statistic is calculated as Het~0:0140. This is to be referred to a chi-square distribution with df~K{1~2{1~1, and the p-value is 0.9057.
Because the data is consistent with peril-ratio homogeneity (no mechanistic interaction between treatment and age), I pooled the two strata in order to obtain a common peril ratio: c PR PR W~1 :0383 (95% CI: 0.9776,1.1027) using Woolf's method, or c PR PR MH~1 :0407 (95% CI: 0.9688,1.1179) using the Mantel-Haenszel method. This implies a ,4% reduction in survival for anyone, young or old, who chooses to take tolbutamide (though it is not significant, judging from the 95% CIs that cover the noeffect peril ratio of one). For this example, age is not an important confounder; the common peril ratio and the crude peril ratio (1.0523) differ very little. This is no surprise as the data is drawn from a randomized controlled trial.

Coronary Heart Disease Occurrence Data for Personality Type A and B Persons
The second example consists of cohort data comparing the occurrence of coronary heart disease (CHD) between personality type A and B persons, taken from Table 7-24 in the textbook Statistical Analysis of Epidemiologic Data [16]. The stratifying variable is age (a total of five strata). Table 3 presents the peril ratios and their 95% CIs for the five strata.
Again, the data is consistent with peril-ratio homogeneity (Het~6:7357 and p-value~0.1505, based on a chi-square distribution with df~K{1~5{1~4). I then pooled the five Table 3. Re-analysis of the coronary heart disease (CHD) occurrence data for personality type A and B persons a . strata in order to obtain a common peril ratio: c PR PR W~1 :0538 (95% CI: 1.0335,1.0744) using Woolf's method, or c PR PR MH~1 :0622 (95% CI: 1.0407,1.0842) using the Mantel-Haenszel method. This implies a 5,6% reduction in CHD-free probability for a type A person when compared with a type B person of the same age. This reduction is significant, judging from the 95% CIs that do not cover the no-effect peril ratio of one.

Subsequent Cocaine Use Data for Early Marijuana Users and Non-Users
The final example consists of twin follow-up data for subsequent cocaine use comparing exposed twin members (early marijuana users) with their unexposed co-twins (non marijuana users), and is taken from the paper of Cummings and McKnight [17]. Treating each twin pair (a total of 311 pairs) as one separate stratum, the data can be presented in a total of 311 'tables' (see Table 4).
This matched-pair data is in accord with the sparse-data limiting model; there are only two subjects in each stratum but the total number of strata is large. Therefore, I applied the Mantel-Haenszel method in order to pool the strata: c PR PR MH~1 :4136 (95% CI: 1.2711,1.5720). (Both Woolf's and the standardization method rely on the large-stratum limiting model and are not applicable to this example. The heterogeneity test also relies on the large-stratum limiting model. Therefore, in this example, the assumption of no mechanistic interaction has to be invoked if the strata are to be pooled, but the assumption by itself is not amenable to testing.) This implies a significant ,1.4 fold decrease in cocaine-naive probability in later years for an early marijuana user when compared with his/her non-marijuana-using co-twin. Ignoring the paired structure of the data, the crude peril ratio for this example is the same as c PR PR MH but the variance is larger [Var log c PR PR crude 0:0041wVar log c PR PR MH 0:0029].

Discussion
The estimation of disease-free/survival probabilities and their variances are important first steps for a stratified analysis regarding peril ratios. If censoring (loss to follow up and competing death) occurs in a follow-up study, then the disease-free/survival probabilities can be estimated using the Kaplan-Meier method, and their variances can be estimated using Greenwood's method [1]. Subsequently, one can proceed to use Woolf's method in order to obtain the common peril ratio and its variance (under the largestratum limiting model) as described in this paper. However, a Mantel-Haenszel-typed estimator and its variance for censored data, which is valid under both large-stratum and sparse-data limiting models await further study.
Under the rare-disease assumption that risk of disease is exceedingly low, a log peril ratio can be approximated by the difference between two risks or two odds [13]. For a follow-up study of a rare disease, one therefore has the option of applying existing stratified-analysis techniques in order to estimate the common risk difference [1]. For a case-control study of a rare disease, one should look for a common odds difference, when there is no mechanistic interaction between the exposure under study and the stratifying variable. Further studies are warranted in order to develop stratified-analysis methods regarding odds differences.

Author Contributions
Conceived and designed the experiments: WCL. Performed the experiments: WCL. Analyzed the data: WCL. Contributed reagents/materials/ analysis tools: WCL. Wrote the paper: WCL. Table 4. Re-analysis of the subsequent cocaine use data for exposed twin members (early marijuana users) and their unexposed co-twins (non marijuana users) a .