## Figures

## Abstract

Epidemiologists often use ratio-type indices (rate ratio, risk ratio and odds ratio) to quantify the association between exposure and disease. By comparison, less attention has been paid to effect measures on a difference scale (excess rate or excess risk). The excess relative risk (ERR) used primarily by radiation epidemiologists is of peculiar interest here, in that it involves both difference and ratio operations. The ERR index (but not the difference-type indices) is estimable in case-control studies. Using the theory of sufficient component cause model, the author shows that when there is no mechanistic interaction (no synergism in the sufficient cause sense) between the exposure under study and the stratifying variable, the ERR index (but not the ratio-type indices) in a rare-disease case-control setting should remain constant across strata and can therefore be regarded as a common effect parameter. By exploiting this homogeneity property, the related attributable fraction indices can also be estimated with greater precision. The author demonstrates the methodology (SAS codes provided) using a case-control dataset, and shows that ERR preserves the logical properties of the ratio-type indices. In light of the many desirable properties of the ERR index, the author advocates its use as an effect measure in case-control studies of rare diseases.

**Citation: **Lee W-C (2015) Excess Relative Risk as an Effect Measure in Case-Control Studies of Rare Diseases. PLoS ONE 10(4):
e0121141.
https://doi.org/10.1371/journal.pone.0121141

**Academic Editor: **Massimo Pietropaolo, University of Michigan Medical School, UNITED STATES

**Received: **July 4, 2014; **Accepted: **February 10, 2015; **Published: ** April 28, 2015

**Copyright: ** © 2015 Wen-Chung Lee. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

**Data Availability: **All relevant data are within the paper and its Supporting Information files.

**Funding: **This paper is partly supported by grants from the Ministry of Science and Technology, Taiwan (NSC 102-2628-B-002-036-MY3) and the National Taiwan University, Taiwan (NTU-CESRP-102R7622-8). No additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The author has declared that no competing interests exist.

## Introduction

Quantifying the association between exposure and disease is a central issue in epidemiology [1]. To this end, epidemiologists often resort to ratio-type indices, such as rate ratio or risk ratio. Under the rare-disease assumption, these indices can be approximated by the odds ratio (also a ratio-type index), which can be conveniently estimated in (cumulative) case-control studies (where controls are selected from noncases at the end of a follow-up period) [1,2]. By comparison, less attention has been paid to effect measures on a difference scale, such as excess rate or excess risk.

In an authoritative textbook, Breslow and Day [3] cited several examples from the literature of cancer epidemiology demonstrating that the ratio-type indices (but not the difference-type) provide a stable measure of association in a wide variety of human populations. In addition to the empirical justifications, they also highlighted the logical properties of the ratio-type indices, stating that they (but not the difference-type indices) are “useful for appraising the extent to which the observed association may be explained by the presence of another agent, or may be specific to a particular disease entity”. For these reasons and their estimability in case-control studies, they are most epidemiologists’ indices of choice when it comes to quantifying exposure-disease associations.

The ‘excess relative risk’ (ERR) used primarily by radiation epidemiologists [4,5] is of peculiar interest here. Compared to the aforementioned indices, ERR is unique in that it involves both difference and ratio operations; it is the excess risk per unit of exposure (difference operation here) divided by the background risk (ratio operation here). ERRs correspond to the beta parameters of a ‘linear relative risk model’ [1,6,7]. For example with two exposures, the model is Risk = exp(*α*) × (1 + *β*_{1}*x*_{1} + *β*_{2}*x*_{2}), and the ERRs are *β*_{1}, for the first exposure, and *β*_{2}, for the second exposure, respectively. The model implies linear dose-response trends and additivity of effects due to different exposures (the addition operations). The model also separates a risk into two parts (the multiplication operation): the background risk, exp(*α*), and the relative risk function, (1 + *β*_{1}*x*_{1} + *β*_{2}*x*_{2}). Under the rare-disease assumption, the model becomes a ‘linear odds ratio model’: Odds = exp(*α*) × (1 + *β*_{1}*x*_{1} + *β*_{2}*x*_{2}) [1]. Therefore, the beta parameters (that is, the ERRs) are estimable from case-control data. (The ERR index should not be confused with the ‘relative excess risk’ (RER), an index previously proposed by Suissa [8]. RER is a *comparative* effect measure specifically designed to compare the effects of two different exposures (or treatments). In the linear relative risk or linear odds ratio model, RER = *β*_{1}/*β*_{2}. As this paper focuses on effect measures *proper*, we will not discuss the RER index any further.)

Although the ERR index appears to be a sensible effect measure for a radiation exposure, it has received little attention outside in the field of radiation epidemiology. The purpose of this study is to promote its general use. Using the theory of sufficient component cause model [9–11], I will show that when there is no mechanistic interaction (no synergism in the sufficient cause sense) between the exposure under study and the stratifying variable, the ERR index in a rare-disease case-control study should remain constant across strata and can therefore be regarded as a common effect parameter. (This is true whether the exposure of interest is a radiation agent or not.) As a bonus, I will show that by exploiting this homogeneity property, the related attributable fraction indices can be estimated with greater precision. I will demonstrate the methodology using a case-control dataset. I will also show that ERR preserves the logical properties of the ratio-type indices.

## Methods

### Common Effect Parameters

Consider the relation between an exposure and a disease in a population. The exposure is binary (*E* = 1 indicating a subject is exposed, *E* = 2, otherwise) with an exposure prevalence of *p*) and the disease is assumed to be rare (with a very low disease rate). A stratified analysis is to be performed based on a stratification variable *S* with a total of *L* strata. Let Peril_{s,e} denote the ‘peril’, and Risk_{s,e}, the disease risk, within a fixed duration of follow-up for subjects with *S* = *s* and *E* = *e* in the study population for *s* = 1,2,…,*L* and *e* = 1,2 [A peril is the reciprocal of a risk complement, that is, Peril_{s,e} = (1 − Risk_{s,e})^{−1}, see references 12 and 13.] Let Risk_{E = 1} (Risk_{E = 2}) be the marginal disease risk (collapsed over *S*) for the exposed (unexposed) population, with a marginal relative risk (mRR) as

Most epidemiologists are familiar with the concept of *multiplicative* interactions. If there is no multiplicative interaction between the exposure under study and the stratifying variable, we have , for *s* = 1,2,…,*L*. However, Lee [12, 13] showed that when there is no *mechanistic* interaction between the exposure under study and the stratifying variable, it would be the *peril* ratios (instead of the *risk* ratios) that are constant across strata for a fixed duration of follow-up, that is, , for *s* = 1,2,…,*L*.

In this paper, we impose the rare-disease assumption. A constant peril ratio therefore implies a constant excess risk (ER), that is, also a constant. The approximation, logPeril = −log(1 − Risk) ≈ Risk, has a bias of less than 0.05% for risk less than 0.001—the setting for studies on cancers, coronary heart diseases, etc. For more common diseases, such as hypertension or type 2 diabetes, the approximation breaks down and the method proposed here would be inapplicable.

The ERR, the focal point of this paper, is

ERR has an additional advantage over ER as an effect measure for the exposure under study; it is not only constant across strata but should be more stable for different durations of follow-up. (By contrast, ER will be roughly doubled if follow-up duration is doubled.) For a homogeneous (unstratified) population, ERR is simply RR minus 1.

For the exposed population, the ‘counterfactual’ disease risk (disease risk when each and every exposed subject, contrary to fact, is unexposed) is its factual disease risk (Risk_{E = 1}) minus a specific constant, ER, under the constant ERR model. (Note that because the stratification variable *S* may create a confounding effect, the counterfactual disease risk for the exposed population is in general not equal to the factual disease risk for the unexposed population, that is, Risk_{E = 1} – ER ≠ Risk_{E = 2} in general.) Under the constant ERR model, the population attributable fraction (PAF) is therefore
and the attributable fraction among the exposed population (AFE) is

For a rare disease, PAF and AFE should also remain stable for different durations of follow-up.

### Estimation in Case-Control Studies

The above ERR, PAF and AFE indices (but not the ER index) can be estimated from a case-control study conducted in the population. Assume that a case-control study recruited a total of *n*_{1} cases and *n*_{2} controls. Let CS_{s,e} (CN_{s,e}) denote the number of cases (controls) with *S* = *s* and *E* = *e* in the case-control sample. Recall that the case-control odds in a case-control study are a constant multiple (the reciprocal of the control sampling fraction of the case-control study, *f*) of the corresponding disease odds (and disease risks for a rare disease) in the population [1]. Therefore, the disease risks for the exposed population, the unexposed population, and subjects with *S* = *s* and *E* = *e* in the study population, can be estimated (if *f* is known) as and respectively, where the plus sign in the subscript indicates a summation over the corresponding index. In general *f* is unknown of course. However, the following three sets of parameters do not depend on *f*:
and
respectively, where is the sample estimate of the odds difference among subjects with *S* = *s* in the case-control data.

### Exploiting the Homogeneity Property

Homogeneity of ERR in the study population implies that the expected values of the odds differences in the case-control data are constant across strata, i.e., , for *s* = 1,2,…,*L*. Therefore, we also have and respectively, for *s* = 1,2,…,*L*. This suggests the following weighted-average estimators for ERR, PAF and AFE:
and

S1 Exhibit presents the optimal weighting systems (in the sense of minimal variances for the weighted averages) and the variance formulas for these three indices. To set confidence limits, it helps to do the log transformation [*y* = log(1 + *x*) with ] to and the complementary log transformation [*z* = −log(1 − *x*) with ] to and for better approximations. The limits are then transformed back [*x* = exp(*y*) − 1 or *x* = 1 − exp(−*z*)] to the original scale.

The proposed method is based on a constant ERR model (no mechanistic interaction between the exposure under study and the stratifying variable for a rare disease). In practice, this needs to be checked using the data on hand. S2 Exhibit presents a homogeneity test, which is a chi-square test with a degree of freedom of *L* – 1 under the null hypothesis. The test may have low power if the degree of freedom is too large.

If the homogeneity assumption fails, ERR (and ER) will no longer be a meaningful effect measure for the exposure. However, the attributable fraction indices under heterogeneity ( and ) can still be estimated albeit with larger variances, by letting , the proportion of exposed controls falling in stratum *s*, as the weighting system (S3 Exhibit).

S4 Exhibit presents SAS codes for all the calculations.

## Results

Shapiro et al’s [14] case-control data of myocardial infarction (taken from Table 2–14 in the textbook *Case-Control Studies*: *Design*, *Conduct*, *Analysis* [15]) is reanalyzed here in order to demonstrate the methodologies. The study examined the age-specific relation of myocardial infarction to recent oral contraceptive use (a total of five age strata, see Table 1). The data is consistent with the constant ERR model (p-value = 0.2225; using the chi-squared test in S2 Exhibit).

Table 2 presents the optimal weighting systems under homogeneity (*w*_{s} for , *u*_{s} for and *v*_{s} for , using the method detailed in S1 Exhibit). From Table 3, we see that the use of oral contraceptive incurs a 57% (95% confidence interval, CI: 16%~112%) increase in myocardial infarction risk. Population-wide, it accounts for 4.8% (95% CI: 1.5% ~ 7.9%) cases, or 54.9% (95% CI: 36.5%~67.9%) exposed cases.

The weighting system under heterogeneity (*q*_{s}) is also presented in Table 2 for comparison. Without exploiting the homogeneity property, it results in much larger variances for the (6.6183 E-4 > 2.6436 E-4) and (1.3197 E-2 > 0.6182 E-2), as compared to those presented in Table 3 under homogeneity assumption.

## Discussion

Like the commonly used ratio-type indices (rate ratios, risk ratios and odds ratios), ERR also maintains the logical properties of Breslow and Day [3]. Using ERR to quantify association strengths, S5 Exhibit shows that for an observed exposure-disease association to be explained away by an unmeasured factor, the putative factor (if it exists) must be at least as strongly associated with exposure, and also as strongly associated with disease, as that seen between the exposure and disease under study. S6 Exhibit further shows that if the exposure under study is only associated with a specific disease entity, ERR for the exposure and this disease entity will be greater than that for the exposure and the disease as a whole.

S7 Exhibit shows that a constant ERR and a constant risk ratio models cannot be reconciled except for a weak exposure or when disease risks vary little across strata. This raises an interesting proposal that examples of genuine mechanistic interactions may be far more common than we thought—but unfortunately because of the ratio-type indices used, they went unrecognized by previous researchers. Further studies are needed to investigate this postulate. The irreconcilability between the constant ERR and the constant risk ratio models also reveals the inappropriateness of using the risk ratio (or odds ratio) as a measure for exposure effect indiscriminately for all situations, a practice most (if not all) epidemiologists currently follow.

In summary, the ERR index enjoys the logical properties that were previously thought to be exclusive to the ratio-type indices. The ERR index (but not the difference-type indices) is estimable in case-control studies. For rare diseases and in the absence of (sufficient-cause) mechanistic interaction, the ERR index (but not the ratio-type indices) will remain constant across strata and can be regarded as a common effect parameter. Exploiting this homogeneity property, one can also estimate the attributable fraction indices with greater precision. In light of the many desirable properties of the ERR index, the author advocates its use as an effect measure in case-control studies of rare diseases.

## Supporting Information

### S1 Exhibit. Optimal weighting systems and variance formulas for excess relative risk (ERR), population attributable fraction (PAF) and attributable fraction among the exposed population (AFE).

https://doi.org/10.1371/journal.pone.0121141.s001

(PDF)

### S3 Exhibit. Estimation of population attributable fraction (PAF) and attributable fraction among the exposed population (AFE) under heterogeneity.

https://doi.org/10.1371/journal.pone.0121141.s003

(PDF)

### S5 Exhibit. A proof that for an observed exposure-disease association to be explained away by an unmeasured factor, the putative factor (if it exists) must be at least as strongly associated with exposure, and also as strongly associated with disease, as that seen between the exposure and disease under study.

https://doi.org/10.1371/journal.pone.0121141.s005

(PDF)

### S6 Exhibit. A proof that if the exposure under study is only associated with a specific disease entity, excess relative risk (ERR) for the exposure and this disease entity will be greater than that for the exposure and the disease as a whole.

https://doi.org/10.1371/journal.pone.0121141.s006

(PDF)

### S7 Exhibit. A proof that a constant excess relative risk (ERR) and a constant risk ratio (RR) models cannot be reconciled except for a weak exposure or when disease risks vary little across strata.

https://doi.org/10.1371/journal.pone.0121141.s007

(PDF)

## Author Contributions

Conceived and designed the experiments: WCL. Performed the experiments: WCL. Analyzed the data: WCL. Contributed reagents/materials/analysis tools: WCL. Wrote the paper: WCL.

## References

- 1.
Rothman KJ, Greenland S, Lash TL, eds. Modern Epidemiology, 3rd ed. Philadelphia: Lippincott; 2008.
- 2. Hogue CJR, Gaylor DW, Schulz KF. Estimators of relative risk for case-control studies. Am J Epidemiol 1983;118:396–407. pmid:6613982
- 3.
Breslow NE, Day NE. Statistical Methods in Cancer Research, Vol I, The Analysis of Case-Control Studies. Lyon: International Agency for Research on Cancer; 1980. pmid:7379018
- 4. Thompson DE, Mabuchi K, Ron E, Soda M, Tokunaga M, Ochikubo S, et al. Cancer incidence in atomic bomb survivors. part II: solid tumors, 1958–1987. Radiat Res 1994;137:s17–s67. pmid:8127952
- 5. Preston DL, Ron E, Tokuoka S, Funamoto S, Nishi N, Soda M, et al. Solid cancer incidence in atomic bomb survivors: 1958–1998. Radiat Res 2007;168:1–64. pmid:17722996
- 6. Richardson DB. A simple approach for fitting linear relative rate models in SAS. Am J Epidemiol 2008;168:1333–1338. pmid:18953061
- 7. Langholz B, Richardson DB. Fitting general relative risk models for survival time and matched case-control analysis. Am J Epidemiol 2010;171:377–283. pmid:20044379
- 8. Suissa S. Relative excess risk: an alternative measure of comparative risk. Am J Epidemiol 1999;150:279–282. pmid:10430232
- 9. VanderWeele TJ, Robins JM. The identification of synergism in the sufficient-component cause framework. Epidemiology 2007;18:329–339. pmid:17435441
- 10. VanderWeele TJ, Robins JM. Empirical and counterfactual conditions for sufficient cause interactions. Biometrika 2008;95:49–61.
- 11. VanderWeele TJ. Sufficient cause interactions and statistical interactions. Epidemiology 2009;20:6–13. pmid:19234396
- 12. Lee WC. Assessing causal mechanistic interactions: a peril ratio index of synergy based on multiplicativity. PLoS ONE 2013;8:e67424. pmid:23826299
- 13. Lee WC. Estimation of a common effect parameter from follow-up data when there is no mechanistic interaction. PLoS ONE 2014;9:e86374. pmid:24466062
- 14. Shapiro S, Slone D, Rosenberg L, Kaufman DW, Stolley PD, Miettinen OS. Oral-contraceptive use in relation to myocardial infarction. Lancet 1979;1:743–747. pmid:85989
- 15.
Schlesselman JJ. Case-Control Studies: Design, Conduct, Analysis. Oxford: Oxford University Press; 1982.