Is Opium a Real Risk Factor for Esophageal Cancer or Just a Methodological Artifact? Hospital and Neighborhood Controls in Case-Control Studies

Background Control selection is a major challenge in epidemiologic case-control studies. The aim of our study was to evaluate using hospital versus neighborhood control groups in studying risk factors of esophageal squamous cell carcinoma (ESCC). Methodology/Principal Findings We compared the results of two different case-control studies of ESCC conducted in the same region by a single research group. Case definition and enrollment were the same in the two studies, but control selection differed. In the first study, we selected two age- and sex-matched controls from inpatient subjects in hospitals, while for the second we selected two age- and sex-matched controls from each subject's neighborhood of residence. We used the test of heterogeneity to compare the results of the two studies. We found no significant differences in exposure data for tobacco-related variables such as cigarette smoking, chewing Nass (a tobacco product) and hookah (water pipe) usage, but the frequency of opium usage was significantly different between hospital and neighborhood controls. Consequently, the inference drawn for the association between ESCC and tobacco use did not differ between the studies, but it did for opium use. In the study using neighborhood controls, opium use was associated with a significantly increased risk of ESCC (adjusted OR 1.77, 95% CI 1.17–2.68), while in the study using hospital controls, this was not the case (OR 1.09, 95% CI 0.63–1.87). Comparing the prevalence of opium consumption in the two control groups and a cohort enrolled from the same geographic area suggested that the neighborhood controls were more representative of the study base population for this exposure. Conclusions/Significance Hospital and neighborhood controls did not lead us to the same conclusion for a major hypothesized risk factor for ESCC in this population. Our results show that control group selection is critical in drawing appropriate conclusions in observational studies.


Introduction
Case-control studies are the design of choice in studying less common diseases such as esophageal cancer. Although esophageal cancer ranks 8 th in incidence amongst all cancers [1], it is rare enough that even in many large cohorts, it may take a long time to have enough numbers of cases sufficient for statistical analysis [2,3,4]. Therefore, although consortia of cohorts can help to have enough numbers of cases, case-control studies are still widely used to study the etiology of esophageal cancer.
Defining an appropriate sampling frame from which controls should be selected is arguably one of the most difficult tasks in designing a case-control study. The aim is to select a group of controls which are representative of the community from which cases have been selected. In their review of the methodological issues of case-control studies, Wacholder and colleagues have stressed the importance of study base and control selection in casecontrol studies, and discussed several sources for control selection, including population controls, hospital or disease registry controls, controls from a medical practice, friend controls, relative controls, controls selected from case series, proxy respondents and deceased controls [5]. Neighborhood and hospital-based controls have been used in many studies. Each of these controls has advantages and disadvantages. For example, enrolling hospital controls is usually more convenient and less costly, and information collected from cases and controls is more comparable in the sense that both cases and controls respond in a medical setting, but it also has the disadvantage that cases and controls may not be from the same study base and the referral pattern for the disease of interest may be different. A comprehensive treatment of this subject is given elsewhere [6].
The Golestan Case-Control Study in northeastern Iran was carried out in two phases. In the pilot phase of the study, 130 incident esophageal squamous cell carcinomas (ESCCs) and 260 matched hospital controls were enrolled, while in the main phase of the study, 300 ESCC cases and 571 matched neighborhood controls were recruited [7]. In this manuscript, we compare the results obtained from the pilot phase of this study, which used hospital controls, and the results of the main phase, which used neighborhood controls, to evaluate tobacco-related variables and opium as risk factors for ESCC.

Ethics Statement
The study was approved by the Institutional Review Boards of the Digestive Disease Research Center of Tehran University of Medical Sciences and the US National Cancer Institute.

Case Selection
This study compares results from the pilot phase (March 2002-November 2003) and the main phase (December 2004-June 2007) of the Golestan Case-Control Study. Case selection and procedures in the pilot phase and the main phase of the study were the same. A detailed description of the case selection procedures has been published [8]. All cases were evaluated at Atrak Clinic in Khatam Hospital, located in Gonbad City, which is the only specialized clinic for upper gastrointestinal tract cancers in Golestan Province. Patients suspected of having upper gastrointestinal tract cancers were referred by local physicians to the clinic. A population-based cancer registry confirmed that about 70% of incident ESCC cases visited Atrak. All suspected cases underwent upper gastrointestinal endoscopy. Biopsy samples of the esophagus taken during endoscopy were reviewed by expert pathologists at the Digestive Disease Research Center, Tehran University of Medical Sciences. All cases enrolled in both phases had pathology-proven esophageal squamous cell carcinoma.

Control Selection
Hospital-based controls in the pilot phase study. Hospitalized subjects were individually matched to each case on age and sex. Because tobacco use, alcohol consumption, and diet are thought to be important risk factors for ESCC [9], controls were selected from inpatient subjects with diseases thought to be unrelated to tobacco use, alcohol consumption, or diet. When a case was diagnosed, the control selection team reviewed the list of the patients in hospital wards, especially trauma wards, prepared a roster of potential sex-and age-matched (62 years) controls and randomly selected two patients. If either of the selected controls refused or was too ill to participate, the team selected another control from the roster. Control selection matched the expected pattern of cases presenting to the local hospitals. Most (78.5%) of the controls were selected from Khatam hospital, where Atrak Clinic is located, and the rest of the controls were selected from other hospitals in Gonbad City. The response rate for hospital controls was 95% of the first selected patients.
Neighborhood controls in the main phase study.
Neighborhood controls were selected using the Iranian Family Health Census as the sampling frame. Two subjects were matched to each case by place of residence (urban neighborhood or village), age (62 years), and sex. The interview team identified all of the potentially eligible controls in the case's village or urban area, and randomly selected two subjects to interview. If either of the selected controls could not be interviewed for any reason, another person on the list was randomly invited, and so forth. The total number of enrolled neighborhood controls was 571. Of the enrolled neighborhood controls 77% were the first randomly selected subjects, 11% were the second, and the remainder required more than two selections. In nearly all instances the reason that an eligible control did not participate in the study was the absence of the control at the time of invitation [10].

Data Collection
After obtaining written informed consent, a structured questionnaire with closed questions and pre-categorized responses was administered to both cases and controls by physician-researchers from Atrak Clinic. The questionnaire included detailed information on demographics, family history of cancer, history of tobacco, opium and alcohol use, drinking tea habits, oral health, and socioeconomic variables. More detailed information on the questionnaire is available elsewhere [8].

Statistical Analysis
Univariate and multiple variable conditional logistic regression models were used to measure crude and adjusted odds ratios (OR) and 95% confidence Intervals (CI). In addition to the matching factors of age, sex, and place of residence, we also adjusted for education (as a determinant of socioeconomic status), ethnicity (Turkmen versus non-Turkmen) and consumption of cigarette, hookah, Nass and opium (for those analyses that these were not the main independent variable). We chose two previously identified risk factors for ESCC for comparing the two studies, use of different forms of tobacco and use of opium. We calculated the pvalue of the heterogeneity test for each set of adjusted ORs. For significant differences between the two sets of controls, we also compared data from our two controls series with the data from the Golestan Cohort Study, a large cohort study conducted in the same geographic area [11,12]. Age, sex and ethnicity distributions of the two control sets and the cohort were different from each other, so we used the cohort population as the standard population and used the indirect standardization method to calculate expected (standardized) prevalence rates for the other groups.

Results
A total of 130 ESCC cases and 260 hospital-based controls were enrolled in the pilot phase of the study, while the corresponding numbers of cases and neighborhood controls in the main phase of the study were 300 and 571, respectively. Demographic characteristics of the studies participants are shown in Table 1.
Distributions of age, ethnicity, and place of residence were similar between the cases in the two studies, but hospital-based controls were more likely to be non-Turkmen and live in the urban areas than the cases or the neighborhood controls. Tables 2, 3, 4, 5, 6 show the exposure distributions among cases and controls in the two phases of the study. Crude and adjusted ORs and 95% CIs are presented for each phase of the study. Table 2 shows the results for smoking cigarettes. The striking feature of this table and the related results is the overall very low prevalence of smoking in this population, across cases, hospital controls and population controls and these rates were lower in the hospital-based study; 13% of the controls in the hospital-based study and 17% of the controls in the neighborhoodbased study smoked cigarettes. However, the ORs were comparable in the two studies. Both studies showed an increase in the risk of ESCC with smoking, with an adjusted OR of less than 1.5. They also showed a dose-response trend with cumulative use of cigarettes. Table 3 shows the results for Nass use. The results were nearly identical in the two studies. Approximately 8 to 9% of the controls used Nass in each study, and in both, the adjusted ORs show that Nass use was associated with a 1.5-1.8-fold increased risk of ESCC.
The results for hookah are shown in Table 4. Only 7% of the controls in the pilot phase of study and 4% of the controls in the main phase of the study used hookah. The point estimates for the adjusted ORs were 1.8-2.1. Table 5 shows the results for opium use. Twenty-eight percent of the hospital-based controls but only 18% of the neighborhoodbased controls reported using opium. However, the percentage of ESCC patients who used opium was quite similar in the two phases of the study (35% and 30%, respectively) (P value.0.05). The adjusted ORs were greater than 1 in both studies, while the OR was significant and much higher in the neighborhood-based study (OR 1.77) than in the hospital-based study (OR 1.09), it may make sense even though test of heterogeneity is not significant.
Additionally, duration of use showed a dose-response association with ESCC risk in the neighborhood-based study, whereas it did not show such an association in the hospital-based study. In the hospital-based study, the questionnaire did not include average amount of opium used each day, so data on average amount and cumulative exposure were not available.
Calculated standardized opium consumption prevalence's were 0.17, 0.16 and 0.23 for the cohort subjects, the neighbourhood controls and the hospital controls, respectively. We also pooled the data of the 430 cases from the two phases of the study and compared this pooled case data with that of the neighborhood controls and the hospital based controls separately, using nonconditional logistic regression. The results showed that in the analysis of the 430 cases and 570 neighbourhood controls, after adjusting for the confounding factors, using opium had a significant association with ESCC (OR = 2.05, 95% CI 1.43-2.93) while in the analysis of the 430 cases and 260 hospital controls, it did not (OR = 0.87, 95% CI 0.57-1.33). Table 6 shows that among both sets of controls, that smoking is by far the strongest determinant of opium use. Smoking is also a very strong determinant of ESCC, and so opium must be considered a confounder of the relationship between smoking and ESCC, and vice versa. Considering the results of table 6 we add a multivariable analyses in table 5, assessing opium as a risk factor for ESCC which has been adjusted for smoking. The results showed that the observed association with opium use cannot be explained by confounding effect of smoking.

Discussion
In this study, we compared the associations of tobacco-related variables and opium use with ESCC risk in two phases of a casecontrol study in the same population, one phase using hospital controls and one using neighbourhood controls. We found that the results were similar for cigarette smoking and Nass consumption in the two phases of study; both showed a similar magnitude of increased ESCC risk. It was difficult to compare the results for hookah use, as the prevalence of consumption in the study area is very low and our sample size was modest. We found a notable difference between the pilot and main phase results, however, when examining the effect of opium use. Compared to the neighbourhood controls, hospital-based controls were more likely to use opium, at rates close to those seen in cases. Therefore, while the neighborhood-based study showed an increased risk of ESCC with opium use, the hospital-based study did not. Whether hospital-based or population-based controls better satisfy the comparability criteria has been long debated in epidemiologic texts and articles [13]. As Miller and colleagues have discussed, the answer may depend on the question(s) asked, and each study must evaluate the circumstances individually [14]. There are a number of studies which have compared these two methods of control selection. In a case-control study of cervical cancer, where the exposures were variables related to pregnancy, marital status, intercourse, and smoking, West and colleagues showed that hospital controls were more 'case like' than population controls for all exposures, and this led to underestimating the effects [15]. In a case-control study of diet and colorectal cancer, however, there were no significant differences in conclusions using hospital or population controls, so that for most analyses the authors combined the two series [16]. In another study, Sadetzki and colleagues mentioned that the possibility of selection bias should be taken into consideration whenever hospital controls are used [17]. In one other study [18], Infante-Rivard compared population and hospital controls to study risk factors of leukemia in children. From comparisons with population survey data and socioeconomic data, this researcher concluded that the study groups came from the same base population but the distribution of exposures in hospital controls was closer to that of cases than those of population controls, which resulted in ORs closer to null when using hospital controls [18].
In the current study, the prevalence of smoking among the hospital controls was close to that of the neighborhood controls, and also close to that found in the pilot phase of the Golestan Cohort Study in the same population [11]. The ORs in the two case-control studies were similar, and from both we would conclude that cigarette smoking is a risk factor for ESCC, albeit more modestly than it is in other populations [19]. The hospital controls were mostly selected from patients admitted for elective surgery (73%) or trauma (21%), but there were internal medicine patients (6%) too. These conditions were selected under the assumption that they were not related to smoking. Most of the surgery patients were hospitalized for benign prostatic hyperplasia (BPH) or hernia. , There is a possible but controversial protective effect of low dose smoking on BPH [20], but the association, if it exists, is small and would be unlikely to affect our results. There is also a report that has shown an association between smoking and hernia [21] but it needs to be investigate more to find out the real association between smoking and hernia. Use of Nass in the hospital and neighborhood controls was also similar, but it was higher than what was found in the pilot phase of the Golestan Cohort Study pilot phase of the cohort study (19). However, we should not expect that the prevalence in the casecontrol and cohort studies should be similar, since Nass use is a function of sex, age, and ethnicity (it is most commonly seen in Turkmen men), and the case-control and cohort studies were dissimilar in these demographic variables.
The pattern of association with hookah use was reasonably similar in the two phases of the study. However, as we mentioned above, the number of people who used hookah was very small, so it is difficult to draw any meaningful conclusions.
The main difference between two studies was related to the association of opium consumption and ESCC risk. The cases in both studies had similar rates of opium consumption, but the hospital controls reported a higher prevalence of opium use than the neighborhood controls. So OR for opium use in neighbourhood control set is significant while it is not statistically significant in hospital control set and although test of heterogeneity was not significant, considering that the power of analysis in interaction test is not high enough we think that P value of 0.15 can be an issue for discussion.
There are several potential explanations for this difference in prevalence among these two control groups. Hospital controls may not be representative of the population because in this area opium has traditionally been used to treat pain and numerous ailments, including those which brought some of these controls to the hospital. In addition, a recent study has shown that regular users of opium are more prone to accidents [22]. This report is consistent with our finding that 48% of the hospital control patients admitted for trauma used opium. If either of these hypotheses is true, then using hospital controls would result in Berksonian bias and, in this instance, the estimation of opium risk would be biased toward the null. On the other hand the slight differences in ORs between the two studies can not necessarily be due to control selection as there are many differences between the two case-control studies. Different case groups and twice as many cases in the neighborhood study than the hospital-based study (the hospital-based estimates are less robust) can inflating opium exposure in this control group.
We compared standardized rates of opium use among the two sets of controls in the current study and the participants in the Golestan Cohort Study, which is the most representative survey that we have of the community from which the cases were selected, and we found that the prevalence of opium use in the cohort was much closer to that in the neighbourhood controls than it was to the prevalence of opium use in the hospital controls. This suggests that the neighbourhood controls were more representative of the study base than the hospital controls, at least for opium exposure. On the other hand, one might argue that the reported rates of opium consumption among neighborhood controls and participants in the cohort are lower than the real rates because   they are not questioned in a medical setting and when they are sick, so they may not answer the questions as truthfully, and this may result in information bias. However, a recent study by our group showed that the responses given to the questions in the pilot phase of the cohort study, also in a non-medical setting, were very close to the results found by testing urine for markers of codeine and morphine [23], so it seems that the questionnaire provides valid responses in this setting. This study has several strengths and limitations. Notable strengths, especially for the comparisons in this paper, were the high participation rates of both hospital and neighbourhood controls and the fact that the same team members interviewed all the cases and controls in both studies. One of the limitations is the modest sample size of the hospital-based study. Another is the fact that we cannot exclude the possibility that some of the exposures under study (particularly opium use) might be associated with the reasons that the hospital controls were hospitalized.
In summary, the results of this study show that neighbourhood controls were superior to hospital controls in assessing the risk of ESCC associated with opium exposure in this population. But, as