The association between environmental exposures during childhood and the subsequent development of Crohn’s disease: A score analysis approach

Background Environmental factors during childhood are thought to play a role in the aetiology of Crohn’s Disease (CD). In South Africa, recently published work based on an investigation of 14 childhood environmental exposures during 3 age intervals (0–5, 6–10 and 11–18 years) has provided insight into the role of timing of exposure in the future development of CD. The ‘overlapping’ contribution of the investigated variables however, remains unclear. The aim of this study was to perform a post hoc analysis using this data and investigate the extent to which each variable contributes to the subsequent development of CD relative to each aforementioned age interval, based on a score analysis approach. Methods Three methods were used for the score analysis. Two methods employed the subgrouping of one or more (similar) variables (methods A and B), with each subgroup assigned a score value weighting equal to one. For comparison, the third approach (method 0) involved no grouping of the 14 variables. Thus, each variable held a score value of one. Results Results of the score analysis (Method 0) for the environmental exposures during 3 age intervals (0–5, 6–10 and 11–18 years) revealed no significant difference between the case and control groups. By contrast, results from Method A and Method B revealed a significant difference during all 3 age intervals between the case and control groups, with cases having significantly lower exposure scores (approximately 30% and 40% lower, respectively). Conclusion Results from the score analysis provide insight into the ‘compound’ effects from multiple environmental exposures in the aetiology of CD.

Introduction Environmental risk factors in childhood are believed to play a role in the subsequent development of the inflammatory bowel disease (IBD) subtype, Crohn's disease (CD) [1][2][3][4]. Numerous studies have evaluated the different environmental exposures during childhood, albeit findings for many have been inconsistent [4][5][6][7][8][9][10][11][12][13][14]. These inconsistencies may be attributed to differences in both the timing and the extent of the various environmental exposures during childhood, as well as the heterogeneity in CD susceptibility mutations both between and within individual population groups [15,16]. Alternatively, these findings may be a result of methodological issues such as study sample size, participant characteristics (i.e. demographics, socioeconomic factors), identification of inappropriate control subjects, or the failure to account for potential confounding environmental variables.
Several theories have emanated in a bid to understand the underlying pathogenesis and aetiology of IBD [17][18][19][20]. The most widely accepted is the 'hygiene hypothesis' and the theory has received wide speculation and academic attention [19]. The hygiene hypothesis holds that an 'overly hygienic' childhood environment will impair the microbial competence of the gastrointestinal immune system, and its ability to appropriately recognize new antigens, predisposing children to immunologic disorders later in life [19,21]. While the effect is deemed to be most profound during early childhood, the optimal timing and magnitude of exposure required is still unclear. In addition, it is entirely possible that the impact of different exposures is not mutually exclusive; that disease development depends on the dose-response interactions between exposures, and this reflects how strongly one exposure may 'protect' or increase the 'risk' conferred by CD susceptibility mutations. The first of many susceptibility genes identified was CARD15 (caspase-activation recruitment domain), also referred to as the NOD2 (nucleotide oligomerization domain) gene, within the IBD1 (inflammatory bowel disease) locus. The gene is responsible for the recognition and clearance of intestinal antigens via the induction of autophagy-related protein complexes. Polymorphisms in the CARD15/NOD2/ IBD1 locus have been associated with the highest risk for CD development [22][23][24] and recent evidence shows reduced functionality affects the immune system and disease phenotype in paediatric CD population [25]. Indeed the role of epigenetics and alterations in DNA methylation in IBD cannot be ignored [26]. The concept of microbiota or environmentally-induced permanent epigenetic change particularly within developing areas is a distinct possibility [27].
Numerous studies have been conducted on the paradigm of the hygiene hypothesis, including the recent case control investigation of childhood environmental exposures (0-5, 6-10 and 11-18 years) conducted in the 2 largest tertiary referral IBD centres in Cape Town, South Africa by Basson et al [4]. It must be highlighted that, more recent literature denotes distinct differences between adult and paediatric IBD patient population, particularly with regard to matters surrounding the hygiene hypothesis [28]. The distinction between these two groups has been recently debated and it appears that with IBD are a distinctive population with specific peculiarities requiring highly skilled and specialised approach for diagnosis and treatment [28].
The present study uses a subset of data from the previously mentioned study by Basson et al [4]. The aim of the present study was to investigate the association between childhood environmental exposures and the risk of CD development in the Western Cape, South Africa, based on a score analysis approach.

Design and setting
This was a post hoc analysis of previously collected data from a large case-control study, performed between September 2011 and January 2013, of all consecutive CD patients seen during their normally scheduled appointments at the 2 largest public sector hospitals in Cape Town; Groote Schuur Hospital (GSH) and Tygerberg Hospital (TBH). The methodological approaches of the study have been described elsewhere [4]. Disease characteristics of the CD patients have also been described in detail elsewhere [29]. Briefly, only patients with complete data at diagnosis were included. Patients were excluded if disease duration was less than 5 years, or had a prior diagnosis of intestinal tuberculosis, determined via the algorithm suggested by Epstein et al [30]. Control subjects were identified from the same populations giving rise to the CD cases. Controls were excluded if they had a prior diagnosis of tuberculosis, IBD or other immune-mediated diseases, any gastrointestinal disorder (e.g., irritable bowel syndrome), or any family history of IBD.

Study data
For reference, the results from Basson et al of the multiple logistic regression analysis evaluating environmental risk factor exposure in 3 age groups (0-5 years, 6-10 years and 11-18 years) have been presented in Table 1. The authors investigated 14 environmental exposures, including; primary source of drinking water, hot piped tap water, community type, total number of people in household, number of people sharing a bathroom, number of bedrooms in home, type of toilet facility, type of toilet facility, donkey/horse/cow/sheep living permanently on the property, second-hand cigarette smoke exposure, unpasteurized milk consumption, raw beef consumption, helminth infection and treatment for helminths.

Methodology
Environmental Exposures for the age intervals (0-5 years, 6-10 years and 11-18 years) To evaluate the environmental exposures during the 3 age intervals based on a score analysis of the 14 environmental exposure variables (Table 1), three different methodological approaches were undertaken. In the first approach ('Method 0'), no score value was assigned to the variables, thus all 14 variables had a weighting score value equal to one. In the second and third analytical approach, environmental variables of similar nature were combined into 'subgroups' based on two methodological approaches termed 'Method A' and 'Method B'. All 14 variables were included in the two approaches (Method A, Method B), and only the number of subgroups within Method A and Method B differed. Each 'subgroup' was equal to a value of one, and relative to the methodology for variable subgrouping, the weighting of an environmental variable was inversely related to the number of variables contained within its respective subgroup. For instance, within the 'Method A' approach, the environmental exposure variables 'water source' and 'hot water availability' were grouped together (i.e. subgroup A1), to form  (19) 24 (11) 26 (14) 21 (10) 11 (6) 10 (5)

Household pets
No Once per year or more 7 (4) 39 (22) 12 (7) 54 (27) 13 (7) 32 (15) (Continued) Environmental exposure, childhood and Crohn's disease: Score analysis Once per year or more 9 (5) 27 (14) 11 (96) 29 (15) 15 (8) 29 (14) Helminth one subgroup, whereas the exposure variables 'number of people in the household', 'number of bedrooms in the household' and 'pets in the home' were combined to form another subgroup (i.e. subgroup A2). Both subgroup A1 and subgroup A2 had a score value equal to a value of one, respectively. However the individual weighting for the two variables comprised within subgroup A1 ('water source' and 'hot water') was 1/2 + 1/2, whereas for the three variables within subgroup A2 ('number of people in the household', 'number of bedrooms in the household' and 'pets in the home') was 1/3 + 1/3 + 1/3. The subgrouping of variables and the weighted score of variable(s) within each subgroup for Method A (outcome 1) and of Method B (outcome 2) are depicted in Tables 2 and 3, respectively. It must be noted that for both Method A and Method B, the exposure variables 'never having consumed unpasteurized milk' and 'passive cigarette smoke exposure' were not combined with other exposure variables and their individual score remained equal to the value of one. As this was a post-hoc analysis, this weighting methodology was decided based on the following: 1) a significant risk association was identified for unpasteurized milk during all 3 age intervals, including an independent risk association for 6-10 and 11-18 years; and 2) a significant association was identified for passive cigarette smoke exposure during 2 age intervals, including of an independent risk association for 11-18 years [4]. Method A and Method B consisted of individual subgroups, and the lowest possible variable weighting within any subgroup was equal to 1/5, thus a subgroup consisted of no more than 5 exposure variables. Since the 'score value' for every subgroup was equal to the value of one, the total possible score for Method A and Method B was equivalent to the total number of subgroups contained within that group, respectively. For Method A, the total possible score was 8, and for Method B, the total possible score was five. A score analysis was performed based on these premises. The subgroups within Method A and Method B, respectively, were statistically analysed based on their single (i.e. unpasteurized milk) or pooled weighted score.
Comparisons were performed between the cases and the controls using these predefined approaches (Method A and Method B), and the score analysis for environmental risk factors was performed over the 3 childhood age intervals (0-5, 6-10 and 11-18 years). For each analysis (0-5, 6-10 and 11-18 years), score results have been represented as minimum, maximum, mean and median values for the case and control groups. The Odds Ratios (OR) and 95%CIs represents the significance of the difference between the two groups, as well as the difference in proportion with regard to number of exposures based on the score.

Statistical analysis
Demographic data for the cases and controls is presented as frequencies (percentages) for the childhood infection and as medians, and as medians (interquartile range (IQR)) and mean (standard deviation (SD)) for the environmental exposure variables (environmental risk factors). The score analysis was conducted for the environmental exposures over the three age intervals using logistics regression.

Score analysis for 'Method 0': Environmental risk factors during the 3 age intervals
The results of the score analysis for Method 0, evaluating environmental risk factor exposure in 3 age groups (0-5 years, 6-10 years and 11-18 years) are shown in Table 4. In this model, all exposures were equal to the value of one, thus the total possible score was 14 (Table 1). 0-5 years. During the age interval 0-5 years, the mean and median scores for the case and control group were [4.39 (SD ± 1.93) vs 4.71 (SD ± 1.98); and 4.0 (IQR 3.0-5.0) vs 4.0 (IQR 3.0-6.0)], respectively. Both groups had a minimum score of zero, whereas cases had a maximum score of 10 and controls had a maximum score of 12. There was no significant difference in the exposure scores between the case and control groups (OR = 0.92; 95% CI, 0.83-1.02).
6-10 years. During the age interval 6-10 years, the mean and median scores for the case and control group were [4.39 (SD ± 1.80) vs 4.60 (SD ± 1.92); and 4.0 (IQR 3.0-6.0) vs 4.0 (IQR 3.0-6.0)], respectively. Both groups had a minimum score of zero, whereas cases had a maximum score of 9 and controls had a maximum score of 11. There was no significant difference in the exposure scores between the case and control groups (OR = 0.94; 95% CI, 0.80-1.02).

11-18 years.
During the age interval 11-18 years, the mean and median scores for the case and control group were [3.77 (SD ± 1.54) vs 4.02 (SD ± 1.65); and 4.0 (IQR 3.0-5.0) vs 4.0 (IQR 3.0-5.0)], respectively. The minimum and maximum scores for cases were 0 and 9, respectively, whereas the minimum and maximum scores for controls were 1 and 9, respectively. There was no significant difference in the exposure scores between the case and control groups (OR = 0.90; 95% CI, 0.80-1.02).

Score analysis for environmental risk factors during the three age intervals
Method A. The results of the score analysis evaluating environmental risk factor exposure in 3 age groups (0-5 years, 6-10 years and 11-18 years) for Method A are shown in Table 5. The maximum possible score for Method A was 8. 0-5 years. During the age interval 0-5 years, the mean and median scores for the case and control group were [2.08 (SD ± 0.98) vs 3.58 (SD ± 1.12); and 1.99 (IQR 1.33-2.66) vs 2.16 (IQR 1.66-3.16)], respectively. Both groups had a minimum score of zero, whereas cases had a maximum score of 5.16 and controls had a maximum score of 6.46. There was a significant difference in exposure scores between the case and control groups (OR = 0.74; 95% CI, 0.62-0.92), thus indicating that cases had 26% less exposure during this age interval when compared to the controls. 6-10 years. During the age interval 6-10 years, the mean and median scores for the case and control group were [1.30 (SD ± 0.79) vs 1.64 (SD ± 1.03); and 1.50 (IQR 0.5-2.0) vs 1.50 (IQR 1.0-2.5)], respectively. Both groups had a minimum score of zero, whereas cases had a maximum score of 3.5 and controls had a maximum score of 6. There was a significant difference in exposure scores between the case and control groups (OR = 0.67; 95% CI, 0.53-0.83), thus indicating that cases had 33% less exposure during this age interval when compared to the controls. Environmental exposure, childhood and Crohn's disease: Score analysis 11-18 years. During the age interval 11-18 years, the mean and median scores for the case and control group were [1.81 (SD ± 0.85) vs 2.08 (SD ± 0.98); and 1.66 (IQR 1.16-2.33) vs 1.83(IQR 1.33-2.08)], respectively. The minimum and maximum scores for cases were 0 and 4.66, respectively, whereas the minimum and maximum scores for controls were 0.33 and 5.16, respectively. There was a significant difference in exposure scores between the case and control groups (OR = 0.72; 95% CI, 0.58-0.90), thus indicating that cases had 28% less exposure during this age interval when compared to the controls.
Method B. The results of the score analysis evaluating environmental risk factor exposure in 3 age groups (0-5 years, 6-10 years and 11-18 years) for Method B are shown in Table 6. 0-5 years. During the age interval 0-5 years, the mean and median scores for the case and control group were [1.20 (SD ± 0.63) vs 1.47 (SD ± 0.77); and 1.10 (IQR 0.73-1.60) vs 1.38 (IQR 0.85-1.98)], respectively. Both groups had a minimum score of zero, whereas cases had a maximum score of 3.13 and controls had a maximum score of 3.93. The maximum possible score for Method B was 5. There was a significant difference in exposure scores between the case and control groups (OR = 0.72; 95% CI, 0.58-0.90), thus indicating that cases had 42% less exposure during this age interval when compared to the controls. 6-10 years. During the age interval 6-10 years, the mean and median scores for the case and control group were [1.22 (SD ± 0.60) vs 1.33 (SD ± 0.78); and 1.19 (IQR 0.73-1.60) vs 1.33 (IQR 0.81-1.98)], respectively. Both groups had a minimum score of zero, whereas cases had a maximum score of 3.06 and controls had a maximum score of 4.26. There was a significant difference in exposure scores between the case and control groups (OR = 0.58; 95% CI, 0.43-0.78), thus indicating that cases had 42% less exposure during this age interval when compared to the controls.
11-18 years. During the age interval 11-18 years, the mean and median scores for the case and control group were [1.08 (SD ± 0.60) vs 1.31 (SD ± 0.71); and 0.93 (IQR 0.64-1.41) vs 1.19 (IQR 0.73-1.80)], respectively. The minimum and maximum scores for cases were 0 and 3.13, respectively, whereas the minimum and maximum scores for controls were 0.2 and 3.63, respectively. There was a significant difference in exposure scores between the case and control groups (OR = 0.59; 95% CI, 0.43-0.79), thus indicating that cases had 41% less exposure during this age interval when compared to the controls.

Discussion
The enteric flora is comprised of numerous microorganisms [31] and the history of CD is paved with publications hypothesizing various environmental agents in the aetiology of CD [20,32,33]. However, studies evaluating exposures during childhood have largely focused on the effect of an individual environmental exposure without accounting for potential confounding interactions both between variables and over time, likely limiting the true validity of what is being measured. Using a subset of previously collected data, the present study conducted a post-hoc score analysis to investigate this relationship in an attempt to further delineate the paradigm of the hygiene hypothesis. Results of the score analysis (Method 0) for the environmental exposures during 3 age intervals (0-5, 6-10 and 11-18 years) revealed no significant difference between the case and control groups (Table 4). All exposure variables were equal to a value of one and in turn, frequency of exposure was based on the individual effect of each variable, thus implying that in most cases, the capacity of a single variable to induce dysbiosis within the gut microbiome is limited. By contrast, a significant case-control difference was observed in all age intervals evaluated for both Method A and Method B, with cases having significantly lower exposure scores (approximately 30% and 40% lower, respectively), when compared with that of controls. In support of the hygiene hypothesis, these findings implicate the complex role in which multiple microbial-based environmental exposures function in the development of the intestinal immune system. The latter is further portrayed by the change in both the mean values and CIs observed between Method A and Method B in that, the difference in mean values between the cases and controls in Method A is consistently larger compared to the difference in mean  [20,34]. While it is recognized that the gut microbiome is profoundly shaped by the microbial environment, there is now evidence that the converse is also true [35][36][37]. For instance, it has been shown that the intricate array of pattern recognition receptors (PRR)s, including the vitamin D receptor (VDR), which equip the immune system to recognize microbial molecular patterns, have an impressive effect on both the diversity and functionality of the gut microbiota [38]. Therefore, genetic alterations in these PPRs may influence this bidirectional interaction, in which immune activity can either suppress or promote pathogenic microbial blooms, in turn, affecting homeostasis of host intestinal immunity and the convergence of disease susceptibility. The NOD-like or nucleotide oligomerization domain receptors (encoded by NOD2 gene) is a PRR that recognizes intracellular bacterial products and variants in the CARD15/NOD2/IBD1 locus are associated with the development and phenotypic patterns of CD [39]. The likelihood of a NOD2-dysregulating microbial community in immunemediated disorders, such as CD is strengthened by data from recent NOD2-/-mice [40], in which unregulated inflammation of the gut results taxonomic shifts in bacterial phyla characteristic of the change in the microbiome during inflammation, and similar to those described in IBD patients compared with healthy controls.
Findings from the score analysis performed in the present study provide insight into the 'compound' effects from environmental risk factors in the pathogenesis of CD. This has important implications for future IBD-related studies as it demonstrates the importance of accounting for environment as a 'whole' during epidemiological research, as opposed to the impact of individual factors. Indeed, it is likely that certain environmental risk factors may hold a greater 'weight' with regard to their effect on the gut microbiota in disease pathogenesis, particularly in context to timing of exposure during childhood.

Conclusion
The findings from this present study are in line with the hygiene hypothesis, and demonstrate the complex role in which numerous microbial-based environmental exposures together with CD susceptibility genes function in the development of the gastro-intestinal immune system. Furthermore, the score analysis provides insight into the 'compound' effect from environmental risk factors in the pathogenesis of CD in which certain environmental exposures during childhood have greater impact with regard to their effect on the gut microbiota in disease pathogenesis, together with timing of exposure.