The Genetic and Environmental Bases of Complex Human-Disease: Extending the Utility of Twin-Studies

Making only the assumption that twins are representative of the population from which they are drawn, we here develop a simple mathematical model (using widely available epidemiological information) that sheds considerable light on the pathogenesis of complex human diseases. Specifically, for the case of multiple sclerosis (MS), we demonstrate that the vast majority of patients (≥94%), possibly all, require genetic susceptibility in order to get MS. Nevertheless, only a tiny fraction of the population (≤2.2%) is actually susceptible to getting this disease; a finding which is highly consistent in all of the studied populations across both North America and Europe. Men are more likely to be susceptible than women although susceptible women are more than twice as likely to actually develop MS compared to susceptible men (i.e., they have a greater disease penetrance). This is because women are more responsive to the environmental factors involved in MS pathogenesis than men. These differences account for the current gender-ratio (3∶1, favoring women) and also for the increasing incidence of MS in women around the world. By contrast, the most important genetic marker for MS susceptibility (DRB1*1501) influences the likelihood of susceptibility but not the penetrance of the disease. Nevertheless, even for this major susceptibility allele, only a very small fraction of DRB1*1501carriers (<5%) are susceptible to getting MS and for only a minority of MS patients (∼41%) does this allele contribute to their susceptibility. Moreover, each copy of this allele seems to make an independent contribution to susceptibility. Finally, at least three environmental events are necessary for MS pathogenesis and, during the course of their lives, the large majority of the population (≥69%) experiences an environmental exposure, which is sufficient to produce MS in, at least, some susceptible genotypes. Also, susceptible men (compared to susceptible women) have a lower threshold, a greater hazard-rate, or both in response to the environmental factors involved in MS pathogenesis.

g 01 , g 02 = P(G│Gx+) = g 01 ; and: P(G│Gx-) = g 02 Table S3. For Section F, we assume that the hazard for developing MS in susceptible men and women is proportional.

Breakdown of P(MS) based on a Simple Susceptibility Scheme
Appendix S1; Section B
Small Green Boxes within Appendix S1 are hyperlinks back to Main Text Navigate Back to Main Menu

The Nature of Genetic Susceptibility
We define disease-penetrance for any specific genotype (or, equivilently, any specific individual) as the conditional life-time probability of disease given that particular genotype. For MS, it has been established that the DRB1*1501 allele, located on the short arm of chromosome 6, is an MS susceptibility-allele. [21][22][23][24][25][26] The set of carriers of this allele (HLA+) and the set of non-carriers (HLA-) form a partition of the general population.
Within the populations of North America and northern Europe, it has been consistently observed that: P(HLA+│MS) > P (HLA+) and, thus, also: P(HLA-│MS) < P(HLA-) # (HLA+) and (HLA-) form a partition Rewriting, rearranging, and combining these two equations yields: (1) P(MS│HLA+) > P(MS) > P(MS│HLA-) Therefore, a direct consequence of these observations 21-26 is that some genotypes must have a higher penetrance (for MS) than others and, therefore, there must be at least one genotype, within the population, that has the smallest penetrance of any. The set of all genotyes that share this same smallest penetrance is defined as (G-) and its penetrance is P(MS│G-). Members of this set are referred to as being "genetically non-susceptible." Conversely, the set of all genotyes, which have a penetrance greater than this minimum, is defined as (G), and its members are referred to as being "genetically susceptible." Based on these observations and considerations, therefore, both (G) and (G-) must contain at least one member, they are mutually exclusive, and they partition the population (see Table S3; Section A). Also, the penetrance P(MS│G-) may (or may not) be zero, depending upon the prevalence of purely environmental MS (see Main Text).
In the Model, 27 it is supposed that there are some number of susceptibility loci (n) that harbor susceptibility alleles (i.e., specific DNA sequences -in either "coding" or "non-coding" genomic regions -which, alone or in certain combinations with other such alleles, increase the likelihood of MS compared to individuals who possess only non-susceptibility alleles at each locus). Each locus is assumed to be a chromosomal region that is independent of other loci, although a particular locus may harbor more than one susceptibility allele at a particualar region or more than one (linked) susceptibility region. 27 By definition, the set (G) includes all genetic combinations at these n susceptibility loci that lead to genetic-susceptibility. The term P(G) represents the probability that an individual in the general population is a member of this set. We can partition (G) into disjoint subsets (G h ),where every genetic-combination in the subset (G h ) has, within its collection of genotypes at the different susceptibility-loci, at least one group of (h) loci, which are in a "susceptible state" and that, by themselves (i.e., as a combination), would result in susceptibility to MS. 27 In addition, no member of the subset (G h ) can have a group of fewer than (h) loci in a "susceptible state" that, by themselves, would lead to MS-susceptibility. The term "by themselves" indicates that a person having this particular combination of "susceptible states" at the (h) loci is susceptible to getting MS, regardless of the "allelic state" at any other genetic locus. 27 Each subset (G h ) can be further divided into two sub-subsets (S h +) and (S h -) based on whether the particular combination that defines membership in the (G h ) subset either does (S h +) or does not (S h -) include the DRB1*1501 allele. Thus, we can also define two subsets of G, (S+) and (S-), such that: In this conceptualization, genetic-susceptibility is understood to be (quantitatively) binary -an individual is either genetically susceptible or they are not. Nevertheless, within (G), there may be a wide variation in the likelihood that MS will develop (i.e., in the penetrance of the different genotypes). As can be appreciated from Eq. (1), such a binary structure is a direct consequence of the fact that DRB1*1501 is an undisputed MS susceptibility-allele for MS. 21-26 Also, as a consequence of this, both the number of susceptibility alleles and the number of loci that harbor such alleles must be at least one. Presumably, there are many others but, in any case, the total number must also have an upper-bound (i.e., not every allele or locus in the genome can be a susceptibility allele or locus). As noted earlier, the combination of allelic states (genotype) at the different susceptibility loci that has the least likelihood of resulting in MS, together with all other combinations that share this minimum likelihood, constitute the set (G-). Any combination (genotype) that increases this likelihood (even by a miniscule amount) belongs, by definition, to the set (G). Thus, the sets (G) and (G-) are mutually exclusive and both are non-empty. Nevertheless, the set of susceptible individuals (G) could, at least theoretically, encompass virtually the entire population (i.e., all but one genotype) and the penetrance of different susceptible-genotypes could range from nearly zero to one.
The set (G) can also be partitioned into those genotypes that are sufficient to produce disease but do so more often, or exclusively, in "susceptible" environmental circumstances (G0), and those that are sufficient to produce the disease but do so independently of an individual's environmental experiences (G3). The subset (MS, G3) will be referred to as "purely genetic" MS (see Table S3; Section A).
Clearly, using these approximate probabilities (together with these conditional probabilities), if they have been assigned correctly for the population under consideration, then our Assumption (A1) that: will yield an estimate for P(MS), which is too low. The estimate will be better if only the (P2) and (P3) subsets are included in the denominator and will be better still if only P(MS│P3) is considered.
However, it is also the case that, in any MS cohort, individuals will experience an excessive mortality (due to MS) compared an unaffected control population. 29 Therefore, an even better estimate would be derived from the prevalence in the cohort of the population restricted to ages 45-55 years, in which new incident cases are unlikely to occur 3 and where substantial early mortality from MS is unlikely to have yet happened. 28 To get a sense for the possible magnitude of the underestimation, using these approximate probabilities above, then, from Eq. (4), we can calculate that: If: 1 > a 3 ≥ 0.5 ; as seems likely with an average onset-age 3 for MS of: ~30 yrs Then: Clearly, a similar underestimation will occur for the quantities P(MS│P, MZ MS ) and P(MS│P, DZ MS ); the estimates for which, again, rely on cross-sectional probabilities being substituted for longitudinal probabilities. In these cases, however, because the affected proband in the twin-ship is known to have MS, he or she (and, thus, also their twin) will already be in either the (P2) or (P3) age-band. Therefore, for all ascertained pairs (with at least 1 affected) the degree of underestimation for P(MS│DZ MS ) and P(MS│MZ MS ) will be less than it is for P(MS). Nevertheless, from Prop. (4.2); Section C, the estimate of P(G) is derived from the ratio of these two quantities such that: Therefore, the under-estimate of P(G) from using P(MS│P) -i.e., by using Assumption (A1) -will be mitigated, to some extent, in the ratio.

The Number and Uniqueness of Susceptible Genotypes
It seems that individual MS patients are unlikely to share specific susceptibility genotypes with other MS patients. Thus, both from recent genome-wide screens 26 and from theoretical considerations alone, 27 it seems that there are approximately 100-200 susceptibility loci in the human genome. In addition, it seems that, on average, between 11 and 18 of these loci need to be in a susceptible state in order to confer susceptibility. 27 Under these circumstances, the number of different susceptible combinations (N) will be huge. Thus, regardless of the exact distribution of the number of susceptible loci necessary for each susceptible genotype, with only 7 billion people on earth (of whom, less than 5% are susceptible), it is unlikely that any more than a tiny fraction of MS patients actually share the exact same combination of susceptibility genes with another MS patient. Nevertheless, even granting this conclusion, this does not exclude the possibility that patients might still be classifiable into "clusters" of genetic associations. In this view, it may be possible to subdivide the universe of susceptible genotypes (i.e., combinations of "susceptible genes") into a more manageable number of different, but possibly overlapping, groups. Thus, perhaps, each group would share certain properties (e.g., expected penetrance, involvement of specific pathways, and so forth) although no member of the group would share an identical collection of susceptibility genes with any other member. Nor would they, necessarily, share any specific subset of susceptibility genes. Rather, for example, each member of the group might possess some number of a cluster of genes in addition to whatever else is necessary to make their particular genotype susceptible.
Consequently, in order to identify these "clusters" (if they exist) using a GWAS approach in large datasets, 26 it is important to test as many different combinations of as many different associated genes as possible to explore these "group-associations" with MS. In addition, because gender and HLA-status impact MS-susceptibility (see Main Text & Section D), it is important to use this "cluster" approach, not only for the population as a whole, but also for the different subgroups broken down by gender (men or women) and/or by HLA-status (carriers of 0, 1, or 2 copies of the DRB1*1501 allele).
Moreover, as discussed in the text and in Section D, the prevalence of women in the susceptible subset (G) is low (28 -48%). There are (at least) two possible explanations for this circumstance. First, it is possible that the genes, which are associated with MS, are different between men and women. Second, susceptible women may, on average, require more susceptibility alleles to MS than susceptible men. 27 Therefore, it would be interesting (and important) to perform the GWAS analyses, separately by gender, to determine both whether the same set of genes are associated in men and women and, also, whether MS-women possess more of the ~100 identified susceptibility-genes 26 than MS-men.
Finally, as noted earlier, 27 part of the DRB1*1501 effect is seems to be due to reduction in the number of susceptibility genes needed to produce susceptibility. If this reduction is of greater magnitude in women than men, it might help to explain the gender-difference in MAF between MS-men and MS-women (see Main Text & Prop. 6.4; Section D). Therefore, in the large datasets now becoming available, 26 it would be important to confirm that the MAF difference between men and women actually exists, to confirm the observation that each DRB1*1501 allele and each "(HLA-) allele" has an independent impact on susceptibility, and to compare the number of susceptibility genes present for the different subgroups broken down both by gender and by HLA-status.
Appendix S1 ; Section C
Small Green Boxes within Appendix S1 are hyperlinks back to Main Text Canada, the impact of a shared CH environment on disease occurrence seems to be minimal. 4-7, 9, 10, 19, 20 Thus, we assume that: P(MS│G-, S MS ) = P(MS│G-, CH) = P(MS│G-) # Assumption (A2) The term (IG MS ) is defined, specifically, to exclude the impact of the CH and IU environments beyond any possible impact of CH in siblings (see Prop. 1.4a). Therefore, also: P(MS│G-, IG MS ) = P(MS│G-) The set (G-) has the lowest penetrance of any genotype (see Section B). Consequently, over 95% of concordant MS in siblings is due to genetic susceptibility. This percentage increases to much more than (95%) when a more realistic estimate for P(MS, G-│S MS ) is used.

Proof for Proposition 1.3:
Because: P(MS│DZ MS ) = 0.054 > 0.029 = P(MS│S MS ) # Data: Table (2) and: P(G│DZ MS ) = P(G│S MS ) # DZ-twins are genetically siblings Therefore, the shared intrauterine (IU) environment, the more similar childhood (CH) environment of DZ-twins (compared to non-twin siblings), or both, increase the risk of MS. However, the fact that all siblings share similar CH environments, together with the actual evidence, 4-7, 9, 10, 19, 20 suggest that this is increased MS-risk in twins is due, almost entirely, to an IU environmental effect. Thus, the independence of (G) and (Gx+), implies the independence of (G) and (Gx-), of (G-) and (Gx+), and of (G-) and (Gx-). Conversely, when Eq. (3) doesn't hold (i.e., where the Gx+ state is associated with MS), then it must also be the case that, at least, one of these conditions -Prop. (1.7a) or (1.7b) -does not hold; and also that Eq. (4) does not hold.
Consequently, if the (Gx+) state is associated with MS, then, so too, is the (Gx-) state.

New Definitions for Proposition 2:
(See also Table 1; Main Paper) 1. P(MS│FT), P(MS│ST) = probability that the first (FT) or second (ST) twin of an MZ twin-pair will get MS, independent of whatever has happened or will happen to their co-twin (G1) = High-penetrance subset of (G), such that: (G i ∈ G1)│{z i > z} (G2) = Low-penetrance subset of (G), such that: If: z i = z ; then these genotypes are assigned to (G1) and (G2) evenly and randomly so that the sets (G1) and (G2) are mutually exclusive and form a partition of (G). Among the population of susceptible individuals, the probability of the (i th ) genotype, P(i), is: By definition, the penetrance of any specific genotype is expected to be the same under equivalent environmental circumstances. The quantity P(MS│IG MS ) has been specifically adjusted (Prop. 1.4a) to remove the impact of the similar environment that twins experience. Therefore, by definition: With these definitions and assumptions, by the definition of mathematical expectation for the discrete random variable (z i ), and from the definition of the variance (σ zi 2 ) of such a variable, therefore: And, using Assumptions (A5 & A6) together with Eqs.
(1 & 2) yields: Therefore, from Eq. (3), it follows that: and: s' = ∑ k (z k 2 )*P(k) / z s = E(z j 2 ) / z s = z s + (σ zk 2 ) / z s ≥ z s = P(MS│G, Gx-) Thus, the penetrance for susceptible individuals from the MZ MS population is increased compared to the penetrance for susceptible individuals in the general population (NB: similar logic applies equally to its subsets).
Thus: P(G1│G, IG MS ) / P(G1│G) ≥ P(G2│G, IG MS ) / P(G2│G) Moreover, defining: b i ' = P(MS│IG MS , G i ) ; then, as in Eq. (4), it follows that: ; for: z 1 ≥ z 2 ; into the above equations, and using the same logic as above for both (G 1 ) and (G 2 ), leads to the conclusion that: Consequently, more penetrant genotypes are enriched to a greater extent than less penetrant genotypes in both the (MS) and the (MS, IG MS ) populations. Also, because (G1) and (G2) partition (G), therefore: and, also:

Proof of Proposition 2.3:
We can define the discrete random variable (a j ) as the set of coefficients that randomly pair each of the (j) genotypes in (G, Gx+) with a genotype in a subset (kj) of the genotypes in (G, Gx-).
The penetrance of the (kj) subset will be defined as: P(MS│G kj ) = z kj We can then chose the subset (kj) such that: E(z kj ) = z s ; and: Var(z kj ) = Var(z k ) If (j > k), then some of the genotypes in (G, Gx-) will be used more than once to make up the (kj) subset.
The (a j ) coefficients will be chosen such that: Because the sets (G, Gx+) and (G, Gx-) are mutually exclusive, the random variables (z j ) and (z k ) are expected to be independent. In this case, (a j ) and (z kj ) will also be independent, as will (a j ) and (z k ).
By definition: Simple rearrangement leads to: Therefore, if: q < p ; then: However, because, by definition: (1) and, thus: and, because: Simple rearrangement leads to: However, because, by definition: It follows from the definitions and from Props.
However, because, by definition: x ≥ y ; therefore, we conclude that: x ≥ b' ≥ y Proof of Proposition 3.4:

3.4a
We define: Throughout the domain of: 0 < p < 1; {or , equivalently: In this way (a) and (a') mirror each other such that: If it is true that: Then it also must be true that: a' ≥ b' ; throughout the domain of (1 -p) Therefore: a' ≥ b' (2) Therefore: (3) and: Because: (0 ≤ p ≤ 1) ; one of the following three statements must be true: Consequently, in MS, the maximum possible value for P(G) in the general population is 2.2%. A very similar range-estimate for P(G) is derived from epidemiological data obtained from different populations throughout North America and Europe (Table 7).

4.2b.
In addition, rearrangement of Eq. (5), together with Eq. (5) of Prop. (2.1), yields: Again using the Prop. 5.2b estimate that: However, because the circumstance in which: z = b' ; implies a zero variance, this estimate is almost certainly too high. Therefore, the most useful form for Eq. (12) to take is:

a.
For gender and HLA:

c.
For gender and HLA:

New Definitions for Proposition 5:
The set G can be partitioned into two disjoint subsets (Gx+ and Gx-) based upon whether or not the susceptible person carries a specific genetic characteristic (Gx). Moreover, as in Prop.
By contrast, if the genetic characteristic (Gx) that is chosen to partition (G) is associated with MS (Prop. 1.7), then the same estimate of (g) will be given by any such partition, in which case: when: B ≠ 1; then: g 1 = g 2 ; if and only if: g = 1 # Eqs. our estimated (t/s) will be artificially high and, consequently, the estimate of (g ≥ 0.94) will be too low.

5.2b.
Because both partitions must estimate the same parameter (g), therefore, the only solution for (g) that is consistent with both estimates is: 0.94 ≤ g ≤ 1 ; The first of these enrichments (OR 1 ) is due to Mechanism (1) whereas the second and third (OR 2 and OR 3 ) are due to Mechanism (2). Because, from Prop. (5.2b): g ≈ 1 ; therefore: A ≈ R ; and: In this case, both (OR 3 ) and the combination of the first two enrichment stages (OR 1/2 ) can be directly observed. For the Gender partition, using the Table 6  So that, at least some of the Female enrichment in MS must be due to Mechanism (2).

Proof of Proposition 6.2:
Gender-Status 6.2a. The development of Prop. (4.2) would be unaltered if men and women were to be considered separately.
Therefore, from Table (6)  Because there is no overlap between these two ranges, we conclude that, for this partition, it must be the case that: Tables (2)  Consequently, from these analyses, we conclude that there is a large penetrance-imbalance for gender, in both the 2 nd and 3 rd enrichment-stages.  separately, therefore, we can define (see Table 5 See also Eq. (10); Prop. (7.1a) for an adjustment to these range estimates.
Again, the lack of any overlap between these predicted ranges, indicates that HLA+ individuals are more likely than HLA-individuals to be genetically-susceptible to MS.
6.3b. The observations from Tables (2)  where: A ≈ R ; and: A 1 ≈ R 1 ; so that, for this partition: Also from Tables (2 & 5  Compared to individuals who lack the DRB1*1501 allele (HLA-), there is an enrichment of individuals who are homozygous for this allele (2HB+) in an MS population, and this enrichment is much greater than it is for individuals who carry one copy of this allele (1HB+) and one copy of a "non-DRB1*1501" allele (1HB-).
6.4b. This suggests a method for further exploring the impact of a genetic trait (Gx+) on the development of MS.
Thus, by analogy to HWE (Prop. 6.4a), we can consider the development of MS as a selection process with a different "fitness" for each genotype. In the circumstanaces of DRB1*1501, the three genotypes are: Homozygous "non-DRB1*1501" ; or: (HLA-) or: (1HB-, 1HB-) In the analogy, for a general population (at HWE) where: P(HLA+) = 0.24 ; therefore: (32) p 2 = P(2HB+) = 0.016 ; 2pq = P(1HB+) = 0.224 ; and: q 2 = P(HLA-) = 0.76 In the general population, these genotypes are presumed to be in HWE and, in fact, for the UCSF #2 control population, this presumption is supported by the data (Table 3). In addition: Based on the data in Table 3, each of the MS populations studied are either at or very near to HWE with respect to DRB1*1501 status, even though this HWE is (in all cases) a very different one from that of the control populations. Therefore, based on Eqs. (30-32), this yields the relationship that: (w p )p 2 + (w pq )2pq + (w q )q 2 = p' 2 + 2p'q' + q' 2 = 1 (33) and, thus: Thus, for a population at HWE, the quantity (w p ) ½ estimates the relative MAF of the risk allele in the susceptible MS population compared to its MAF in the general population. Accepting the conclusion that: Then, this relative MAF, in turn, represents the entire enrichment (OR 1 and OR 2 ) that occurs when moving, first, from the general population to the (G) population and then, second, from the (G) population to the (MS, G) population. In addition, the ratios of these "fitness" levels represent the relative enrichment of the different genotypes when moving from the general population to the (MS) population. For example, comparing the relative enrichment of (2HB+) compared to (HLA-), yields: and, therefore, that: (w p ) / (w q ) ≈ P(G│2HB+) / P(G│HLA-) We will assume that this approximate equality is a true equality and refine our nomenclature such that: P(HB+│G) = p' = MAF of the DRB1*1501 allele in the susceptible population and: P(HB-│G) = q' = combined MAF of "non-DRB1*1501 alleles" in the susceptible population In the context of DRB1*1501 (Table 3), we take the independent selection of these alleles to imply that: Applying these weights to Eq. (31) yields: (w p )p 2 + (w pq )2pq + (w q )q 2 = (w p )p 2 + (w p ) ½ (w q ) ½ 2pq + (w q )q 2 = {p(w p ) ½ + q(w q ) ½ } 2 = 1 Defining (w) by the relationship: w = (w p / w q ) ½ > 1 ; we can transform Eq. (36) to yield: q 2 + (w)*2pq + (wp) 2 = (w 0 )*q 2 + (w 1 )*2pq + (w 2 )*p 2 = (q + wp) 2 = 1/ w q For convienience, we can then define "apparent" initial probabilities for the different genotypes as: It is in this sense that the two DRB1*1501 alleles are said to be independently selected; that is the relative normalized selection pressure for two alleles (w 2 ) is equal to the square of that for one allele (w).
Thus, the weighting scheme implied here is geometric (1, w, w 2 ) for the homozygous-lack, and for the heterozygous-and homozygous-presence, of the risk allele. This is analogous to the joint probability of two events being the product of the individual probabilities; and it contrasts to the weighting scheme for recessive and dominant traits, assuming a non-zero risk for non-carriers and a suitable definition of (w > 1), which would be (1, 1, w) and (1, w, w), respectively. Moreover, because the arguments made above are fully reversible, the initial and final populations will be in HWE if, and only if, the selection pressure is geometric.
Consequently, if both initial and resulting populations (following strong selection) are at HWE (Tables 2 -4), this implies that, for some (w p ) and (w q ), Eq. (36) Averaging these two experiences yields: Therefore, despite a very strong selection pressure, the large majority of DRB1*1501 genotype selection seems to occur when moving from the general population to the susceptible (G) population (the OR 1 step) and very little selection seems to occur when moving from the set (G) to the set (MS, G) -i.e., during the (OR 2 ) step. Moreover, the fact that the initial set (general population) and final set (MS, G) are at HWE, almost certainly, means that the intermediate set (G) is also at HWE.
6.4d. In addition, for each of these samples, for men and women (considered separately), the observed proportions of cases in the different HLA-categories are very near to those expected at HWE (Tables 3 & 4). Despite this, however, men consistently have a lower odds ratios for MS in all HLA+ categories, a smaller proportion of (2HB+, MS) patients, and a lower probability for P(HLA+│MS) compared to women (Tables 2 & 5 Thus, the observed differences between men and women with MS indicate that men (compared to women) have a smaller MAF for the DRB1*1501 allele in an MS population, which is reflected in the consistent observation from Table 2 that: (38) P(HLA+│MS, F, G) > P(HLA+│MS, M, G) To evaluate the possible bases for this observation we will consider the following relationships: This also leads to the lower-boundary condition that: Because "purely genetic" MS is defined to be independent of the environment (see also Section B), it's penetrance is expected to very high (i.e., near unity) and, thus, we anticipate both that: (17) P(MS│G3) ≈ 1 ; and also that: If these Eq. (17) conditions were not to be met, it would raise the question of what factors determined the lower penetrance in (G3). If these factors were potentially identifiable and non-hereditary, then they would constitute environmental events and, thus, these genotypes would be in (G0); not in (G3). Although a stochastic mechanism might lower the penetrance somewhat, such a mechanism seems unlikely to reduce the penetrance of "purely genetic" MS markedly. Using these Eq. (17) conditions, we will first consider the most "extreme" circumstance, in which we assume that: where the variances of the of the (x i ) and (y i ) terms (σ xi 2 and σ yi 2 ; respectively) are assumed to be zero. It is noteworthy, however, that these extreme conditions are clearly contrary to observed epidemiological facts.
Thus, under these particular extreme conditions we would also expect that: Therefore, in this circumstance, we would further anticipate that: Indicates that the Eq. (18) conditions (even at: x > 0.8) are very far removed from the actual data.
7.2b. Next we will consider an alternative set of "more plausible" extreme conditions. By Props. (2.1 & 2.2), any variance in the penetrance value within the (G3) or (G2) subset, will lead to the enrichment of more penetrant genotypes when moving either from (G) to the set (G, MS) or from (G, IG MS ) to the set (G, MS, IG MS ).
Therefore, in the new "extreme" condition, we will assume that all of the enrichment that takes place is due to the difference in penetrance between the (G3) and (G2) subsets and, thus, where the variances of the of the (x i ) and (y i ) terms (σ xi 2 and σ yi 2 ) are still assumed to be zero. Thus, using these definitions, these modified "extreme" conditions then become: Moreover, because the conditions that: P(G3│G) = P(G1│G) = p ; and: σ yi 2 = 0 seem too extreme for the actual distribution, and because less extreme assumptions lead to smaller estimates, these derived upper limits for the ranges of P(G3) are, almost certainly, too large.
Therefore, it must be the case that: P(G3│G) ≈ 0 And, consequently, for all practical purposes, "purely genetic" MS does not exist. make P(HLA-│S-) larger and the presence of protective alleles or genes will make P(HLA-│S-) smaller.
Nevertheless, these other alleles/genes are low in frequency and small in contribution compared to the DRB1*1501 allele. 26 In addition, with approximately 50-200 susceptibility loci and only 11-18 necessary for susceptibility, 27 it seems likely that most of (S-) will consist of combinations not including the DRB1 locus.
Therefore, we will assume that: Consequently, in any case: 3.
u , x = Actual (u) and transformed (x) exposure-levels (all necessary factors) of the susceptible population x 2 -x 1 = 1 ; Exposure-difference between the 2 nd (x 2 ) and 1 st (x 1 ) time-period is defined as "1 unit" 4. h(u) , g(u) = hazard-functions for developing MS in susceptible men {h(u)} and women {g(u)} r = the proportionality constant for hazard -such that: g(u) = (r)· h(u)

5.
λ m , λ w = Exposure-threshold necessary to produce disease in susceptible men (λ m ) and women (λ w ) λ = λ w -λ m = the difference in exposure-threshold between susceptible women and men 6. c , d = the maximum probability of MS in genetically susceptible men (c) and women (d).

Environmental Considerations
From Prop. (6.2), it is apparent that the greater prevalence of MS in women is due to: P(MS, E│F, G) > P(MS, E│M, G) This could be due to women being more likely to experience a sufficient environmental exposure than men, to women having a different physiological response to a similar exposure compared to men, to women having a greater probability of developing MS once the necessary environmental and genetic events have come together, or it could be due to some combination of these factors. Regardless of the reason, however, women and men require separate consideration so that: (1) for women: Thus: P(MS 1 )*P(F│MS 1 ) = (C)*P(MS 2 )*P(F│MS 1 ) = (C)*P(MS 2 )*P(F 1 ) # Definition (7) From Eq. (1) P(MS 1 , E│G, F) = P(MS 1 )*P(F│MS 1 ) / P(G, F) = P(F 1 )(C)*P(MS 2 ) / P(G, F) In Canada, the sex-ratio in MS patients {i.e., P(F│MS) / P(M│MS)} has increased from 2.2 in Time- Period-1 (i.e., 1941Period-1 (i.e., -1945 to become 3.2 in Time- Period-2 (i.e., 1976Period-2 (i.e., -1980 Consequently: P(MS, E│G, F) 2 = Zw 2 = P(F 2 )*P(MS 2 ) / P(G, F) Finally, the level of environmental exposure at which the development of MS become possible (i.e., the threshold) does not need to occur at zero and the threshold does not need to be the same for men and women. Consequently, Eqs. (8 & 9) need to be written differently such that: both 1.0, then, as for true survival, everyone ultimately fails. If the threshold in women (λ w ) is greater than that in men (λ m ), then the difference in threshold (λ) will be positive. Because, Assumption (A11) leads to exponential response curves, any two points determines each curve uniquely.

(31)
If envirnmental experience is independent of susceptibility, then: 0.690 ≤ P(E) ≤ 1 Consequently, at present, both genders seem to experience, very commonly, each of the necessary environmental events involved in MS pathogenesis, (i.e., these are population-wide events).
Of course, from Eq. (24), for both men and women: P(MS│G, E) << 1 Thus, it must be that certain genetic backgrounds are only (or more) responsive to certain environmental experiences. For example, if all genotypes required the (E A ) environmental event (e.g., vitamin D deficiency) 4 but some genotypes required a longer duration or greater intensity of exposure to produce MS than others, then this might help to explain the low penetrance ranges for the parameters (c) and (d) incicated in Eq. (24).