Skip to main content
Advertisement

< Back to Article

MetMap Enables Genome-Scale Methyltyping for Determining Methylation States in Populations

Figure 2

The methylation state of restriction site B cannot be determined by its read count alone.

Suppose that due to the size selection step, only fragments of length 50–300bp are sequenced. The four adjacent restriction sites (denoted by circles) may have different methylation states, resulting in epialleles with different “neighborhood methylation structures” of B. Site B is sequenced only from fragments of type B–C–D, which are the product of alleles in which sites B and D are unmethylated (and cut) and site C is methylated (and not cut). (a) B is unmethylated in both case 1 and case 2, but it receives different read count values. In case 1 sites A,B and D are unmethylated and therefore digested by HpaII, giving fragments A–B of length 10bp and B–C–D of length 100bp. Fragment A–B is too short to be sequenced, but B–C–D has its ends sequenced. In case 2 all four HpaII sites are digested, giving fragments A–B, B–C and C–D. A–B and B–C are too short and are not sequenced, and so site B is not sequenced. In case 3, site B is methylated, is not cut by HpaII, and is not sequenced. Note that the read counts at site B alone cannot distinguish case 2 from case 3. (b) Analysis is complicated by heterogeneous methylation within a population of cells. The extent to which site B is methylated in the cell population cannot be determined given only the read count at site B. In case 4, although site B is cut in 90% of the cells, it is sequenced only infrequently, because site C is unmethylated and cut in 90% of the cells, resulting in a B–C fragment that is too short for sequencing. In contrast, in case 5 site B is cut in only 10% of the cells. But site C is methylated in 90% of cells, so the majority of the fragments in which site B has been cut will yield a B–C–D fragment and will be sequenced. Thus the methylation structure of neighboring restriction sites strongly influences the frequency with which a site will be sequenced.

Figure 2

doi: https://doi.org/10.1371/journal.pcbi.1000888.g002