Skip to main content
Advertisement

< Back to Article

Table 1.

Features of different methyltyping techniques.

More »

Table 1 Expand

Figure 1.

Determination of genome-scale methylation states with MethylSeq/MetMap.

Genomic DNA is digested with the methylation-sensitive restriction enzyme HpaII. Unmethylated HpaII sites (open circles) are digested and thus found at the ends of restriction fragments, while methylated HpaII sites (black circles) are not digested. Restriction fragments are size-selected according to the Illumina protocol; fragments that are either too long or too short are removed. Fragments that pass the size-selection are used to construct sequencing libraries. After sequencing, the raw reads are aligned against the reference genome and processed with MetMap to derive maps of genome-scale methylation.

More »

Figure 1 Expand

Figure 2.

The methylation state of restriction site B cannot be determined by its read count alone.

Suppose that due to the size selection step, only fragments of length 50–300bp are sequenced. The four adjacent restriction sites (denoted by circles) may have different methylation states, resulting in epialleles with different “neighborhood methylation structures” of B. Site B is sequenced only from fragments of type B–C–D, which are the product of alleles in which sites B and D are unmethylated (and cut) and site C is methylated (and not cut). (a) B is unmethylated in both case 1 and case 2, but it receives different read count values. In case 1 sites A,B and D are unmethylated and therefore digested by HpaII, giving fragments A–B of length 10bp and B–C–D of length 100bp. Fragment A–B is too short to be sequenced, but B–C–D has its ends sequenced. In case 2 all four HpaII sites are digested, giving fragments A–B, B–C and C–D. A–B and B–C are too short and are not sequenced, and so site B is not sequenced. In case 3, site B is methylated, is not cut by HpaII, and is not sequenced. Note that the read counts at site B alone cannot distinguish case 2 from case 3. (b) Analysis is complicated by heterogeneous methylation within a population of cells. The extent to which site B is methylated in the cell population cannot be determined given only the read count at site B. In case 4, although site B is cut in 90% of the cells, it is sequenced only infrequently, because site C is unmethylated and cut in 90% of the cells, resulting in a B–C fragment that is too short for sequencing. In contrast, in case 5 site B is cut in only 10% of the cells. But site C is methylated in 90% of cells, so the majority of the fragments in which site B has been cut will yield a B–C–D fragment and will be sequenced. Thus the methylation structure of neighboring restriction sites strongly influences the frequency with which a site will be sequenced.

More »

Figure 2 Expand

Figure 3.

Inference of site-specific probabilities of unmethylation and annotation of strongly unmethylated islands from MethylSeq read counts.

MetMap constructs a directed graphical model (b) from the genome and read counts (a). The methylation state of each CCGG site is represented by a random variable that also encodes whether it is in an unmethylated island. CpG sites are also represented in the model, with the distance between sites affecting the parameters. The read counts are used to set the state of the observed random variables corresponding to the possible sequenced fragments (for simplicity of representation, only a sample of these variables is outlined in the figure). The numbers in the blue circles represent normalized read counts. Dark edges correspond to boundaries of fragments. MetMap inferences of the extent of unmethylation (c) are shown alongside the values attained from a bisulfite sequencing validation. The raw read counts are scaled by the value chosen for sample 4 (Methods). Strongly Unmethylated Islands are annotated from the posterior distributions inferred at sites and the total read counts. The example shows part of an inferred SUMI on chromosome 19 from sample 4.

More »

Figure 3 Expand

Figure 4.

The average probability of methylation at SUMIs is highly stable across individuals of the same sex.

All pairings among the four individuals tested are shown. On the left side of each pair the correlations between the site specific MetMap scores are presented for sites within SUMIs. On the right side of each pairing the correlations of the SUMI scores are presented. The distribution of the sites that are highly unmethylated in one sample but methylated to different extents in the other sample is discussed in Text S3.

More »

Figure 4 Expand

Figure 5.

Strongly Unmethylated Islands (SUMIs) in the neutrophil methylome.

Genomewide SUMI predictions (a) reveal strongly unmethylated islands that are proximal to genes and that do not always correspond to sequence-based annotations of CpG islands shown in the tracks ‘BF islands’ and ‘CpG islands’ (e.g., the promoter of LRG1 and in an intron of SH3GL1). (b) SUMI and BF island length distributions have a different shape than the CpG island length distribution, suggesting numerous short false positives in the latter. (c). Some SUMIs appear 5′ of alternative promoter sites.

More »

Figure 5 Expand

Table 2.

Counts of the SUMIs annotated in the four human neutrophil samples.

More »

Table 2 Expand

Table 3.

Percentages of neutrophil SUMIs, UCSC CpG islands and BF-islands that overlap regions associated with functionality.

More »

Table 3 Expand

Figure 6.

Transcription start sites and their close surroundings are enriched with novel SUMIs.

The number of SUMIs that overlap each location within 5Kbp from RefSeq transcription start sites is shown for (a) all neutrophil SUMIs (b) Novel SUMIs (SUMIs that do not overlap UCSC CpG islands or BF islands) (c) SUMIs that do not overlap BF islands (d) SUMIs that do not overlap UCSC CpG islands.

More »

Figure 6 Expand