Trisomy 21 Alters DNA Methylation in Parent-of-Origin-Dependent and -Independent Manners

The supernumerary chromosome 21 in Down syndrome differentially affects the methylation statuses at CpG dinucleotide sites and creates genome-wide transcriptional dysregulation of parental alleles, ultimately causing diverse pathologies. At present, it is unknown whether those effects are dependent or independent of the parental origin of the nondisjoined chromosome 21. Linkage analysis is a standard method for the determination of the parental origin of this aneuploidy, although it is inadequate in cases with deficiency of samples from the progenitors. Here, we assessed the reliability of the epigenetic 5mCpG imprints resulting in the maternally (oocyte)-derived allele methylation at a differentially methylated region (DMR) of the candidate imprinted WRB gene for asserting the parental origin of chromosome 21. We developed a methylation-sensitive restriction enzyme-specific PCR assay, based on the WRB DMR, across single nucleotide polymorphisms (SNPs) to examine the methylation statuses in the parental alleles. In genomic DNA from blood cells of either disomic or trisomic subjects, the maternal alleles were consistently methylated, while the paternal alleles were unmethylated. However, the supernumerary chromosome 21 did alter the methylation patterns at the RUNX1 (chromosome 21) and TMEM131 (chromosome 2) CpG sites in a parent-of-origin-independent manner. To evaluate the 5mCpG imprints, we conducted a computational comparative epigenomic analysis of transcriptome RNA sequencing (RNA-Seq) and histone modification expression patterns. We found allele fractions consistent with the transcriptional biallelic expression of WRB and ten neighboring genes, despite the similarities in the confluence of both a 17-histone modification activation backbone module and a 5-histone modification repressive module between the WRB DMR and the DMRs of six imprinted genes. We concluded that the maternally inherited 5mCpG imprints at the WRB DMR are uncoupled from the parental allele expression of WRB and ten neighboring genes in several tissues and that trisomy 21 alters DNA methylation in parent-of-origin-dependent and -independent manners.


Introduction
Trisomy 21 (Down syndrome) is the most common autosomal aneuploidy that is compatible with life (average rate of 1/400-800 live births; average life expectancy of 55 years) [1]. The supernumerary chromosome 21 results from meiotic nondisjunction errors in approximately 90-95% of cases during oogenesis [2,3]. Thus, most individuals with Down syndrome inherit two maternal complete and free copies of chromosome 21. Advanced maternal age increases the risk of pregnancy with trisomy 21 [4], while the evidence for an association with paternal age is inconsistent [5][6][7][8]. Therefore, the extremely skewed disparity observed between the maternal and paternal meiotic errors at the origin of chromosome 21 nondisjunction is mainly explained by the effect of advanced maternal age.
Although individuals with Down syndrome share phenotypically distinctive traits including clinical manifestations of atypical and segmental accelerated aging [9], the syndrome exhibits a large variety of physical stigmata that are unevenly represented among probands. Some of the defects may constitute severe pathologies (i.e., cognitive dysfunction, acute lymphoblastic leukemia, congenital heart disease, premature aging and Alzheimer disease-like neuropathology). Despite the considerable number of clinical studies on the intellectual and physical disabilities associated with trisomy 21, and owing to the small proportion of paternally inherited cases, there is still no conclusive information regarding whether parent-of-origin (maternal versus paternal) genetic factors contribute differentially to the observed phenotypic variation. For example, in one study [10], no significant difference was found in the distribution of phenotypic clinical findings between Down syndrome patients with maternal (n = 150) and paternal (n = 8) origin of the nondisjunction errors. In another study [11], congenital heart defects, high arched palate, and short fingers occurred less frequently in cases with a paternally-derived extra chromosome 21 (n = 8) than in cases with a maternally-derived extra chromosome 21 (n = 28). The main caveats in those reports are the use of a limited number of polymorphic markers in the first study mentioned and the use of nucleolar organizer region heteromorphisms in the second study mentioned for the conclusive determination of the paternal origin of the nondisjunction errors.
Considering the scenario of genomic imprinting on chromosome 21 [12], the parental origin of the supernumerary chromosome 21 will contribute differentially to the development of some defects in a parent-dependent manner in Down syndrome. Notably, the uncommon condition of a normal phenotype with maternal [13][14][15] or paternal [16,17] uniparental disomy (UPD) of chromosome 21 does not rule out the possibility of genomic imprinting of nonessential genes. In fact, UPD involving chromosomes containing imprinted genes does not necessarily reveal an imprinted pathological disorder with clinical consequences [18]. Importantly, the extra copy of chromosome 21 causes a genome-wide [19,20] domain pattern [20] of dysregulation of gene expression, and it affects the DNA methylation levels differentially at distinct CpG dinucleotide sites [21,22]. At present, it is also is unknown whether those effects are dependent or independent of the parental origin of the nondisjoined chromosome 21.
The parental origin of the nondisjoined chromosome 21 can be established experimentally in nuclear trios (mother, father, and proband) by linkage analysis using highly polymorphic DNA markers (i.e., genotyping with short tandem repeats, STRs) [23]. Such expert analysis is difficult when DNA samples from the progenitors are unavailable. Here, we assessed the dependability of reversible epigenetic molecular imprints resulting in germline-specific 5 m CpG for the discrimination of the parental origin of chromosome 21 nondisjunction. For that purpose, we developed a PCR assay based on maternally derived allele methylation at a differentially methylated region (DMR) located in the tryptophan-rich basic protein WRB gene. Two research groups recently identified the target maternal-of-origin 5 m CpG imprints at the WRB DMR using genome-wide methylation chip technology [24,25]. The occurrence of a DMR in the WRB gene warranted campaigning it as the first candidate maternally imprinted gene (i.e., paternally expressed) on the human chromosome 21. In contrast to the uniparental inheritance pattern of allele expression determined by imprinting, one study reported alternate (i.e., opposing) monoallelic expression of the maternal or the paternal WRB alleles in different human fetal tissues from the same embryo [25], a pattern consistent with random monoallelic expression, rather than with imprinting. Thus, we also reappraised the candidate imprinting status of the WRB gene. We performed single-nucleotide polymorphism primer extension (SNuPE) at informative 3´-UTR SNP variants in DNA from blood cells, and in human embryonic stem cell lines (hESCs). Furthermore, we conducted an integrative and comparative epigenomic computational analysis using transcriptome RNA sequencing (RNA-Seq) public repositories, essentially adopting the frameworks recently described to explore enrichment-based sequencing data [26,27].
Finally, we also evaluated the impact of the parental origin of the supernumerary chromosome 21 on the methylation statuses at CpG sites in the RUNX1 (chromosome 21) and TMEM131 (chromosome 2) genes, the methylation levels of which are altered with trisomy 21 [21,22,28].

Ethics Statement
Peripheral blood samples from participating control and Down syndrome nuclear families were collected with written informed consent. For infants and children with Down syndrome, a surrogate consent procedure was used, whereby the next of kin or a legally authorized representative consented in writing on the behalf of the participants. The subjects were included from projects approved by the Ethics Committee of the Faculdade de Medicina de Campos, Brazil (approval code FR-278769), the Faculdade de Medicina de São José do Rio Preto, Brazil (HCRP 5810/2009) and the Universidade do Estado do Rio de Janeiro, Brazil (040/2005). The main objectives of those projects were to develop molecular genetic tests for the determination of the parental origin of the nondisjunction of human chromosome 21 and to screen for parent-of-origin genetic risk factors for trisomy 21. This study was conducted according to the principles expressed in the Declaration of Helsinki.

Subjects
We included 70 nuclear families (mother, father, and child) of either male or female index cases with the full free trisomy 21 status established by conventional karyotyping (G band) in at least 20 cells. We also included 20 families of children with no trisomy (controls). The average maternal age of index cases was 31.6 years (ranging from 16 to 44 years). After determining the parental origin of the nondisjunction error for at least three informative STRs and selecting for the informative SNPs in the target loci, 15 trios with a maternal origin (MT21), six with a paternal origin (PT21) and 13 control trios (N21) met the criteria for the investigation of parental-of-origin allele methylation and allele expression.

Determination of the Parental Origin of Chromosome 21 Nondisjunction through Linkage Analysis
To select cases of trisomy 21 of either maternal or paternal origin of the nondisjunction error, we performed linkage analysis by quantitative fluorescence PCR genotyping nuclear trios at highly polymorphic STRs using fluorochrome-labeled primers and separating the amplimers by high-resolution capillary electrophoresis. The physical locations and primer sequences are shown in S1 Table. We determined the parental origin of the extra copy of chromosome 21 by comparing the genotype of the index case with the parental genotypes at the informative STR loci. We identified the parental alleles according to the following possible scenarios of allele segregation: index case presenting with a triallelic pattern exhibiting allele ratios of approximately 1:1:1 or a biallelic pattern exhibiting a consistent allele ratio of 2:1. The observed genotypes are shown in S2 Table.

Assessment of Heritable Epigenetic Methylation Imprints
To infer the methylation status at a given CpG island (CGI), we developed CGI-specific methylation-sensitive restriction enzyme-based PCR triplex assays (MSRE-PCR). In one tube reaction, each assay amplifies the target CGI and two other genomic regions as a control for the efficiency of the restriction enzyme digestion and to normalize the estimate ratios of the restriction enzyme-resistant 5 m CpG sites. Allele-specific methylation was determined by interrogating informative SNPs neighboring the target CGIs. Information regarding the physical location of the selected target CGIs, 5 m CpG-sensitive restriction enzymes, SNPs and primer sequences used in the assays is included in S1 Table. Because of the potential diagnostic value of the WRB DMR (CGI-2) for the discrimination of the parental origin of nondisjunction of the supernumerary chromosome 21, we describe its specific MSRE-PCR assay here. On untreated gDNA, the assay generates three FAM-labeled amplimers of different lengths. On HhaI treated gDNA, the combined pattern of the amplimers is used to determine one of three possible statuses of methylation: hypomethylated, hemimethylated or hypermethylated. The genomic amplimers are: (1) The WRB DMR target region (observed amplicon size: 254 bp), encompassing the recognition sites for HhaI, and the rs2299739 and rs2244352 SNPs (S1 Table). (2) The known unmethylated ESCO2 core promoter CGI (observed amplicon size ranging 248 to 250 bp), encompassing many HhaI recognition sites. The ESCO2 core promoter CGI was originally featured as a query target in the SALSA MS-MLPA ME030 probemix commercial kit (MRC Holland). This genomic amplimer provides information regarding the efficiency of the restriction enzyme reaction. (3) A region (observed amplicon size ranging 269 to 271 bp) genetically linked to the WRB DMR but lacking HhaI recognition sites. This amplimer serves as a normalization reference. Parental allele methylation statuses at the WRB DMR were assessed by interrogating neighboring SNPs using a non-fluorescent uniplex version of the MSRE-PCR. Next, we carried out SNuPE assays to genotype the methylated alleles, refractory to digestion with the restriction enzyme, using SNaPshot technology (Thermo Fisher Scientific, Waltham, MA, USA). To validate the parental allele-specific methylation statuses, we used the McrBC restriction endonuclease that acts upon DNA containing methylcytosine on one or both strands [31]. We used the same experimental design to profile the methylation statuses at 5 m CpG-restriction enzyme sensitive sites located at the WRB CGI-1 and CGI-3. To evaluate the impact of the parental origin of the extra copy of chromosome 21 in the methylation of CpG sites at the RUNX1 and TMEM131 genes, we developed gene-specific methylation-sensitive restriction enzyme MSRE-PCR assays (S1 Table).
We calculated the ratio of restriction enzyme-resistant 5 m CpG sites at the queried CGIs using the following equation: where, C is the value of the area under the peak (AUP) of the queried CGI amplimer from the restriction enzyme digested DNA sample; D is the AUP of the negative amplimer from the restriction enzyme digested DNA sample, and f AB is the correction factor that accounts for the lack of symmetry and imbalance observed between the AUP for the queried GCI amplimer (A) and the negative amplimer (B) obtained from the undigested DNA sample.

An Androgenetic Reference Sample of Methylation Statuses in Male Germline Derivatives
The homogenous androgenetic nature of the hydatidiform mole was determined by comparing the genotypes (S3 Table) of the two samples of the same specimen with that of a peripheral blood sample from the donor by quantitative fluorescence PCR using the AmpFlSTR Identifiler system (Thermo Fisher Scientific, Waltham, MA, USA).

Assessment of Transcriptional Allele Expression
To discriminate between the possible biallelic and the monoallelic patterns of transcriptional expression of the WRB gene, we interrogated 3´-UTR SNPs in cDNAs obtained from heterozygous blood donors and hESCs using SNaPshot technology. For comparison, we genotyped SNPs at two paternally imprinted reference genes: H19 [32] and ATP10A [33]. Primer sequences are shown in S1 Table.

High-Resolution Capillary Electrophoresis for Separation of Fluorochrome-Labeled Amplimers
Amplimers were analyzed on an automated laser fluorescent ABI PRISM 310 Genetic Analyzer (Thermo Fisher Scientific, Waltham, MA, USA). The electropherograms were generated using the dedicated GeneScan1 Analysis and Genotyper1 software version 3.7 packages and Gene-Mapper1 ID version 3.2 (Applied Biosystems 1 from Thermo Fisher Scientific, Waltham, MA, USA).

The Methylation Statuses at the WRB CpG Islands in Methylome Public Datasets
To examine the methylation patterns at the CGI mapped to the reference WRB locus, we performed an integrative and comparative epigenomic computational analysis by viewing the Smith Lab Public DNA methylation track hub [34] and the 111 distinct epigenomes from the Roadmap Consortium [35] at the UCSC Genome Browser [36,37]. The Smith Lab Public DNA methylation track hub comprises a pre-loaded set of 183 analyzed human methylomes from bisulfite sequencing experiments from brain [38][39][40], hESCs [41], induced pluripotent stem cells (iPSCs) [42] and blood cells [43][44][45]. The rates of 5 m CpG along the chromosome 21 in human oocytes [46] were displayed as a custom track in the UCSC Genome Browser. We graphically displayed the predicted promoters and CTCF and POL2 binding sites at the UCSC interface using the Ensembl Regulatory Build and Transcription Factor Binding track resource [47].

Identification of DNA Motifs Mapping at the WRB DMR
To search for cis-acting motifs that may be implicated in the differential epigenetic statuses of the WRB DMR, we queried published libraries of DNA-binding proteins with the FASTA reference sequence for the WRB DMR using online search DNA motif programs [48,49]. We also used Tandem Repeat Finder [50] to scan the WRB DMR DNA reference sequence for the presence of ungapped (regular, fixed-length patterns) DNA repeat elements. The downstream analysis for the identified array is based on the rationale that cis elements constitute platforms on which the interactions with site-specific DNA-binding factors are built to establish and maintain epigenomic modifications [51]. To investigate whether the identified DNA motifs correspond to transcription factor binding sites, we compared the DNA motifs against a database of known motifs using the TOMTOM tool [52], which ranks each suitable match to the query and displays motif web logos. To address whether the motifs found in the WRB DMR are present in other loci, we searched the entire reference genome sequence using the FIMO tool [53] from the MEME program suite [48] available online at http://meme-suite.org/. The conservation of the array of DNA motifs was investigated by lifting over the coordinates in the reference genomes of vertebrates using the LiftOver tool from the UCSC Genome Browser [36] and submitting the lifted FASTA sequences to the motifs analysis. The coordinates of the tandem repeat motif array were then graphically displayed in the UCSC Genome Browser [36,37] using the custom track tool.

Combinatorial Histone Modification Expression Signatures
We compared the common combinatorial histone modification expression patterns across the DMRs in 29 known maternally and two paternally imprinted genes [46] with that in the candidate imprinted WRB gene by displaying the sequence features and the activating or repressive histone marks in the UCSC genome browser using annotation and track hubs. As testable predictions of the epigenetic status, we used the following two modification modules defined by Wang and collaborators in human CD4+ T cells [54]. The 17-histone modification activation backbone module: H2A.Z, H2BK5ac, H2BK12ac, H2BK20ac, H2BK120ac, H3K4ac, H3K4me1, H3K4me2, H3K4me3, H3K9ac, H3K9me1, H3K18ac, H3K27ac, H3K36ac, H4K5ac, H4K8ac, and H4K91ac. The 5-histone modification repressive module: H3K27me3, H3K27me2, H3K9me3, H3K9me2, and H4K20me3.

Reappraisal of the Candidate Maternally Imprinting Status of the WRB Gene Using Secondary Analysis of Massive Parallel RNA Sequences
We measured allelic imbalance at likely heterozygous loci that map within a 4-Mb chromosomal region centered at the WRB gene by querying RNA sequence read archive (SRA) public data repositories. We included exon, 5´-UTR, 3´-UTR and ncRNA SNPs being 159 SNPs with MAF > 0.1 (i.e., expected global heterozygosity rate of 0.18) as testable predictions and 4 SNPs with MAF <0.05 as a control for monoallelic expression (S4 Table), mapping to within 35 genes and ncRNAs. As a reference, we used SNPs at the SNURF and H19 genes (S1 Table), which are both imprinted in every human tissue tested so far [55]. We used IUPAC genomic reference to correct for reference allele preference during alignment [58]. We restricted the analysis to two series of experiments. The first included 1,012 human RNA-Seq sample runs (S5 Table) selected from the 4,978 unsorted accessions recently analyzed by Deelen et al. (2015) [59]. The second included 212 SRA accessions from public repositories and sorted into 15 primary tissue sources (S6 Table) by their reported Biosample and Bioproject unicity (i.e., one sample, not a mixture, from a single donor). We quantified the allele fractions only in runs that yielded a depth of at least 80 reads, with a quality of base calls > Q30, increasing the variant confidence detection to a probability of correct SNP call of 0.999 provided a read depth (coverage) of 40 for a theoretical heterozygous position [60]. We categorized the patterns of allele expression as monoallelic, biallelic or biallelic imbalance. The criteria were: monoallelic if the allele fractions were < 0.15 or > 0.85; biallelic if the allele fractions ranged from 0.35 to 0.65; and biallelic imbalance if the allele fractions ranged from > 0.15 to <0.35 or > 0.65 to <0.85. We used a chi-square test to evaluate whether the allele-specific read counts deviated from the expected proportions (50/50) [58]. To map FASTQ formatted, filtered spot RNA raw sequence data to the hg19 reference sequence we used a flexible workflow created with the web-based Galaxy tool suite (https://usegalaxy.org/) [61]. The workflow incorporates the use of the following software tools: FastQC (read quality control check) [62], FASTQ Groomer, Filter by quality (Phred score quality cut-off: 25; minimum percentage: 90), removal of sequencing artifacts, Bowtie2, BWA, BAM-to-SAM, Filter SAM or BAM, output SAM or BAM (reads with maximum 01 variant or reads with no variant), and FreeBayes (Bayesian genetic variant detector). We visualized the filtered and aligned reads using the UCSC graphical interface.

Results
The Methylation Statuses at the WRB CpG Islands The WRB reference sequence locus contains five annotated CpG islands, CGI-1, CGI-2, CGI-3, CGI-4, and CGI-5 comprising 64, 27, 19, 17 and 18 CpG dinucleotide sites, respectively (Fig 1). CGI-1 encompasses the 5´-UTR and exon 1 of the ENST00000333781.8 reference WRB transcript variant 1 (long variant), and maps to a predicted promoter region. In public methylome databases, CGI-1 is essentially unmethylated in gametes and in somatic and embryonic cells and tissues (Fig 1). CGI-2 maps to a second predicted promoter region, located upstream of the TSS of the ENST00000380708.4 reference WRB transcript variant 2 (short variant) (Fig 1). CGI-2 is located within the differentially methylated region (DMR) reported by Court and collaborators [24] and Docherty and collaborators [25] for which WRB was classified as a candidate maternally imprinted gene (i.e., paternal-origin allele expressed). CGI-2 is differentially methylated in oocytes versus sperm [46], partially methylated in adult somatic tissues, and hypermethylated in embryonic cells and tissues (Fig 1). CGI-3 is differentially methylated in female and male gametes, essentially unmethylated in adult cells and tissues, and hypomethylated in embryonic stem cells (Fig 1). CGI-4 is hypermethylated in gametes, in all somatic cells, and in tissues (Fig 1). CGI-5 is differentially methylated in the gametes and is ubiquitously hypermethylated in somatic cells and tissues (Fig 1).
Using target-specific MSRE-PCR assays, we experimentally replicated the above methylation statuses at the WRB CGI-2 (Fig 2A), CGI-1 (Fig 2B), and CGI-3 ( Fig 2C) in genomic DNA from disomic individuals. The overall average ratio of restriction enzyme-resistant 5 m CpG sites at the WRB CGI-2 was 47.4% in disomic blood cells from adult donors (S7 Table). In hESCs, the overall average ratio of 5 m CpG sites was 82.1% (S7 Table). In contrast, the assay  [36]), the speciesconserved principal transcript (ENST00000333781 in pink) (APPRIS [75]), the Ensembl Regulatory Build CTCF and POL2 activity and predicted promoters [76], CpG islands, CpG sites, and regulation and methylome studies indicated in the Materials and Methods section. The CGI-2 is located in the differentially methylated region (DMR) reported by Court and collaborators [24] and Docherty and collaborators [25] (depicted in red in the custom track named Court/ Docherty DMR), from which WRB was classified originally as a novel candidate, maternally imprinted gene (i.e., paternal-origin allele expressed). The custom tracks containing the sperm and oocyte DNA methylation signals correspond to the supplementary data reported by Okae et al. [46]. Screenshot generated using the UCSC Genome Browser hg19 (http://genome.ucsc.edu).
The allele-specific methylation at the WRB CGI-2 DMR was assessed experimentally across the neighboring rs2244352 SNP in genomic DNA from peripheral blood cells of disomic heterozygous subjects. The maternal alleles were consistently methylated (i.e., refractory to digestion with HhaI and susceptible to digestion with McrBC) while the paternal alleles were unmethylated (i.e., susceptible to digestion with HhaI and refractory to digestion with McrBC) (Fig 3). Thus, the underlying genetic basis for the hemimethylated profiles observed at the WRB CGI-2 DMR is unequivocally due to the inheritance of specific maternal-allele 5 m CpG imprints.

Discrimination of the Parental Origin of Chromosome 21 Nondisjunction Using Heritable 5mCpG Imprints at the WRB CGI-2 DMR
We assessed the dependability of the known heritable epigenetic methylation [24,25] resulting in maternal allele methylation imprints at the WRB CGI-2 DMR to ascertain the parental origin of chromosome 21 nondisjunctional events in trisomy 21 probands. For these experiments, the parental origin of the nondisjunctional events was unambiguously established by linkage analysis in nuclear families using highly polymorphic STRs. The genotypes and the segregating alleles are shown in S2 Table. The ratios of restriction enzyme-resistant 5 m CpG sites at the WRB CGI-2 DMR were significantly different between the MT21 probands (average ratio of 68.4%) and the PT21 probands (average ratio of 33.5%) (Fig 4 and S7 Table). As it follows from the inheritance of maternalallele derived 5 m CpG imprints, the WRB CGI-2 was unmethylated in DNA from a complete androgenetic mole (Figs 4 and 5A). In hESCs (Figs 4 and 5B, and S7 Table), the WRB CGI-2 was hypermethylated, indicating the loss of the imprinted pattern of 5 m CpG marks at this developmental stage.

Parent-of-Origin-Independent Altered Methylation Effects Due to the Extra Copy of Chromosome 21
Because the supernumerary copy of chromosome 21 differentially affects the methylation statuses at distinct loci in the genome [21,22], we tested whether the levels of methylation at loci in either cis-or trans-configuration with regards to the WRB CGI-2 DMR change in a parentof-origin-dependent fashion. Using the MSRE-PCR approach, we assessed the methylation statuses at CpG sites within the RUNX1 (cis) and TMEM131 (trans) genes that are differentially epigenetically perturbed in trisomic samples versus disomic samples [21,22,28]. Compared with disomic individuals, in trisomic subjects there is a gain of methylation at RUNX1 and a loss of methylation at TMEM131. We replicated the reported perturbation effects of the extra copy of chromosome 21 on the methylation patterns at these loci. Importantly, and in contrast to the situation at the WRB CGI-2 DMR, the epigenetic perturbations observed at the RUNX1 representative control disomic DNA sample (blood) using the HhaI methylation-sensitive restriction enzyme-based PCR triplex assay developed in this study. Electropherograms of the amplimers generated from either undigested genomic DNA (upper panel) or DNA digested with HhaI (lower panel) genotyped via quantitative fluorescent PCR. The positive amplimer refers to a locus in the ESCO2 gene with constitutively hypomethylated CpG dinucleotides at the target restriction enzyme sites (100% susceptible to HhaI digestion). The negative amplimer refers to a WRB region that lacks HhaI sites, and is, therefore, refractory to enzymatic digestion. The numbers in the upper boxes correspond to the amplimer lengths in base pairs while those in the lower boxes refer to the areas under the peak of the amplimer. In this representative DNA sample, the ratio of 5 m CpG sites at the WRB CGI-2 was 50.6%. In contrast, the assay revealed a consistent unmethylated pattern of CpG sites at the WRB CGI-1 (B) and WRB CGI-3 (C).

The Maternal 5 m CpG Imprints at the WRB CGI-2 DMR Do Not Dictate a Paternal Monoallelic Expression
To assess whether the maternal allele-specific methylation pattern observed at the intragenic WRB CGI-2 DMR is indicative of functional gene regulation differences in the form of Maternal-of-origin-dependent imprinted methylation marks at the WRB CGI-2 DMR. 5 m CpGsensitive restriction endonuclease sites at the WRB CGI-2 DMR are differentially methylated on the maternal allele versus the paternal allele in a manner consistent with imprinting. Electropherograms of the genotype profiles in a control nuclear family, informative for the rs2244352 (C>A) SNP neighboring the WRB CGI-2 DMR. In the child, the maternal-derived C allele is 100% resistant to HhaI digestion (fully methylated), whereas the paternal-derived A allele is 100% susceptible to HhaI digestion (i.e., unmethylated). The parental allele-specific methylation statuses were validated using the McrBC restriction endonuclease that cleaves methylated DNA and, therefore, the unmethylated paternal allele, but not the maternally methylated allele, remains undigested.  genomic imprinting, we qualitatively tested for allele-specific gene RNA expression by interrogating 3´-UTR SNP variants in cDNAs of RNA from blood samples and hESCs. We noted that both the maternal and paternal alleles were expressed in heterozygous samples independently of the methylation statuses observed at the WRB CGI-2 DMR (Fig 7). In hESCs, in which the WRB CGI-2 DMR is hypermethylated, the WRB transcriptional expression profile was biallelic (Fig 8).
Genomic imprinting can either completely silence one parental allele or significantly reduce its expression. We queried RNA-Seq public archives using sequence substrings for each exon 1 of the two major WRB reference transcripts, variants 1 (ENST00000333781.8) and 2 (ENST00000380708.4). We found evidence of expression (> 80 reads) of the WRB transcript variant 1 in RNA-Seq experiments in the brain, fallopian tube, liver, muscle, ovary, skin, and testis samples (S8 Table). However, using a sequence substring specific for the exon 1 of the WRB transcript variant 2, the number of reads filtered in each SRA accession from primary tissues was consistently 16, thus impairing further analysis of this short transcript variant (S8 Table) according to our stringent criteria.
Because relevant genotypes are unfortunately lacking in the entire RNA-Seq datasets analyzed (i.e., samples are not DNA/RNA exome sequencing pairs), we next aimed to provide RNA-Seq evidence for either monoallelic or biallelic expression queried with the rs1060180 and rs13230 WRB 3´-UTR SNPs. These SNPs map to both the long and short WRB transcript variants. In the 1,012 unsorted RNA-Seq experiments, we found allele fractions consistent with a biallelic pattern of expression (mean allele fraction ranging from 0.49 to 0.51) for both SNPs in the brain, epidermal keratinocytes, fetal large and small intestine, large airway epithelial cells, ovary, skin and testis samples (S9 Table). In the analysis of the RNA-Seq public databases sorted by tissue, we also observed allele fractions consistent with biallelic expression across the WRB 3´-UTR SNPs in brain, fallopian tube, thyroid, muscle, ovary, skin, and testis samples (S10 Table). Altogether, we observed no suppression or expression bias effects on either allele (i.e., the evidence against a maternally suppressed, imprinting effect) in ten biosamples (brain, fallopian tube, fetal large and small intestine, large airway epithelial cells, thyroid muscle, epidermal keratinocytes, ovary, skin, and testis). In contrast, for the known imprinted SNURF gene (tested across the rs705 SNP in the brain, fallopian tube, liver, muscle, ovary, skin, and testis) both alleles were expressed monoallelically (i.e., passed the "flip test" required for genomic imprinting) (S10 Table). Similarly, for the H19 rs217727 and H19 rs10840159 SNPs we observed monoallelic expression of both alleles in the fallopian tube, ovary, and testis (S10 Table).
Because some known DMRs are physically distant up to 2-Mb from the target imprinted gene [24,25], and therefore function as imprinted control regions (ICRs), we extended the query to 162 SNPs (S4 Table), mapping within a 4-Mb chromosomal region centered at the WRB gene. Most SNP substrings yielded < 20 reads and, therefore, were unsuitable for further analysis according to the inclusion criteria. Importantly, in addition to WRB, we found SNPs with > 80 reads yielding allele fractions that were consistent with the biallelic expression in 15

Common Combinatorial Patterns of Histone Acetylation and Methylation Marks at the Predicted WRB Promoter Regions
We next intended to identify epigenetic histone marks around the WRB CGI-2 DMR that conflate traceable combinatorial patterns, which possibly regulate the expression patterns of the WRB transcript variants 1 and 2. We conducted the comparative analysis of DMRs classified as germline DMRs (gDMRs) [46] and associated with patterns of imprinted methylation or expression. We found similarities in the confluence of both the 17-histone modification activation backbone module and the 5-histone modification repressive module at the CGIs mapping to the long (ENST00000333781.8) and short (ENST00000380708.4) transcript variants of the WRB gene and at the gDMRs of the BLCAP, GNAS, IGF2R, GRB10, and RB1 genes, which were most striking with the MEST gDMR (Fig 10 and S1 Fig).

DNA Motifs at the WRB CGI-2 DMR
Based on a computational comparative analysis of genomic reference sequences, we observed that the nucleotide sequence encompassing the CGI-2 is conserved among primates (Fig 11A). We analyzed the region for the occurrence of arrays of cis elements (other than the CpG dinucleotide sites) that may be specific to this DMR or common to known DMRs by searching for sequence motifs as testable predictions of the differential epigenetic status. We found the DNA repeat motif [AGGYGBYSYAGGACT] (Fig 11B). In humans, the motif occurs in a cluster of seven tandem units. The motif repeat units are spaced at regular chromosomal intervals through the WRB DMR (CGI-2), encompassing 388 bp (Fig 11C). We then searched the whole genome reference sequence and were unable to find other loci containing the array that occurs at the WRB CGI-2 DMR. The [AGGYGBYSYAGGACT] motif repeat unit has sequence similarity (i.e., match overlap of thirteen nucleotides in the optimal alignment) to the putative multimer SPDEF_DBD_2 DNA-binding specificity consensus site ([gtggTCCCGGATYAT]) of the transcriptional factor SPDEF (HumanTF 1.0 DNA binding motif library) [49,63]. The number of [AGGYGBYSYAGGACT] DNA motif units has varied since its evolutionary occurrence in marmosets (Fig 11C).

Discussion
In this multidisciplinary study, we used complementary experimental and computational approaches to address the challenging biological questions of whether the extra copy of Although prior studies had shown that the extra copy of chromosome 21 differentially affects the DNA methylation levels at distinct CpG dinucleotide sites [21,22], it was unknown whether the effects are dependent (i.e., genomic imprinting) and/or independent of the parental origin of the nondisjoined chromosome 21. We showed these epigenetic effects on three genes, two located on chromosome 21 (WRB and RUNX1) and one located on chromosome 2 (TMEM131).
We established that the supernumerary chromosome 21 altered the methylation patterns at distinct CpG dinucleotide sites in the RUNX1 and TMEM131 genes in a parent-of-origin-independent manner. From a genome-wide perspective, this finding suggests that the epigenetic effect of the extra copy of chromosome 21 does not vary greatly with the parental origin of the supernumerary copy of chromosome 21; however, the extra copy does affect the methylation statuses of genes located on the same and other chromosomes.
We explored the differentially heritable epigenetic methylation imprints at the WRB CGI-2 DMR to develop a simple PCR assay based on the maternally imprinted 5 m CpG marks to ascertain the parental origin of chromosome 21 nondisjunctional events in Down syndrome probands. The assay does not require bisulfite conversion and improves on the current linkage analysis approach by not requiring genomic DNA from the progenitors. This epigenetic feature will be an asset when parental samples are unavailable as in the situation of cryopreserved banked specimens. In addition to the notorious scarcity (incidence of approximately 5%) of the nondisjunctional events of paternal origin, the dependability of their identification from DNA available from the progenitors has precluded studies on phenotype-(epi)genotype correlations for Down syndrome of patrilineal descent. The WRB CGI-2 DMR-based assay will greatly facilitate the identification of Down syndrome cases of paternal origin and the establishment of representative cohorts for studies on the variation in phenotypic outcomes. The assay will also be useful in the prenatal molecular diagnosis of the parental origin, without collecting parental samples, in triploid pregnancies where only the conceptuses with two paternal sets have the potential to cause maternal complications [64].
The initial evidence suggesting a functional link between the maternal-of-origin-specific imprinted methylation at the WRB CGI-2 DMR and the WRB monoallelic expression came from the study by Docherty and collaborators [25]. By sequencing cDNA across the rs1060180 SNP, they observed alternate (i.e., maternal or paternal) monoallelic transcript expression in skeletal muscle and aorta tissues and biallelic transcript expression in the spinal cord from the same embryo. Unfortunately, that study did not conclusively establish whether the paternal allele is unmethylated. Moreover, a distinctive feature of imprinted genes is the preference for the full expression of one (and always the same) of the two parental alleles [65]. Importantly, alternate monoallelic transcript expression occurs only between different genes regulated by the same imprinting control region. For example, between the paternally expressed Peg13 noncoding RNA and the maternally expressed Kcnk9 gene in the mouse brain [66], and between the maternally expressed H19 and the paternally expressed Igf2 genes in mouse hematopoietic stem and progenitor cells [67].
We evaluated the candidate imprinting status of the WRB gene. We showed that the maternal heritable epigenetic 5 m CpG imprints at a CGI-2 DMR were uncoupled to the predicted monoallelic expression of the paternal WRB allele. We replicated this observation in twelve biosamples (brain, blood, fallopian tube, fetal large and small intestine, hESCs, large airway epithelial cells, thyroid, muscle, epidermal keratinocytes, ovary, skin, and testis). These results are in agreement with recently reported findings from three genome-wide scale analyses of public RNA-Seq data [26,59,68]. These studies reported mean allele fractions showing either non-significant deviation (mean value = 0.504) [26,59] or significant deviation (average value ranging from 0.553 to 0.803) [26,68] for the WRB 3´-UTR rs1060180 and rs13230 SNPs. Furthermore, we provide evidence from 15 tissue repositories of allele rates that are consistent with the biallelic transcript expression of 10 genes that map to within 2-Mb around the WRB gene. Thus, the allele transcript expression of 11 genes in at least one primary human tissue tested is uncoupled from the control of maternally inherited imprints at the WRB DMR.
Steyaert and collaborators [26] used a SNP-guided analytical framework to identify monoallelic DNA methylation events from enrichment-based sequencing data. In the WRB gene, they functionally annotated the locus comprising the SNP rs2299739 with significant monoallelic DNA methylation. Using the rs2244352 SNP, here we demonstrated that the locus undergoes maternal-of-origin-specific differential methylation. We showed that in hESCs the WRB CGI-2 is hypermethylated, in contrast to the differential maternally inherited allele methylation profile observed in blood cells. A hypermethylated state for the WRB CGI-2 DMR occurs in several methylome studies of hESCs [35,41,42]. Significantly, we observed a biallelic transcriptional expression pattern for WRB, despite the hypermethylated state in hESCs.
The hemimethylated status characteristic of the known imprinted DMRs is not a sufficient epigenetic signature to determine the uniparental monoallelic expression of genes. For example, the gDMR in the GRB10 gene, which is isoform-specifically imprinted only in the brain, is hemimethylated in all tissues analyzed [55,69]. We observed a comparable confluence of both the 17-histone modification activation backbone module and the 5-histone modification repressive module between six maternally imprinted genes (MEST, BLCAP, GNAS, IGF2R, GRB10, and RB1) and the WRB gene. Therefore, there is still the possibility of the WRB gene being an isoform-and tissue-specific imprinted gene. Due to the unavailability of isoform-specific SNPs and the small levels of expression of the WRB transcript variant 2 in RNA-Seq experiments, unfortunately, we could not test that hypothesis with our current approach.
In a recent methylome study in human oocytes, Okae and collaborators [46] used the criteria of DNA methylation levels at the DMRs of known imprinted genes to classify the WRB CGI-2 as a secondary DMR (i.e., rather than a gDMR). At the WRB CGI-2, they observed an average methylation of 17.5% in oocytes (hg19 coordinates chr21:40757510-40758276; 11.5% in oocytes pool 1 PCR and 49.9% in oocyte pool 2) and of 0.45% in sperm, but with 35-65% methylation levels in blood cells. Importantly, a genome-wide scan of the methylomes of oocytes and blastocysts from that study revealed that the methylation profiles are highly comparable, with the methylation levels of maternal germline DMRs in blastocysts being on average half the levels found in oocytes (data not shown). Because the reported level of methylation at the WRB CGI-2 in blastocyst was 37.72% [46], one would expect a methylation level of approximately 75% in oocytes. In fact, the average methylation level at a core region with hg19 coordinates chr21: 40757603-40757721 in oocyte pool 2 is 70.3% (reanalysis of data from S1 Table of ref. [46]). Therefore, we consider the WRB CGI-2 to be a maternal gDMR.
We demonstrated the maintenance of the DNA sequence context of the human WRB CGI-2 DMR in primates. Within the WRB CGI-2 DMR, we identified a cis DNA motif array. In primates, the array consists of 2 to 8 consensus repeat units in primates that bear sequence similarity to the DNA-binding specificity consensus site of the transcription factor SPDEF. We speculate that the cis DNA motif array is a distinctive feature in the establishment and/or maintenance of the hemimethylated state of this particular DMR, rather than being involved in the differential expression of WRB transcript variants or alleles.
WRB is essentially a housekeeping gene. The human WRB protein corresponds to the Get1 protein in yeast. The WRB/Get1 proteins form a conserved family (IPR028945) in eukaryotes. These proteins function as transmembrane receptors for ASNA1/TRC40 (Get3 in yeast)-mediated insertion of tail-anchored (TA) proteins into the endoplasmic reticulum membrane (GO:0071816) [70]. RNA-Seq profiling revealed that WRB transcription occurs in all individual tissue categories investigated, although higher levels (i.e., FPKM/TPM) are displayed in the brain, testis, ovary and kidney [71]. Most tissues displayed moderate to strong nuclear and cytoplasmic positivity by WRB-specific antibody staining [72]. RNAi-mediated knockdown of WRB expression increases the rate of homologous recombination DNA double-strand break repair [73]. Importantly, WRB transcription levels are compensated in trisomic T21 lymphoblastoid cells [74], and the WRB gene lies outside a chromosomal domain dysregulated by the presence of the supernumerary chromosome 21 [20]. Therefore, the biological significance of the WRB gene perhaps being imprinted in an isoform, tissue and/or developmental-stage-specific manner is rather intriguing. Shown are the 17-histone modification activation backbone module, and the 5-histone modification repressive module found in human CD4+ T cells [54]. Highlighted in light blue is the DMR in each gene. Composite of screenshots of the dataset viewed at the UCSC Genome Browser hg18 (http://genome.ucsc.edu). (TIF) S1 Table. List of target loci, primer sequences, and restriction enzymes interrogated for parent-of-origin-dependent and -independent effects on DNA methylation.  Table. List of the 1,012 RNA-Seq public datasets queried to investigate locus and allelespecific expression. The list is a manually curated compilation of tables downloaded from the NCBI Sequence Read Archive (SRA). (XLSX) S6 Table. List of the RNA-Seq public data experiments configuring the Atlas of primary human tissues queried to replicate the imprinting-dependent and independent patterns of allele-specific expression of target genes. The list is a manually curated compilation of tables downloaded from the NCBI Sequence Read Archive (SRA), Biosample and Bioproject browsers. (XLS) S7 Table. Ratios of the restriction enzyme-resistant 5 m CpG sites in the WRB CGI-2 estimated in control disomic and trisomic study subjects and in human embryonic stem cell lines.

Supporting Information
(XLSX) S8 Table. RNA-Seq reads filtered with sequence substrings specific for each exon 1 of either WRB transcript variants 1 (ENST00000333781.8, long) or 2 (ENST00000380708.4, short). (XLS) S9 Table. Allele expression fractions across the WRB 3´-UTR rs1060180 and rs13230 SNPs found using unsorted 1,012 RNA-Seq public datasets. (XLSX) S10 Table. Allele expression fractions across the 162 SNPs mapping within the 4-Mb chromosomal region centered at the candidate imprinted WRB gene and across SNPs in the SNURF and H19 known imprinted genes. Shown in the different worksheets are the number of reads for each SNP filtered in the RNA-Seq public datasets, sorted by informative tissue. Worksheet labels correspond to the series "tissue gene SNP" (i.e., Adrenal BACE2 rs11701157), grouped alphabetically, and highlighted in different colors by tissue. (XLS) S11 Table. Summary of the RNA-seq evidence of biallelic expression of eleven genes mapping to a 4-Mb chromosomal region centered at the WRB gene in fifteen primary human tissues (Fig 9 in the main text is a graphical representation of these data). (XLS)