Interplay between DMD Point Mutations and Splicing Signals in Dystrophinopathy Phenotypes

DMD nonsense and frameshift mutations lead to severe Duchenne muscular dystrophy while in-frame mutations lead to milder Becker muscular dystrophy. Exceptions are found in 10% of cases and the production of alternatively spliced transcripts is considered a key modifier of disease severity. Several exonic mutations have been shown to induce exon-skipping, while splice site mutations result in exon-skipping or activation of cryptic splice sites. However, factors determining the splicing pathway are still unclear. Point mutations provide valuable information regarding the regulation of pre-mRNA splicing and elements defining exon identity in the DMD gene. Here we provide a comprehensive analysis of 98 point mutations related to clinical phenotype and their effect on muscle mRNA and dystrophin expression. Aberrant splicing was found in 27 mutations due to alteration of splice sites or splicing regulatory elements. Bioinformatics analysis was performed to test the ability of the available algorithms to predict consequences on mRNA and to investigate the major factors that determine the splicing pathway in mutations affecting splicing signals. Our findings suggest that the splicing pathway is highly dependent on the interplay between splice site strength and density of regulatory elements.


Introduction
Dystrophinopathies are the most frequent neuromuscular disorder. They are caused by mutations in the DMD gene, one of the largest genes found in humans [1,2]. DMD encodes for dystrophin, a key player in the stabilization of the sarcolemma during muscle contraction [3]. Clinical phenotypes include severe Duchenne muscular dystrophy (DMD), milder Becker muscular dystrophy (BMD), intermediate muscular dystrophy (IMD) and pure cardiac X-linked dilated cardiomyopathy (XLCM). DMD is characterized by early-onset, rapidly progressive muscular weakness, leading to wheel-chair dependency before age 13 and death during the third decade. BMD is clinically heterogeneous but presents a later onset and slower progression [4].
Clinical severity is determined by the maintenance of the open reading-frame, allowing the expression of semi-functional dystrophin with preserved N-term and C-term protein-binding domains [5]. Some parts of the central rod-domain can be truncated with minimal impact on protein function [6]. Frameshift and nonsense mutations cause absence of dystrophin expression and a DMD phenotype. In-frame mutations lead to abnormal or reduced dystrophin in muscle causing BMD. On this particular feature is based the promising molecular therapy of antisense oligonucleotide (AON)-mediated exon-skipping. Targeting splicing motifs of the pre-mRNA can induce the exclusion of selected exons and restoration of an open reading-frame, theoretically allowing the conversion of DMD to the BMD phenotype [7][8][9].
Until recent years, molecular diagnosis was mainly limited to detection of exonic deletions and duplications accounting for 65-70% of all disease-causing mutations [10,11]. Detection of the remaining 25-30% single point mutations or small rearrangements have historically been challenging due to the large size of DMD gene. Development of high-throughput screening methodologies has allowed routine diagnosis of these mutations [12]. However, the mutation impact on pre-mRNA splicing and protein expression is often unknown.
Exceptions to the reading-frame rule are found in approximately 9% of patients and the production of alternatively spliced transcripts is considered a key modifier of the clinical severity [10]. Skipping of the mutated exon has been reported in several nonsense BMD-associated mutations, suggesting a model based on disruption/creation of splicing regulatory elements (SRE) [13][14][15][16][17][18]. However, some findings suggest that SRE alteration is not the only factor determining exon-skipping [19][20][21][22]. Recently, Flanigan and co-workers postulated that exon-skipping occurs in a subset of weakly defined DMD exons [23]. It has also been found that splice site mutations can lead to exon-skipping or activation of cryptic splice sites [24][25][26]. Nevertheless, the main factors determining the final splicing pathway are still unclear.
The precise definition of DMD point mutations and their consequences help to improve our understanding of the molecular pathology in dystrophinopathies. Due to its particular features and size, DMD is a suitable model gene for the study of the in vivo effects of DNA variants on mRNA and the elements involved in the regulation of the splicing process. Point mutations also provide valuable information regarding critical protein domains for dystrophin function. Herein we report our results concerning the clinical phenotype, dystrophin expression and DMD molecular analysis in 105 dystrophinopathy patients, presenting 98 different point mutations. Muscle mRNA analysis performed in most patients, identified 27 mutations causing aberrant pre-mRNA splicing. The mechanisms involved in the development of splicing defects included abrogation of natural splice sites, creation of new splice sites, alteration of SREs and pseudoexon activation. Bioinformatics analysis using splice site and SRE predictive matrices was performed to investigate the major factors determining the splicing pathway in splice site mutations and the ability of available algorithms to predict exon-skipping events in exonic mutations.

Patient selection
Dystrophinopathy patients who tested negative for intragenic deletions and duplications were screened for point mutations using genomic DNA or muscle cDNA whole gene sequencing. Male patients were grouped into four phenotypic categories: DMD, BMD, IMD, and XLCM according to clinical presentation, family history, age at onset, progression and age at loss of ambulation (DMD,13, BMD$16, IMD $13 and ,16). Females expressing myopathic symptoms were reported as MC (manifesting carriers) while unaffected females were reported as AC (asymptomatic carriers). Patients or their parents in case of children gave written individual informed consent to participate in the study. The study was performed in accordance with the ethical standards laid down in the declaration of Helsinki and was approved by the Ethics Committee of Hospital de la Santa Creu i Sant Pau (HSCSP), Barcelona.

Muscle biopsy analysis
A muscle biopsy was taken in 89 out of 105 cases. Muscle sections were analyzed using standard histological and immunohistochemical techniques, described elsewhere. Dystrophin IHC was performed using monoclonal antibodies against N-terminal (DYS3), rod-domain (DYS1) and C-terminal (DYS2) epitopes (Novocastra, Newcastle upon Tyne, UK). IHC analysis of other sarcolemmal proteins, such as a, b, c and d sarcoglycans, caveolin-3, dysferlin, utrophin and emerin, were also performed.

Mutation detection
DNA was extracted from peripheral blood samples according to standard procedures. Prior to point mutation screening, DNA was tested for intragenic deletions and duplications by MLPA (multiple ligation-dependent probe amplification) (P034 and P035 Salsa Kit, MRC-Holland). Point mutation detection was performed on genomic DNA by direct sequencing of the 79 DMD exons and their flanking intronic sequences using SCAIP (single-condition amplification/internal primer) [12]. When muscle tissue was available, mutation analysis was first performed by cDNA sequencing and further confirmed on genomic DNA. Total mRNA was extracted and purified from approximately 30 mg of muscle using RNeasy Fibrous Tissue Mini Kit (Qiagen, Hilden, Germany) and subsequently retrotranscribed to cDNA by RT-PCR using polythymine primers (Invitrogen, Carlsbad, NM). Complete DMD cDNA was amplified and sequenced in twenty overlapping fragments using published [25] and self-designed primers. Sequencing analysis were performed using Big Dye 3.1 chemistry and ABI 35006L equipment (Applied Biosystems, Foster City, CA). Nucleotide positions were determined according to the standard DMD reference sequence (GenBank accession number NM_004006.2), and mutation nomenclature follows the guidelines of the Human Genome Variation Society. In order to make data publicly available, mutations and associated phenotypic information were subtmited to the Leiden Open Variant Database (LOVD, www.dmd.nl), Leiden, the Neetherlands.

Statistical analysis
The association between alteration of SRE motifs and milder BMD phenotype in truncating mutations was analyzed using Fisher's exact test for each SRE predictive matrix (Table S1). A permutation test was applied to significant SRE matrices to adjust for multiple comparisons. Differences in relative 59 splice site strength and density of SRE motifs between exons exhibiting cryptic site activation and exon-skipping in 59 ss mutations were analyzed using a paired T test.

Analysis of splicing pathways
Mutations predicted to affect splice sites or SRE motifs were analyzed by semi-quantitative QF-PCR in muscle cDNA. For each mutation, specific fluorescent labelled primers pairs encompassing the mutated exon were designed (Table S2). PCR products were analyzed by capillary electrophoresis using ABI 35006L equipment and Genemapper software (Applied Biosystems, Foster City, CA). Splicing outcomes were determined by comparing fragment length with position of potential cryptic splice sites and exon length. Peak area was used to calculate the relative ratio of each transcript population. Samples were run in duplicate together with a normal control.

Results
We identified 98 different point mutations in 105 unrelated dystrophinopathy patients, 99 males and 6 female carriers. Table 1 shows the identified mutations and associated muscular phenotypes together with results regarding muscle dystrophin immunostaining and mRNA analysis. Representative images of dystrophin immunostaining are shown in Figure 1. We identified 54 nonsense mutations, 15 small deletions, 11 insertion/duplications, 20 splice site mutations, 4 missense mutations, and one deep intronic mutation. Aberrant splicing was found in 27 mutations through muscle cDNA and/or in silico analyses. Mechanisms involved in splicing defects included abrogation of 17 natural splice sites, two splice site creations, seven SRE alterations and one pseudoexon activation. In the male patients, 75 had DMD, 15 BMD, 8 IMD and one XLCM. In the female carriers, one was asymptomatic (AC) while five manifested myopathic symptoms (MC). The manifesting carriers reported here were included in a previous work concerning clinical outcomes and X-chromosome inactivation [42].
The frequency of each type of mutation differed substantially between clinical phenotypes. Most DMD patients presented nonsense and frameshift truncating mutations that accounted for 84% of cases (63/75). Splice site mutations were found in 14.6% of cases (11/75) while missense and in-frame changes were found in 2.7% (2/75). BMD patients presented the same proportion of truncating mutations than splicing defects, 46.7% (7/15) each. Splicing defects in BMD include a pseudoexon activation caused by a deep intronic substitution. Only one missense mutation was detected in BMD patients (6.7%). IMD patients presented three truncating mutations (37.5%), three splice site mutations (37.5%) and two missense mutations (25%).
In silico analysis of SRE motifs showed five ESE disruptions and one ESS creation in BMD mutations. Disruption of at least one PESE octamer occurred in mutations c.883C.T (exon 9), c.3850G.T (exon 28), c.5287C.T (exon 37), c.10235del (exon 71) and c.10231_10235del (exon 71). Mutation c.10409dup (exon 74) was predicted to create an ESS according to PESS and Fas-ESS matrices. Five BMD mutations were predicted to create an intronic identity element (IIE). IMD mutation c.3982C.T (exon 29) was predicted to disrupt three SR-protein binding sites and to create an ESS according to hnRNP A1 and Sironi's matrices (Table S1). To investigate the ability of available matrices to predict critical regions for exon recognition and, to assess the association of SRE alterations with exon-skipping in BMD patients, all truncating mutations located in in-frame exons were tested against different matrices (Table S1). In male patients, 32 nonsense/frameshift mutations were identified in-frame skippable exons but only six were associated with milder BMD phenotype (18.75%). Statistically significant differences between DMD/IMD and BMD mutation groups were found using PESE and IIE matrices. Other matrices did not show any significant difference. Predicted disruption of PESE octamers occurred in 5 out of 6 BMD mutations (83.3%) and in 6 out of 26 DMD/IMD mutations (23%) (Fisher's Exact Test, P-value 0.0112). Seven mutations associated with DMD/IMD were found in exons where exon skipping events have been previously described (exons 25, 29, 37, 38, and 40). None of them was predicted to disrupt any PESE octamer. Creation of IIE hexamers occurred in 5 of 6 BMD mutations (83.3%) and in 7 of 26 DMD/IMD mutations (26.9%) (Fisher's Exact Test, P-value 0.0185). A permutation test corroborated the significant result between the PESE and IIE matrices (truncated P-value product 0.00659).

Splice site mutations
We identified twenty different splice site mutations in twentyone unrelated patients. Splicing pathways were determined in fourteen mutations through muscle cDNA sequencing and QF-PCR analysis. Detected transcript species, relative ratio and splice site predictions are summarized in table 2. Most mutations involved canonical AG/GT nucleotides disrupting natural splice sites (13/20, 65%). Four involved non-canonical nucleotides (20%), of which one disrupted the splice site while the other three reduced its efficiency. Creation of a new splice site was found in four mutations. Two of them also disrupted a natural site (c.265-1G.A and c.6913-1G.A), while the other two (c.1332-9A.G and c.5444A.G) created a strong splice site more efficient than the natural site. An intronic single-base substitution far from a natural splice site (647 bp) provoked the activation of a cryptic 59 ss causing the inclusion of a 67 bp pseudoexon into the mature mRNA. Mutations affecting natural 59 ss were more frequent (11/ 21, 52.4%) than those affecting natural 39 ss (6/21, 28.6%).
The splicing pathway differed from one mutation to another. In most cases (11/14) variable levels of more than one alternative transcript were detected. Splicing outcomes included exonskipping, cryptic or new splice site activation, intron retention and pseudoexon inclusion. Six mutations induced exclusively or mainly exon-skipping, seven induced activation of alternative splice sites and one mutation induced predominantly normal splicing (c.3603+2dupT). In most cases, the clinical phenotype and expression of dystrophin correlated with the absence/presence of significant amounts of normal and/or in-frame mRNA transcripts. This correlation was not observed in two cases presenting significant amounts of in-frame transcripts and a severe phenotype (patients #2042 and #1455). In both cases, most abundant inframe transcripts presented truncated protein-binding domains, actin-binding (ABD) or zinc-finger (ZZ) domains respectively. Loss of coding sequences may also have an impact on protein folding or stability. The mutations that abolished the function of natural sites presented a score reduction between 7% and 86% using HSF, and between 64% and 801% using MaxEnt ( Table 2). The mutations that reduced the site efficiency presented a score reduction between 7% and 15% using HSF, and between 41% and 81% using MaxEnt. Most of the activated cryptic or new sites were predicted by HSF or MaxEnt matrices (Table 2). However, two transcripts presented activation of GC 59 ss (#1959 and #1619) that were only predicted by the SpliceSiteFinder algorithm. Two transcripts did not correlate with any potential 59 ss (patients #338 and #1619).
To investigate the major factors determining the main alternative splicing pathway in 59 ss mutations (cryptic site activation versus exon-skipping), we analyzed several parameters: exon and intron length, density of SRE motifs, availability of cryptic sites and relative 39 ss strength. The analysis was extended to eight exons with previously reported pathways in 59 ss shown. BMD patient #1665 shows dystrophin reduction. This patient presented a 39 ss disrupting mutation causing mainly exon 3 inframe skipping. Patient #1973 presents the rare combination of DMD phenotype and reduction of dystrophin expression. In this patient, a missense mutation in CH1 of ABD1 domain may cause impaired actin-binding activity. DMD patient #1775 carrying a nonsense mutation in exon 26 shows absence of dystrophin (an isolated revertant fibre can be observed in DYS2). In contrast, patient #1472 carrying a nonsense mutation in exon 28 shows reduced dystrophin expression and milder BMD phenotype. mRNA analysis in this patient revealed in-frame exon-skipping due to the disruption of an ESE motif. In the last row, BMD patient #1497 shows a very mild reduction of dystrophin expression. This patient presented a missense mutation in the ZZ domain that may compromise b-distroglycan binding. doi:10.1371/journal.pone.0059916.g001 Figure 2. Exonic mutations associated with exon-skipping events. On the left, semi-quantification of alternative transcripts by QF-PCR on muscle biopsy cDNA. In the centre, schematic representation of the detected transcript species and their relative ratio. On the right, mutation disrupting mutations [26]. We found that exons exhibiting mainly exon-skipping presented a weak 39 ss, while most exons showing predominantly cryptic site activation presented a strong 39 ss ( Figure 3A, paired t test, P-value 0.0346). No statistically significant differences were found in other parameters. However, exons exhibiting cryptic site activation presented a mean density of ESE motifs higher than exon showing exon-skipping ( Figure 3B).

Missense and in-frame mutations
Four missense mutations and one amino acid deletion were detected in our cohort. All were located in highly conserved residues and were predicted to be pathogenic based on Polyphen-2 and SIFT algorithms. These mutations were not found in 100 healthy controls and not reported in the Exome Variant Server (http://evs.gs.washington.edu/EVS/). Two mutations were found in the N-terminal ABD. In the CH1 (calponin homology ABD domain), mutation p.Leu53Arg was found in a DMD patient with irregularly reduced expression of dystrophin ( Figure 1). In the CH2 domain, mutation p.Gly166Val was found in a patient with IMD. In the central rod-domain at the spectrin-like repeat 2, a double amino acid change (p.Met450Ile_Asp451Tyr) was detected in an IMD patient presenting reduction of dystrophin with negative fibres. Two mutations were found in the C-terminal cysteine-rich region. Mutation p.Pro3320Ser in the ZZ-domain was found in a BMD patient with near normal dystrophin expression (Figure 1), and a single amino acid deletion, p.Glu3367del, was found in a DMD patient.

Discussion
We describe a comprehensive analysis of 98 DMD point mutations related to clinical phenotype and their effect on muscle mRNA and dystrophin expression. Aberrant splicing was found in 27 mutations. Mechanisms responsible for the splicing defects consisted in abrogation of natural splice sites, creation of new splice sites, disruption/creation of regulatory elements (SRE) and pseudoexon activation. Bioinformatics analysis of nonsense/ frameshift mutations revealed that PESE/PESS matrix is a powerful tool to predict critical regulatory regions for BMDassociated exon-skipping. Our findings suggest that the splicing pathway in 59 ss disrupting mutations is highly dependent on the interplay between 39 ss strength and density of exonic splicing enhancers.
In agreement with the reading frame rule, most nonsense and frameshift mutations in our cohort were found in patients presenting a severe DMD or IMD phenotype. However, 11% of them were detected in patients presenting milder BMD or XLCM phenotypes. Several mechanisms have been associated with the production of dystrophin in nonsense/frameshift mutations, ameliorating the clinical phenotype. These include alternative translation initiation in 59 end mutations [43,44], escape of nonsense-mediated mRNA decay (NMD) in mutations located in or beyond exon 74 [45] and somatic mosaicism [25,46,47]. However, the most reported mechanism is the skipping of the mutated exon, producing significant amounts of in-frame transcripts. Mechanisms involved in exon-skipping events include disruption and creation of SRE. Although creation of ESS has been reported [18], disruption of ESE is better documented [13][14][15][16][17]25,48]. Most widely used algorithms to predict ESE disruption are ESE Finder matrices for SR protein binding sites, and Rescue-ESE hexamers which are differentially present in exons and introns. However, the ability of these tools to predict exonskipping events in the DMD gene is limited. Analysis of the mutation entries in the Leiden database (LOVD) revealed that ESE disruption occurred in 50% of BMD nonsense mutations [10]. Deburgrave et al. reported similar results, since 4 out of 8 mutations with confirmed mRNA exon-skipping had consequences on ESE motifs ESE motifs [25].
In our subset of patients, we found a nonsense mutation in somatic mosaicism in a patient who presented DCM but no muscle weakness. Clinical, pathological and molecular studies in this patient are discussed in greater detail in a previous work [49]. Exon-skipping events were found or predicted in seven BMD patients. We found that 8-mers putative splicing enhancers (PESE) and silencers (PESS) from Zhang and Chasin [34] are a powerful tool to predict in the DMD gene critical SRE motifs for exon recognition. PESE disruption was predicted in six patients presenting five different mutations. Disruption of ESE motifs was predicted only in one mutation when ESE finder or Rescue-ESE matrices were used ( Figure 2 and Table S1). Surprisingly, disruption of PESE motifs in BMD overlapped in most cases with creation of an intronic identity element (IIE) [35], raising the possibility that the mutations had a double effect, contributing to loss of exon identity. An ESS creation was predicted by PESS and other matrices in a nonsense mutation in exon 74. However, in absence of cDNA studies we can not confirm an exon-skipping event, since mutations in this exon have been found to cause either exon-skipping or escape from NMD [14,25]. None of the associated IMD/DMD mutations located in exons where exonskipping events have previously been reported were predicted to disrupt any PESE octamer. However, one of these mutations located in exon 29 induced exon-skipping. Nevertheless, the proportion of exon-skipping transcripts in the patient was much lower than those found in a BMD patient presenting an identical skipping pattern (Figure 2), indicating that they are insufficient to rescue the phenotype. According to LOVD this mutation has been previously found in BMD patients, suggesting differences in the exon-skipping efficiency between individuals. In line with this hypothesis, Ginjaar et al. reported a BMD family with a nonsense mutation in exon 29 who presented variable phenotype severity, ranging from severe BMD to asymptomatic elevation of CK levels [17]. The authors reported that clinical variability was related to different levels of exon 29 skipping.
In line with previous reports [19][20][21][22], our data indicate that SRE disruption/creation is not the only factor determining exonskipping, since 6 out of 11 mutations disrupting PESE octamers in in-frame exons were found in IMD/DMD patients (Table S1). In a recent work, Flanigan et al. 2011 reported that exon-skipping occurs in a subset of exons, proposing a model in which a weak exon definition context, defined by a weak 39 ss and low ESE density, is necessary for mutation-associated exon-skipping [23]. In our cohort, we identified BMD nonsense/frameshift mutations in exons 9, 28, 37, 71 and 74. To our knowledge, this is the first report of exon-skipping events in exons 9 and 28. According to the model of Flanigan et al, exons 37 and 71 present weak 39 ss and a low ESE density, while exon 9 exhibits the lowest ESE density in our subset of exons ( Figure 4) [50], and reinforce the idea that the overall pre-mRNA architecture might be involved in the splicing process [51]. We observed that most splice site mutations induce variable levels of multiple alternatively spliced transcripts. Probably for this reason, splice site mutations are more frequent in BMD than in DMD. The splicing pathway differs substantially from one mutation to another, with main outcomes consisting in exonskipping or activation of alternative sites. Intron retention was found only in one case and involved the smallest DMD intron (107 bp), in line with previous findings [52]. Several mutations induce more than one pathway at the same time. For this reason, predicting how these mutations will affect splicing patterns without mRNA studies is challenging. Algorithms such as HSF and MaxEnt are useful tools to predict abrogation or reduction of splice site function, creation of new sites and presence of cryptic sites. Changes located in non-canonical AG or GT nucleotides, slightly reducing the site strength are expected to induce significant amounts of normally spliced transcripts. However, this can not be generalized to all mutations. While mutations c.3603+2dupT and c.9563+5G.C induced significant amounts of normally spliced transcripts, c.3432+3A.T and c.10086+5G.C induced mainly aberrant transcripts (Table 2). These findings and the variety of observed splicing outcomes indicate that factors other than splice sites influence the final pattern. Multiple factors have been suggested to determine the splicing pathway, including the sequence context of the affected splice site, exon and intron length, RNA secondary structures, and conservation of the reading frame [51,53,54]. The abundance of cryptic splice sites has been suggested as a main factor determining whether a mutation induces exon-skipping or cryptic splice site activation [55]. Confirming a previous work [56], our data indicate that the availability of cryptic splice site does not determine the main splicing pathway, since numerous potential sites are found in most analyzed exons according to HSF and MaxEnt predictions.  Habara et al. proposed that in +1G.A mutations a strong exon recognition, resulting from the combination of a high 39 ss score and a long exon length, is necessary for cryptic site activation [56]. Our data indicate that the splicing pathway in 59 ss mutations is determined by the interplay between the relative strength of 39 ss and the density of ESE elements (Figure 3). Cryptic site activation occurs in those exons that present a strong 39 ss compared with the next distal exon, while weak 39 ss lead to exon-skipping. However, two exceptions are found in our subset of exons. Exon 26 presenting a weak 39 ss showed cryptic site activation, while exon 69 presenting a moderately strong 39 ss showed exon-skipping ( Figure 3C). We hypothesize that a high density of ESE motifs may compensate a weak 39 ss, leading to activation of alternative splice sites. In the other hand, a low ESE density in a moderately strong 39 ss context may contribute to exon-skipping.
Precise identification of DMD mutations and their consequences on mRNA and protein expression is essential to provide accurate genetic counseling in dystrophinopathy families and to include patients in mutation suppression therapies. Our results support and extend previous findings showing that 39 ss strength and density of regulatory elements are determinant factors of the splicing pathway in mutations affecting splicing signals. However, other factors such as the genomic context may also play a relevant role, suggesting a more complex model. Understanding the splicing code and developmenting computational splicing models will be of great value to predict pathological effects of DNA variants in molecular diagnosis of dystrophinopathy and other diseases, and to design more efficient molecules for splicing modulation therapies.

Supporting Information
Table S1 Nonsense and frameshift mutations located in in-frame exons were analyzed using different matrices to predict creation or disruption of splicing regulatory elements. For ESE matrices, 1 represents disruption of at least one ESE motif while 0 represents no disruption; for ESS matrices, 1 represents creation of an ESS motif while 0 represents no creation. (DOCX) Table S2 Primer pairs used to perform semi-quantitative QF-PCR in muscle cDNA are listed. Primer name indicates its exonic location in the cDNA. For each amplicon, length of normal transcript and analyzed mutations are indicated. (DOC)