Figure 1.
Repeat expandability correlates with inter-locus polyQ toxicity.
(A) The graph shows the exponential decay regression lines fitted to the age-at-onset and inherited repeat length distributions in the polyQ disorders (Huntington disease (HD)(dashed line), spinal and bulbar muscular atrophy, X-linked (SMAX1), dentatorubral-pallidoluysian atrophy (DRPLA), Machado-Joseph disease (MJD), spinocerebellar ataxia 1 (SCA1), 2 (SCA2) and 7 (SCA7)). The inter-locus polyQ toxicities were derived from the parameters of the regression line of each disorder for the modal age-at-onset of 32 years (dashed lines). (B) Plot of ranked expandability and ranked inter-locus polyQ toxicity at the modal age-at-onset (32 years) with the regression line (one-tailed Spearman's rank; rho = 0.75; P = 0.03; N = 7).
Table 1.
Inter-locus polyQ toxicity and expandability of the dynamic DNA polyQ loci.
Figure 2.
Intergenerational instability is predictive of somatic instability.
(A) Repeat-length normalised levels of somatic mosaicism in the brains of SCA1 and MJD patients is similar to the levels of germ line instability observed in these disorders. Data were obtained from meta-analysis of a published study of somatic mosaicism in the cerebral cortex (NMJD = 11, NSCA1 = 7) and white matter (NMJD = 9, NSCA1 = 6) of SCA1 and MJD individuals (Table S1) (Maciel et al, 1997). (B) Repeat-length normalised levels of somatic mosaicism in buccal cells of HD and SCA7 patients is similar to the levels of germ line instability observed in these disorders. Data were obtained from meta-analysis of published studies of somatic mosaicism in the buccal cells of HD (N = 12) [8] and SCA7 (N = 1) [34] individuals (Table S2).
Figure 3.
Repeat expandability correlates with flanking genomic DNA sequence GC content.
(A) polyQ-encoding CAG-repeat expandability correlates with proximal, but not distal flanking genomic DNA sequence GC content. Distance from the repeat (red vertical line) is plotted on a log scale against Spearman's coefficient of correlation (rho) with expandability [31]. The dashed line shows the threshold for statistical significance (P<0.05; two-tailed). (B) The graph shows the coefficient of correlation of flanking genomic DNA GC content of the seven dynamic DNA CAG polyQ-encoding loci with repeat expandability. Spearman's rank coefficient of correlation (rho) was calculated to a distance of 2,000 bp both 5′ and 3′ of each repeat using a sliding window of 100 bp and step size of 10 bp. The dashed line shows the threshold for statistical significance (P<0.05; two-tailed). and The position of the CAG•CTG repeat is represented by the vertical red bar.
Table 2.
Correlation of flanking genomic DNA GC content with repeat expandability of the polyQ loci.
Figure 4.
Inter-locus polyQ toxicity correlates with genomic DNA flanking sequence GC content.
The graph shows the regression analysis between inter-locus polyQ toxicity and the GC content of the genomic DNA flanking sequences at a distance of 100 bp (r = −0.87; P = 0.01; N = 7).
Figure 5.
Inter-locus polyQ toxicity correlates with the flanking genomic DNA sequence GC content, but does not extend beyond the repeat containing exon in the mRNA sequence.
(A) Inter-locus polyQ toxicity correlates with the flanking genomic DNA sequence GC content. The graph shows the coefficient of correlation (r) for the relationship between inter-locus polyQ loci toxicity and flanking genomic DNA sequence GC content. GC content was sampled using a sliding window of 100 bp and a step size of 10 bp. The threshold for statistical significance (dashed lines) and the position of the CAG•CTG repeat (red vertical bar) are also shown. Note that the region of statistically significant correlation extends for ∼400 bp either side of the repeat tract (as indicated by the vertical dotted lines). (B) Gene structure of the seven polyQ containing genes. All diagrams are to scale. Exons (white box), introns (grey box), intergenic regions (horizontal black bar), and repeat tract (vertical black bar) are shown. (C) Inter-locus polyQ toxicity only correlates with flanking mRNA sequence GC content to the 5′ and 3′ ends of their host exons. The graph shows the coefficient of correlation (r) for the relationship between inter-locus polyQ toxicity and flanking mRNA sequence GC content determined as in (A). Note that the region of statistically significant correlation extends for only ∼100 bp either side of the repeat tract (as indicated by the vertical dashed lines) corresponding to the length of mRNA sequence encoded by the repeat containing exons and not extending into flanking exons.
Figure 6.
Steady-state transcript levels in human brain do not correlate with inter-locus toxicity or flanking DNA GC content.
(A) Correlation (Pearson, r) between inter-locus toxicity and polyQ gene steady-state transcript levels in whole brain (r = 0.33, P = 0.47; yellow diamond) or cerebellum (r = 0.37, P = 0.31; red diamond). (B) Correlation (Pearson, r) between 500 bp flanking DNA GC (%) content and polyQ gene steady-state transcript levels in whole brain (r = 0.07, P = 0.89) or cerebellum (r = 0.34, P = 0.46). Similarly, no significant correlation was observed between polyQ gene steady-state transcript levels and 100 bp flanking DNA GC (%) content (brain, r = −0.07, P = 0.89; cerebellum, r = −0.09, P = 0.85) or 2000 bp flanking DNA GC (%) content (brain, r = 0.37, P = 0.41; cerebellum, r = 0.34, P = 0.46). Steady-state transcript levels values are averages of values from multiple independent samples of normal human whole brain (N = 2, yellow diamond) and cerebellum (N = 6, red diamond). The least squares linear regression lines are shown for whole brain (solid) and cerebellum (dashed). Steady-state transcript levels were calculated as ‘reads per kilobase of exon model per million mapped reads’ (RPKM) [59]. RPKM values are shown in log10 scale for.