Alternative Splicing at NAGNAG Acceptors: Simply Noise or Noise and More?

Alternative splicing at pairs of acceptors in close proximity are one frequent cause of transcriptome complexity. In particular, acceptors with the pattern NAGNAG are widespread in several genomes [1–3]. When affecting the coding regions, alternative splicing at NAGNAGs mainly results in the insertion/deletion of one amino acid. While such subtle events are undoubtedly frequent, an important question arises: do they have functional consequences or are they simply noise tolerated by cells? 
 
Zavolan and colleagues [3,4] suggest that these variations are the result of stochastic binding of the spliceosome at neighboring splice sites and do not discuss known functional implications. We previously found indications against a general noise assumption for NAGNAG splice events [1]: biases towards intron phase 1 and single amino acid insertions/deletions, correlation of amino acid variation and the peptide environment, enrichment of polar residues at NAGNAG exon–exon junctions, preference for protein–protein interactions and particular Pfam domains, human–mouse conservation of the intronic AG, and tissue-specific splicing at several NAGNAG acceptors. These findings indicate negative selection against NAGNAG-derived variability deleterious for certain protein regions, which agrees with the underrepresentation of NAGNAGs in coding regions detected by Zavolan and colleagues [4]. This does not rule out that variability may be advantageous for other proteins, but signs of positive selection are much harder to detect and remain to be shown. 
 
Zavolan's finding that confirmed NAGNAGs (current mRNAs/expressed sequence tags do show alternative splicing) are not better conserved between human and mouse than unconfirmed ones may argue against functional implications. However, this result is probably biased by the unconfirmed dataset, which consists of ~60% NAGGAG whose GAG is part of the conserved exon. To avoid such a bias, we split confirmed NAGNAGs into those in which the “extra” AG is either intronic or exonic, according to the transcript annotation [1]. Interestingly, intronic but not exonic extra AGs have a significant conservation. Meanwhile, Akerman and Mandel-Gutfreund found a high conservation of the intronic flanking regions [5], typical for biologically meaningful alternative splicing [6]. 
 
The finding of Zavolan and colleagues that relative acceptor strength is predictive for confirmed and unconfirmed NAGNAGs refers to an accepted fact of splicing (for example, alternative exons have weaker splice sites than constitutive ones [7]). In tandems, the splice-site strength often determines the preferred acceptor, consistent with our earlier results (see Supplementary Notes in [1]). Thus, we agree that thermodynamic fluctuation plays an essential role during splice-site recognition at NAGNAG acceptors. This is in line with the finding that a single mutation is sufficient to convert a normal acceptor into a NAGNAG tandem, enabling alternative splicing [8]. However, this useful model is not valid for all NAGNAGs. In particular, tissue-specific regulation of alternative NAGNAG splicing challenges this model [1,9]. Overrepresented sequence motifs found in the vicinity of confirmed NAGNAGs are likely to contribute to this regulation [5]. 
 
Moreover, some protein isoforms derived by alternative splicing at NAGNAG acceptors are known to be functionally different: IGF1R, signaling [10]; DRPLA, cellular localization [9]; mouse Pax3, DNA binding [11]; and Arabidopsis thaliana U11-35K, protein binding [12]. Alternative NAGNAG splicing in the untranslated region of mouse Ggt1 affects the translational efficiency [13]. Furthermore, a NAGNAG mutation in ABCA4 is relevant for Stargardt disease 1 [14]. For clarity, we did not claim that all alternative splice events at NAGNAGs serve as protein “fine-tuning” mechanism [1,8] (as misinterpreted by [4]). In our opinion, like genetic variants, splice variants may be neutral or result in phenotypic differences. Thus, they represent just another playground of molecular evolution [15,16]. The few currently evident cases of biologically different NAGNAG-derived isoforms may represent just the tip of an iceberg. 
 
Finally, in the context of the problem discussed here, it has to be considered that noise is important for many biological processes [17], leading to the model of “cultivated noise” [18]. For example, splicing noise at the Drosophila Dscam gene is used for cell individualization [19]. Although it has yet to be proven, it is tempting to speculate that noise arising by splicing at NAGNAG acceptors provides another “cultivated” stochastic mechanism. 
 
In conclusion, it remains unknown what fraction of the more than 1,900 currently confirmed human NAGNAGs play a role in biological functions. To facilitate further experimental and bioinformatics analyses, we developed a database, TassDB (http://helios.informatik.uni-freiburg.de/TassDB), that provides information and large collections of NAGNAG acceptors.

Alternative splicing at pairs of acceptors in close proximity are one frequent cause of transcriptome complexity. In particular, acceptors with the pattern NAGNAG are widespread in several genomes [1][2][3]. When affecting the coding regions, alternative splicing at NAGNAGs mainly results in the insertion/deletion of one amino acid. While such subtle events are undoubtedly frequent, an important question arises: do they have functional consequences or are they simply noise tolerated by cells?
Zavolan and colleagues [3,4] suggest that these variations are the result of stochastic binding of the spliceosome at neighboring splice sites and do not discuss known functional implications. We previously found indications against a general noise assumption for NAGNAG splice events [1]: biases towards intron phase 1 and single amino acid insertions/deletions, correlation of amino acid variation and the peptide environment, enrichment of polar residues at NAGNAG exon-exon junctions, preference for proteinprotein interactions and particular Pfam domains, humanmouse conservation of the intronic AG, and tissue-specific splicing at several NAGNAG acceptors. These findings indicate negative selection against NAGNAG-derived variability deleterious for certain protein regions, which agrees with the underrepresentation of NAGNAGs in coding regions detected by Zavolan and colleagues [4]. This does not rule out that variability may be advantageous for other proteins, but signs of positive selection are much harder to detect and remain to be shown.
Zavolan's finding that confirmed NAGNAGs (current mRNAs/expressed sequence tags do show alternative splicing) are not better conserved between human and mouse than unconfirmed ones may argue against functional implications. However, this result is probably biased by the unconfirmed dataset, which consists of ;60% NAGGAG whose GAG is part of the conserved exon. To avoid such a bias, we split confirmed NAGNAGs into those in which the ''extra'' AG is either intronic or exonic, according to the transcript annotation [1]. Interestingly, intronic but not exonic extra AGs have a significant conservation. Meanwhile, Akerman and Mandel-Gutfreund found a high conservation of the intronic flanking regions [5], typical for biologically meaningful alternative splicing [6].
The finding of Zavolan and colleagues that relative acceptor strength is predictive for confirmed and unconfirmed NAGNAGs refers to an accepted fact of splicing (for example, alternative exons have weaker splice sites than constitutive ones [7]). In tandems, the splice-site strength often determines the preferred acceptor, consistent with our earlier results (see Supplementary Notes in [1]). Thus, we agree that thermodynamic fluctuation plays an essential role during splice-site recognition at NAGNAG acceptors. This is in line with the finding that a single mutation is sufficient to convert a normal acceptor into a NAGNAG tandem, enabling alternative splicing [8]. However, this useful model is not valid for all NAGNAGs. In particular, tissue-specific regulation of alternative NAGNAG splicing challenges this model [1,9]. Overrepresented sequence motifs found in the vicinity of confirmed NAGNAGs are likely to contribute to this regulation [5].
Moreover, some protein isoforms derived by alternative splicing at NAGNAG acceptors are known to be functionally different: IGF1R, signaling [ . For clarity, we did not claim that all alternative splice events at NAGNAGs serve as protein ''fine-tuning'' mechanism [1,8] (as misinterpreted by [4]). In our opinion, like genetic variants, splice variants may be neutral or result in phenotypic differences. Thus, they represent just another playground of molecular evolution [15,16]. The few currently evident cases of biologically different NAGNAG-derived isoforms may represent just the tip of an iceberg.
Finally, in the context of the problem discussed here, it has to be considered that noise is important for many biological processes [17], leading to the model of ''cultivated noise'' [18]. For example, splicing noise at the Drosophila Dscam gene is used for cell individualization [19]. Although it has yet to be proven, it is tempting to speculate that noise arising by splicing at NAGNAG acceptors provides another ''cultivated'' stochastic mechanism.
In conclusion, it remains unknown what fraction of the more than 1,900 currently confirmed human NAGNAGs play a role in biological functions. To facilitate further experimental and bioinformatics analyses, we developed a database, TassDB (http://helios.informatik.uni-freiburg.de/ TassDB), that provides information and large collections of NAGNAG acceptors. "

Michael Hiller Rolf Backofen
Albert-Ludwigs-University Freiburg Competing Interests: The authors have declared that no competing interests exist.

Authors' Reply
That splice variation at tandem acceptor sites is frequent has been reported by several groups, including Zavolan et al. [1], Sugnet et al. [2], and Hiller et al. [3], and is uncontroversial. It is to be expected that at least some of these variations will affect protein function, and this is also beyond dispute, in spite of suggestions to the contrary in the letter of Hiller et al. [4]. The questions that are under discussion concern the mechanism that brings about these splice variations and their ''functional consequences'' or ''role in biological functions.'' The rather vague formulation of these questions has, in our opinion, given rise to much misunderstanding. Therefore, to be concrete, we list what we believe are the main relevant questions. (1) Why are these splice variations so common? By what mechanism are they brought about? (2) To what extent is the introduction of these variations controlled and regulated by the cell? (3) What fraction of these variations is neutral and what fraction affects protein function? (4) To what extent are the nonneutral variations deleterious and to what extent are they beneficial?
With respect to the first question, we have shown [5] that one need not invoke a complicated mechanism for introducing these variations, but that a simple model of stochastic binding of the spliceosome to competing splice sites, in combination with nonsense-mediated decay, can fully explain the abundance of these variations. Moreover, this model accurately predicts the relative frequencies of all small length variations, not only at acceptor but also at donor splice sites. As Hiller et al. also stress in their letter, there is little doubt that thermodynamic fluctuations, i.e., noise, play a role in splice-site selection. The combination of these facts suggests to us that thermodynamic noise is responsible for introducing a large fraction of the observed alternative splicing events at tandem acceptors.
With respect to the second question, if the introduction of splice variation at NAGNAG acceptors were highly controlled by the cell, then one would not expect that they could be predicted from the local sequence at the splice site only. The fact that our same simple model successfully predicts which NAGNAG acceptors show splice variation and which do not suggests that at least a substantial fraction of all such splice variations are not tightly controlled by the cell. We agree with Hiller et al. that our model cannot explain the observed cases of variation in the relative proportion of the alternative splice forms across different tissues. We disagree, however, that this invalidates our model for these NAGNAGs. Just as different point mutations occur at different rates in different cellular states and sequence contexts, so may the relative probabilities with which the spliceosome binds to competing splice sites depend on details of the kinetics that may vary between tissues. It remains to be determined if the cells are able to actively regulate kinetic details so as to specifically regulate alternative splicing at tandem acceptor sites. In fact, we feel that one of the main uses of our model is to provide a baseline expectation under simple thermodynamic noise, allowing one to more effectively identify interesting cases that deviate significantly from this behavior.
With respect to questions 3 and 4, it is of course to be expected that some of the variations affect protein function. Indeed, Hiller et al. [3] have provided several lines of evidence that indicate a bias toward alternative NAGNAG acceptors that minimize the effect on the proteins. We agree with Hiller et al. that this strongly suggests that, at least in some cases, the effects of NAGNAG variations are deleterious and that selection acts to avoid them. We strongly disagree, however, that this argues against noise being responsible for introducing these variations. By the same reasoning one could argue that point mutations are not introduced by noise because one observes negative selection against certain single point mutants. Rather, the observed selection against NAGNAG motifs in locations where splice variation would deleteriously affect protein function suggests that the splice variation at NAGNAG acceptors is not under tight control of the cell, and supports the idea that these variations are mostly the result of uncontrollable noise. Finally, the fact that some variations deleteriously affect protein function does not imply that all these variations play a ''role in biological function.'' In many cases some amount of deleterious variation might just be tolerated.
How frequent are cases in which variations are beneficial for the cell, i.e., in which the cell uses both functionally different forms? We agree with Hiller et al. that such cases remain to be identified, but do not agree that the problem lies with the general difficulty of showing signs of positive selection. Positive selection is typically used to refer to cases where selection has favored change at particular positions. In contrast, to show that NAGNAG variations are beneficial, one would need to show only that there is clear selection for conserving the tandem acceptor property of variant NAGNAGs. This was in fact precisely the purpose of our test that compared the conservation of variant NAGNAG acceptors with that of invariant NAGNAG acceptors. Hiller et al. call this test ''probably biased'' due to a substantial fraction of NAGGAG tandem acceptors in which the GAG is part of the ''conserved exon.'' The point that we may not have stressed enough [5], and that is apparently not appreciated by Hiller et al., is that if there is selection for maintaining a NAGNAG acceptor that supports splice variation, then both AG dinucleotides need necessarily to remain conserved. This selection pressure is stronger even than the selection pressure on NAGs that are part of the exon, where selection will chiefly operate at the level of their coding potential, often allowing for neutral mutation of the AG dinucleotide. Thus, NAGNAGs at invariant acceptors must necessarily be under less selection to conserve both AG dinucleotides than beneficial variant NAGNAGs. If a substantial proportion of the variant NAGNAGs were under selection for their tandem acceptor property, then we would expect to see their NAGNAG property more often conserved than for invariant NAGNAGs. Since we do not observe this, we conclude that the fraction of NAGNAGs under selection for retaining their tandem acceptor function cannot be very large. Finally, Hiller et al. discuss the conservation test that they performed [3] and mention the conservation statistics obtained more recently by