Frequent Gain and Loss of Introns in Fungal Cytochrome b Genes

In this study, all available cytochrome b (Cyt b) genes from the GOBASE database were compiled and the evolutionary dynamics of the Cyt b gene introns was assessed. Cyt b gene introns were frequently present in the fungal kingdom and some lower plants, but generally absent or rare in Chromista, Protozoa, and Animalia. Fungal Cyt b introns were found at 35 positions in Cyt b genes and the number of introns varied at individual positions from a single representative to 32 different introns at position 131, showing a wide and patchy distribution. Many homologous introns were present at the same position in distantly related species but absent in closely related species, suggesting that introns of the Cyt b genes were frequently lost. On the other hand, highly similar intron sequences were observed in some distantly related species rather than in closely related species, suggesting that these introns were gained independently, likely through lateral transfers. The intron loss-and-gain events could be mediated by transpositions that might have occurred between nuclear and mitochondria. Southern hybridization analysis confirmed that some introns contained repetitive sequences and might be transposable elements. An intron gain in Botryotinia fuckeliana prevented the development of QoI fungicide resistance, suggesting that intron loss-and-gain events were not necessarily beneficial to their host organisms.


Introduction
Introns are widely distributed in numerous genes of viruses, prokaryotes and eukaryotes [1][2][3][4]. They are frequently found in mitochondrial genes of plant and fungal kingdoms, but are scarcely found in mitochondrial genes of Animalia, Protozoa and Chromista, although Animalia and Fungi are grouped together as Opisthokonts, and Chromista is phylogenetically close to Plantae [5]. Mitochondrial introns are classified into two main groups, group I and group II, based on their distinct RNA structures that facilitate their self-splicing activity [6][7]. Mitochondrial introns are usually group I introns and contain internal open reading frames (ORFs) which facilitate intron removal from RNA transcripts and intron propagation to new sites in genes [8][9][10].
Intron density in eukaryote genomes varies by more than three orders of magnitude, suggesting that there must have been extensive intron gain and/or loss events during evolution [11]. The evolutionary history of introns remains unresolved in many aspects. Several models have been proposed to explain the evolution and spread of introns. Introns Late theory argues that introns could have expanded in a fashion similar to transposable elements (TEs), have been the results of tandem duplications into exons [12], and were generated through a reverse-splicing mechanism catalyzed by the splicing machinery itself [13]. The presence of protein-encoding introns strongly supports the Introns Late theory where introns might be able to spread horizontally between phylogenetically distant species of different kingdoms [14][15].
In contrast, Introns Early theory suggests that introns were abundant in ancestral genes and mainly evolved through intron loss [16][17]. The prevailing theory for intron loss is that a processed mRNA is reverse transcribed to cDNA, which then recombines with the genomic copy of the gene, thereby precisely deleting the unmatched intronic sequence. This mechanism has been demonstrated experimentally in yeast [18] and many other studies also supported this mechanistic model [19][20][21][22][23]. Although the original concept of Introns Early vs Late was concerned with the origin of nuclear spliceosomal introns, the details of evolution of mitochondrial group I or group II introns could provide insights.
Most group I introns harbor genes encoding so-called homing endonucleases, initiating intron mobility via a double-strand breaks (DSBs)-repair process [24]. Many group I introns in organellar genomes encode maturases that help promote intron splicing by a variety of mechanisms [8,25]. Some maturases also function in trans to promote splicing of other group I introns in the same genome [26][27]. The group II introns move via an RNA-based mechanism known as 'retrohoming'. The proteins encoded by group II introns are multifunctional, containing maturase, reverse transcriptase and DNA-binding functions as well as DNA endonuclease activity. The maturase activity facilitates intron splicing by stabilizing the catalytically active RNA conformation, while other functions aid in the retrohoming to allelic target sites [28][29], as well as non-allelic sites [24,30].
The number of mitochondrial introns or intron sequences may vary dramatically in different species. In recent studies on peach brown rot fungi Monilinia spp. (anamorphs in Monilia), Cyt b genes, the targets of the Quinone outside inhibitor fungicides (QoIs), were isolated and analyzed [31][32][33]. The Cyt b genes in M. fructicola, M. laxa, M. fructigena and M. yunnanensis have seven, whereas the Cyt b gene in M. mumecola has six large (.1 kb) introns. The intron/exon organization varies considerably among these species. In contrast to the large introns, the exons of Cyt b are usually small, ranging from 11 to 397 bp in length. The combined intron sequences of the Cyt b gene in M. fructicola, M. fructigena, M. laxa, M. yunnanensis and M. mumecola are 10,751, 12,380, 12,329, 11,456 and 13,027 bp, respectively. The large intron number and size make the Cyt b genes in Monilinia spp. the longest reported to date in all kingdoms of organisms. In contrast, the Cyt b gene in Blumeria graminis f.sp. tritici, the causal agent of wheat powdery mildew, is intronless, even though Monilinia and Blumeria species belong to the same fungal class Leotiomycetes. It will be interesting to know what caused the Cyt b genes in Monilinia species to harbor the largest ones in eukaryotic organisms.
The Cyt b genes have been sequenced and annotated from thousands of species including distantly related species from different kingdoms and widely used in systematic studies or evolutionary studies [34][35][36][37][38][39][40]. Furthermore, intron number and position in the Cyt b genes vary dramatically among different species. However, the dynamics of Cyt b gene introns, has not been systemically studied. In this study, we compiled all available Cyt b genes from the GOBASE database and assessed the evolutionary dynamics of Cyt b gene introns, and some introns in Monilinia and Botryotinia spp. were confirmed by Southern blot analyses.

Ustilaginoidea virens
The Cyt b genes were previously isolated from peach brown rot fungi M. fructicola, M. fructigena, M. laxa, M. yunnanensis and M. mumecola [31][32][33]. In the present study, the Cyt b gene was isolated from the rice false smut fungus Ustilaginoidea virens. Briefly, degenerate primers were designed based on Cyt b gene sequences from phylogenetically close species, and PCR reaction was carried out to amplify the conserved region from U. virens isolate UV-8a. Once a conserved fragment was isolated, the flanking sequences were obtained using Tail-PCR or Inverse-PCR amplifications and sequenced. For cDNA amplification, total RNA was isolated using the RNeasy Plant Mini Kit (Qiagen, Valencia, CA) according to the manufacturer's instructions. First-strand cDNA was synthesized using an oligo-dT primer and Superscript III reverse transcriptase (Invitrogen, Carlsbad, CA) according to the manufacturer's recommendations. Amplification of Cyt b-specific cDNA was performed using the primers designed to amplify the Cyt b sequence from translational start codon to translational stop codon. Intron numbers and precise locations were identified by comparing genomic DNA (gDNA) and cDNA sequences.

Data Mining of Cyt b Gene Sequences
Cyt b gene and intron sequences were obtained directly from the GOBASE database (http://gobase.bcm.umontreal.ca/) using the gene search and intron search options with Cyt b gene (cob) as gene name [41]. Unfortunately, the GOBASE database has been Table 1. Distribution of introns in Cyt b genes of eukaryotic organisms. In the GOBASE database, taxon divisions Fungi, Metazoa were equal to kingdoms Fungi, Animalia, respectively. Plant division represented the Plantae kingdom except for a few lower plants (i.e. some green algae). Protista included kingdoms Chromista and Protozoa, and some lower plants (green algae) which should have been in the kingdom Plantae based on the Cavalier-Smith's classification system [5]. 2 Data were obtained by gene search and intron search options with Cyt b gene (cob) as gene name in GOBASE database [41], and some of our own sequences were added, i.e. seven Cyt b introns from M. fructicola, M. fructigena, M. laxa and M. yunnanensis, six Cyt b introns from M. mumecola and U. virens. 3 The precise intron numbers were not known because these Cyt b genes were partial sequences. Spp. indicates biological species. doi:10.1371/journal.pone.0049096.t001 Intron positions corresponding to the Cyt b amino acid sequence of Ustilaginoidea virens   163  164  168 169  171  188  200  206  208  229  253  260  263  270  275  289 386 Tm (1) Mfg5 (1) Tj (1) Uv6 (1) Ml4 (1) To (1) Mm4 (1)

Determination of Intron Positions and Phases
Information about intron location is important and the base for elucidating the dynamics of introns. In the present study, the relative location of an intron in the Cyt b gene is indicated by its ''position'' and ''phase''. Intron position in this study refers to the position of codon at which the intron is inserted. Intron phase refers to the nucleotide position in a codon at which the intron is inserted. An intron is considered as phase ''0'' if it is inserted between two consecutive codons, ''1'' if inserted between the first and the second nucleotide of a codon, ''2'' if inserted between the second and the third nucleotide of a codon. The number, position and phase of introns in Monilinia spp. and U. virens were identified by comparing the gDNA and cDNA sequences. For Cyt b genes retrieved from the GOBASE database, the intron number, position and phase were determined as described below. To determine the locations of introns of the Cyt b genes, the deduced amino acid sequences of all Cyt b genes were aligned using the MegAlign program in the Software package DNASTAR (DNAS-TAR Inc., Nevada City, CA). For the locations of introns to be comparable in different Cyt b genes, their relative locations corresponding to the Cyt b gene in U. virens were considered as their locations. If the locations of two introns vary by fewer than 6 nucleotides, they are called ''sliding introns''. The annotation for all sliding introns were checked manually as described in Figure  S1. If possible, sliding introns were modified so that they could be adjusted to the same location. In most cases, the nucleotide at the 39 end of a group I intron should be 'G' and the exonic base immediately upstream of a group I intron should be 'T' [42][43][44]. Introns with identical position are called common introns; they are referred to as unique intron if only a single intron is found at a certain position. Each intron is named after the abbreviation of species name followed by a number, with ''1'' as the most 59 intron and the largest number as the most 39 intron. For example, intron Mm3 refers to the 3 rd intron of the Cyt b gene from M. mumecola.

Phylogenetic Analysis
Maximum parsimony (MP) method was used to construct phylogenetic trees. Cyt b gene cDNA sequences or intron sequences at position 164 were aligned with the program Clustal W in the software MEGA 4.0. The following settings were used to construct the phylogenetic trees: heuristic search using close neighbor interchange (CNI; level = 1) with initial trees generated by random addition (100 reps).

Identification of Homology in Cyt b Introns
The homology of Cyt b introns was identified by using a reciprocal BLAST to calculate the pairwise nucleotide identity of introns. Algorithm BLASTN was used for the identity search in program BioPerl with default parameters and an E-value cutoff of 1e-05. To obtain an accurate estimate of global hit statistics, a tiling of high-scoring segment pairs (HSPs) onto either the subject or the query sequence was performed. Two introns were considered as homologues if coverage (the sum of HSP length/ longer one in paired introns) .30% and average pairwise identity .80%. Southern hybridization analysis was performed mainly as described previously [45]. In brief, DNAs were digested with EcoR I or Hind III and separated in 1.0% agarose gels in 0.56TBE buffer and transferred to Hybond N+ membranes (Amersham Pharmacia Biotech UK Limited, Buckinghamshire, UK). Intronic fragments were amplified by PCR reactions with the intronspecific primer pairs listed in Table S1. Thermal parameters for amplifying each fragment were as follows, 4 min at 94uC, 35 cycles of 40 sec at 94uC, 40 sec at 54uC, and 1.5 min at 72uC, a final elongation step of 72uC for 5 min was followed. All PCR products were gel-purified using the EasyPure Quick Gel Extraction Kit (TransGen Biotech, Beijing, China). Gel-purified PCR products were labeled with alkaline phosphate using AlkPhos Direct Labeling Reagents (Amersham Biosciences UK Limited). Hybridization was performed overnight at 55uC in Gene Images AlkPhos Direct hybridization buffer including 0.5 M NaCl and 4% blocking reagent (Amersham Biosciences UK Limited). Membranes were washed twice in primary buffer solution at 55uC for 10 min and then washed twice in secondary buffer solution at room temperature for 5 min. Target DNA was detected using the CDPstar reagent (Amersham International plc, Buckinghamshire, UK) and the signal from chemoluminescence was captured using ChemiDoc XRS+ Imager (Bio-Rad Laboratories Inc., California, USA).  (Table 1). In general, Cyt b gene introns were frequently present in organisms from the fungal kingdom and some lower plants such as green algae, liverworts, hornworts, true quillwort and moss, but generally absent or rare in the Cyt b genes from Chromista, Protozoa and Animalia. Therefore, the analysis of dynamics of Cyt b gene introns was focused on the fungal kingdom.

Exon-intron Structure of Cyt b Gene in Ustilaginoidea virens
A total of 172 fungal introns from 69 Cyt b genes were identified and characterized extensively. Of the 172 introns investigated, the locations of 171 introns were determined as described in materials and methods, location of the intron En could not be defined because the Cyt b gene sequence with this intron was unavailable in the database. The introns of Cyt b genes varied dramatically in location and length, sometimes even between closely related species.
As a whole, fungal introns were found at 35

Frequent Intron Losses in Fungal Cyt b Genes
To understand the relationship between Cyt b introns, the homology of 171 fungal Cyt b gene introns with known positions was analyzed. Theoretically, hit scores below e-value threshold (1e-05) have a certain chance of being homologous. However, the sum of HSPs of many hits only covered a limited part of either introns. Such introns should not be real homologues but just contained certain conserved regions. Thus, in the present study, only introns meeting additional criteria (coverage .30% and average pairwise identity .80%) were considered homologues.
Altogether 19 homologues were identified ( Table 3). As expected, most homologous introns were located at the same locations. Many homologous introns were present at the same position in relatively distinct species but absent in closely related species, indicating that frequent intron loss in Cyt b genes had occurred. For example, at position 169, homologous introns were frequently detected from phylum Ascomycota (at least from 3 classes, Sordariomycetes (Gz4), Eurotiomycetes (Ab, Nf) and Leotiomycetes (Mfc5, Mfg6, Mm5, Ml5 and My5), indicating that these introns were vertically inherited from a common ancestral intron before the divergence of these classes. These highly homologous introns were detected in all of the investigated species of Monilinia genus but not from the species B. fuckeliana of the same Sclerotiniaceae family, indicating that the intron at position 169 was recently lost in B. fuckeliana (after the divergence of Monilinia and Botryotinia genera). Intron loss events were also observed at position 67: homologous introns were detected in species from the Leotiomycetes (Bf1, Mfc1, Mfg1, Mm1, Ml1 and My1) and Sordariomycetes (Uv1, Pa1), but not in the G. zeae species which is in the same order Hypocreales with U. virens under Sordariomycetes class. Besides positions 67 and 169, similar intron loss events were observed at least at positions 131, 143, 164, 260 (Table 3). Such frequent intron loss events may help explain why more than 97% of investigated species contained intron-free Cyt b genes and more than half of the Cyt b genes containing introns had only one or two of them (Table 1).

Frequent Intron Gains in Fungal Cyt b Genes
Analysis of homology of fungal Cyt b introns indicated that some introns located at the same common position did not show similarity though the corresponding species are closely related. For instance, at position 164, introns Ml4 and Mfg5 were not homologous with introns My4 and Mm4 (Table 3), although they are from species of the same genus. These unrelated introns might have recently inserted at the same gene position of different organisms independently. The highly divergent sequences between introns at the same position were observed in closely related species, strongly indicating the high mobility of these introns.
Some introns at the same common position were more similar in distantly related species than in closely related species. As shown in Table 3 and Figure 1A, at position 164 intron Mm4 of M. mumecola did not show similarity with intron Ml4 in M. laxa although the two species are closely related ( Figure 1A). Similarly, intron My4 in M. yunnanensis was not homologous with intron Mfg5 in M. fructigena although their corresponding Cyt b coding sequence had 99.1% nucleotide identity. On the other hand, introns Mm4 and My4 in two distantly related Monilinia spp were highly homologous and had 99.4% nucleotide identity ( Figure 1B). Such highly similar intron sequences in distantly related species suggests that the introns were gained independently, likely through lateral transfers after speciation of the Monilinia genus ( Figure 1C), though how the transfers occurred is still unclear. Intron Mfc4 was also not homologous with other introns, suggesting that it was gained in a similar fashion (Table 3, Figure 1C). By contrast, introns Ml4 and Mfg5 were homologous with other introns Bf4, Gz3, Nc2 and Uv3, indicating that those introns were inherited vertically from a common ancestral intron.
Most of the mitochondrial group I introns could encode LAGLIDADG or GIY-YIG DNA endonucleases, which play a role in the transfer and site-specific integration (''homing'') of the intron (Lazowska, Jacq et al. 1980;Lambowitz and Perlman 1990;Pellenz, Harington et al. 2002). In the present study, most introns at common positions contained the LAGLIDADG endonuclease coding sequence (e.g. Mm4, Ml4, Mfc4 etc.) or the GIY-YIG endonuclease coding sequence (e.g. Bf2, Pa2, My2) and thus should be able to move by lateral transfer. Homing endonuclease genes (HEGs) are disproportionately common in mitochondria and chloroplasts of eukaryotes. The first HEG discovered was v (I-SceI) in the large subunit rRNA gene in the mitochondria of the yeast Saccharomyces cerevisiae [48][49]. The study of the evolutionary dynamics of HEGs showed that they are by regularly moving to new species through horizontal transmission, which starts the process of invasion, spread and fixation all over again, and the long-term persistence of HEGs in a single population relies on low transmission rates and a positive correlation between transmission efficiency and toxicity [16,46,50]. The introns in Cyt b genes of Monilinia spp. contained HEGs which should be beneficial for introns to specific recognition sites by horizontal transmission. Therefore, the high number of Cyt b introns in Monilinia spp. appeared to be a result of recent intron gain events and this directly caused Monilinia Cyt b genes to be the largest ones in eukaryotes.

Intron Loss-and-gain Events could be Mediated by Transpositions
As shown in Table 3 Hybridization experiments were performed to investigate transposability of Cyt b intron sequences in the genome of Monilinia species. Intron My3 from M. yunnanensis is located at position 143, southern hybridization using My3 as probe showed expected bands (those with arrows in lanes 1, 2 and 11, 12 in Figure 2A These introns were repetitive elements, a characteristic for TEs. Mfg2 was also confirmed to be a repetitive intron ( Figure S2A). The multiple copies of introns could either be mitochondrial or nuclear since we used the total genomic DNAs in the experiments. By contrast, My4 and Ml4 were single copy elements ( Figure 2B, Figure S2B), thus should not be considered TEs.

Intron Gains can have a Profound Impact on the Phenotype
Single amino acid changes in the Cyt b gene can confer QoI fungicide resistance in plant pathogenic fungi. For instance, in the plant pathogens B. graminis, Sphaerotheca fuliginea, Mycosphaerella fijiensis and Venturia inaequalis, fungicide resistance is conferred by a single point mutation in the Cyt b gene, leading to an amino acid substitution at position 143 from glycine to alanine (G143A) (Gisi, Sierotzki et al. 2002;Sierotzki, Schlenzig et al. 2002). The presence of an intron at the position 143 has been shown to prevent the formation of the G143A mutation in some pathogens, because mutations in this codon would interfere with intron splicing and consequently prevent the formation of an active Cyt b gene [31,[51][52]. A similar mechanism was documented in S. cerevisiae, where an exon mutation at the second nucleotide upstream of the splice site of intron 4 in the Cyt b gene (COB4) did not permit correct splicing of the pre mRNA [53][54].
Two different Cyt b alleles were discovered, one with and one without an intron at position 143 in B. fuckeliana. Southern hybridization using intron Bf3 as probe showed that this intron was absent from the entire genome of isolates lacking the intron at position 143 ( Figure S2C). Homology analysis of the Bf3 showed that it was not homologous with any of introns including those at the same position in closely related Monilinia spp. (Table 3). Therefore, the Bf3 should be laterally gained at position 143 in isolates with Bf3 rather than lost in the isolates without the intron. Since only B. fuckeliana isolates lacking the intron Bf3 have the G143A mutation and are highly resistant to QoI fungicides (Banno, Yamashita et al. 2009; Ishii, Fountaine et al. 2009), the Bf3 gained event directly influenced the development of the QoI fungicide resistance. Once the isolates gained the Bf3, they lost the potential ability to develop the fungicide resistance based on the G143A mutation. Pathogens did not get any visible benefit from such an intron gain event but had negative influence for the development of fungicide resistance, suggesting that it was just one of many intron gain-and-loss events occurring neutrally and the intron gain-and-loss events were not necessarily beneficial to their host organisms.

Supporting Information
Table S1 Primers used for intron-specific PCR amplification (DOC) Figure S1 Modification of the sliding intron position or phase. (A) The phase of intron Sc5 was modified based on one nucleotide shifting of the exon/intron sequence. (B) The phase of intron Gi3 was modified based on two nucleotide shifts that altered one codon. (C) Sliding introns can't be modified to same position or phase. Lowercase letters indicate intron sequences, and uppercase letters represent the flanking exon sequences, the larger uppercase letters represent the codon at which individual introns located. If introns have different positions or phases, they are considered as different locations. The locations of introns were modified slightly from original information for some sliding introns. For example, the intron Sc5 at position 270 was a phase 0 intron based on the database information, but one nucleotide shifting of exon/intron sequence could make it have same phase with other introns at this position (such as Cbs2 (2), Cc(2) etc. in table 2), as well as have common group I intron features (A). The intron Gi3 at position 164 was a phase 0 intron, but two nucleotide shifts of exon/intron sequence would fulfill the above mentioned criteria, though one exonic codon was altered (B). In contrast, the same criteria did not allow sliding the introns (C). (TIF)