Analysis of small and large subunit rDNA introns from several ectomycorrhizal fungi species

The small (18S) and large (28S) nuclear ribosomal DNA (rDNA) introns have been researched and sequenced in a variety of ectomycorrhizal fungal taxa in this study, it is found that both 18S and 28S rDNA would contain introns and display some degree variation in size, nucleotide sequences and insertion positions within the same fungi species (Meliniomyces). Under investigations among the tested isolates, 18S rDNA has four sites for intron insertions, 28S rDNA has two sites for intron insertions. Both 18S and 28S rDNA introns among the tested isolates belong to group I introns with a set of secondary structure elements designated P1-P10 helics and loops. We found a 12 nt nucleotide sequences TACCACAGGGAT at site 2 in the 3’-end of 28S rDNA, site 2 introns just insert the upstream or the downstream of the12 nt nucleotide sequences. Afters sequence analysis of all 18S and 28S rDNA introns from tested isolates, three high conserved regions around 30 nt nucleotides (conserved 1, conserved 2, conserved 3) and identical nucleotides can be found. Conserved 1, conserved 2 and conserved 3 regions have high GC content, GC percentage is almost more than 60%. From our results, it seems that the more convenient host sites, intron sequences and secondary structures, or isolates for 18S and 28S rDNA intron insertion and deletion, the more popular they are. No matter 18S rDNA introns or 18S rDNA introns among tested isolates, complementary base pairing at the splicing sites in P1-IGS-P10 tertiary helix around 5’-end introns and exons were weak.


Introduction
Mycorrhizal symbiosis is a common phenomenon in all terrestrial plant communities. One of the major types of mycorrhiza is the ectomycorrhiza, typically formed by almost all tree species in temperate forests [1]. For the ectomycorrhiza symbiosis which the fungus forms a mantle external to the plant root, the number of plant and fungal species involved is currently estimated to be ca. 6,000 and 20,000-25,000, respectively [2,3]. The ecologically and economically most important forest trees (Pinaceae, Fagaceae, Betulaceae, Nothofagaceae, a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 of 28S rDNA was amplified using primers Vdahl4 (5'-CGGGCTTGGCAGAATCAG-3') / Vdahl2 (5'-GCGACGTCGCTATGAACG-3') [18]. An initial denaturation at 94˚C for 1min was followed by 30 cycles of denaturation at 94˚C for 30s, annealing at 47˚C for 30s, and extension at 72˚C for 90s. There was a final extension step at 72˚C for 10min. 18S rDNA-ITS-28S rDNA region was amplified using primers ITS1 (5 0 -TCCGTAGGTGAACCTGCGG-3 0 ) / ITS4 (5 0 -TCCT CCGCTTATTGATATGC-3 0 ) [19]. An initial denaturation at 94˚C for 1min was followed by 30 cycles of denaturation at 94˚C for 30s, annealing at 50˚C for 30s, and extension at 72˚C for 120s. There was a final extension step at 72˚C for 10min. The products were electrophoresed in a 1% (w/v) agarose gel to check the efficiency of amplification. The purified amplicons were sequenced by Shanghai Sangon Biotechnology Co., Ltd, Shanghai Invitrogen Biotechnology Co., Ltd, Beijing Tsingke Biotechnology Co., Ltd., China. The sequences were aligned by sequence analysis software DNAMAN, Lynnon Corporation.

Intron secondary structure modeling
Secondary structure models were predicted following the conventions for group I introns defined by Burke et al. and according to the models proposed by Cech and Michel and Westhof [12][13][14]. The P1-P9 stem-loop elements were individually identified by comparison with available group I intron sequences from the Comparative RNA web site (CRW at http:// www.rna.icmb.utexas.edu/) and then folded using the mfold web server at http://www.bioinfo. rpi.edu/applications/mfold/old/rna/form1.cgi [20,21]. The RNA secondary structures were calculated and drawn using RNAstructure version 4.6 [22].

Positions and structure analysis of 18S rDNA introns
The   -I, Picea-I1, Picea-I2, 1-1-I) from tested isolates had the same features known to be conserved among group-I introns: the last exon base U and the last intron base G; the pairing regions P1-P10; the consensus elements P, Q, R and S within the core region; the internal guide sequences (IGS) proposed to help align the exons for splicing [23][24][25][26][27][28][29]. Beside these common structures of group-I introns above, the 18S rDNA introns (Picea-I1, Picea-I2, Pop7-I, SB6-I, 1-1-I, Spicea-I) have an extensive P5 region (P5, P5a, P5b, P5c and P5d), the 18S rDNA introns (Picea-I1, Picea-I2, Pop-I, AM51-I, 1-1-I, Spicea-I) have two extra stems on the 3' side of P9 (P9.1 and P9.2) from this study and we reported previously [30,31]. The 18S rDNA intron (Picea-I1, Picea-I2, Pop7-I, AM51-I, 1-1-I, Spicea-I) possess an A-rich bulge, however, we did not find an typical A-rich bulge around P5 pairing region in the secondary structures of 18S rDNA intron SB6-I. The sequences of Picea-I2, Yang2-I, Baihua-I, Shanbai-I exhibited 94.7% identity, they have the same secondary structure. The sequences of SB6-I and SO2-I    (Fig 2). The intron distribution in 28S rDNA of tested isolates in this study was showed in Fig 1, the exon sequences flanking introns were showed in Fig 2. Intron distribution compairson between18S rDNA and 28S rDNA were listed in Table 2. Some isolates have both 18S and 28S rDNA introns, some isolates have one of 18S or 28S rDNA introns, some isolates have neither 18S or 28S rDNA introns. Among tested isolates, AM51, Picea, Yang2, Shanbai, Baihua, belong to Meliniomyces spesice, both 18S and 28S rDNA introns display some degree variation in size, nucleotide sequences and insertion positions. While all tested Cenococcum geophilums 18S introns insert at site 3 and 28S introns insert at site 2, both sequences display high homology, respectively. Cenococcum geophilums introns) from tested isolates, it was found three high conserved regions around 30 nt nucleotides (conserved 1, conserved 2, conserved 3), and identical nucleotides can be found in the three conserved regions (Fig 5). Conserved 1, conserved 2 and conserved 3 regions have high GC content, GC percentage is almost more than 60%, that implied conserved 1, conserved 2, conserved 3 regions take part in complementary base pairing which maybe more firm. Sequence analysis of  the three high conserved regions combining with deduced intron RNA secondary structures, three high conserved regions maybe participate in forming P3, P7, P4, helices-core region (the consensus elements P, Q, R and S within the core region), or important for maintaining core region structure, or splicing founction. Conserved 1 region distributes around P3 and P4 helices, and can pull P3 and P4 helices together. Conserved 2 region distributes around P4, P6, P7 helices, that maybe make P Q consensus elements in P4 helix more stable (conserved 2 region can pair with conserved 1 region in many introns, for example AM51-I2, Shanbai-I, 2-15-I, and all tested Cenococcum geophilums introns.), or can pull P6 and P7 helices together (conserved 2 region distributes around P6 and P7 helices in introns Pop7-I and Pop2-I). Conserved 2 region in intron 2-16-I can be found in P9 helix unpairing region, in which small ORF can be found. Conserved 2 region did not be found in intron SO5-I. Conserved 3 region distributes around P7, P8, P9, maybe important for strengthening core region secondary structure, or important for forming loop L8, L9, L9.1, L9.2, L9.3 (Fig 4).  conserved 1 region is more conservative than conserved 2 and conserved 3 regions. Conserved 3 region seems more conservative than conserved 2 region. Conserved 1 region seems more important for intron core region structure maintaining. Conserved 1, conserved 2 and conserved 3 regions in introns 2-16-I and SO5-I, containing long unpairing nucleotide sequence with small HEG ORFs, overall are less conservative than introns without HEG ORFs. The introns containing HEGs can be spliced by homing endonucleases, and endonuclease-mediated intron homing is an effificient process. Homing is initiated by an intron-encoded homing endonuclease that recognizes and generates a double-stranded DNA break close to the site of intron insertion [32][33][34][35][36][37][38][39][40]. Because introns containing HEGs can code themself endonucleases to splice introns, probably they did not need conserved sequences too much, or dependent on conserved sequences completely. This maybe the reason why sequences of introns containing HEGs are less conservative than introns without HEGs. Sequence analysis of 28S rDNA site 1 introns (AM51-I1 and Yang2-I1) from isolates AM51 and Yang2, conserved 1 and conserved 3 regions still can be found. Sequence analysis of conserved 1, 3 regions combining with intron secondary structures, conserved 1 region distributes around P3 and P4 helices and can pull them together, conserved 3 region distributes around P7, P8, P9, maybe important for strengthening core region secondary structure, or important for forming loop L9 (Fig 5). Conserved 2 region did not find in introns AM51-I1 and Yang2-I1.
We would try to find out whether the 28S intron conserved 1, 2, 3 regions exist in 18S rDNA introns or not, interestingly the trace of 28S intron conserved 1, 2, 3 regions can be found in 18S rDNA introns (Figs 5 and 6). Conserved 1, conserved 2 and conserved 3 can be found in all Cenococcum geophilums 18S rDNA introns listed in Table 1 (site 3), differently just conserved 2 located in the upstream of conserved 1, but conserved 2 still can pair with conserved 1 (Fig 6). Cenococcum geophilums is an ecologically important ectomycorrhizal fungus with a global distribution and a broad host range [41], if there is a reason because its 18S and 28S rDNA intron sequences and secondary structures are easy for insertion and deletion? Conserved 1, conserved 2 and conserved 3 can be found in 18S rDNA introns Picea-I1 and Pop7-I (site 1). Conserved 1 and conserved 2 can be found in 18S rDNA introns Picea-I2, Yang2-I, Baihua-I, Shanbai-I (site 3). Conserved 1 and conserved 3 can be found in 18S rDNA intron AM51-I (site 2). Only conserved 3 can be found in 18S rDNA intron SB6-I (site 4), but was divided into two part, 5'-end located in P2.1 helix, 3'-end located in helix P9 and loop L9 (Fig 3).

Discussion
Intron 2-16-I and SO5-I, beside pairing regions P1-P10, they have long unpairing regions, try to find open reading frame and seem they contain small ORFs, maybe they belong to HEGassociated group I introns (Fig 4). Goddard and Burt (1999) published a model of intron lifecycle and homing that involved intron cyclical gain and loss. Full-length HEG maybe need for invading, once the intron becomes fixed, the HEG no longer need, therefore it will accumulate mutations and become non-founctional or lost HEG [42]. From this evoluation point of view, the introns without HEG genes maybe advanced, the introns containing HEG genes maybe old. We found conserved 1, 2, 3 regions from introns 2-16-I and SO5-I with HEG are less conservative than as the introns without HEG did. Introns containing HEG are very rare among 18S rDNA and 28S rDNA, we only found three introns containing HEG (SB5-I from 18S rDNA, SO5-I and 2-16-I from 28S rDNA) from our all tested 18S rDNA and 28S rDNA sequences. The HEG gene no longer need, will be gradually deleted, 2-16-I and SO5-I seem have residual HEG gene nucleotides (non-founctional nucleotide sequences). The reason why residual HEG gene (non-founctional nucleotide sequences) still remain in intron sequences, probably because residual HEG genes have nucleotides which take part in intron secondary structure maintaining or founctions. We did not find the introns containing full length HEG gens, three introns containing HEG (SB5-I from 18S rDNA, SO5-I and 2-16-I from 28S rDNA) all contain residual HEG genes about 100-200 nucleotide sequences, from our isolated ectomycorrhizal fungal samples, our sample all were collected China.
The 12 nt nucleotide sequences TACCACAGGGAT at site 2 in the 3'-end of 28S rDNA, which is just upstream or downstream of the intron insertion position, the high conserved regions and identical nucleotide sequences in the introns at site 2, maybe much easier for introns to insert or delete. Introns break the integrality of exons sequences, introns possibly could control exon genes expressing. we can find 18S rDNA and 28S rDNA absence and presence of introns in the same isolate, for example, isolate CG5 has both 18S rDNA absence and presence of introns. We also find other isolates have both 18S rDNA absence and presence of introns. Genome DNA contains many 18S-5.8S-28S rDNA repeat unit, if product protein expressing from 28S rDNA is over-expressed more than cell metabolization need, will accumlate in cell. Product protein expressing from 28S rDNA is larger than from 18S rDNA, overexpression of 28S rDNA probably increase the cells more burden than over-expression of 18S rDNA. So the mechanism of 28S rDNA expressing control maybe more convenient than 18S rDNA expressing control, intron maybe one of the gene expressing controls. The majority of isolates contain 18S and 28S rDNA introns from our population genetic structure analysis previously, which means isolates containing 18S and 28S rDNA introns are more popular than isolates without 18S and 28S rDNA introns, furthermore, which imply that isolates containing 18S and 28S rDNA introns fit selection pressure better than isolates without 18S and 28S rDNA introns. Probably, the population genetic structure with absence and presence of 18S and 28S rDNA introns are in the balance of gain and lost 18S and 28S rDNA introns. The presence rate of Cenococcum geophilums 18S rDNA introns from China, America, Europe is significantly different from reports and our work, maybe the presence rate of 18S rDNA introns fit the selection pressure coming from its geographical origin. Europe temperature overall is colder than China, whether the presence rate of introns and evolution speed of plant host and fungus are affected by temperature?
Weeks and Cech reported that the yeast mitochondrial group I intron b15 undergoes selfsplicing at high Mg 2+ concentrations, but requires the splicing factor CBP2 for reaction under physiological conditions. Protein CBP2 could help assembly of the catalytic core, which involves association of two domains with each other and with other peripheral structures, and help association of the 5' domain containing the 5' splice site with the catalytic core properly [43]. The Tetrahymena preribosomal RNA intron could undergoes self-splicing in the absence of any proteins [44,45]. Analysis the P1-IGS-P10 tertiary helix between 5'-end introns and exons in 18S and 28S rDNA in this study, we found that the complementary base pairing around the splicing sites were weak. In the P1-IGS-P10 tertiary helix around the splicing sites, there are many UG base pairing and unpairing bases. One of the group-I intron features known to be conserved is the last exon base U. UA and UG bonds are weaker than CG bond, and the presence of unpairing bases could also make the complementary base pairing helix unstable in same degree. The 5' and 3' exons both base pair to the intron's IGS resulting in P1 and P10 helix formation, respectively [45], UG base pairing and unpairing bases in P1-IGS-P10 tertiary helix between 5'-end introns and exons maybe make introns easy to be cut off and make 5' and 3' exons easy to be ligation. Other papers indicated that 5' splice site in P1-IGS-P10 tertiary helix possess UG bond quite common, in almost all introns present a UG pair at the 5' splice site [24,[46][47][48][49].
From the results above, introns in 28S rDNA are much easier to find conserved 1, 2, 3 region than introns in 18S rDNA; site 3 in 18S rDNA introns and site 2 in 28S rDNA introns are hot positions for intron insertion, introns located at site 3 in 18S rDNA and site 2 in 28S rDNA are much easier to find conserved 1, 2, 3 regions than site 1, 2, 4 in 18S rDNA introns and site 1 in 28S rDNA introns; Cenococcum geophilums is one of the most popular ectomycorrhizal fungi, introns in both 18S rDNA and 28S rDNA are much easier to find conserved 1, 2, 3 regions than other fungal species. It seems that the more convenient host sites, intron sequences and secondary structures, or isolates for 18S and 28S rDNA intron insertion and deletion, the more popular they are.