Genetic Analyses of the Internal Transcribed Spacer Sequences Suggest Introgression and Duplication in the Medicinal Mushroom Agaricus subrufescens

The internal transcribed spacer (ITS) region of the nuclear ribosomal RNA gene cluster is widely used in fungal taxonomy and phylogeographic studies. The medicinal and edible mushroom Agaricus subrufescens has a worldwide distribution with a high level of polymorphism in the ITS region. A previous analysis suggested notable ITS sequence heterogeneity within the wild French isolate CA487. The objective of this study was to investigate the pattern and potential mechanism of ITS sequence heterogeneity within this strain. Using PCR, cloning, and sequencing, we identified three types of ITS sequences, A, B, and C with a balanced distribution, which differed from each other at 13 polymorphic positions. The phylogenetic comparisons with samples from different continents revealed that the type C sequence was similar to those found in Oceanian and Asian specimens of A. subrufescens while types A and B sequences were close to those found in the Americas or in Europe. We further investigated the inheritance of these three ITS sequence types by analyzing their distribution among single-spore isolates from CA487. In this analysis, three co-dominant markers were used firstly to distinguish the homokaryotic offspring from the heterokaryotic offspring. The homokaryotic offspring were then analyzed for their ITS types. Our genetic analyses revealed that types A and B were two alleles segregating at one locus ITSI, while type C was not allelic with types A and B but was located at another unlinked locus ITSII. Furthermore, type C was present in only one of the two constitutive haploid nuclei (n) of the heterokaryotic (n+n) parent CA487. These data suggest that there was a relatively recent introduction of the type C sequence and a duplication of the ITS locus in this strain. Whether other genes were also transferred and duplicated and their impacts on genome structure and stability remain to be investigated.

The internal transcribed spacer (ITS) region of the nuclear ribosomal RNA gene cluster is widely used in fungal taxonomy and phylogeographic studies. The medicinal and edible mushroom Agaricus subrufescens has a worldwide distribution with a high level of polymorphism in the ITS region. A previous analysis suggested notable ITS sequence heterogeneity within the wild French isolate CA487. The objective of this study was to investigate the pattern and potential mechanism of ITS sequence heterogeneity within this strain. Using PCR, cloning, and sequencing, we identified three types of ITS sequences, A, B, and C with a balanced distribution, which differed from each other at 13 polymorphic positions. The phylogenetic comparisons with samples from different continents revealed that the type C sequence was similar to those found in Oceanian and Asian specimens of A. subrufescens while types A and B sequences were close to those found in the Americas or in Europe. We further investigated the inheritance of these three ITS sequence types by analyzing their distribution among single-spore isolates from CA487. In this analysis, three co-dominant markers were used firstly to distinguish the homokaryotic offspring from the heterokaryotic offspring. The homokaryotic offspring were then analyzed for their ITS types. Our genetic analyses revealed that types A and B were two alleles segregating at one locus ITSI, while type C was not allelic with types A and B but was located at another unlinked locus ITSII. Furthermore, type C was present in only one of the two constitutive haploid nuclei (n) of the heterokaryotic (n+n) parent CA487. These data suggest that there was a relatively recent introduction of the type C sequence and a duplication of the ITS locus in this strain. Whether other genes were also transferred and duplicated and their impacts on genome structure and stability remain to be investigated.

Introduction
The internal transcribed spacer (ITS) region of nuclear ribosomal DNA (rDNA) has been proposed as a universal DNA barcode marker for fungi [1]. It is widely used in fungal taxonomy and phylogeny due to a high success rate in PCR amplification and a high degree of interspecific variation that can distinguish between most of the closely related fungal species. It is generally believed that multicopy genes such as the nuclear ribosomal RNA gene cluster undergo frequent concerted evolutionary process, making the copies identical or highly similar to each other [2][3][4][5]. The two most commonly proposed mechanisms for concerted evolution are gene conversion [6] and unequal crossing-over [7]. However, numerous studies revealed that the multicopy genes, such as ITS, do not perfectly follow the concerted evolution, and exhibit intra-strain and intra-species variation in a wide variety of organisms [3,5,8]. For example, intra-strain or intra-species variation in ITS sequences have been reported in several fungal genera and species such as Fusarium [9], Scutellospora [10], Ganoderma [11], Xanthophyllomyces [12], Laetiporus [4] and more recently in Ophiocordyceps sinensis [5].
Agaricus subrufescens Peck, the "almond mushroom", is a species commercially cultivated for its nutritional and medicinal values [13][14][15]. This species has been recorded from different continents: the Americas, Asia, Europe, Oceania [13,16], and recently in Africa [17].Thongklang et al. [16] showed that isolates of A. subrufescens from Asia, Europe and South America were inter-fertile and had an amphithallic life cycle (i.e. with both the heterothallic and pseudohomothallic life cycles) where both homokaryotic and heterokaryotic spores can be produced in the same sporocarp. Agaricus subrufescens is a morphologically and genetically highly variable species with a high level polymorphism in the ITS regions [13]. Direct PCR and sequencing of strains with a high level of intragenomic ITS polymorphism often yield chromatographs with many heterozygous sites [13]. Chromatographs with many heterozygous sites have also been reported in other species of the genus Agaricus [18] and other Basidiomycota such as Hebeloma aminophilum [19].
The wild isolate of A. subrufescens CA487 collected at Saint-Leon, France, is one of the specimens previously studied by Thongklang et al. [16]. As in many specimens of this species the ITS sequence of CA487 was highly heterozygous with the sequences of two homokaryotic single spore isolates CA487-S100 and CA487-S42 (GenBank accession numbers KJ541799 and KJ541800 respectively) differing at ten nucleotide sites. However, after reexamining the chromatogram of the ITS region of the parental isolate CA487, we noted that differences between the two progeny could not explain several sites with double peaks and triple peaks (Fig 1).
Here we tested the hypothesis that the ITS chromatogram of isolate CA487 consists of three different types of ITS. If the hypothesis was correct, we further investigated the putative origins of the divergent ITS sequences by comparing our ITS sequences with those from other sources. In addition, we examined the inheritance patterns of the divergent ITS sequences. Our results indicate that there are indeed three equally abundant ITS sequences within this strain, with two types of ITS (A and B) segregating as two alleles at a first locus 'ITSI', while a third nonallelic type of ITS (C) was located at a second unlinked locus 'ITSII'. Interestingly, the types A and B sequences were closely related to those from Europe and the Americas and the type 'C' was similar to ITS sequences found in Asian and Oceanian specimens of the species.

Field collections and single spore isolates
The strain CA487 was isolated by tissue culture from a field specimen collected by J.C. Blanchard in his own garden at Saint-Leon (20 km from Bordeaux), France, on September 28 th , 2006 (identified by J. Guinberteau). A second specimen CA516 was collected by P. Callac and J. Guinberteau at the same site on October 20 th , 2006.
For sporocarp production in an environmentally controlled culture room, the mycelium of strain CA487 was cultivated following the method described in Llarena-Hernández et al. [20]. Single spore isolates (SSIs) were obtained from spore prints directly from cultivated sporocarps following the method described by Thongklang et al. [16]. After spore germination, 120 monospore cultures were isolated and subcultured on compost extract medium (aqueous extract of pasteurized commercial mushroom compost plus 1% glucose and 2% agar).
In addition to the sequences obtained from strain CA487 and its SSIs, sequences from 18 specimens were used for sequence comparison. Among them, 13 sequences were retrieved from GenBank and these sequences have been previously used by Kerrigan [13], Zhao et al. [21], Thongklang et al. [16] and Gui et al. [22]. Four of the 18 sequences from strains ZRL2036, ZRL2134, NT001 and CA935 were generated in previous studies [21,23] but were not submitted until now through this work. The remaining strain CA864 was also added to our analyses for comparison; it was isolated from a wild specimen collected on distillation residue of essential oils of Thuja, at about 300 km away from where CA487 was collected. CA864 was collected by J.C. Chasles (with morphological identification by J. Guinberteau) on November 5 th , 2010 at Saint Lézin, Chemillé-en-Anjou, France. The samples from UK and Martinique have been initially identified as Agaricus rufotegulis (syn. A. subrufescens) and Agaricus fiardii, respectively. DNA isolation and PCR amplification DNA extraction was performed for 94 SSIs following the protocol described by Zhao et al. [21]. DNA concentration was adjusted to 25 ng/μl. PCR amplification was performed in 30 μl of reaction mixture containing 50 ng genomic DNA, 0.5 μM of each primer, 0.2 mM dNTPs, 2 μg BSA, 1 U Taq DNA Polymerase and 1x incubation buffer as described by Zhao et al. [21]. Sequences of the primers are indicated in Table 1.

Cloning and sequencing
PCR products of parental isolate CA487 were purified either directly or after agarose gel electrophoresis and band excision using the QIAquick PCR purification kit (Qiagen, Hilden, Germany) or the Gel Extraction kit (Qiagen, Hilden, Germany). The purified PCR products were then ligated into a pMD19-T vector (Takara, Tokyo, Japan) at 16°C overnight. Recombined plasmids were transformed and replicated into competent cells of Escherichia coli strain Top10 (TianGen Biotech., Co., Ltd., Beijing, China). Plasmids containing the cloned PCR products were extracted using QIAprep Miniprep Kit (Qiagen, Hilden, Germany), then was used as template for sequencing as described in Zhao et al. [21].

Phylogenetic analysis
The sequences produced from this study plus those retrieved from GenBank were aligned using BioEdit v. 7.0.4 [24]. The maximum parsimony (MP) analysis was performed with PAUP Ã 4.0b10 [25] by heuristic searches with unordered characters, random addition of sequences, gaps included as coded characters, and used tree bisection-reconnection (TBR) branch swapping for optimization. The MP analysis included 1000 bootstrap replicates.

Map position of markers
Co-dominant markers were developed firstly to distinguish between the homokaryotic and heterokaryotic single spore isolates, and secondly to distinguish the different types of ITS present in the single spore isolates. Thongklang et al. [16] showed that in strains of A. subrufescens such as CA487, the basidia produce not only classical homokaryotic spores but also a variable percentage of heterokaryotic spores. Because at least 99% of the basidia were tetrasporic, it was hypothesized that ability to produce both types of spores was likely due to one additional round of post-meiotic mitosis occurring in a variable proportion of basidia and generating eight haploid nuclei [16,26]. In such basidia each basidiospore receive two nuclei would be heterokaryotic. The rate of heterokaryotic spores depends on the parent strain but is also sensitive to the environmental conditions [13] and can vary from 40% to 75% [16].
In a theoretical random tetrasporic model proposed by Rocha de Brito et al. [26], the nuclei are paired at random and the expected loss rate of parental heterozygosity among the heterokaryotic offspring is 33%. This applies to any locus including the centromeres; subsequently, two-third receive non-sister nuclei (i.e. each one per second meiotic division), while the remaining one-third receive sister nuclei. The heterokaryons receiving non-sister nuclei are, therefore, highly heterozygous in the centromeric region and a co-dominant marker tightly linked to a centromere would be able to reveal heterokaryosis [26]. However, such a marker is not appropriate to identify heterokaryons containing sister nuclei, which are highly homozygous in the centromeric region. It can be noted that, in different studied offspring, Thonhklang et al. [16] and Rocha de Brito et al. [26] generally observed a loss rate of parental heterozygozity of about 50% at markers unlinked to the centromere due to crossovers, i.e. higher than the 33% expected in the theoretical model. Therefore, both centromere-linked and -unlinked markers are needed to reliably identify heterokaryotic offspring.
EST sequences of A. subrufescens identified by Foulongne-Oriol et al. [27] with putative homologs in the genome of Agaricus bisporus were selected. Information about their physical positions on the A. bisporus genome was used to develop informative molecular markers to distinguish between the different expected types of spores among the CA487 SSIs. First, we selected an A. subrufescens sequence (ci isotig00274, GenBank: GBEJ01008187) for which A. bisporus homolog was close to MAT and the centromere (PRS088 marker), and its tight linkage with the centromere was recently confirmed in A. subrufescens [26] and more generally a highly conserved macro-synteny between the two species [28]. Then, two other A. subrufescens sequences, for which A. bisporus homologs were found in distal positions on chromosomes X and III were used to develop markers PRS016 and PRS049, respectively (Table 1). PRS016 and PRS049 were derived from cl isotig01118 (GenBank: GBEJ01009029) and cl isotig1243 (Gen-Bank: GBEJ01009154) transcribed RNA sequences, respectively.

Molecular markers genotyping
Cleaved Amplified Polymorphic Sequences (CAPS) markers were previously used for mapping in A. bisporus [29][30] and to study the life cycle of A. subrufescens [16,26]. They exploit heteromorphic positions detected in the sequences. In this study, loci showing restriction fragment length polymorphism between alleles within strain CA487 were included for analyzing the SSIs. After digestion with the appropriate restriction endonuclease, DNA fragments were separated by electrophoresis in 2% agarose gel. Co-dominant alleles and their constituent DNA fragment sizes are given in Tables 1 and 2.

Multilocus genotype testing
A multilocus genotype test previously used in A. bisporus by Kerrigan et al. [31][32][33] and in A. subrufescens by Thongklang et al. [16] and Rocha de Brito et al. [26] to determine the homozygote/heterozygote status of the SSIs (n vs. n+n) was also used here. Specifically, co-dominant single-locus markers that are heteroallelic in the parental strain are used to determine whether the individual SSIs were homozygotes or heterozygotes. SSIs heteroallelic at least at one of the loci are unambiguously heterokaryons. SSIs homoallelic at all loci are regarded as putative homokaryons. This multilocus genotype test requires markers that must be genetically independent (genetically unlinked loci) and is more stringent if one of the markers is tightly linked to a centromere [16]. Contingency Chi-square tests (df = 1, p = 0.05) were used to determine the linkage relationships among the markers.

Types of ITS inferred from 284 clones
Since double or triple peaks were observed in chromatograms by direct sequencing, PCR products of the heterokaryotic parental strain CA487 were cloned to reveal the ITS sequence diversity within the strain. The sequences obtained from 284 clones did not exhibit any double or triple peaks. In the ITS alignment of the 284 sequences, 35 of the 662 aligned nucleotide sites were variable.
Among the 284 clones, those having identical sequences were grouped in the same type; 164 clones were grouped into three major types that we named ITS types A, B, and C. Type A (59 clones) was characterized by having unique nucleotides at six positions (39, 122, 130, 145-146, and 200); type B (47 clones) by having unique nucleotides at four positions (114, 334, 559, and 649); and type C (58 clones) by having unique nucleotides at three positions (269, 466, 475). Together, alignment of the 164 cloned ITS sequences showed polymorphism at 13 positions (8 in ITS1, 1 in 5.8S, and 4 in ITS2) as indicated in Table 3. The remaining 120 sequences differed from the main ITS types A, B, or C either by nucleotide substitutions at single or rarely two unrelated positions or were putative chimeras due to recombination between the main ITS  types. The A, B, and C type sequences are the "dominant" sequences and the focus of our study while others may represent true minor variants or PCR/sequencing artifacts. A more intense sequencing is needed to confirm whether the minor variants represent authentic variants or PCR artifacts. The ratio between the observed numbers of clones of types A, B, and C among the 164 sequenced clones (59, 47, and 58, respectively) did not differ significantly from a 1:1:1 ratio (Chi-square value = 1.62, df = 2, p = 0.44). Three representative sequences of each of ITS types A, B, and C sequences are deposited in GenBank ( Table 3).

Confirmation of the stable presence of three different ITS sequences within isolate CA487
In the original ITS sequence chromatograph of isolate CA487, we observed some heteromorphisms and triple peaks that could not be explained by superposing the ITS sequences of two homokaryotic single spore isolates CA487-S100 and CA487-S42 obtained by Thongklang et al. [16]. We hypothesized that a third ITS type might be present in isolate CA487. As explained in Fig 1, based on the original chromatograph of strain CA487 and the chromatographs of its two SSIs, it has been possible to reconstitute the three types of ITS with the exception of the character G at position 334 located in the 5.8S region. This G was masked by the flanking G of the sequences of types A or C in both the forward and reverse strand, respectively (Fig 1). This polymorphism was detected in another sequence obtained by using a forward primer located between the length heteromorphism of the position 39 and the position 334 (data not shown). It was also detected in the sequence of CA487-S42. These three sequences were confirmed by results from sequencing of cloned ITS PCR products. The ITS sequences of the single spore isolates CA487-S100 and CA487-S42 [16] were found to be identical to sequences of types A and B, respectively. To further verify our results and eliminate the possibility of contamination, we also extracted DNA from additional tissues of the originally collected fruiting body (herbarium specimen CA487) as well as fruiting bodies produced by growing the mycelia of this specimen (isolate CA487). Our results showed that the presence of the three types of ITS was stable. In addition, we also sequenced the specimen CA516 collected in the same site about one month later. Chromatographs with similar double and triple peaks were obtained in all cases.

ITS sequences comparison of selected samples from different continents
Representative clones (ITS of types A, B, and C) of isolate CA487 were compared with sequences of 18 wild specimens from Americas, Asia, Europe, and Oceania. Among three samples that were highly heterozygous for the ITS sequences, two isolates had clear haploid a In this alignment, absent nucleotides are represented by a dash and characters specific to each type of ITS appear in bold type. sequences of the constitutive nuclei and thus they were included in the comparison, while the remaining highly heterozygous sequence was excluded. From the total alignment, 29 positions were found variable; polymorphisms at these positions are shown in S1 Table. At nine of the 29 positions the polymorphisms are parsimony informative (i.e. with each nucleotide shared by at least two groups of samples) and allowed us to distinguish three groups of ITS, each being characterized by its own unique set of nucleotides (haplotype) at these 9 positions (Table 4). Specifically, two of the three groups of sequences are characterized by exactly the same six and three nucleotides which characterized respectively the types A and C of strain CA487 (S1 Table). The sequences of the third group have the same set of nucleotides (haplotype) than as the type B of CA487 at the nine polymorphic positions. However, the four nucleotides which characterized the type B sequence within isolate CA487 were not shared by any other sequences of our sample and Samples or sequences with one type of ITS

Samples possessing two or three different types of ITS
a Absent nucleotides are indicated as deletion by a dash; ITS type A or type C-specific characters are in bold type.
b He = heterokaryotic genotype (n+n); Ho = homokaryotic genotype (n) were obtained from clone (C), from spore (S), or deduced (D) from the electropherogram of a heterokaryon by interpreting the double peaks. c Heterokaryotic specimens possessing several types of ITS; their haploid constituents A, B, or C, are listed above in the thus appeared as non-parsimony informative. Other sequences of type A or C have also one or several polymorphisms at other position not shared by all sequences within each clade. Several samples from Europe and Americas such as the French strain CA864 and the Brazilian strain WC837 that were highly heterozygous possessed in fact both haploid sequences of type A and B (Table 4). This is also likely the case for the Californian sample SBRF, but this was not formally shown because haploid sequences were not available. For this reason, the sequence of SBRF was not included in the phylogenetic analysis.
The relationships among the 22 ITS sequences are shown in Fig 2. The results indicated the types A and C ITS sequences belonged to two distinct lineages with strong bootstrap support, while the sequences of type B form a paraphyletic group. Geographically, specimens with ITS types A, B or both A and B were found in the Americas and in Europe, while sequences of type C were found in specimens from Hawaii and Southeast Asia. The only exception was the type C sequence from the French isolate CA487 which also possessed the types A and B sequences.

Classification of the single spore isolates: homokaryons and heterokaryons
The heterozygosity level of SSIs (n vs. n+n) was determined on the basis of the homo-or hetero-allelic genotypes at three genetically independent loci, obtained by using three CAPS markers (Table 1 and S2 Table). Among the 94 SSIs, 26.60% (25/94) were heteroallelic at least at one of the three loci and were therefore considered as heterokaryons. The remaining SSIs were putative homokaryons and further subjected to ITS genotyping with PCR-RFLP. From Chi-square contingency tests, Mendelian segregation 1:1 was not rejected at each locus and no evidence of genetic linkage was found among the three loci (df = 1, p > 0.05 in all tests, see S3 Table). We also noted that, among the 25 heterokayons, the rates of homozygosity were 60% (15/25), 56% (14/25), and 40% (10/25) at three loci PRS49, PRS16, and PRS88 respectively, with an overall mean of 52%. As the locus PRS88 is tightly linked to the MAT locus, the rate of homozygosity at this locus of 40% is also an estimation of the rate of sister nuclei heterokaryons that are homozygous at the MAT locus and behavior as homokaryons [26]. This observed rate is close to the 33% expected in a theoretical model of random migration of the post meiotic nuclei in the spores [26].

ITS phenotypes and genotypes of the SSIs
The three ITS sequence types of A, B, and C can theoretically generate seven different combinations in offspring: A, B, C, A+B, A+C, B+C, and A+B+C. Among these seven possible combinations, only the pure type C was not found among the 94 SSI progeny. The restriction enzyme MboII has one recognition site present in both types A and C and a second recognition site present in only ITS type C (Table 1). However similar electrophoretic patterns are expected in two cases: for [C] and [A+C] and also for [B+C] and [A+B+C]. To ensure further distinction of the types, digestion using an additional enzyme FokI was employed and this enzyme has a recognition site present in both types B and C. Together, digestions using MboII and FokI distinguished all the phenotypes. As shown in Table 2, except the pure type C, all six other phenotypes were detected among the 94 single spore isolates. The number of each ITS phenotype and their putative genotypes are reported in Table 2 (20 + 11 = 31) bearing alleles ITSI-a and ITSI-b, respectively, segregate at locus ITSI. Homokaryons possessing the ITS type C (19 + 11 = 30) bear an allele ITSII-c segregating at the second locus ITSII with a corresponding null allele (ITSII-n as null in 39 homokaryons). Alleles at both loci ITSI and ITSII showed a balanced Mendelian segregation (Chi-square values were of 0.72 and 1.18, respectively, which means 1:1 segregation ratio is accepted; df = 1, p > 0.05). For the Chi-square contingency tests of genetic linkage between the two loci ITSI and ITSII, the parental: recombinant ratio was 30:39 (Chi-square value = 1.43, df = 1, p = 0.23), not significantly different from independent re-assortment. Similar results were obtained for ITSI, ITSII and the three markers used in this study (S3 Table) showing that alleles at all five loci were segregating normally and re-assorting independently.
Among the genotypes of the 25 heterokaryons at the ITSI locus, the observed pairing of the alleles in the spores is 4 a/a, 9 b/b, and 12 a/b, giving a homozygosity of 52% (13/25) which is similar to the rate observed on average at the three other loci PRS49, PRS16, PRS88 in the same offspring. This is also similar to the rate around 50% observed at various loci in different offspring by Thongklang et al. [16] or Rocha de Brito et al. [26]. At the ITSII locus genotypes ITSIIc/c and ITSII-c/n have the same phenotype [C] and thus cannot be distinguished from each other. At this locus, the observed homozygosity at ITSII for the null allele was 24% (6/25), which is also in agreement with a rate around 50% of homozygosity at this locus for the two alleles.
In conclusion, allelic segregation among both homokaryotic and heterokaryotic offspring agree with the presence of two unlinked ITS loci. ITS of types A and B are two alleles segregating at the ITSI locus. ITS of type C is a non-allelic homolog (a paralog) located at the second locus ITSII that is not linked to ITSI.

Discussion
Evidence for the presence of three different types of ITS The presence of three distinct types of ITS within one strain was supported by data from different methods. First they were inferred by interpreting the shifted sequences in the chromatograms of the isolate CA487 due to length differences among the three types. Similar chromatograms were obtained with CA516, a collection from the same site, indicating that a mycelium possessing these three ITS sequences was persisting at this location. Among the hundreds of ITS sequences of specimens in the genus Agaricus that we have obtained since over a decade ago, this was the first time we observed such stable and reproducible chromatograms with triple peaks. Then the presence of three distinct ITS sequence types was confirmed by the more reliable method of PCR cloning and sequencing. The observed number of clones bearing three types of ITS (A, B, and C) was not significantly different from the 1:1:1 ratio in our PCR clone library. The three types of ITS were also found in the offspring but none of 94 single spore isolates possessed only the type C.

Genetic hypotheses inferred from the analyses of the offspring
Among the 94 single spore isolates, 25 (27%) showed evidence of heterozygosity and were considered heterokaryotic. This is lower than the 44% estimated previously among a different set of offspring of the same strain by Thongklang et al. [16]. However, the previously estimated rate was likely an overestimate because their ITS marker could not distinguish ITS types B and C and homokaryons [AC] were interpreted as heterokaryons. In addition, variations in environmental conditions might influence the rate of heterokarotic offspring in both amphithallic species A. subrufescens and A. bisporus [31].
The presence of the different types of ITS and their segregation were separately analyzed in the homokaryotic and the heterokaryotic offspring. Based on the fact that the three types of ITS were equally prevalent in the parent strain, one simple hypothesis was that CA487 was a trikaryon, i.e. a heterokaryon composed of three types of genetically different nuclei [34] with each of them carrying the ITS of types A, B and C, respectively. The presence of homokaryons AC and BC, and the absence of homokaryons C in the homokaryotic offspring allow us to definitely reject this hypothesis. Instead, we observed allelic segregations among both homokaryotic and heterokaryotic offspring which agree with the presence of two unlinked loci. ITS of types A and B are two alleles at the ITSI locus. ITS of type C segregated with a null allele at the ITSII locus unlinked to ITSI. At ITSI, the rate of homozygosity was close to 50%, similar to the other loci. ITSI and ITSII are not linked to any other marker used in this study.
The locus ITSI may contain hundreds of copies of the ITS and here we assume the simplest explanation that all the copies of the allele ITSI-a were of type A and all the copies of the allele ITS-b were of type B. The hypothesis that a proportion of the copies of the allele ITS-a could be of type B sequence is possible but unlikely for two reasons. Firstly, the chromatograms of type A and type B sequences in types A and B homokaryons were very clean with no obvious minor peaks resulting from the differences between the two sequences. Secondly, wild strains with only type A or type B are found in Europe, North and South Americas and the Caribbean. Therefore, alleles A and B seemed to have diverged a long time ago. If types A and B sequences could be mixed in the same cluster within the same homokaryon, we should expect to find all the intermediate proportions between 100% A 0% B and 0% A 100% B. However, in the different offspring we have studied (present and previous studies), we have never found any homokaryon with both the type A and type B sequences.
The heterokaryon CA487 has two constituent nuclei bearing alleles ITSI-a and ITSI-b, respectively, but it remains unknown in which of the two nuclei the ITSII locus can be found. Unfortunately, attempts to de-heterokaryotize CA487 to obtain the two constitutive haploid nuclei have failed. The methods we used were the same protocols that have allowed us to successfully obtain homokaryons from heterokaryons in A. bisporus through protoplasting.
Our data indicated that ITSI is not linked to the centromere. Specifically, the heterokaryons that are homoallelic at the centromere-linked locus PRS088 are putative sister progeny heterokaryons and thus should be highly homozygous at all centromere-linked loci [26]. This is not the case at the ITSI locus since four of these ten heterokaryons are heterozygous ITSI-a/ b. As for the second ITS locus, it is possible that the null allele represents undetected ITS that could not be amplified with the current universal primers. However, we believe this is highly unlikely as we have not failed amplification of any samples from the genus Agaricus using the universal primers. The simplest hypothesis is that ITSII is a duplicate locus of the ITS introgressed from a homokaryon closely related to those from an Asian strain and that it is present in only one of the two constitutive haploid nuclei of the parent CA487. ITSII is not linked to ITSI and this genomic composition is persisting in the natural environment.
Geographic distribution ranges of the types of ITS and evidence for hybridization ITS sequences similar to or identical with the types A, B, and C sequences of isolate CA487 have been found in field specimens from different continents. The extensive differences among the three sequences suggest that these sequences have likely diverged from each other for a long time. However, the exact reason for their divergence is not known. These divergences may have occurred several millions years ago. Indeed, an approximate date of 2-3 Ma is indicated in the multigene Maximum Clade Credibility tree of Zhao et al. [35] for the divergence between a wild Chinese sample (zrl2012722) and a cultivar (CA276). The latter likely originates from Brazil, like most cultivars having a ITS sequence of type A/B. In addition, A. subrufescens is the earliest species arising in the A. sect. Arvenses approximately 12 Ma ago. The geographical distribution of the different types of sequences suggests that geographical isolation might have contributed to their divergence.
Kerrigan [13] analyzed the diversity of A. subrufescens and noted extensive heterozygosity in American samples. He suggested that hybridization might have occurred between populations. He also noted that the Hawaiian samples were highly diverged from samples in the Americas. Since 2005, many specimens have been collected in Europe and in Asia. Some European samples possessed similar heteromorphisms as the American samples while ITS sequences of the Chinese and Thai samples were similar or even identical to those of the Hawaiian samples [13,16,21,22]. Our analysis showed that American and European samples had ITS of type A, B or both A and B, while the Asian and Oceanian samples had ITS of type C. Thus, we were surprised to find an ITS sequence of type C in the French specimen CA487.
Our results suggest two possibilities. In the first, the type C sequence might have a distribution much broader than is currently known to cover Eurasia and even Africa. Incomplete sequences of old herbarium samples of A. subrufescens from DR Congo in Africa [17] showed a high sequence identity to type C sequences, suggesting that the type C might be present on this continent. Additional samples from Europe and Africa might help shed light on its distributions. In the second, the ITS of type C was recently introduced and introgressed from Asia/Oceania to Europe. The commercial cultivation of A. subrufescens remains sporadic in Europe and to our knowledge all or almost all the cultivated strains originated from America and most of them bear both types of ITS A and B sequences. Alternatively, human assisted migration of the three types of ITS sequences into the same ecological niche could have brought the different types together and resulted in their hybridization and subsequent generation of a strain such as CA487. We would like to note that CA487 was isolated in 2006, before A. subrufescens were formally described based on field specimens in Asia. In addition, to our knowledge, A. subrufescens was not commercially cultivated in Europe, and there was only few imports from Brazil [14].
Interestingly, since 2012, the Thai isolate CA918 possessing an ITS of type C has been cultivated and crossed with the Brazilian strain WC837 and the French strain CA487. These crosses generated strains with different combinations of ITS types such as AC or BC and these strains have been cultivated at INRA [16]. These observations showed that a Asian sample possessing the ITS of type C was inter-fertile with those bearing types A or B from Europe or Americas, giving viable and sometimes vigorous hybrids. Thus, it is conceivable a hybrid with all three ITS sequence types could be generated.
Are ITS loci I and II within functional rDNA clusters?
In strain WC837 of A. subrufescens, Rocha de Brito et al. [25] found that the ITS locus was linked to a locus PRS160 located in a sequence having a homolog in a distal position on chromosome VII of A. bisporus. This result indicated that the position of PR160 and/or of the ITS locus likely differed between these two species, despite their high level of macro-synteny [28]. In strain CA487, ITSI is most likely functional because offspring having only the ITSI-a or the ITSI-b alleles are viable. In contrast, the ITSII-c allele was never found alone in the single spore isolates, thus whether it is functional or not remains to be determined. Based on the chromatograms and the relative proportions of clones with the types A, B, and C sequences from strain CA487, the copy number of the type C ITS sequence within strain CA487 is likely very similar to those of types A and B. The long-term effects of such introgression and ITS duplication on the genome structure and function of A. subrufescens remain to be examined.
Supporting Information S1  Table. Evidence for Mendelian segregation between alleles within each of the five loci and independent re-assortment between all pairs of loci. (DOCX)