Discovering cryptic species in the Aspiciliella intermutans complex (Megasporaceae, Ascomycota) – First results using gene concatenation and coalescent-based species tree approaches

Taxonomic identifications in some groups of lichen-forming fungi have been challenge largely due to the scarcity of taxonomically relevant features and limitations of morphological and chemical characters traditionally used to distinguish closely related taxa. Delineating species boundaries in closely related species or species complexes often requires a range of multisource data sets and comprehensive analytical methods. Here we aim to examine species boundaries in a group of saxicolous lichen forming fungi, the Aspiciliella intermutans complex (Megasporaceae), widespread mainly in the Mediterranean. We gathered DNA sequences of the nuclear ribosomal internal transcribed spacer (nuITS), the nuclear large subunit (nuLSU), the mitochondrial small subunit (mtSSU) ribosomal DNA, and the DNA replication licensing factor MCM7 from 80 samples mostly from Iran, Caucasia, Greece and eastern Europe. We used a combination of phylogenetic strategies and a variety of empirical, sequence-based species delimitation approaches to infer species boundaries in this group. The latter included: the automatic barcode gap discovery (ABGD), the multispecies coalescent approach *BEAST and Bayesian Phylogenetics and Phylogeography (BPP) program. Different species delimitation scenarios were compared using Bayes factors species delimitation analysis. Furthermore, morphological, chemical, ecological and geographical features of the sampled specimens were examined. Our study uncovered cryptic species diversity in A. intermutans and showed that morphology-based taxonomy may be unreliable, underestimating species diversity in this group of lichens. We identified a total of six species-level lineages in the A. intermutans complex using inferences from multiple empirical operational criteria. We found little corroboration between morphological and ecological features with our proposed candidate species, while secondary metabolite data do not corroborate tree topology. The present study on the A. intermutans species-complex indicates that the genus Aspiciliella, as currently circumscribed, is more diverse in Eurasia than previously expected.

Introduction species delimitation methods have otherwise not been used until now in Megasporaceae species, particularly not in studies of the A. intermutans complex. We used a combination of phylogenetic strategies to delimit species in the Aspiciliella intermutans complex applying coalescent-based approaches and other recently developed DNA-based methods.
The objective of this study is to delimit species boundaries within the Aspiciliella intermutans complex by implementing single-and multi-locus phylogenetic analyses, and coalescentbased species delimitation methods.

Specimen sampling and phenotypic study
We gathered 90 samples from Asia and Europe; they were mostly collected by the authors in Iran, Armenia and the Czech Republic. Specimens were also obtained from the following herbaria: B, GZU, PRA, PRC, PL, CR and GLM.
Locations of sampels: Molecular data of specimens tentatively referred as Aspiciliella intermutans were analyzed using 66 ITS, 51 MCM7, 61 nuLSU and 73 mtSSU sequences as they have been shown to be useful to resolve phylogenetic relations in this group of lichenized fungi ([e.g. [1,5,6,27]). Five samples of Aspiciliella cupreoglauca were selected as outgroup, since this species is closely related to Aspiciliella intermutans [5].
Morphology and anatomy of the specimens were observed, using a Leica M165 C stereomicroscope, and a Leica DM 2500 P compound light microscope connected to a Leica MC 190 HD digital camera. Detailed observations of thallus anatomy, asci, ascospores and conidia were made on hand-cut sections of thallus, apothecia and pycnidia mounted in tap water.
Chemistry of the specimens was investigated by thin layer chromatography (TLC), using solvent systems B and C, and by spot tests, using KOH, C and I, applied directly on the lichen thalli [28].

DNA extractions, PCR amplifications and sequencing
Total DNA was extracted from freshly collected material according to Park et al. [29] but with some modifications as described in Zakeri et al. [5,6].

Marker
Primer parameters were initial denaturation for 5 min at 95˚C, followed by 30 cycles of 30 secs at 95˚C, 30 secs at 54˚C for amplifying ITS, 64˚C for nuLSU, 59˚C for mtSSU, 58˚C for MCM7, and 1 min at 72˚C, a final extension step of 3 min at 72˚C was added, after which the samples were kept at 4˚C. The amplification products were visualized by electrophoresis on 1% agarose gels and stained with ethidium bromide, and were purified by adding 2 μL ExoSAP (Exonuclease 1-shrimp alkaline phosphatase, USB) to 5 μL of the PCR products, followed by a heat treatment of 15 min at 37˚C and 15 min at 80˚C. Both DNA strands of the PCR product were sequenced on an ABI 3730 by the Bik-F Laboratory Center in Frankfurt am Main.

Sequence alignment
Sequence fragments generated for this study were assembled and edited using BioEdit v. 7

Phylogenetic analyses
Maximum likelihood (ML) analyses were performed for the single loci separately and for the concatenated dataset using the online version of IQ-TREE [38][39][40] with simultaneous inference of the optimal partitioning scheme and substitution models for each data partition. For concatenated dataset, a partition of seven subsets has been proposed (Group I intron, ITS1, 5.8S rDNA, ITS2, mtSSU, nuLSU and MCM7). Conflicts among the single gene trees were assessed using the tree topology and following the criterion of nodes bootstrap support equal or above 95% in one but below this value in the other tree for no conflict. No conflicts were found and thus all the datasets were concatenated. Branch support was estimated with the ultrafast bootstrap algorithm [41] based on 1000 bootstrap replicates and using a maximum of 1000 iterations and a minimum correlation coefficient of 0.99 as a stopping rule. The ITS PCR product obtained was around 800 bp. The larger sizes of ITS were due to the presence of insertions of about 200 bp identified as Group I intron [42] at the 3' end of the SSU rDNA. Group I intron were present in all the samples of Aspiciliella intermutans complex.
We used the Markov Chain Monte Carlo approach implemented in MrBayes v.3.2.2 [43] to infer phylogenetic trees applying the partitioning scheme inferred with IQTREE and slightly simplified substitution models, because most of the models inferred by IQ-TREE are not implemented in MrBayes. See Table 2 for details on locus partitions and substitution models. Two independent runs, each with four heated Metropolis-Coupled MCMC chains ("temperature" 0.2) were initiated and run for 5,000,000 generations, with tree and parameter sampling Study on the Aspiciliella intermutans complex every 100 th generations. Convergence among the parameters of the both runs was assessed in Tracer v.1.6 following the criterion of effective sample size (ESS) values above 200. The consensus tree was generated after discarding the initial 25% trees.

Species delimitation analyses
In order to infer the most likely species numbers in our Aspiciliella intermutans dataset, we chose a method for single-locus DNA-based species delimitation, the automatic barcode gap discovery [44], and two coalescent-based models � BEAST [45] and BPP v.3.2 [46] to analyze the four loci concatenated dataset.
Our exploratory analyses showed that the ITS region is very variable in the A. intermutans group and the nuLSU is very conserved (results not shown) and thus they were not used for the final single-locus species delimitation analyses.
ABGD is an automatic procedure that sorts the sequences into hypothetical species based on the barcode gap. This method automatically finds the distance where the barcode gap is located [44]. The ABGD method was carried out for the mtSSU and MCM7 datasets using the Web interface at http://wwwabi.snv.jussieu.fr/public/abgd/abgdweb.html. Default parameters were chosen using Kimura 2-parameter (K2P) distances that correct for transition rate bias (relative to transversions) in the substitution process. The default for the minimum relative gap width was set to different values between 0.1 and 0.15.
We used the coalescent-based hierarchical Bayesian model � BEAST implemented in BEAST v.1.8.0 to estimate species tree following our proposed species delimitation scenarios. � BEAST incorporates the coalescent process and the uncertainty associated with gene trees and nucleotide substitution model parameters and estimates the species tree directly from the sequence data [47].
Species assignments are required a priori for � BEAST analyses. To designate species in � BEAST, we used four different scenarios (4, 5, 6 and 9 species) for assigning individuals to a 'candidate species', based on well supported groups in the phylogenetic tree obtained from the concatenated dataset. We used lognormal distributions for the relaxed uncorrelated rates for all loci, we selected a Yule process and gamma-distributed population sizes for the species-tree prior and a piecewise linear population size model with a constant root. Default values were used for remaining priors. Two independent Markov chain Monte Carlo (MCMC) analyses were run for a total of 100 million generations, sampling every 1000 th generations and excluding the 12500 trees as burn-in. We assessed the MCMC convergence and determined burn-ins by examining ESS values and likelihood plots in the program Tracer version 1.6. The posterior probabilities of nodes were computed from the sampled trees (excluding burn-in samples) using TreeAnnotator 1.8.0 [47].
We used the multispecies coalescent model implemented in the program BP&P v.3.2 [46] to estimate the posterior probability for each species scenario (4, 5, 6 and 9 species). BP&P incorporates coalescent theory and phylogenetic uncertainty into parameter estimation; and the posterior distribution for species delimitation models is sampled using a reversible-jump Markov Chain Monte Carlo (rjMCMC) method. We used the unguided species delimitation algorithm [48]. This algorithm explores different species delimitations and different species phylogenies, with fixed specimen assignments to populations. The program attempts to merge populations into one species and uses the nearest neighbour interchange (NNI) or subtree pruning and regrafting (SPR) algorithms to change the species tree topology [46]. We performed analyses using algorithms 0 and 1 using three prior theta (θ: is a parameter that is proportional to the product of the population size and the mutation rate) assigned to gamma distributions of G (2, 10), G (2, 100) and G (2, 1000), combined with root age (τ0) assigned G (2,2000). Rates were allowed to vary among loci (locus rate = 1), and the analyses were set for automatic fine-tune adjustments. Each reversible-jump Monte Carlo (rjMCMC) analysis was run for 100,000 generations, sampling each generation and specified a burn-in of the first 8 000 generations. Each analysis was run twice to confirm consistency between runs. Speciation probabilities greater than 0.95 were considered as supported species delimitations.
Since different species delimitation analyses supported different scenarios for the Aspiciliella intermutans complex, the most likely hypothesis of species boundaries was assessed using Bayes factor delimitation (BFD) test [49]. Bayes factor uses a Bayesian coalescent-based reconstruction of species trees to compare the different species delimitation models, calculating the marginal likelihood and does not require a guide tree. We calculated marginal likelihood estimates (MLEs) for four species delimitation scenarios (4, 5, 6 and 9) species. For each scenario, we reconstructed a species tree using � BEAST v.1.8.0. The analyses were run with the same substitution models used in MrBayes, a birth-death model for the species tree prior; population size model set to piecewise linear and constant root. The MCMC chain was run for 20,000,000 generations, sampled every 1000 th , and the first 25% of each run was discarded as burn-in. MLEs were estimated using path sampling (PS) and stepping-stone sampling (SS) in � BEAST, with 100 path steps, a chain length of 100,000 generations and likelihoods saved every 100 th generations. The Bayes factor was calculated as 2 × (marginal likelihood of the best model (model1)-marginal likelihood of alternative model (model2). Positive values of Bayes factor support the best model against alternative models.

Phylogenetic analyses
We generated a seven-locus data matrix consisting of 3205 aligned nucleotide positions. The matrix of the data sets had 183, 231, 139, 172, 1073, 507 and 900 unambiguously aligned nucleotide positions for Group I intron, ITS1, 5.8s, ITS2, nuLSU, MCM7 and mtSSU respectively. Table 2 summarized the best-fitting models of evolution for each locus. The multilocus data set was based on 79 individuals. A total of 218 new DNA sequences (ITS, nuLSU, mtSSU and MCM7) were generated for this study and were aligned with sequences obtained from Gen-Bank (Table 3). The partitioned ML analysis of the concatenated data matrix yielded the optimal tree with Ln likelihood value = -12042.961 The mean LnL value of the two parallel runs of the Bayesian analysis for the seven combined loci was -12512.24 with a standard deviation of ±2.17.
Since the topologies of the trees estimated from Bayesian methods and ML did not have any well-supported conflict, only ML topologies from IQ tree are shown with bootstrap and posterior probability values indicated on the nodes (Fig 1). In the concatenated tree topology,  specimens of A. intermutans were recovered in several distinct clades (named hereafter A, B, C and D), in some cases with low support especially in Bayesian analyses (Fig 1). In all the groups, subclades with a strong support were also found.

Species delimitation analyses.
Our exploratory analyses showed that the single gene tree were not congruent. Further, ITS region was resulted too variable and the nuLSU too conserved in the A. intermutans group (results not shown). The monophyletic groups B, C, D1, D2 and D3 obtained in the concatenated tree (Fig 1) were not reciprocally monophyletic in the single gene trees. The clades A1, A2, A3 and A4 were recovered in mtSSU gene tree and A3 in MCM7 gene tree (see S1 Fig and S2 Fig).
ABGD analyses applied to MCM7 and mtSSU dataset detected 12 and 9 candidate species respectively. The results of these analyses are shown in supporting information.
The results of the multispecies coalescent species delimitation method BP&P with different algorithms 0 and 1 are summarized in Tables 4 and 5. The two algorithms in BP&P Study on the Aspiciliella intermutans complex consistently distinguished 4-6 independent evolutionary lineages. While the 4, 5 and 6 species topologies had a speciation probability of >0.96 threshold in the runs with algorithm 1 and 0 (values ranged between 0.96-1.0), the nine species topology was only supported (0.95) with algorithm 1 and a prior distribution on ancestral population size (theta) with a gamma distribution of G (2, 1000). The Bayes factor delimitation results are provided in Table 6. BFD test supports the 6-species delimitation scenario model as the best scenario over other models tested, i.e. four, five and nine species delimitation scenarios. The 6-species scenario corresponds to the well supported clades: A1+A2, A3, A4, B, C and D. Clades C and D were not strongly supported in the concatenated gene tree. However, these were supported by the � BEAST species tree (Fig 2).

Phenotypical and chemical examination of Aspiciliella intermutans samples
Chemistry. Two chemotypes were detected by TLC: 1) with norstictic acid and traces of connorstictic acid; 2) with norstictic, stictic and traces of cryptostictic acids. Similar results of a chemical analysis of A. intermutans from Greece were published by Sipman & Raus [50]. The second chemotype was only detected in some samples of the clade D, and the first chemotype was found in all clades. The two chemotypes detected inside the A. intermutans complex do not discriminate putative lineages. Prior distributions on ancestral population sizes (theta) assigned to different gamma distributions (G) combined with root age (tau) assigned to G (2,2000).

Morphology.
No distinct morphological features were observed to segregate candidate species, except for the areoles color and form. Clades A1+A2 and B are characterized by a thick thallus and larger areoles with a brown to red brown color, in contrast to the thinner thallus and smaller areoles with a gray to light brown color of the samples in clades A3 and A4. However, in samples from different parts of Europe, grouped in clades C and D, all these characters were totally intermixed. Although, the variability in the thallus form and color of the A. intermutans complex was considerable; these appeared more or less homogeneous in the samples from Iran and Armenia. Consequently, we tend to not consider them taxonomically relevant in this group.

Substrate preferences and distribution
The species of the Aspiciliella intermutans complex occur on siliceous or volcanic rocks in open habitats. The species grow on substrates with mesic (e.g. basalt) to low pH (e.g. andesite, granite and quartzite rocks). They often grow on exposed, more or less horizontal faces of outcrops. The clades C and D appear to have broad distributions, collected from geographically different regions; clades A1, A2, A3, A4 and B consisted only of specimens collected in western north Iran, Armenia and Azerbaijan.
Our study uncovers cryptic species diversity in the widespread Mediterranean microlichen species, Aspiciliella intermutans, and adds another example of species complexes to the microlichens. We show that A. intermutans as currently described is monophyletic but include several cryptic lineages. Our multispecies coalescent species delimitation approaches and BFD test supported the occurrence of at least six candidate species in the A. intermutans complex. Our results are in the accordance with a previous study showing different genetic clusters within the A. intermutans group [5].
The results of phylogenetic analyses of single locus datasets were incongruent in A. intermutans group, suggesting incomplete lineage sorting in the loci or ongoing gene flow or long generation times. These dramatically increase the amount of time required for a species to be recognized under a strict monophyly criterion [56]. The phylogenetic species delimitation criteria such as reciprocal monophyly has been widely used to recognize species however, this criterion may fail to accurately delimit species boundaries in recently diverged species [57]. Multispecies coalescent-based approaches offer a statistic framework for testing species boundary in group with recent diversification histories and does not require species to be reciprocally monophyletic in single gene trees [58,59].
Since the results of single locus datasets were inconclusive, a multilocus data set from A. intermutans samples was analyzed in multispecies coalescent-based approaches. Mitochondrial SSU tree topology recovered most lineages identified in the combined analyses, suggesting that a strong phylogenetic signal from mtSSU may have been dominating the analysis of the concatenated genes. The distinctiveness of the different monophyletic clades in the concatenated phylogenetic tree is clear but phylogenetic analyses alone are insufficient (lack of a strong support in some clades) to draw conclusions on species boundaries in this complex. Therefore, we used a combination of methods to address the species delimitation problems in this complex.
Although the species delimitation analyses indicated the presence of different species lineages in the A. intermutans complex, we followed a conservative approach as suggested by � BEAST and BP&P, which is sensitive enough to detect evolutionarily distinct lineages at very shallow timescales [60]. Based on all of the available evidence, including the multispecies coalescent-based species delimitation inferences � BEAST, BP&P, and Bayes Factor test, we could objectively circumscribe six candidate species within the A. intermutans complex.
We also examined chemistry, morphology, substrate preferences and geographic distribution of the samples, in order to correlate with tree topology. Generally, we found little corroboration between morphological and ecological characters with our candidate species, and also secondary metabolite data provided only a very limited support for the putative species. However, that is a case as in other cryptic lineages of lichenized-fungi. The presence of cryptic species in lichen-forming fungi without any recognizable phenotypical characterization of the clades has been demonstrated in some studies [61].
We did not try to extract DNA of the A. intermutans type material from St. Laon [10], because the material was too old to obtain a sequence. However, we obtained DNA from an A. intermutans sample (France, Bretagne, C. Roux 26741) collected 160 km northwest of St. Laon, and the sample is grouped in the clade D together with samples from other parts of Europe. We suggest the candidate species D (clade D) as the putative species for A. intermutans s. str. and all other clades as putative different cryptic species for A. intermutans complex.
We know that all published taxa with the characteristics of Aspiciliella should eventually be compared with the species distinguishable in this group in order to recognize any names with priority or synonyms. However, this should be the last step after the taxa have been convincingly identified and circumscribed in a manner that those older type specimens can be doubtlessly affiliated to them. Currently, our study suggests that it is not possible to delimit such taxa convincingly, and therefore it is not possible for us on the moment to decide about valid taxonomic names.

Characters of the putative species in the Aspiciliella intermutans complex
The phenotypic and geographic features corresponding to the different clades are summarized in Table 7.
Clades A1+A2 were recovered as a well-supported lineage (BS = 94/PP = 0.1), sister to the clade A3 with strong nodal support (BS = 94/PP = 0.1) they show some differences in ecological and morphological patterns. The samples of clade A1 developed a gray-brown to lightbrown thallus with pale marginal lines and large (0.3-2 mm) and irregular areoles; some areoles come over others and show a superimposed structure. They are commonly found at medium to higher elevation (1800-3044 m) in the North and North-West of Iran and in Armenia.
Clade A3 was recovered as a well-supported lineage (BS = 100/PP = 0.1). The samples develop a gray thallus, which is light brown with pale marginal lines at the edge of the thallus and in the middle totally gray and pruinose; areoles are smaller (0.3-1 mm) than samples in A1+A2 clades and they do not show an irregular structure. They are commonly found at lower to medium elevation (1390-1850 m) in Armenia. Clade A4 was recovered with a high nodal support (BS = 100/PP = 0.1) and includes samples morphologically similar to clade A3, but prefers substrates with lower pH. The samples were collected from Azerbaijan, Armenia and Northwest of Iran, at medium elevation (1300-1900 m).
Clade B was recovered with a low nodal support (BS = 74/PP = 0.89) and includes samples from Azerbaijan, Armenia, the Northwest and Northeast of Iran, at medium to high elevation (1885-1992 m). The samples are morphologically similar to those in clade A1, but prefer substrates with higher pH.
Clade C was recovered with a low nodal support (BS = 39) and it occurs mostly on basaltic rocks at low to medium elevations, 0-1030 m. It has been collected from the Czech Republic, Slovakia, Greece, France and Spain. It has been recorded also on andesite rocks, in medium to high elevations (1305-1898 m) in northwestern of Iran.
Clade D was recovered with a medium support (PP = 75) and it is sister to clade A-C with the remaining Aspiciliella intermutans samples (BS = 83/PP = 0.89). The samples have a wide Study on the Aspiciliella intermutans complex morphological variation and they are from Romania, Ukraine, Bulgaria, Italy, France and mostly from Greece (North Aegean Region), at low elevations (2-760 m), mostly on granite rocks. Some samples in this clade show a chemotype with norstictic acid, stictic acid and a trace of cryptostictic acid, observed exclusively inside this clade

Conclusions
Our study of the Aspiciliella intermutans species complex indicates that the genus Aspiciliella and the A. intermutans complex are more diverse in Eurasia than previously expected. Further, this study highlights the occurrence of cryptic species in microlichens. Our sampling was more intensive in Armenia, Iran, Greece and the eastern part of Europe however the samples from other distribution ranges of this species complex were either scarce or absent. While four of the six species-level lineages within the A. intermutans complex were restricted to Armenia and Iran, only two broadly distributed lineages were found in samples from Greece and also other parts of Europe (Fig 3). Based on our results we suggest that the Caucasian region could be the main biogeographical region for speciation in this complex.
Ultimately, rigorous sampling in other regions, such as Western and Central Europe, the Caucasian region, NE Iran, Turkmenistan Anatoly, Central Asia mountains (and also North America), will be needed to more accurately assess the distribution patterns of lineages within this group. Understanding the general geographic distribution of the candidate species identified in this study requires robust data from a broader geographic sampling, which may or may not correlate to additional lineages within the A. intermutans species-complex.
The authors plan a detailed revision of the A. intermutans species-complex in the near future, with more taxonomic, morphological and geographical sampling to characterize boundaries between the candidate species.