Variability in the Insect and Plant Adhesins, Mad1 and Mad2, within the Fungal Genus Metarhizium Suggest Plant Adaptation as an Evolutionary Force

Several species of the insect pathogenic fungus Metarhizium are associated with certain plant types and genome analyses suggested a bifunctional lifestyle; as an insect pathogen and as a plant symbiont. Here we wanted to explore whether there was more variation in genes devoted to plant association (Mad2) or to insect association (Mad1) overall in the genus Metarhizium. Greater divergence within the genus Metarhizium in one of these genes may provide evidence for whether host insect or plant is a driving force in adaptation and evolution in the genus Metarhizium. We compared differences in variation in the insect adhesin gene, Mad1, which enables attachment to insect cuticle, and the plant adhesin gene, Mad2, which enables attachment to plants. Overall variation for the Mad1 promoter region (7.1%), Mad1 open reading frame (6.7%), and Mad2 open reading frame (7.4%) were similar, while it was higher in the Mad2 promoter region (9.9%). Analysis of the transcriptional elements within the Mad2 promoter region revealed variable STRE, PDS, degenerative TATA box, and TATA box-like regions, while this level of variation was not found for Mad1. Sequences were also phylogenetically compared to EF-1α, which is used for species identification, in 14 isolates representing 7 different species in the genus Metarhizium. Phylogenetic analysis demonstrated that the Mad2 phylogeny is more congruent with 5′ EF-1α than Mad1. This would suggest that Mad2 has diverged among Metarhizium lineages, contributing to clade- and species-specific variation, while it appears that Mad1 has been largely conserved. While other abiotic and biotic factors cannot be excluded in contributing to divergence, these results suggest that plant relationships, rather than insect host, have been a major driving factor in the divergence of the genus Metarhizium.


Introduction
Species within the genus Metarhizium are insect pathogenic fungi with a broad range of insect hosts. The genus was recently divided into several separate species based on a multilocus phylogeny [1]. The EF-1a sequence was found to be diagnostic for species identification. The population biology (and now species association) of Metarhizium had been assumed to be influenced primarily by host insect taxa [2][3][4][5][6][7][8]. That is, different species of Metarhizium were associated with different insect species. However, an association between Metarhizium species and habitat and/or plant types has been observed [9,10]. This represents a significant paradigm shift, in that it demonstrated that habitat/plant selection, not host insect selection, influenced the population structure of Metarhizium. In addition, M. robertsii has been shown to be rhizosphere competent [11][12][13], further supported by research demonstrating M. robertsii is an endophyte [14].
Metarhizium is phylogenetically related to the fungal grass endosymbionts Claviceps and Epichloë [15]. Genomic analyses also indicated that Metarhizium spp. are more closely related to endophytes and plant pathogens than to animal pathogens, suggesting that Metarhizium evolved from fungi that are plant associates [16].
Two adhesin genes have been identified that are specifically involved with insect pathogenesis and plant association, Metarhizium adhesin-like protein 1 (Mad1) and Metarhizium adhesin-like protein 2 (Mad2), respectively [17]. The MAD1 adhesin allows Metarhizium to adhere to insect cuticle, while the MAD2 adhesin enables attachment to plants, and were expressed differentially on their respective hosts [17]. Both proteins contain a middle region (domain B) that contains Thr-rich tandem repeats.
We propose three possible models of evolution within genus Metarhizium: (1) insect host has caused divergence among species; (2) plant host has caused divergence among species; (3) other abiotic or biotic factors caused the divergence and evolution among Metarhizium species. In this study, we explored the genetic differences in 14 Metarhizium isolates, representing 7 different species, through sequence analysis (open reading frames and promoter regions) of the Mad1 insect adhesin and Mad2 plant adhesin genes. Sequences were also compared to the EF-1a gene, which allows for species identification [1], in order to infer evolutionary relationships.

Mad1 variability
Inter-isolate, interspecies, and intraspecies variation were calculated for the open reading frame and promoter regions through pairwise nucleotide comparisons. The greatest interisolate divergence within the open reading frame of Mad1 was 14.2% found between isolates ARSEF 7486 (M. acridum) and ARSEF 6238 (M. guizhouense). However, when considering the average inter-isolate variation between species, the greatest interspecies divergence was 12.3% between M. acridum and M. majus, with the least divergence between M. robertsii and M. brunneum (2.9%). The overall average interspecies variation for the Mad1 open reading frame for all Metarhizium species examined was 6.7%. The average interspecies variation for the promoter region was 7.1%. For the open reading frame, the intraspecies variation was low in M. robertsii (0.2%) and M. brunneum (0.3%), while it was relatively higher for M. guizhouense (3.9%). Similarly, in the promoter region, intraspecies variation was low in M. robertsii (0.1%) and M. brunneum (0.1%), and higher for M. guizhouense (3.9%). The average estimated nonsynonymous/synonymous substitution rate ratio (dN/dS) for Mad1 was calculated at 0.20.
Initial analysis of the MAD1 proteins showed that M. robertsii isolates had a conserved protein length at 717 amino acids. The MAD1 protein for ARSEF 6238 (M. guizhouense) was also 717 a.a., while the Ontario isolates of M. guizhouense had proteins that contained 706 a.a. The MAD1 protein for M. brunneum isolates was 711 a.a., while the M. pingshaense MAD1 was 704 a.a. Overall, M. acridum had the longest MAD1 protein at 723 a.a., including an insertion of 11 amino acids within domain B, which contained Thr-rich tandem repeats. These 11 extra amino acids provided M. acridum with eight tandem repeats, while all other species contained six. M. acridum also possessed a variable region in the N-terminal ligand binding region of the protein, while this region was mostly conserved among other species.

Mad2 variability
The greatest inter-isolate divergence within the open reading frame for Mad2 was 15.9% found between isolates ARSEF 7486 (M. acridum) and HKB1-1b (M. robertsii). Similarly, the greatest interspecies divergence was 15.7% between M. acridum and M. robertsii, with the least divergence between M. guizhouense and M. majus (2.5%). The overall average interspecies variation for the Mad2 open reading frame and promoter region for all Metarhizium species examined was 7.4% and 9.9%, respectively. The intraspecies variation within the open reading frame was low in M. robertsii (0.2%) and M. brunneum (0.0%), and moderately higher in M. guizhouense (2.2%). In the promoter region, intraspecies variation was also low in M. robertsii (0.04%) and M. brunneum Phylogenetic analysis of 59 EF-1a, Mad1, and Mad2 The 59 EF-1a phylogenetic tree for all fourteen isolates segregated according to species, including the division of the PARB clade, which includes isolates of M. pingshaense, M. robertsii, and M. brunneum, and the MGT clade, which includes isolates of M. majus, and M. guizhouense (Fig. 1).
The phylogenetic trees for the Mad1 and Mad2 full gene sequences also formed divisions that were consistent with the PARB and MGT clades ( Fig. 2 and 3). The PARB clade isolates all formed species-specific nodes consistent with their 59 EF-1a identification, however, in the Mad1 tree M. pingshaense and M. brunneum were grouped together (Fig. 2), while M. pingshaense and M. robertsii grouped together in the 59 EF-1a and Mad2 trees ( Fig. 1  and 3). In the Mad1 and Mad2 phylogenetic trees, ARSEF 6238 (M. guizhouense) and ARSEF 1914 (M. majus) grouped together to form a separate node from the Ontario isolates of M. guizhouense within the MGT clade. Overall, the Mad2 tree had the best resolution, with the highest bootstrap values for each node.
The congruency indices (I cong ) calculated for both trees derived from the promoter regions of Mad1 and Mad2 in comparison to 59 EF-1a were each 2.03 (p = 1.66610 26 ) ( Table 1). That is, the phylogenetic trees of the Mad1 and Mad2 promoter regions were equally congruent to the phylogenetic tree for 59 EF-1a. The maximum agreement subtree (MAST), for 59 EF-1a and the phylogenetic trees of the promoter regions each contained 11 terminal nodes in order for perfect congruence to occur (

Discussion
Here, we amplified and cloned the full Mad1 and Mad2 genes in fourteen isolates of seven different species of Metarhizium in order to assess the gene variability. M. acridum, the acridid-specific pathogen [16,18], was found to have relatively more insertions and deletions within the open reading frames of Mad1 and Mad2, respectively, specifically within the Thr-rich tandem repeat region in domain B of both proteins. Mad2 variability between species was identified within putative transcriptional elements, including STRE, PDS, the degenerative TATA box, and TATA box-like regions. Additionally, phylogenetic analysis of 59 EF-1a, Mad1, and Mad2 revealed that the evolution of the Mad2 gene was more congruent with the phylogeny of 59 EF-1a than Mad1, suggesting plant host, rather than insect host, was a probable influence in the divergence among Metarhizium species.
In general, it was found that Mad1 and Mad2 were largely conserved within a species. However, intraspecies variation for M. This may be due to geographic divergence within M. guizhouense, since ARSEF 1914 and ARSEF 6238 were isolated in the Philippines and China, respectively [1]. Interestingly, Bischoff et al. [1] accepted M. majus and M. guizhouense at the species rank due to congruence between conidial size and the 59 EF-1a phylogeny, although these species did not meet the molecular genealogical concordance criteria. However, Japanese isolates demonstrated that the conidial sizes of M. majus and M. guizhouense were incongruent with the 59 EF-1a phylogeny [19]. This incongruence within the MGT clade warrants further investigation in order to fully resolve species ranks which may be obfuscated by population genetic differences within a Metarhizium species.
M. acridum, which is a species that displays insect host specificity, particularly pathogenic to acridids (grasshoppers and locusts) [16,18], had the longest MAD1 protein. This includes an 11 amino acid insertion that gave the M. acridum MAD1 protein eight tandem repeats of GKETTPAQQTTP within domain B, as opposed to the six repeats in all other isolates. This is putatively a functional difference, as it is presumed that a higher number of repeats could increase the distance between the cell wall and the N-terminal ligand binding region [17]. Additionally, M. acridum possessed a variable region in the N-terminal ligand binding  region, which could putatively cause a difference in adherence. However, when the Mad1 gene from M. acridum was inserted into M. robertsii, there was no difference in cuticle adhesion or virulence (St. Leger, pers. comm.). Conversely, M. acridum had the shortest MAD2 protein, including a 12 amino acid deletion directly after the Thr-rich repeats in domain B. This may also have a functional implication that may limit its ability to associate with plants. Phylogenetic analysis of 59 EF-1a, Mad1, and Mad2 also shows this species is highly divergent from other Metarhizium species.
Within the MAD2 protein sequence, a variable region was readily identified within the N-terminal ligand binding region. Interestingly, the variability was conserved within a species. This amino-terminal domain has been implicated in adhesive interactions in the ALS proteins of C. albicans [20], which are similar to the MAD1 and MAD2 proteins [17]. It may be possible that this variability causes differences in adhesion to various plants among species of Metarhizium.
Overall, genetic variation was slightly greater in the Mad2 open reading frame (7.4%) in comparison to Mad1 (6.7%), but noticeably higher in the Mad2 promoter region (9.9%) in comparison to the Mad1 promoter (7.1%). Analysis of the Mad1 promoter did not identify any variable transcriptional elements. Future research could focus on the expression of Mad2 between species since there was variation present within the promoter region. Several putative transcriptional elements have been identified within the Mad2 promoter [21], however, the analyses presented here focused on the variable STRE, PDS, degenerative TATA box, and TATA box-like regions. The stress response element (STRE) activates genes under various stress conditions, including glucose starvation [22,23]. Similarly, the post diauxic shift (PDS) element mediates transcriptional activation in response to nutritional limitation [24,25]. The presence of these transcriptional elements is consistent with the finding that Mad2 is upregulated under nutrient deprivation [21].
Interestingly, it has been found that the expression of cell wall and stress response genes evolved at an accelerated rate following the transfer of M. robertsii from a semitropical to a temperate soil community [26]. It was also found that cell wall genes with significantly altered expression were enriched for TATA boxes. Conversely, virulence determinants were unaltered [26]. M. robertsii, which has demonstrated a more generalist ability to colonize plant rhizosphere when compared to M. brunneum and M. guizhouense [10], contained the most TATG repeats within the degenerative TATA box region. It also contains a TATG repeat prior to the TATA box-like sequence, which the other species lack. Whether this contributes to the generalist nature of the plant association is unknown. Also, the length of the MAD2 protein is conserved within the PARB clade, including the Ontario isolates of M. guizhouense. This is notable, since Ontario isolates of M. robertsii, M. brunneum, and M. guizhouense have shown plant rhizosphere associations [10].
Overall, variation within the DNA and protein sequences of the Mad1 and Mad2 genes, were largely species-specific. This is expected, as these genes would have diverged during speciation. However, the higher amount of variation, especially in the promoter region, suggests Mad2 had diverged more than Mad1, and phylogenetic analysis indicated that Mad2 is more congruent with the 59 EF-1a phylogeny, which is used for species identification [1]. Also, variation within the TATA box-like region of the Mad2 promoter was conserved within a clade. This would suggest that in evolutionary terms, Mad2, the plant adhesin, has diverged among Metarhizium lineages, contributing to clade-and species-specific variation. Conversely, it appears that Mad1 has been largely conserved. This is reflected in the average estimated dN/dS ratio, which is higher in Mad2 (0.31) than in Mad1 (0.20), suggesting that there is more stabilizing selection for Mad1, as there is a higher relative abundance of nonsynonymous mutations.  One explanation for the results observed is that the stabilizing selection for Mad1 has reduced variation and caused incongruency with 59 EF-1a. The promoter regions are both equally congruent to 59 EF-1a. While EF-1a is highly conserved [27], the 59 region used in these analyses contains a large portion of intronic nucleotides (.60% when aligned with GenBank Accession AAR16425). As such, the promoter regions of the Mad genes and the intronic regions of 59 EF-1a would both accumulate random substitutions during evolution. Mad2, which has demonstrated a degree of stabilizing selection, would have fewer accumulated random mutations. Lastly, Mad1, which has shown more stabilizing selection and less variation than Mad2, would have accumulated even fewer random mutations, resulting in more incongruency with the 59 EF-1a phylogenetic tree. Previous studies on insect infection related genes (i.e. Pr1 and Ntl) have also demonstrated a high degree of stabilizing selection [6,28].
There is evidence that plant host association may play an important role in the evolutionary divergence within the genus Metarhizium with the exception of the acridid-specific M. acridum and possibly M. majus, which has demonstrated specificity for Coleopteran insects, particularly scarabs [18,29,30]. Ontario species of Metarhizium have shown plant rhizosphere specificity [10] and M. robertsii is an endophyte [14]. Whole genome analyses have also suggested that the genus Metarhizium evolved from endophytes or plant pathogens [16].
While the Mad2 plant adhesin gene showed a higher amount of variability than Mad1, and was more congruent with 59 EF-1a, it is difficult to ascertain whether this is due to plant relationships alone. Over the course of time, a number of factors may have contributed differentially to the evolution of Metarhizium species. The genetic differences found may be the residual effects derived from an ancestral plant-associated relative. While phylogenetic evidence suggests that plant interactions have had the greater role in shaping the evolution of this fungal genera, it is possible that insect associations may have been influential on the more recent evolution of Metarhizium. While other abiotic and biotic factors cannot be excluded in contributing to species divergences, it appears that plant relationships have been a driving factor in the evolution of Metarhizium species.

DNA extraction
Conidia were inoculated into 50 mL 0.2% (w/v) yeast extract, 1% peptone, 2% dextrose (YPD) broth in flasks. The flasks were incubated at 27uC and shaken at 200 rpm for 3 to 4 days, until sufficient mycelia had accumulated. The mycelia were removed by vacuum filtration onto FisherbrandH P8 filter paper, washed with distilled water, and crushed in liquid nitrogen using a mortar and pestle. DNA was extracted using the DNeasy Plant Mini Kit (QIAGEN). Extracted DNA was quantified using a NanoVue spectrophotometer (GE).
PCR amplifications were performed in a total volume of 50 mL, which included 5 mL 10X Standard PCR Buffer (NEB), 1 mL dNTPs (10 mM each dATP, dCTP, dGTP, dTTP) (QIAGEN), 10 pmol each of the opposing amplification primers (Sigma), 0.5 mL Taq polymerase (NEB), and 500ng genomic DNA. The following PCR conditions were used for Mad1 amplification: initial denaturation, 1 minute at 94uC, then 30 cycles of denaturation, 1 minute at 94uC; annealing, 1 minute at 60uC; extension, 4.5 minutes at 72uC; and final extension, 10 minutes at 72uC. The same PCR conditions were used for Mad2 amplification, with an annealing temperature of 56uC and an extension time of 3 minutes. Table 2 lists the primers used to amplify all Mad1 and Mad2 sequences.
Full DNA sequences for Mad1 and Mad2 were obtained for all isolates by primer walking (DNA Walking Speedup kit; Seegene). Amplified PCR products were separated by gel electrophoresis, excised, and purified with a QIAquick gel extraction kit (QIAGEN). Purified PCR products were cloned using pGEM-T Easy, as per manufacturer's instructions (Promega). Plasmid DNA was extracted using a GenElute Plasmid Miniprep Kit (Sigma), and inserts were sequenced using vector sequencing primers (SP6 and T7) by the Core Molecular Biology Facility at York University (Toronto, Canada).