The dense single nucleotide polymorphisms (SNP) panels needed for genome wide association (GWA) studies have hitherto been expensive to establish and use on non-model organisms. To overcome this, we used a next generation sequencing approach to both establish SNPs and to determine genotypes. We conducted a GWA study on a fungal species, analysing the virulence of Heterobasidion annosum s.s., a necrotrophic pathogen, on its hosts Picea abies and Pinus sylvestris. From a set of 33,018 single nucleotide polymorphisms (SNP) in 23 haploid isolates, twelve SNP markers distributed on seven contigs were associated with virulence (P<0.0001). Four of the contigs harbour known virulence genes from other fungal pathogens and the remaining three harbour novel candidate genes. Two contigs link closely to virulence regions recognized previously by QTL mapping in the congeneric hybrid H. irregulare × H. occidentale. Our study demonstrates the efficiency of GWA studies for dissecting important complex traits of small populations of non-model haploid organisms with small genomes.
Citation: Dalman K, Himmelstrand K, Olson Å, Lind M, Brandström-Durling M, Stenlid J (2013) A Genome-Wide Association Study Identifies Genomic Regions for Virulence in the Non-Model Organism Heterobasidion annosum s.s. PLoS ONE 8(1): e53525. doi:10.1371/journal.pone.0053525
Editor: Yong-Hwan Lee, Seoul National University, Republic of Korea
Received: September 3, 2012; Accepted: December 3, 2012; Published: January 16, 2013
Copyright: © 2013 Dalman et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: Financial support from the Swedish Research Council of Environmental, Agricultural and Spatial planning (FORMAS) grant ID 229-2007-1097 (http://www.formas.se/en/) and the Swedish Foundation for Strategic Research (SSF) number RBb08-0011 (http://www.stratresearch.se/) is gratefully acknowledged. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Genome-wide association (GWA) studies have been used to elucidate the genetic basis for disease traits . The feasibility of GWA studies in plants was demonstrated by Atwell and colleagues  who tested 107 phenotypes in Arabidopsis thaliana inbred lines, and has also been used to investigate the genetic basis of resistance in different crops or various traits in wild plants and trees . So far, very few GWA studies have been applied to fungi, even though the tools required are available and fungi have the advantages of small genomes, a homokaryotic (haploid) stage, a sexual life cycle and the ability to multiply genotypes for reproducibility. In Saccharomyces cerevisiae GWA mapping of mtDNA copy number identified one significant SNP  and population based GWA analysis of clinical vs. nonclinical yeast identified several genetic loci associated with the clinical background .
However, several different genetic traits have been studied in fungi using quantitative trait loci (QTL) analysis. In Pleurotus ostreatus, Santoyo et al.  located QTLs linked to lignin-degrading enzymatic activities on six different chromosomes. A QTL associated with pathogenicity on wheat has been identified in the Fusarium head blight fungus Gibberella zeae . In Nectria haematococca MPI, several QTLs for virulence on pumpkin have been found . The genetic components controlling virulence in the necrotrophic forest pathogen Heterobasidion annosum (Fr.) Bref. sensu lato (s.l.) have been studied by QTL mapping of fungal growth within sapwood and induced lesion length in phloem in Norway spruce (Picea abies) and Scots pine (Pinus sylvestris) . Four QTLs, for both lesion length and growth in sapwood on spruce and pine were found located within the same linkage group, while two other pine-specific QTLs associated with growth or lesion length were located to two other groups . In addition to nuclear genetic factors, the virulence has been shown to be influenced by the mitochondrial genome using hybrids of H. irregulare × H. occidentale . The advantage of using GWA studies over QTL mapping is that no mapping population needs to be generated. Furthermore, because many more recombination events have occurred in natural populations compared to in a single crossing, the linkage blocks are expected to be smaller than in QTL mapping, which implies a higher map resolution .
Population genomics and genome-wide mapping have revealed recent hybridization and genetic introgression between the pathogenic fungi Coccidioides posadasii and C. immitis  while a survey of the domesticated and wild yeasts Saccharomyces cerevisiae and S. paradoxus showed that phenotypic variation correlated broadly with global genome-wide phylogenetic relationships . By SNP identification and whole-transcriptome sequencing, Ellison et al.  discovered two recently diverged populations of Neurospora crassa in which they identified genomic islands of high divergence that may be the result of local adaptation to a temperature difference between the two populations.
The necrotrophic pathogen H. annosum s.l. causes severe damage in coniferous forests throughout the Northern Hemisphere . H. annosum s.l. is a complex consisting of five species with different but partially overlapping host preferences , . H. annosum sensu stricto (s.s.) mainly infects Pinus spp. but is also able to attack Picea spp. and other soft and hard wood tree species throughout Europe . The economic losses due to devaluation of saw timber because of stem rot, reduced tree growth and tree mortality reach 1 billion Euros yearly for European forest owners. In addition, H. annosum s.l. challenges the coniferous forests potential to act as a carbon sink by releasing CO2 from decayed wood.
In this study, we identify novel and well-known virulence genes in the conifer pathogen Heterobasidion annosum s.s. through GWA studies based on large scale single nucleotide polymorphism (SNP) identification and typing through next generation sequencing. Our results demonstrate the efficiency of this technique in a haploid non-model organism.
Generation of a Genomic Data Set
We sequenced the genomes of 23 haploid H. annosum s.s. isolates with an Illumina Genome Analyzer. We obtained paired end reads from a 400-base pair (bp) insert library from three of the isolates (90211/2, Sä_16-4 and W_15), while 36-bp single end reads were generated from the remaining 20. The number of reads ranged between 3.2 and 8.9 million for the single end read isolates, and between 12.3 and 15.3 million for the paired end read isolates (Table 1). A combined reference genome was de novo assembled from the paired end sequenced isolates using the Velvet assembler , . The assembly resulted in a reference genome of 56,195 contigs with an N50 of 41,157 bp and a median coverage of 38.7×. The number of contigs of at least 1000 bp was 2330, which comprise in total 30,569,260 bp (internal gaps excluded). These contigs were used as reference sequences when sequence reads of each isolate were assembled including the three used in the de novo assembly. The assembly of the individual isolates was achieved through mapping towards the combined reference genome using MOSAIK (http://bioinformatics.bc.edu/marthlab/Mosaik). For each isolate, 48% to 75% of the sequence reads were aligned to the reference and the coverage ranged between 80% and 94% with gap regions included (Table 1). The mean read coverage was between 2.6× and 12.6× of the individual isolates. There were 4,398,305 positions where all 23 individuals had sequence coverage of at least 2×. In these positions, a total of 64,055 SNPs were found and 33,018 of these were not singletons. We tested our SNP panel for population structure with the program Structure , , , using one SNP per contig. No signs of population structure were detected.
Fungal Virulence on Pine and Spruce Differs Significantly between Isolates
Virulence of H. annosum s.s. on Scots pine and Norway spruce was measured both as lesion length formed under the bark and fungal growth in the sapwood of the plants. The average infection success was 90% for each plant species and ranged between 30% and 100% for individual isolates (Table 2). There was a significant difference in virulence between isolates on pine and on spruce, measured both as lesion length and fungal growth in sapwood (P<0.05, ANOVA). In spruce, fungal growth (SFG) was normally distributed. However, the measurements for growth in pine (PFG) and lesion length in both spruce (SLL) and pine (PLL) had to be transformed to generate a normal distribution. A correlation between lesion length and fungal growth for spruce (R2 = 0.6684) was found, but not for pine. A weak but still significant correlation (R2 = 0.2363, P = 0.0187) was found between PFG and SFG.
Twelve SNPs are Significantly Associated with Fungal Virulence
The general linear model revealed an association between 12 SNP markers and virulence with an adjusted P-value lower than 0.0001 (10 000 permutations) (Table 3). Out of these twelve, six SNP markers were found on the same contig (contig 41480); the remaining six SNP markers were distributed on six different contigs (Fig. 1). Four SNP markers were associated with SFG and the remaining eight with PFG. No markers were associated with SLL or PLL.
The p-values are plotted against the site number from the SNP extraction. The contigs with significant SNPs are indicated with orange lines. Abbreviations: PFG, fungal growth in pine sapwood; PLL, lesion length in pine; SFG, fungal growth in spruce sapwood; SLL, lesion length in spruce.
The length of the contigs varied between 2.5 kilobase pairs (kb) and 76.5 kb (Table 4) and the SNP density varied in the contigs from 0.4 to 1.7 SNPs per kb (minor allele frequency ≥2/23 individuals; coverage in all individuals). The linkage disequilibrium (LD) heat maps show clear blocks of LD in the two contigs 4128 and 41480 associated with the markers for virulence traits (Fig. 2). In the remaining five contigs the SNP marker associated with virulence was not associated with any LD block (Fig. 2 and Fig. S1). The six SNP markers located in contig 41480 were found in two distinct LD blocks with the same statistical support (P<0.0001). The less virulent genotypes were present at low frequencies in the population (Fig. 3).
The upper part of each figure plots the p-values (−log10 scale) for the four traits (up- and downstem combined) to the genomic position (in bp). Abbreviations: PFG, fungal growth in pine sapwood; PLL, lesion length in pine; SFG, fungal growth in spruce sapwood; SLL, lesion length in spruce. The lower part displays linkage disequilibrium (LD) heat maps. The heat map illustrates the LD value r2 from white to red where red indicates high r2 -values. Significant SNP markers are in red. (A) Contig 4128; (B) Contig 41480; (C) Contig 16590; (D) Contig 9600.
Genotypes for SNPs significantly associated with: (A) fungal growth in spruce sapwood; contig 4128, 15627, 45322, 50191 and (B) fungal growth in pine sapwood; contig 9600, 16590 and 41480. All associations were adjusted by a permutation test, the positions shown have a p adj <0.0001. Non-synonymous substitutions are labelled with an asterisk.
Surveying the Genomic Regions Associated with Fungal Virulence
The seven H. annosum s. s. contigs containing SNPs associated with virulence were highly homologous to the H. irregulare genome (http://genome.jgi-psf.org/Hetan2/Hetan2.home.html)  and the gene order within each contig was well conserved. The contigs were not linked among each other and they were distributed over different scaffolds. Contig 4128 had sequence homology to positions 49591–113806 on scaffold 2 in the genome of H. irregulare. The LD block harbouring the SNP associated with SFG was located between position 40886 and 66145 in the contig (corresponding to position 89756-113692, scaffold 2 in H. irregulare). This region contained nine genes that encode: a serine protease, a transcriptional co-repressor, a quinone oxidoreductase (ToxD), an inner centromere protein, a urea transporter, an enzyme similar to a DNA-dependent ATPase, a sorbitol dehydrogenase and two flavin-containing monooxygenases (Fig. 2A). Marker 2541, associated with SFG, was located in between the two genes; the serine protease and the transcriptional co-repressor (Fig. 2A). Linkage was found between this marker and SNPs in the genes that encode a cytochrome P450, a urea transporter, a DNA-dependent ATPase and a sorbitol dehydrogenase.
Contig 41480 had homology to positions 499810–584466 on scaffold 13 in the genome of H. irregulare and they were syntenic, although there was a small break in homology between positions 532643 and 533812, where H. irregulare has a transposon inserted. The LD block between 2.2 and 7.0 kb in the contig spanned a putative calcineurin, similar to S. cerevisiae CNA-1, and the end of an acetylglutamate kinase/synthase gene (Fig. 2B). Two markers at positions 11461 and 12127 were found in a smaller LD block that was located in a gene of unknown function. An absolute LD (r2 = 1) between the significant markers and the rest of the positions in the LD blocks was observed. This linkage is localized to the genes that encode calcineurin, acetylglutamate kinase and the longer unknown gene. Two of the markers associated with virulence in the calcineurin gene were synonymous whereas the third was located in an intron (Table 3). In addition, the small contig 15627 (2.5 kb) also had homology to scaffold 13 (51802–56242) in the genome of H. irregulare. This region contains the last 200 bp of a polyprenylsynthetase gene and three unknown genes, out of which two encode secreted proteins.
Contig 16590 had homology to positions 1029321-1073048 on scaffold 10 of the H. irregulare genome (Fig. 2C). The significant SNP marker 7164 at position 17829 in the contig was located in a non-synonymous position in an exopolyphosphatase gene (Table 3). It had LD (r2 = 1) to one unknown gene at position 14325 and to three intergenic positions.
Contig 9600 had homology to positions 533823-565834 on scaffold 1 in the genome of H. irregulare. Only a few, quite small LD blocks were found, the largest being 1.7 kb (Fig. 2D). The significant marker at position 30323 was located to a synonymous position in the transcription factor SWI5 with an absolute LD (r2 = 1) restricted to the same gene at positions 30227, 30247, 30352 and 30602 (Table 3).
Contig 45322 and 50191 had homology to scaffold 3 and 5 of the genome of H. irregulare, respectively. The SNP in contig 45322 was localized to a gene of unknown function and the one in contig 50191 was found in an intergenic region (Fig. S1).
Applying GWA analysis to an organism in a haploid stage that can be clonally reproduced in high numbers and phenotyped repeatedly increases the accuracy of the phenotypic measurements and the power of the association analysis with several orders of magnitude as compared to diploids. We demonstrated that as few as 23 haploid individuals could successfully be used to identify convincing associations in small fungal genomes, whereas several hundreds of individuals are needed in diploid organisms with large genomes such as humans and plants , .
Lind et al.  used a mapping population generated from a cross between H. irregulare and H. occidentale to show that several H. annosum s.l. QTLs were associated with fungal virulence on spruce and pine. An improved linkage map  located two QTLs on scaffolds 1 and 2 close to or overlapping the H. annosum s.s. contigs 9600 and 4128, suggesting that these regions harbour general virulence factors important in several Heterobasidion species, e.g. H. irregulare, H. occidentale and H. annosum s.s. Such general virulence regions have previously been suggested by Lind et al.  who found one linkage group with QTLs for four virulence traits. The overlapping results between these different studies and mapping methods strengthen the assumption that the regions target important virulence genes. These combined data suggest that virulence in H. annosum s. l. is controlled by regions of virulence factors that occur at different positions in the genome, of which some confer general virulence conserved between the different Heterobasidion species and other species-specific virulence that have evolved separately. Some of these species-specific genes found in H. annosum s.s. may encode virulence factors that are specialized for Scots pine infection. Interestingly, several Norway spruce specific virulence factors were also found with a significant association for SFG, confirming the ability of H. annosum s.s. to infect spruce as an alternative host. In addition, this fact may indicate that H. annosum s.s. uses different mechanisms to infect spruce than H. occidentale.
The success and power of an association study is dependent on the number of SNP markers and on the LD decay. We analysed LD in six contigs using an extended SNP data set and the results indicate a variable LD block size present in our population (300 bp to 31 kb). The very limited LD for the associated SNPs in four of the contigs indicates a high-resolution mapping of these regions. The fact that only part of the genome could be included in the analysis makes it difficult to predict the number of SNPs that are needed for complete coverage in H. annosum s.s. Maize has been shown to have an LD decay between 1 and 10 kb that increased with the increase of minor allelic frequencies and with smaller sample sizes . For the diploid maize, small sample size was defined as 25 individuals and a sample size of more than 50 individuals generated no significant differences in the mean r2. The long LD regions found in H. annosum s.s. might therefore partly be an effect of the relatively low number of individuals used.
Surveying the Genomic Regions Associated with Fungal Virulence
Gene models associated with fungal growth in spruce.
Markers associated with fungal growth in spruce (SFG) were found to be located in gene models with varying functions (contig 4128, Fig. 2A) as well as to unknown/novel virulence genes (contig 45322, 50191, 15627, Fig. S1). Contig 4128 is located close to a previously known QTL for fungal growth in pine sapwood, which is why the genes found in this region are candidates as conserved, general, virulence factors. The gene encoding a quinone oxidoreductase (contig 4128) is similar to a gene that encodes zinc-binding oxidoreductase ToxD, a host-selective toxin produced by Pyrenophora tritici-repentis Pt-1C-BFP , with 27% sequence identity and 36% similarity. Ptr ToxD is a zinc enzyme, specific for NADPH that catalyses the one-electron reduction of certain quinones . Pyrenophora tritici-repentis, causal agent of tan spot of wheat, is known to produce several additional host-selective toxins, including Ptr ToxA, Ptr ToxB and Ptr ToxC .
Serine carboxypeptidases, found in contig 4128, belong to the serine-type catalytic family and use the amino acid as a nucleophile to form an acyl intermediate. They can also act as transferases and are common in viruses, bacteria and eukaryotes . In Schizosaccharomyces pombe it was shown that deletion of a serine carboxypeptidase gene, sxa2, causes hypersensitivity to the P-factor mating pheromone and a reduction in mating efficiency in M cells . Moreover, in Cryptococcus neoformans a deletion mutant of KIN1, a serine/threonine protein kinase, was shown to have attenuated virulence . The position of a significant marker between two genes that encode a serine protease and a transcriptional co-repressor indicate that the transcriptional regulation of either of the genes could be affected.
Flavin-containing monooxygenases (FMOs), found in contig 4128, are xenobiotic-metabolizing enzymes that catalyse the oxygenation of nucleophilic nitrogen, sulphur, phosphorous and selenium atoms using NADPH as a cofactor and FAD as a prosthetic group . They have been implicated in the metabolism of several pharmaceuticals, pesticides and other toxicants. Arabidopsis mutants overexpressing FMO1 had an increased basal resistance against Pseudomonas syringae pv. tomato and Hyaloperonospora parasitica, which suggests that the FMO was involved in detoxifying the virulence factors produced by the pathogens . Another FMO was previously found within a QTL for lesion lengths in pine bark close to contig 9600 suggesting a general and conserved role in pathogenicity  (Lind M, Dalman K, Olson Å, Brandström-Durling M, Stenlid J in prep.).
Gene models associated with fungal growth in pine.
Three contigs were found to harbour markers associated with fungal growth in pine (PFG) (contig 41480, 16590, 9600). The general picture of LD for the significant markers in contig 41480 is that it is limited to three genes, calcineurin, acetylglutamate kinase and a longer unknown gene. Therefore these genes, along with exopolyphosphatase (contig 16590) and SWI5 (contig 9600), are the strongest candidates for the PFG virulence trait. The putative gene for calcineurin found in this study represents a strong candidate for virulence signalling in H. annosum s.s. Calcineurin is a phosphatase regulated by Ca2+ and calmodulin, and is heavily involved in the calcium-dependent signal transduction pathways of many processes in eukaryotes, such as T cell activation, muscle hypertrophy, memory development, glucan synthesis, ion homeostasis and cell cycle control . Furthermore, the protein confers a conserved function for virulence in several fungi; e.g. Candida albicans , Ustilago maydis  and Sclerotinia sclerotiorum . The plant pathogen Botrytis cinerea calcineurin-responsive transcription factor CRZ1 was found to be required for penetration of plant surfaces . In the rice blast fungi Magnaporthe oryzae, deletion mutants for the genes encoding a calcineurin-responsive transcription factor had a reduced virulence due to a defect in host penetration , .
A gene encoding N-acetylglutamate was found in contig 41480. N-acetylglutamate is involved in the biosynthesis of arginine in prokaryotes, lower eukaryotes and plants . An insertional mutation or deletion of genes encoding N-acetylglutamate lead to reduced virulence in Gibberella zeae (anamorph, Fusarium graminearum)  and in Colletotrichum higginsianum .
Exopolyphosphatases (PPXs), found in contig 16590, hydrolyse and release the terminal phosphate from linear polyphosphate containing three or more phosphoanhydride bonds . Together with polyphosphate kinases (PPKs), responsible for polyphosphate synthesis, PPXs maintain the dynamic balance of the polyphosphate level. Polyphosphate is essential for growth of cells, responses to stresses and virulence in several bacteria . Mutants lacking PPX in the human pathogen Neisseria meningitidis had an increased resistance to complement-mediated killing .
In contig 9600 the significant marker was located to a SWI5 transcription factor. The homolog in C. albicans, CaACE2, was shown to affect virulence and deletion mutants of CaACE2 were avirulent in a mouse model . Strong QTLs for lesions in pine and spruce bark, and growth in spruce sapwood, are overlapping this contig (Lind M, Dalman K, Olson Å, Brandström-Durling M, Stenlid J in prep.). The corresponding peaks of these QTLs were at approximately 400 kb (pine) and 600 kb (spruce) respectively on scaffold 1 in the genome of H. irregulare. Candidate genes from this region are likely to play an important part in general Heterobasidion pathogenicity.
In this study we show that GWA studies are useful for dissecting important complex traits of non-model organisms, in particular those with small genomes and a haploid life style. We characterized seven genomic regions associated with fungal growth in the sapwood of spruce and pine and present eight candidate virulence genes that encode quinone oxidoreductase (ToxD), serine carboxypeptidase, two flavin-containing monooxygenases, calcineurin, acetylglutamate kinase, exopolyphosphatase and the SWI5 transcription factor. Of these, all except calcineurin, acetylglutamate kinase and exopolyphosphatase were found very close to or directly overlapping with previously known virulence QTLs.
Plant and Fungal Material
Two-year-old Norway spruce (Picea abies) and Scots pine (Pinus sylvestris) plants, originating from Latvia and Gotthardsberg in Sweden, respectively, were washed and planted in 2 L pots with fertilized peat. The plants were grown for one month in the greenhouse at 20°C before inoculation. The 23 haploid H. annosum s.s. isolates used in this study originated from field studies from different geographic locations in Europe (Table 5). They were all single spore isolates isolated from basidiospores or conidiospores. All isolates were grown on Hagem medium  at 21°C in darkness for one week whereafter autoclaved pine wood blocks (5×5×5 mm) were placed on the mycelia. The cultures were incubated for a further four weeks to allow thorough colonization of the wood blocks.
The spruce and pine plants were inoculated with an H. annosum s.s. isolate by cutting a small window (5×10 mm) in the cambium of the plant, halfway between two nodes, inserting a colonised wood block in the wound and then wrapping it with Parafilm®. The experiment was executed in blocks: two spruce and two pine plants were inoculated with each of the 23 fungal isolates each day for five consecutive days, so that in total ten spruce and ten pine plants were inoculated with each isolate. As a control, ten spruce and ten pine plants were inoculated with sterile wood blocks. After four weeks the plants were harvested and the lesion lengths upstem and downstem from the wound was measured. The stems were then cut into 5-mm pieces and placed on wet filter paper in Petri dishes for 7 days. Growth within sapwood upstem and downstem from the wound was scored by the presence of conidiophores of H. annosum s.s. Inoculation was considered to have been unsuccessful in plants that showed no visible lesion, no fungal growth in sapwood and no conidia in the wound: these stems were discarded.
Statistical Tests for the Virulence Assay
A one-way ANOVA, performed using Minitab® 18.104.22.168, was used to test the difference in virulence between isolates. The measured values for virulence were normally distributed for SFG, but not for PFG or for SLL and PLL. To obtain a normal distribution for the three latter traits, 2 were added to every measure whereupon they were log transformed. The resulting mean values for all four traits were used in the regression analysis and in the association mapping.
Heterobasidion annosum s.s. mycelia were grown and DNA extracted according to Lind et al. . All library preparations and sequencing were performed at the SNP&SEQ Technology Platform of Uppsala University Hospital on the Illumina Genome Analyzer according to Illumina’s standard protocol. For three of the isolates (90211/2, Sä_16-4 and W_15), libraries with an insert length of 400 bp were sequenced from both ends (paired end).
Reads from the three paired-end-sequenced isolates were de novo assembled using Velvet version 0.7.60 (hash length 21 bp) , . Contigs larger than 1000 bp were used as reference sequences. Reads from each isolate (including those isolates that were used for the de novo assembly) were aligned to the reference contigs using MOSAIK version 1.0.1384 (http://bioinformatics.bc.edu/marthlab/Mosaik). Two mismatches per read were allowed using a hash size of 10; only the uniquely aligned reads were accepted. The SAMtools program (pileup -c) ,  and an in-house Python script were used to find SNPs. Each SNP had to have a SNP quality of at least 10 according to Li et al. . Only sites where all individuals had coverage of two reads or more were used. The data was submitted to the Sequence Read Archive, SRA (SRA050098).
Population Structure Analysis
Structure within the sample population was explored using the program STRUCTURE , , . To minimise the impact of linkage between SNPs in the structure analysis, we selected a single SNP representing each contig. These were analysed with different assumed population subdivisions in the range from one to four subpopulations in a model assuming the SNPs were not linked and the organism haploid.
Association Mapping and Linkage Disequilibrium Analysis
The association between SNP markers and virulence estimates was analysed using TASSEL version 2.1 . A general linear model (GLM) approach assuming a completely random mating population was used. The imported phenotypic data consisted of transformed and non-transformed mean values of the four traits. The association analysis was performed in two rounds; first using the SNP dataset from the whole genome and then using an extended dataset from each selected genomic region found to be significantly associated with virulence in the first round. The SNP dataset for the first round consisted of SNPs with a minor allelic frequency of 9% and a minimum coverage of 100% and in the second round a minor allelic frequency of 9% and a minimum coverage of 65%. A permutation test with 10 000 replicates, implemented in TASSEL, was used to correct the p-values for each site. Linkage disequilibrium (LD) heat maps, based on r2 calculations implemented in TASSEL, were constructed for the selected genomic areas using the second SNP dataset.
Alignment and Annotation
The SAMtools pileup -c output  and an in-house Python script were used to retrieve the consensus sequences from the selected genomic regions of all isolates. Corresponding genomic regions were identified using BLAST searches of the complete sequenced genome of H. irregulare, isolate TC32-1 (http://genome.jgi-psf.org/Hetan2/Hetan2.home.html). The gene annotations found in TC32-1 were transferred to the reference sequence using PROT_MAP, FGENESH-2 (SoftBerry, Mount Kisco, NY) and Artemis . The alignments were further analysed for synonymous and non-synonymous substitutions using MEGA version 4 .
Overview of two genomic regions significantly associated with Heterobasidion virulence in spruce and pine. The upper part of each figure plots the p-values (−log10 scale) for the four traits (up- and downstem combined) to the genomic position (in bp). Abbreviations: PFG, fungal growth in pine sapwood; PLL, lesion length in pine; SFG, fungal growth in spruce sapwood; SLL, lesion length in spruce. The lower part displays linkage disequilibrium (LD) heat maps. The heat map illustrates the LD value r2 from white to red where red indicates high r2 -values. Significant SNP markers are in red. (A) Contig 45322; (B) Contig 50191.
Maria Jonsson is acknowledged for her technical laboratory assistance. The linguistic review was kindly provided by Caroline Woods and Heriberto Vélëz. Many thanks go to Ann Christin Rönnberg-Wästljung for her helpful advice on association mapping. Per Normann is warmly acknowledged for rendering scripts to handle large data sets.
Sequencing, assembly and SNP-extraction: KH MB-D. Virulence assay: KD ML ÅO. Association mapping and LD-analysis: KD. Annotation of gene models: KH ÅO KD. Population structure analysis: MB-D. Overall supervision of project: ÅO JS. Conceived and designed the experiments: JS ÅO. Performed the experiments: KD ÅO ML. Analyzed the data: KD KH ÅO MB-D JS ML. Contributed reagents/materials/analysis tools: ÅO KD MB-D ML KH. Wrote the paper: KD KH ÅO ML MB-D JS.
- 1. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, et al. (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A 106: 9362–9367. doi: 10.1073/pnas.0903103106
- 2. Atwell S, Huang YS, Vilhjalmsson BJ, Willems G, Horton M, et al. (2010) Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 465: 627–631. doi: 10.1038/nature08800
- 3. Hall D, Tegstrom C, Ingvarsson PK (2010) Using association mapping to dissect the genetic basis of complex traits in plants. Brief Funct Genomics 9: 157–165. doi: 10.1093/bfgp/elp048
- 4. Connelly CF, Akey JM (2012) On the prospects of whole-genome association mapping in Saccharomyces cerevisiae. Genetics.
- 5. Muller LAH, Lucas JE, Georgianna DR, McCusker JH (2011) Genome-wide association analysis of clinical vs. nonclinical origin provides insights into Saccharomyces cerevisiae pathogenesis. Mol Ecol 20: 4085–4097. doi: 10.1111/j.1365-294x.2011.05225.x
- 6. Santoyo F, Gonzalez AE, Terron MC, Ramirez L, Pisabarro AG (2008) Quantitative linkage mapping of lignin-degrading enzymatic activities in Pleurotus ostreatus. Enzyme Microb Technol 43: 137–143. doi: 10.1016/j.enzmictec.2007.11.007
- 7. Cumagun CJR, Bowden RL, Jurgenson JE, Leslie JF, Miedaner T (2004) Genetic mapping of pathogenicity and aggressiveness of Gibberella zeae (Fusarium graminearum) toward wheat. Phytopathology 94: 520–526. doi: 10.1094/phyto.2004.94.5.520
- 8. Hawthorne B, Rees-George J, Bowen J, Ball R (1997) A single locus with a large effect on virulence in Nectria haematococca MPI. Fungal Genet Newsl 44: 24–26.
- 9. Lind M, Dalman K, Stenlid J, Karlsson B, Olson A (2007) Identification of quantitative trait loci affecting virulence in the basidiomycete Heterobasidion annosum s.l.. Curr Genet 52: 35–44. doi: 10.1007/s00294-007-0137-y
- 10. Olson A, Stenlid J (2001) Plant pathogens - Mitochondrial control of fungal hybrid virulence. Nature 411: 438–438. doi: 10.1038/35078147
- 11. Neafsey DE, Barker BM, Sharpton TJ, Stajich JE, Park DJ, et al. (2010) Population genomic sequencing of Coccidioides fungi reveals recent hybridization and transposon control. Genome Res 20: 938–946. doi: 10.1101/gr.103911.109
- 12. Liti G, Carter DM, Moses AM, Warringer J, Parts L, et al. (2009) Population genomics of domestic and wild yeasts. Nature 458: 337–341. doi: 10.1038/nature07743
- 13. Ellison CE, Hall C, Kowbel D, Welch J, Brem RB, et al. (2011) Population genomics and local adaptation in wild isolates of a model microbial eukaryote. Proc Natl Acad Sci U S A 108: 2831–2836. doi: 10.1073/pnas.1014971108
- 14. Woodward S, Stenlid J, Karjalainen R, Hüttermann A, editors (1998) Heterobasidion annosum: Biology, Ecology, Impact and Control. Wallingford, UK: CAB International.
- 15. Otrosina WJ, Garbelotto M (2010) Heterobasidion occidentale sp. nov. and Heterobasidion irregulare nom. nov.: A disposition of North American Heterobasidion biological species. Fungal Biol 114: 16–25. doi: 10.1016/j.mycres.2009.09.001
- 16. Niemelä T, Korhonen K (1998) Taxonomy of the genus Heterobasidion. In: Woodward S, Stenlid J, Karjalainen R, Hüttermann A, editors. Heterobasidion annosum: Biology, Ecology, Impact and Control. Wallingford, UK: CAB International. 1–25.
- 17. Korhonen K, Capretti P, Karjalainen R, Stenlid J (1998) Distribution of Heterobasidion annosum intersterility groups in Europe. In: Woodward S, Stenlid J, Karjalainen R, Hüttermann A, editors. Heterobasidion annosum: Biology, Ecology, Impact and Control. Wallingford, UK: CAB International. 93–104.
- 18. Zerbino DR, Birney E (2008) Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18: 821–829. doi: 10.1101/gr.074492.107
- 19. Zerbino DR, McEwen GK, Margulies EH, Birney E (2009) Pebble and Rock Band: Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-Read de Novo Assembler. PLoS One 4.
- 20. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155: 945–959.
- 21. Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using multilocus genotype data: Linked loci and correlated allele frequencies. Genetics 164: 1567–1587.
- 22. Falush D, Stephens M, Pritchard JK (2007) Inference of population structure using multilocus genotype data: dominant markers and null alleles. Mol Ecol Notes 7: 574–578. doi: 10.1111/j.1471-8286.2007.01758.x
- 23. Olson A, Aerts A, Asiegbu F, Belbahri L, Bouzid O, et al. (2012) Insight into trade-off between wood decay and parasitism from the genome of a fungal forest pathogen. New Phytol 194: 1001–1013. doi: 10.1111/j.1469-8137.2012.04128.x
- 24. Hirschhorn JN, Daly MJ (2005) Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 6: 95–108. doi: 10.1038/nrg1521
- 25. Aranzana MJ, Kim S, Zhao KY, Bakker E, Horton M, et al. (2005) Genome-wide association mapping in Arabidopsis identifies previously known flowering time and pathogen resistance genes. PLoS Genet 1: 531–539. doi: 10.1371/journal.pgen.0010060.eor
- 26. Lind M, van der Nest M, Olson Å, Brandström-Durling M, Stenlid J (2012) A 2nd generation linkage map of Heterobasidion annosum s.l. based on in silico anchoring of AFLP markers. PLoS One 7: e48347. doi: 10.1371/journal.pone.0048347
- 27. Yan J, Shah T, Warburton ML, Buckler ES, McMullen MD, et al. (2009) Genetic characterization and linkage disequilibrium estimation of a global maize collection using SNP markers. PLoS One 4.
- 28. Pandelova I, Ciuffetti LM (2005) A proteomics-based approach for identification of the ToxD gene. Fungal Genet Newsl 52.
- 29. Rao PV, Krishna CM, Zigler JS (1992) Identification and characterization of the enzymatic activity of zeta-crystallin from guinea pig lens. A novel NADPH:quinone oxidoreductase. J Biol Chem 267: 96–102.
- 30. Andrie RA, Pandelova L, Ciuffetti LA (2007) A combination of phenotypic and genotypic characterization strengthens Pyrenophora tritici-repentis race identification. Phytopathology 97: 694–701. doi: 10.1094/phyto-97-6-0694
- 31. Rawlings ND, Barrett AJ (1994) Families of serine peptidases. In: Alan JB, editor. Methods Enzymol: New York: Academic Press. 19–61.
- 32. Ladds G, Rasmussen EM, Young T, Nielsen O, Davey J (1996) The sxa2-dependent inactivation of the P-factor mating pheromone in the fission yeast Schizosaccharomyces pombe. Mol Microbiol 20: 35–42. doi: 10.1111/j.1365-2958.1996.tb02486.x
- 33. Mylonakis E, Idnurm A, Moreno R, El Khoury J, Rottman JB, et al. (2004) Cryptococcus neoformans Kin1 protein kinase homologue, identified through a Caenorhabditis elegans screen, promotes virulence in mammals. Mol Microbiol 54: 407–419. doi: 10.1111/j.1365-2958.2004.04310.x
- 34. Lawton MP, Cashman JR, Cresteil T, Dolphin CT, Elfarra AA, et al. (1994) A nomenclature for the mammalian flavin-containing monooxygenase gene family based on amino acid sequence identities. Arch Biochem Biophys 308: 254–257. doi: 10.1006/abbi.1994.1035
- 35. Koch M, Vorwerk S, Masur C, Sharifi-Sirchi G, Olivieri N, et al. (2006) A role for a flavin-containing mono-oxygenase in resistance against microbial pathogens in Arabidopsis. Plant J 47: 629–639. doi: 10.1111/j.1365-313x.2006.02813.x
- 36. Sugiura R, Sio SO, Shuntoh H, Kuno T (2001) Molecular genetic analysis of the calcineurin signaling pathways. Cell Mol Life Sci 58: 278–288. doi: 10.1007/pl00000855
- 37. Karababa M, Valentino E, Pardini G, Coste AT, Bille J, et al. (2006) CRZ1, a target of the calcineurin pathway in Candida albicans. Mol Microbiol 59: 1429–1451. doi: 10.1111/j.1365-2958.2005.05037.x
- 38. Egan JD, Garcia-Pedrajas MD, Andrews DL, Gold SE (2009) Calcineurin is an antagonist to PKA protein phosphorylation required for postmating filamentation and virulence, while PP2A is required for viability in Ustilago maydis. Mol Plant Microbe Interact 22: 1293–1301. doi: 10.1094/mpmi-22-10-1293
- 39. Harel A, Bercovich S, Yarden O (2006) Calcineurin is required for sclerotial development and pathogenicity of Sclerotinia sclerotiorum in an oxalic acid-independent manner. Mol Plant Microbe Interact 19: 682–693. doi: 10.1094/mpmi-19-0682
- 40. Schumacher J, de Larrinoa IF, Tudzynski B (2008) Calcineurin-responsive zinc finger transcription factor CRZ1 of Botrytis cinerea is required for growth, development, and full virulence on bean plants. Eukaryotic Cell 7: 584–601. doi: 10.1128/ec.00426-07
- 41. Choi J, Kim Y, Kim S, Park J, Lee Y-H (2009) MoCRZ1, a gene encoding a calcineurin-responsive transcription factor, regulates fungal growth and pathogenicity of Magnaporthe oryzae. Fungal Genet Biol 46: 243–254. doi: 10.1016/j.fgb.2013.12.002
- 42. Zhang H, Zhao Q, Liu K, Zhang Z, Wang Y, et al. (2009) MgCRZ1, a transcription factor of Magnaporthe grisea, controls growth, development and is involved in full virulence. FEMS Microbiol Lett 293: 160–169. doi: 10.1111/j.1574-6968.2009.01524.x
- 43. Kim J-E, Myong K, Shim W-B, Yun S-H, Lee Y-W (2007) Functional characterization of acetylglutamate synthase and phosphoribosylamine-glycine ligase genes in Gibberella zeae. Curr Genet 51: 99–108. doi: 10.1007/s00294-006-0110-1
- 44. Huser A, Takahara H, Schmalenbach W, O'Connell R (2009) Discovery of pathogenicity genes in the crucifer anthracnose fungus Colletotrichum higginsianum, using random insertional mutagenesis. Mol Plant Microbe Interact 22: 143–156. doi: 10.1094/mpmi-22-2-0143
- 45. Rao NN, Gomez-Garcia MR, Kornberg A (2009) Inorganic polyphosphate: Essential for growth and survival. Annu Rev Biochem. 605–647.
- 46. Zhang Q, Li Y, Tang CM (2010) The role of the exopolyphosphatase PPX in avoidance by Neisseria meningitidis of complement-mediated killing. J Biol Chem 285: 34259–34268. doi: 10.1074/jbc.m110.154393
- 47. Kelly MT, MacCallum DM, Clancy SD, Odds FC, Brown AJP, et al. (2004) The Candida albicans CaACE2 gene affects morphogenesis, adherence and virulence. Mol Microbiol 53: 969–983. doi: 10.1111/j.1365-2958.2004.04185.x
- 48. Stenlid J (1985) Population structure of Heterobasidion annosum as determined by somatic incompatibility, sexual incompatibility, and isoenzyme patterns. Can J Bot 63: 2268–2273. doi: 10.1139/b85-322
- 49. Lind M, Olson Å, Stenlid J (2005) An AFLP-markers based genetic linkage map of Heterobasidion annosum locating intersterility genes. Fungal Genet Biol 42: 519–527. doi: 10.1016/j.fgb.2005.03.005
- 50. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, et al. (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25: 2078–2079. doi: 10.1093/bioinformatics/btp352
- 51. Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18: 1851–1858. doi: 10.1101/gr.078212.108
- 52. Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, et al. (2007) TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23: 2633–2635. doi: 10.1093/bioinformatics/btm308
- 53. Rutherford K, Parkhill J, Crook J, Horsnell T, Rice P, et al. (2000) Artemis: sequence visualization and annotation. Bioinformatics 16: 944–945. doi: 10.1093/bioinformatics/16.10.944
- 54. Tamura K, Dudley J, Nei M, Kumar S (2007) MEGA4: Molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol Biol Evol 24: 1596–1599. doi: 10.1093/molbev/msm092