Complete mitochondrial genome of the Verticillium-wilt causing plant pathogen Verticillium nonalfalfae

Verticillium nonalfalfae is a fungal plant pathogen that causes wilt disease by colonizing the vascular tissues of host plants. The disease induced by hop isolates of V. nonalfalfae manifests in two different forms, ranging from mild symptoms to complete plant dieback, caused by mild and lethal pathotypes, respectively. Pathogenicity variations between the causal strains have been attributed to differences in genomic sequences and perhaps also to differences in their mitochondrial genomes. We used data from our recent Illumina NGS-based project of genome sequencing V. nonalfalfae to study the mitochondrial genomes of its different strains. The aim of the research was to prepare a V. nonalfalfae reference mitochondrial genome and to determine its phylogenetic placement in the fungal kingdom. The resulting 26,139 bp circular DNA molecule contains a full complement of the 14 "standard" fungal mitochondrial protein-coding genes of the electron transport chain and ATP synthase subunits, together with a small rRNA subunit, a large rRNA subunit, which contains ribosomal protein S3 encoded within a type IA-intron and 26 tRNAs. Phylogenetic analysis of this mitochondrial genome placed it in the Verticillium spp. lineage in the Glomerellales group, which is also supported by previous phylogenetic studies based on nuclear markers. The clustering with the closely related Verticillium dahliae mitochondrial genome showed a very conserved synteny and a high sequence similarity. Two distinguishing mitochondrial genome features were also found—a potential long non-coding RNA (orf414) contained only in the Verticillium spp. of the fungal kingdom, and a specific fragment length polymorphism observed only in V. dahliae and V. nubilum of all the Verticillium spp., thus showing potential as a species specific biomarker.


Introduction
The Verticillium genus encompasses several invasive, soil-borne species of fungal plant pathogens, which infect over 400 plant species, including herbaceous annuals, perennials and even variety of sizes, ranging from a minimum of 12,055 bp (Genbank NC_021611-Rozella allomycis) to a maximum of 235,849 bp (Genbank NC_021436-Rhizoctonia solani), with an overall average size of 50,512 bp. These size differences are mainly attributed to the numbers and lengths of intronic regions, as well as intergenic regions. For example, [25] reports that intronic regions in mitochondrial genomes can be present in different numbers and lengths, from a few bp up to 5 kb. Despite these size differences, mitochondrial genomes still predominantly contain the conserved set of protein-coding genes [26]. Other sources of fungal mitochondrial genome variation are variability in the mitochondrial gene order, presence of repeats and variable numbers of ORFs with an unknown function. This is similar in plant mitochondrial genomes, which can also have large intergenic, intronic or repetitive regions [27]. Size variability of fungal mitochondrial genomes may also be a consequence of the presence of AT-rich intergenic spacers, comprising significant portions of the mitochondrial DNA [28].
Data from our NGS genome sequencing project enabled the assembly of mitochondrial genomes of V. nonalfalfae strains from hop fields in different geographic locations (Slovenia, Germany and England) and of different pathotypes. Here we describe the completely analyzed V. nonalfalfae mitochondrial genome, which was further used for taxonomic characterization of the species. This work expanded the pool of available genomic resources for fungal mitochondrial genomes, particularly for the Verticillium spp. lineage and confirmed the uniformity of mitochondrial sequences between the different strains of V. nonalfalfae.

Data description and mitochondrial genome assembly
The NGS genomic and RNA-seq data used in this study originate from V. nonalfalfae comparative genome sequencing by our lab of six V. nonalfalfae strains isolated from hop plants, from three geographic locations (Slovenia-T2 and Rec, Germany-P15 and P55, United Kingdom-1953 and1985) and of two virulence levels (mild-Rec, P55 and 1953; lethal-T2, P15 and 1985). The raw genome and transcriptome sequencing data produced by the Illumina Genome Analyzer IIx (150 bp paired end) are available from the NCBI SRA archive under Bioproject PRJNA283258. The following SRA experiments for genomic data were used in the analysis: T2 (SRX1020587, SRX1020589, SRX1020590), Rec (SRX1020612, SRX1020613), P15 (SRX1021611), P55 (SRX1021624), 1953 (SRX1021626), and 1985 (SRX1021650). RNA-seq data for Slovenian strains T2 and Rec are available under SRA experiments accession numbers SRX1020629 and SRX1020679, respectively.
High molecular weight DNA of six fungal strains was obtained from germinating conidia following the protoplasts isolation procedure [29] and CTAB isolation [30], including RNAse treatment. DNA was quantified by NanoVue spectrophotometer and quality checked by agarose gel electrophoresis. Three biological replicates of Slovenian isolates T2 and Rec were grown in liquid xylem simulating media for 72 hours. RNA isolation was performed by TRI-ZOL reagent, isolated RNA was quantified by NanoVue spectrophotometer and integrity checked by formaldehyde gel electrophoresis. Illumina sequencing was performed at IGA Technology Services Srl (Udine, Italy) in a paired-end setup on GAIIx or HiSeq2500 sequencer.
A5 [31], Velvet [32] and CLC Genomics Workbench were used for the initial genome assemblies of the 6 V. nonalfalfae strains. Their mitochondrial genomes were identified from the assembled contigs by using the Blast+ suite [33] to align the contigs to the V. dahliae genome [10]. Only 1 contig per strain showed similarity on the nucleotide level to the published V. dahliae mitochondrial genome (NC_008248.1). Similarity between strains was investigated by remapping the sequencing reads to the assembled reference mitochondrial sequence using the 'Map Reads to Reference' tool and the 'Basic Variant Detection Tool' of the CLC Genomics package. However, all of the identified six mitochondrial contigs were non-polymorphic (100% identity), so further downstream analyses were performed on this one common mitochondrial sequence. Nucmer software [34] was used to perform global alignment between the mitochondrial genomes of V. nonalfalfae and V. dahliae.

Annotation of the mitochondrial genome
The reference gene models were inferred with the Maker2 [35] pipeline, combining ab-initio gene predictor GeneMark [36], homology-based alignments of 871 Verticillium spp. fungal mitochondrial proteins (NCBI Entrez query: "fungi[Organism] AND verticillium AND mitochondrial", September 2014) from the 'nr' collection of Genbank with Exonerate [37] and assembled transcripts from RNA-Seq data. The Verticillium transcripts were prepared by mapping RNA-Seq data to the mitochondrial genome with Tophat [38] and assembling transcripts with Cufflinks [39]. Assembled transcripts were screened for candidate coding regions and finalized with TransDecoder [40].
tRNAs were annotated with tRNAscan-SE (v1.21) [41] and their secondary structures were predicted with ARWEN [42]. Secondary structures were visualized with the VARNA [43] visualization java applet. The mitochondrial genome was additionally examined for tRNAs, rRNAs and introns with RNAweasel [44] and MFannot [45]. The codon usage of the mitochondrial genome was calculated with the 'cusp' program of the EMBOSS suite [46]. The programs were run with default parameters, the only exception being setting the genetic code to mitochondrial where applicable.
Initial structural annotations and BAM file tracks of RNA-Seq mapping were loaded onto a local installation of the WebApollo [47] server for visualization and manual curation. After curation, the final dataset was functionally annotated with the Blast2GO suite [48].
The graphic presentation of the annotated mitochondrial genome was prepared with Circos [49].
The complete sequence of the mitochondrial genome of V. nonalfalfae has been deposited in Genbank under accession no. KR704425.

Phylogenetic analysis
We selected a set of publicly available fully annotated fungal mitochondrial genomes from a broad taxonomic spectrum of fungal species, including 15 pathogenic fungi from the Pezizomycotina group and 3 yeasts, both of which represented the group of ascomycetes, and a single basidiomycete pathogen (Table 1) as an outgroup, to conduct the phylogenetic analysis and to determine the placement of V. nonalfalfae among them. Amino acid sequences of the 14 conserved mitochondrial proteins [9] from these species, including four Cytochrome c oxidase subunits (cox1, cox2, cox3 and cob), three ATP synthase subunits (atp6, atp8 and atp9) and seven NADH dehydrogenase subunits (nad1, nad2, nad3, nad4, nad4L, nad5 and nad6) were aligned using Muscle [50] and trimmed with trimAl [51] to remove badly aligned amino acid sequences. The alignments were then concatenated and used to construct a phylogenetic tree.
RAxML's [52] automated model selection was used to determine the best protein model in terms of likelihood on a fixed, reasonable tree, along with the GAMMA model of rate heterogeneity. The final maximum likelihood phylogenetic tree was made by running RAxML's rapid bootstrap algorithm with 100 approximations and a subsequent maximum likelihood search. ETE [53] was used to visualize the resulting trees.
RT-qPCR of long non-coding RNA (orf414) Fungal strains T2 and Rec were maintained and grown on 1/2 Czapek Dox agar plates at room temperature. Fungal mycelium was harvested from cultured plates and RNA of three biological replicates (50 mg) for each strain was isolated using a spin column procedure employing a Spectrum Plant Total RNA Extraction Kit (Sigma-Aldrich). Isolation followed the manufacturer's protocol, including on column DNAse digestion (Sigma-Aldrich). The RNA concentrations and A260/A280 ratios were measured by a NanoVue Spectrophotometer (GE Healthcare). Quantified RNA samples were stored at -80°C.
One μg of each RNA sample was reverse-transcribed to cDNA using a High Capacity cDNA Reverse Transcription Kit (Applied Biosystems, Foster City, USA), employing random hexamer primers. Reverse-transcribed samples were stored at −20°C. Primers ORF414-FOR 5'-ATCCGAGGGAGATGAGACTTCA-3' and ORF414-REV 5'-TCACCTGATCTTTCTT CAACTTCAA-3' for orf414 were constructed using Primer Express version 3.0 (Applied Biosystems) with default parameter settings. As a housekeeping reference, the ubiquitin gene (UBQ) was amplified with primers UBQ-FOR 5'-GACTCGACCTCAAGGGTGAT-3' and UBQ-REV 5'-GTCTTCGTGGTGGTATGCAG-3. The housekeeping reference was selected on the basis of a previous study [54], in which the ubiquitin gene was selected as the most stably expressed in the analysis of V. dahliae. A qPCR reaction mixture was prepared in a 10 μl reaction volume containing 5 μl of FAST SYBR Green PCR Master Mix (Applied Biosystems), 2 μl of cDNA template and 0.6 μM of each specific primer. FAST Amplification was performed in the ABI PRISM 7500 Fast Sequence Detection System (Applied Biosystems, Foster City, USA) with the following cycling program: 95°C for 20 s, 40 cycles at 95°C for 3 s, followed by 60°C for 30 s, followed by melting curve analysis to confirm the amplification of a single PCR product. All samples were amplified in technical triplicates and the mean cycle threshold (Ct) value was calculated to investigate the expression levels of the analyzed orf414. A t-test for differential expression analysis between the mild and lethal strains was conducted with R v3.0.2.

Verticillium mitochondrion specific length polymorphisms
The assembled V. nonalfalfae mitochondrial genome was aligned with the V. dahliae mitochondrial genome to analyze their similarities using Nucmer [55] pairwise alignment. Furthermore, Illumina sequencing reads of 6 V. nonalfalfae strains were mapped to the reference V. dahliae mitochondrial genome using the 'Map reads to reference' tool of the CLC Genomics Workbench. A V. dahliae lineage specific region was found, which was further PCR amplified in different isolates of the Verticillium spp. A conserved primer pair mth-for 5'-CCTCA CGCTTTTGTAAGTTTACCT-3' and mth-rev 5'-AATTCAAACTCGTTAATACATAGCA-3' was constructed using PRIMER3 online software. Primers were used for amplification of a Verticillium species collection (96 samples), consisting of V. albo-atrum (6 samples Table). PCR reactions were performed in a volume of 20 μl containing fungal DNA (1:100 diluted CTAB isolation), 1x supplied PCR buffer, 2 mM MgCl 2 , 0.8 mM dNTP solution and 0.5 U of Taq DNA polymerase (Kapa Biosystems). Amplification cycling conditions were as follows: 95°C for 5 min, followed by 5 cycles of 95°C 30 sec, 65°C 30 sec (at each cycle the temperature was lowered by 1°C) and 72°C 1 min 30 sec, followed by 25 cycles with same time and temperature profiles, except that the annealing temperature was 55°C. Amplification products were resolved on 1.2% agarose gel.

Results
The mitochondrial genome of V. nonalfalfae presented in this work is a circular-mapping DNA molecule of 26,139 bp in size, with a mean GC content of 26.92%. The three assemblers produced the same mitochondrial sequence for the 6 paired-end NGS data sets. Moreover, remapping the NGS reads back to the mitochondrial genome and subsequent SNP analysis did not indicate any polymorphic regions, despite high coverage (average between 499-10,688x) achieved for all 6 data sets ( Table 2). It is comparable in size to the related V. dahliae mitochondrial genome [10], published on Genbank (NC_008248.1) with a length of 27,184 bp and mean %GC content of 27,32%. A sequence-to-sequence Nucmer alignment of their mitochondrial genomes shows a 98.15% sequence identity with 99.35% aligned bases of V. nonalfalfae mitochondria to V. dahliae.

Annotation results and codon usage
The mitochondrial genome of V. nonalfalfae harbors the whole complement of 14 conserved mitochondrial protein-coding genes, namely the subunits of the electron transport chain of complex I (nad1, nad2, nad3, nad4, nad4L, nad5 and nad6), complex III (cox1, cox2, cox3), complex IV (cob) and the ATP synthase subunits (atp6, atp8 and atp9). Other features also present in the assembled mitochondrial genome are: a small rRNA subunit, a large rRNA subunit containing ribosomal protein S3 encoded within a type IA-intron, 26 tRNAs and a potential long non-coding RNA, designated orf414 (Fig 1).
All mitochondrial genes are predicted to be encoded on the same strand, with the exception of the long rRNA subunit, which is encoded on the opposite strand (Fig 1, Tables 3 and 4).
Putative secondary structures of the tRNA molecules were also determined. It was shown that all of them can fold into a common cloverleaf structure made out of the acceptor stem, D-  Table) and the genes were analyzed with respect to their start and stop codons (Table 3). Among the predicted genes, the 'AUG' initiation codon was mostly prevalent, with only the genes cox3 and rps3 using the alternative 'AUA' initiation codon. The most frequent stop codon was 'UAA', followed by two alternative stop codons exhibiting similar frequency. The 26 tRNA genes ( Table 5) coded for 20 of the 22 common amino acids (excluding Ornithine and Selenocysteine), with the anti-codons ACG (Arginine), GCU (Serine), CAU (Methionine) and UAG/ UAA (Leucine) being present in more than 1 copy. In view of the described annotation information, only 16.52% of the mitochondrial genome is represented by intergenic regions.
Characterization and expression profile of the long non-coding RNA-orf414 BLASTn and BLASTx algorithms were used for characterizing the orf414 sequence (2850..3638, length 789 bp) against 10 NCBI databases. Nucleotide comparisons revealed that this particular region is specific to the Verticillium lineage, since significant hits (e>10 −5 , >50% of query) were only found in V. dahliae sequences. Moreover, the majority of them were of V. dahliae mitochondrial origin, with 96% identity and 100% query coverage, with the exception of the chr2 genomic sequence in the nt database of JR2 V. dahliae assembly [58] which had 93% identity and 88% of query coverage. Only one significant similarity to other species was found in the EST division, with cDNA clone of Phytophtora infestans, which showed 95% identity over 56% of the orf414 sequence (S1 File). A comparison with the protein database (BLASTx) did not reveal any significant hits.
We performed a RT-qPCR analysis to confirm the in vivo expression of orf414 in two Slovenian fungal strains of different pathotypes (Rec and T2) using three biological replicates. The expression of orf414 in the mild strain was 1.2 fold higher than that in the lethal strain (Fig 3).
The expression values were normalized with ubiquitin (ubq) as a housekeeping reference and were compared between strains with an unpaired t-test. The mean difference in expression levels of orf414 between the mild strain (mean = 0.078, sd = 0.061, se = 0.043) and the lethal strain (mean = -0.193, sd = 0.146, se = 0.073) was evaluated with a p-value of 0.07373 and a 95% confidence interval of [-0.041, 0.583], from which we conclude that the expression of orf414 between strains is of similar levels.   [57]. Because of rRNA counts overwhelming the remaining expression profiles, the count number on the graph was capped below 17.5% of the top count profiles (a noncapped graph would show only rRNA expressed), in order to enable visualization of low-expressed regions. Following this track are GC-skew and GCcontent tracks, respectively. GC-skew [(G-C)/(G+C)] reflects the relative number of cytosine to guanine and is often used to describe the strand-specific bias of a nucleotide composition. The GC-skew track is shown as a histogram of 250bp sliding windows with calculated gc-skew coefficients. Green regions represent windows for which the coefficient is larger than 0 and red regions windows for which the coefficient is smaller than 0. The neighbouring grayscale heatmap of the GC-content track represents 100bp sliding windows with calculated gc-contents. Regions in the GC content heatmap are shaded in gray, where darker gray represents higher gc-content and lighter gray represents lower gc-content. The two tracks show a similar pattern to other Sordariomycetes and were also used to scan for anomalies in GC content, which could indicate the introduction of heterologous DNA. No anomalies indicating such an event were detected. The cumulative GC skew analysis was also used to try and find the origins of replication and termination of replication loci (data not shown) but we could not determine them with this analysis. doi:10.1371/journal.pone.0148525.g001

Phylogeny results
In order to infer phylogenetic relationships between the fungal species, a maximum likelihood approach was used. Concatenated amino acid sequences of conserved proteins from the 20 fungal species (Table 1) were used to perform the phylogenetic analysis. The resulting phylogenetic tree (Fig 4) shows high bootstrap support for the individual groups. Since mitochondria within these species are considered to be circular molecules, the alignment start position was arbitrarily decided as the cox1 gene, which all species in this study possess.
The conserved set of 14 mitochondrial protein-encoding genes is present in most of the species used in this analysis. The exceptions are S. cerevisiae and S. pombe yeasts, which lack the NADH dehydrogenase family of genes, C. lindemuthianum, which lacks the nad2 gene, N. crassa, which has two copies of the nad2 gene and R. orthosporum and P. anserina, which lack the atp9 gene.
The analysis yielded 3 clades, which correspond to the Glomerellales, Hypocreales and Sordariales groups of fungi within the Sordariomycetes family. Another clade, consisting of S. borealis and R. orthosporum, corresponds to the Helotiales group. The yeasts (S. cerevisiae, S. pombe, C. albicans) clustered together and U. maydis is shown as an outgroup. V. nonalfalfae was positioned in the Glomerellales group, together with C. lindemuthianum and the closely related V. dahliae.

Analysis of the Verticillium dahliae specific length polymorphism
When aligning the V. nonalfalfae and V. dahliae mitochondrial genomes, a V. dahliae-specific region (1,221 bp long) not present in V. nonalfalfae was revealed between the gene coxI and a tRNA coding for proline. Mapping the mitochondrial sequencing reads of our V. nonalfalfae strains to the V. dahliae mitochondrial genome also supported this finding (S2 Fig). We further analyzed the region with PCR amplification in 96 different Verticillium spp. isolates (S2 Table). The primers were designed in the conserved regions of the V. dahliae and V. nonalfalfae region with expected amplicon sizes of 1400 bp (V. dahliae) and 400 bp (V. nonalfalfae). Amplification of the region was successful in 22 out of 26 V. dahliae isolates showing a longer amplicon and in 44 out of 49 analyzed V. nonalfalfae species with a shorter amplicon. The longer amplicon was also characteristic of V. nubilum species, while the 400 bp amplicon was also amplified in V. alfalfae and V. longisporum. Amplification in species V. albo-atrum, V. isaacii, V. nigrescens and V. tricorpus did not yield any fragment.

Discussion
The assembled mitochondrial contigs of six unique V. nonalfalfae strains were compared to reveal a possible pattern of variation. A similar approach was used by [59] to distinguish among 18 strains of Lachancea kluyveri, which had been isolated from various geographical locations and ecological niches. They found great diversity in the intergenic regions, with variants and indels, and also highly conserved coding regions in the mitochondrial genomes. However, in our case, the mitochondrial genomes did not reveal any variation. Assemblies of 3 different assemblers showed that their nucleotide sequences are the same. Re-mapping analysis further confirmed that no variants exist in the sets of NGS sequencing data for the six strains analyzed. Another more frequent way of comparing mitochondrial genomes is inter-species, rather than an intra-species approach [13,26,27,56,60]. An analysis of two related soybean rust pathogens based on their mitochondrial genomes is such an example [14]. Mitochondrial genomes of Phakopsora pachyrhizi and Phakopsora meibomiae were sequenced, had their genes annotated and were phylogenetically compared to other members of Basidiomycota for taxonomical characterization. The results showed that the order of protein-coding genes and tRNA is identical in the two Phakopsora species and all genes are transcribed from the same DNA strand clockwise, while the sizes of their mitochondrial genomes are also quite similar (31,825bp-P. pachyrhizi, 32,520bp-P. meibomiae). In our case, a comparison of V. nonalfalfae (26,139bp) with the available mitochondrial genome of the closely related V. dahliae (27,184bp), showed comparable sizes, a conserved gene order and species specific nucleotide changes, contributing to a 98.15% nucleotide-level identity between mitochondrial genomes.
Our reported V. nonalfalfae mitochondrial genome is among the smallest published genomes, yielding in total 26,139 bp (max-S. borealis: 203,051 bp, min-L. muscarium: 24,499 bp) and has an average GC content of 26.92%, which is quite similar to the GC content of the mitochondrial genome of V. dahliae (27.32%), although it is still fairly low in terms of the average GC contents of other fungal species considered in this work (29.16%, Table 1). The number of tRNAs seems to be on a par with other members of the Glomerellales and Hypocreales groups (Table 1). Most of the mitochondrial protein-coding genes implicated in oxidative-phosphorylation and ATP synthesis are highly conserved within fungal mitochondrial genomes [10,14,17,21,22]. This also proved to be the case in the V. nonalfalfae mitochondrial genome, since it harbors the whole complement of 14 "standard" mitochondrial protein-coding genes. Most of the predicted features were placed on the same strand of the mitochondrial genome, with the exception of the long rRNA subunit, which was predicted to be on the opposite strand. Features being mostly encoded on the same strand is very common among Ascomycete mitochondrial genomes [10,14,19,61]. Another interesting feature indicating the taxonomic placement of the V. nonalfalfae mitochondrial genome is the rps3 gene, encoded inside an intron, which is reported to be a common feature of the Sordariomycete family of fungal mitochondrial genomes [62]. This gene encodes a ribosomal protein that is a component of the 40S subunit, where it forms part of the domain in which translation is initiated. All of the species analyzed in this study that are members of the Sordariomycetes family contain this intronencoded rps3 gene in their mitochondrial genome according to their Genbank entries (F. graminearum, F. oxysporum, F. solani, L. muscarium, M. chlamydosporia, M. anisopliae, N. crassa, P. anserina, R. orthosporum, S. borealis, V. dahliae). It seemed at first that A. chrysogenum lacked an rps3 gene and C. lindemuthianum lacked the long rRNA subunit, judging from their Genbank entries, but an in-house analysis revealed both genes in the respective genomes (data not shown).
The gene order for the V. nonalfalfae mitochondrial genome, starting with cox2, is cox2--atp9-nad6-cox3-atp6-atp8-nad4-nad1-nad3-nad2-cox1-cob-nad5-nad4L, which is the same gene order as in the V. dahliae mitochondrial genome. Beyond the conserved gene order, RNA features also show conserved placement, such that tRNAs and rRNAs have almost the same placement on both of these mitochondrial genomes, with the exception of V. dahliae missing the tRNA-Cys(GCA) and thus having only 25 predicted tRNAs.
These indications of V. nonalfalfae taxonomic placement were further supported by our sequence-based phylogenetic analysis, which showed how the V. nonalfalfae mitochondrial genome was positioned in the Verticillium spp. lineage of the Glomerellales group, with V. dahliae being the closest relative. Our mitochondrial phylogenetic tree (Fig 4) is concordant with the nuclear genome-based phylogenetic tree, in relation to the Sordariomycetes group [63], even though mitochondria are prone to an accelerated rate of mutation compared to the nuclear genome, together with a higher diversity in gene order and in non-coding regions [13]. An example nuclear-genome based phylogenetic tree [63] shows similar groups of species as those that can be seen in the mitochondrial-based phylogenetic tree, such as the Sordariomycetes comprising the Verticillium spp. in the Glomerellales group, the Fusarium spp. in the Hypocreales group and N. crassa with P. anserina in the Sordariales group. Other groups do not show immediate resemblance with the genome phylogenetic tree but that may be a consequence of low resolution, due to a lack of species of that part of the taxonomy in the phylogenetic analysis.
Another special feature of the V. nonalfalfae mitochondrial genome is the presence of the potential long non-coding RNA (orf414, Fig 1). orf414 was annotated as a long non-coding RNA because no significant similarity hits were found within the nr database with the BLASTx algorithm. However, it was supported by RNA-Seq data and a further qPCR profiling experiment showed that it was expressed in both Slovenian pathotypes. Running a BLASTn alignment Maximum likelihood phylogenetic tree of 20 mitochondrial genomes, based on a conserved set of proteins. The tree was inferred from an alignment of amino-acid sequences of conserved mitochondrial proteins with 3093 distinct alignment positions and 100 rapid bootstrap inferences. The gamma model of rate heterogeneity and a maximum likelihood estimate of the alpha-parameter were used to prepare the final tree. Numbers above tree nodes represent the bootstrap support values. Next to the tree is a graphical presentation of an alignment of mitochondrial protein-coding genes and their order in the represented species. The conserved set of 14 protein-encoding genes is present in most of the species, with the exception of S. cerevisiae and S. pombe yeasts, which lack the NADH dehydrogenase family of genes, C. lindemuthianum, which lacks the nad2 gene, N. crassa, which has two copies of the nad2 gene and R. orthosporum and P. anserina, which lack the atp9 gene. The phylogenetic tree shows V. nonalfalfae joined with V. dahliae and C. lindemuthianum in a cluster corresponding to the established group of fungi called Glomerellales. C. lindemuthianum can be seen to have a very different gene order compared to the other two members of its group. The Hypocreales and Sordariales groups of the Sordariomycetes can also be seen in the tree. A. chrysogenum of the Hypocreales group contains a translocation of the cox2 gene, which distinguishes it from the rest of the members of its group. The Glomerellales and Hypocreales groups show a high degree of synteny within their respective groups and differ only by a translocation of the nad2-nad3 gene cluster. of orf414 on the fungal genomes of the JGI MycoCosm and NCBI nt databases shows the sequence being unique only to the mitochondrion of the Verticillium lineage (V. dahliae). Aligning the sequence to these mitochondrial genomes showed that its location in V. dahliae is at the locus between the cox3 and nad6 genes, which is the same locus as in V. nonalfalfae, with a 95% sequence identity. However, an alignment of orf414 to the V. alfalfae mitochondrial genome available in the WGS section of the NCBI databases showed only a partial hit, with 40% similarity. Interestingly, in a previous work [64], the same region was used as a tool for differentiating among isolates of V. dahliae. In that work, the region between the genes cox3 and nadh6 (our orf414) was found to be highly polymorphic and useful as a marker to separate V. dahliae isolates into subgroups.
Our comparison of V. nonalfalfae and V. dahliae mitochondrial genomes revealed another interesting feature: an additional region (1,221 bp) between the cox1 gene and a proline-coding tRNA in V. dahliae. The PCR amplification experiment showed almost all of our V. dahliae isolates containing that region, together with one isolate of V. nubilum, which was evident from the amplification of a 1400 bp fragment (S2 Table). In view of these results, this additional region could potentially be used as a discriminator between V. dahliae and other Verticillium species such as V. nonalfalfae. Current molecular methods of distinguishing among these species rely on sequencing the ITS region of the ribosomal RNA with specific PCR primers [65]. We believe our proposed molecular marker also has the advantage of a multi-copy mitochondrial molecular marker, which might entail faster and more specific pathogen detection directly in soil samples. A molecular marker with similar traits has already been reported for V. dahliae, which is based on conserved PCR primers in the mitochondrial small rRNA gene region [66].

Conclusions
Assembly of the V. nonalfalfae mitochondrial genome resulted in a 26,139 bp DNA molecule with genomic features reminiscent of many other ascomycete fungal mitochondrial genomes, especially to that of V. dahliae. It harbors a set of 14 oxidative-phosphorylation and ATP synthesis protein-coding genes, small and large rRNA subunits, a ribosomal protein S3 encoded within a type-IA intron and 26 tRNAs, which are common features of Sordariomycete mitochondrial genomes. One interesting additional feature is the presence of a potential long noncoding RNA (orf414) in a region that has already been used in a study for differentiating among isolates of V. dahliae [65]. At this point, the region of orf414 could only be found in V. dahliae and V. nonalfalfae, so we regard it as a unique feature of these 2 species. Using the annotated protein sequences for phylogenetic tree construction, we confirmed this mitochondrial genome to cluster along with that from the closely related V. dahliae in the Glomerellales group of fungi, with a very conserved synteny. This grouping within the Glomerellales group is also supported by nuclear genome-data based clustering [63].
The other interesting feature is the additional sequence present in V. dahliae but absent in V. nonalfalfae, which might be used as a biomarker to distinguish between these two species.