Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

De novo Genome Assembly of the Fungal Plant Pathogen Pyrenophora semeniperda

  • Marcus M. Soliai,

    Affiliation Plant and Wildlife Sciences, Brigham Young University, Provo, Utah, United States of America

  • Susan E. Meyer,

    Affiliation USDA Forest Service, Rocky Mountain Research Station, Shrub Sciences Laboratory, Provo, Utah, United States of America

  • Joshua A. Udall,

    Affiliation Plant and Wildlife Sciences, Brigham Young University, Provo, Utah, United States of America

  • David E. Elzinga,

    Affiliation Plant and Wildlife Sciences, Brigham Young University, Provo, Utah, United States of America

  • Russell A. Hermansen,

    Affiliation Plant and Wildlife Sciences, Brigham Young University, Provo, Utah, United States of America

  • Paul M. Bodily,

    Affiliation Plant and Wildlife Sciences, Brigham Young University, Provo, Utah, United States of America

  • Aaron A. Hart,

    Affiliation Plant and Wildlife Sciences, Brigham Young University, Provo, Utah, United States of America

  • Craig E. Coleman

    craig_coleman@byu.edu

    Affiliation Plant and Wildlife Sciences, Brigham Young University, Provo, Utah, United States of America

De novo Genome Assembly of the Fungal Plant Pathogen Pyrenophora semeniperda

  • Marcus M. Soliai, 
  • Susan E. Meyer, 
  • Joshua A. Udall, 
  • David E. Elzinga, 
  • Russell A. Hermansen, 
  • Paul M. Bodily, 
  • Aaron A. Hart, 
  • Craig E. Coleman
PLOS
x

Abstract

Pyrenophora semeniperda (anamorph Drechslera campulata) is a necrotrophic fungal seed pathogen that has a wide host range within the Poaceae. One of its hosts is cheatgrass (Bromus tectorum), a species exotic to the United States that has invaded natural ecosystems of the Intermountain West. As a natural pathogen of cheatgrass, P. semeniperda has potential as a biocontrol agent due to its effectiveness at killing seeds within the seed bank; however, few genetic resources exist for the fungus. Here, the genome of P. semeniperda isolate assembled from sequence reads of 454 pyrosequencing is presented. The total assembly is 32.5 Mb and includes 11,453 gene models encoding putative proteins larger than 24 amino acids. The models represent a variety of putative genes that are involved in pathogenic pathways typically found in necrotrophic fungi. In addition, extensive rearrangements, including inter- and intrachromosomal rearrangements, were found when the P. semeniperda genome was compared to P. tritici-repentis, a related fungal species.

Introduction

The ascomycete fungal genus Pyrenophora (anamorph Drechslera) is comprised of graminicolous species often associated with leaf-spotting disease in crops and turf grasses [1]. The genus includes the agronomically-important species P. teres, P. graminea, P. tritici-repentis, which are responsible for barley net blotch, barley stripe and wheat tan spot diseases, respectively [2][4]. These Pyrenophora species are necrotrophic pathogens and the diseases they cause result in substantial crop losses each year. In contrast to these foliar pathogens, P. semeniperda is primarily a seed pathogen, although leaf spotting has also been reported in plants infected by this fungus [5].

One of the better-characterized Pyrenophora species is P. tritici-repentis, an economically important pathogen of wheat [6]. This ascomycete (anamorph Drechslera tritici-repentis) causes tan spot and chlorosis in its host and is responsible for grain losses averaging 5 to 15% but reaching up to 50% in conditions favoring disease development [7][9]. Pyrenophora tritici-repentis infects the leaves of its host using exotoxins that induce necrotic spotting surrounded by chlorotic zones. Manning et al. [10] recently reported the genome sequence of three isolates of P. tritici-repentis using whole-genome Sanger sequencing. The genome annotation yielded over 11,000 genes, which serves as a useful model and reference for the sequencing and annotation of other Pyrenophora genomes. The haploid nuclear genome of the sequenced P. tritici-repentis isolate contains eleven chromosomes with an estimated size of 37 Mb.

Pyrenophora semeniperda (anamorph Drechslera campulata) is also a generalist pathogen on a wide range of grass genera. The host range of P. semeniperda was first described by Wallace in 1959 [11]. Currently, it is believed to infect over 36 genera of annual and perennial grasses [12]. It has been reported to infect developing seeds under experimental conditions. This infection does not have any effect on seed maturation but effectively reduces subsequent seed germination and emergence of its hosts [13][15]. Under natural conditions in the field the pathogen primarily attacks mature seeds in the seed bank [14]. Black stromata protruding out of dead seeds are characteristic of infection by the fungus.

Interest has been expressed in using P. semeniperda as a biocontrol agent against cheatgrass (Bromus tectorum) [16][18], an invasive weed in the Intermountain West (IMW) of the United States. Cheatgrass is a threat to many ecosystems of the IMW, invading sensitive habitats of native plants and animals, and providing fuel for disastrous wildfires. As a natural pathogen of cheatgrass, P. semeniperda is effective at killing seeds after conidial inoculation [16] and its use as a biocontrol agent may offer a superior alternative to expensive and dangerous conventional methods of control such as herbicides or early season burning. Despite recent interest in the fungus and its potential as a biocontrol agent for cheatgrass, there are very few genetic and genomic resources available to facilitate studies of P. semeniperda biology.

Here, the de novo assembly of the P. semeniperda genome from 454 pyrosequencing reads and its annotation are presented. The small genome size, haploid state, and modest level of repetitive elements within many fungal genomes make the job of de novo assembly relatively simple compared to other larger and more complex eukaryotic genomes [19]. The P. semeniperda sequencing project has four main objectives: 1) Obtain a high-quality draft of the P. semeniperda genome using next-generation sequencing technology, 2) annotate the genome using P. semeniperda ESTs and sequencing data from P. tritici-repentis and other fungal genomes to validate gene models, 3) identify genes involved in pathogenicity, and 4) establish sequence co-linearity and orthology between P. semeniperda and P. tritici-repentis by identifying genomic structural variations. These objectives will help to elucidate factors involved in virulence and other molecular mechanisms that may be used to exploit the fungus to control expansion of cheatgrass populations. Moreover, the work presented here may add to the general knowledge of fungal biology and contribute to the discovery of novel mechanisms of pathogenicity and infection by other fungi.

Materials and Methods

DNA and RNA Isolation

Fungal cultures and tissue were prepared as described by Boose et al. [20]. A single P. semeniperda isolate (CCB06) was prepared from a B. tectorum seed bank sample collected at Cinder Cone Butte, Idaho, USA. The seed bank sample was obtained as part of a cooperative study with the Idaho Army National Guard, which has administrative responsibility for the Orchard Training Area where Cinder Cone Butte is located. DNA was isolated from mycelium using the ZR Fungal/Bacterial DNA MiniPrep™ kit (Zymo Research Corporation, Orange, CA) following the manufacturer's protocol. DNA was quantified using the NanoDrop ND-1000 spectrophotometer (NanoDrop products, Wilmington, DE).

RNA was isolated from two P. semeniperda isolates using the ZR Fungal/Bacterial RNA MiniPrep™ (Zymo Research Corporation, Orange, CA) and stored at −80 C; RNA was collected from multiple tissue types including mycelium, fruiting structures, and conidia from P. semeniperda isolates including the Cinder Cone Butte isolate used for genome sequence and an isolate collected from Skull Valley, Utah, USA. RNA quality and integrity was assessed for each extraction using the RNA Nano 6000 kit and the 2100 Bioanalyzer Expert software (Agilent Technologies, Santa Clara, CA). RNA quantity was measured with the TBS-380 Mini-Fluorometer (Turner Biosystems, Sunnyvale, CA) in combination with the RiboGreen RNA quantitation reagent (Molecular Probes, Eugene, OR).

cDNA Library Construction and Normalization

RNA samples meeting sufficient quality and quantity criteria were pooled together for 1st strand synthesis and cDNA optimization. cDNA was synthesized from pooled RNA using the SMARTer™ PCR cDNA Synthesis Kit (ClonTech, Mountain View, CA) followed by PCR cycling of cDNA with the Advantage HF 2 PCR kit (ClonTech, Mountain View, CA). Modified oligo primers were used to allow for MmeI digestion for 5′ and 3′ adaptor excision: 5′ Smart Oligo [5′-AAG CAG TGG TAA CAA CGC ATC CGA CGC rGrGrG-3′]; 3′ Oligo dT SmartIIA [5′-AAG CAG TGG TAA CAA CGC ATC CGA CTT TTT TTT TTT TTT TTT TTT TTV N-3′]; New SmartIIA [5′-AAG CAG TGG TAA CAA CGC ATC CGA C-3′] (Sandra Clifton, Personal Communication, Washington University).

Normalization of cDNA was accomplished with the Axxora Trimmer cDNA Normalization kit (AxxOra, San Diego, CA). MmeI (New England BioLabs, Ipswitch, MA) was used for 5′ and 3′ modified SMART adaptor excision followed by removal of excised 5′ and 3′ adaptor ends using AMPure beads (Agencourt Bioscience Corporation, Beverly, MA) using manufacturer's recommended protocols.

454 Sequencing

In total, a half plate of a whole genome library, a full plate of a 3-kb paired end library, and a half plate of normalized cDNA were sequenced using the 454 Life Sciences Genome Sequencer using FLX Titanium series reagents (454 Life Sciences, Bradford, CT). Titanium emPCR, library preparation, and sequencing were completed at the Brigham Young University DNA Sequencing Center (Provo, Utah, USA).

454 Reads Assembly and Genome Annotation

De novo genome assembly was accomplished using all of the whole-genome shotgun and 3-kb paired-end reads with the Newbler software package (454 Life Sciences, Bradford, CT). Default settings were chosen for the assembly in Newbler. cDNA reads were assembled separately from genomic reads and default settings were chosen for transcript assembly.

The genome annotation pipeline MAKER [21] was used to predict gene models within the de novo assembly of P. semeniperda. Expressed sequence tags (ESTs), derived from the cDNA library, were used to provide evidence for predicted genes within the P. semeniperda genome for the annotation pipeline. An in-house Perl script was created to expedite the naming process, as an automated naming scheme did not exist within the MAKER pipeline.

P. semeniperda gene models were imported into the Blast2GO suite [22], [23] for functional annotation analysis. GO annotations were made in accordance with the recommended protocol in the Blast2Go tutorial. Default settings were chosen along with an e-value threshold set at ≤ e-06 for each step of the GO annotation process.

Repeated and low complexity sequences within the P. semeniperda genome were identified using RepeatMasker [24] with a fungal repeat library. A slow search was performed for increased accuracy, increasing the sensitivity of the search between 0–5%.

Genome Assembly Validation

To assess the accuracy of the genome assembly, an automated validation program called amosvalidate was used to highlight regions of the genome that are suspected to be misassembled [25]. The amosvalidate pipeline returns features for each contig and scaffold that are likely to be errors in the assembly such as expansions or contractions of the reads that make up the assembly. Contig assemblies were imported into amosvalidate along with coordinates of the paired-end sequences for analysis. Hawkeye was used to visualize data from amosvalidate [26]. The amosvalidate output data was imported into Hawkeye where each scaffold was visually inspected for assembly errors.

SyMap

SyMap v3.3 (Synteny Mapping and Analysis Program) [27] was used to generate dotplot displays of syntenic relationships between P. semeniperda and P. tritici-repentis. SyMap, by default uses NUCmer [28] for multiple genome alignments via a modified Smith-Waterman algorithm [29]. Gene descriptive information and other features associated with the P. semeniperda genome were imported into SyMap as GFFs (General Feature Files) after the alignment was completed.

Results and Discussion

Sequencing and assembly of the P. semeniperda genome

For sequencing, DNA was extracted from an isolate of P. semeniperda collected at Cinder Cone Butte, ID, USA. Pyrenophora semeniperda is haploid with an unknown chromosome number, although, electrophoretic and cytological karyotyping in related species reveals 9 chromosomes in P. teres [30] and from 8 to 11 chromosomes in P. tritici-repentis, depending on the isolate [31]. A shotgun strategy was used with the 454 Life Sciences Genome Sequencer FLX platform including whole-genome and 3 kb paired-end sequencing libraries. 454 Sequencing of the whole-genome shotgun (WGS) library on a half plate produced approximately 257 Mb of sequence with reads averaging 371 bp. The 3-kb paired-end library was sequenced on a full plate and produced over 469.9 Mb of sequence with an average read length of 362.06 bp. In total, 726.9 Mb of sequence was produced from 2,759,755 reads with an average read length of 366.7 bp; 28.11% (775,958) of the total were recognized as paired-end reads by the assembler and consequently used in the genome assembly (Table 1).

The DNA sequence reads were assembled using the Newbler software package developed by 454 Life Sciences for de novo DNA sequence assembly. An incremental assembly approach was used where WGS reads were first used to create contigs based on an overlap, layout and consensus algorithm. The WGS reads were assembled into 7,890 contigs (N50  = 6,499); 98.38% of the total bases were successfully assembled into contigs (252 Mb). The initial assembly yielded a 6-fold coverage of the genome with approximately 31 Mb of aligned sequence. Next, reads from the 3-kb paired-end library were added to the assembly to provide read linkages that would span most repeats in the genome and increase the number of bases available for additional overlap and consensus. The inclusion of paired-end reads reduced the number of gaps from the assembly of the whole-genome reads alone (Table 2). The completed assembly project included 98.62% (672 Mb) of the total bases (681 Mb) and 95.00% of the raw reads (2,621,753 reads) assembled in 1,001 contigs (N50  = 104,587). The final coverage is 17X with the 1,001 contigs arranged into 54 scaffolds (N50  = 1.47 Mb); the 19 largest scaffolds represent 86.5% of the assembled P. semeniperda genome. The average distance between paired ends is 2.6 kb with a standard deviation of 665.5 bp. The estimated genome size of 40.1 Mb for P. semeniperda is similar to the reported size of 37.8 Mb for P. tritici-repentis [10] and 41.9 Mb for P. teres f. teres [30]. The 454 sequencing reads used in the assembly reported here are deposited in the NCBI sequence read archive (Accession: SRP007005). The whole genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession ATLS00000000. The version described in this paper is version ATLS01000000.

Genome Assembly Validation

The genome assembly was validated and visualized with Amosvalidate and Hawkeye respectively [25], [26]. Scaffolds and contigs were sorted from highest to lowest feature density and analyzed for major mis-assemblies in Hawkeye. With the exception of scaffold 30 the Compression-Expansion (CE) statistic, for the majority of scaffolds remains close to 0 and within the defaulted interval of −3 and +3, which indicates the likelihood of proper reads and mate-pair assembly. Clustering of mate-pair reads including compression and expansion of mate-pair reads is an indication of obvious mis-assembly and was not prevalent in the assembly. Validation results indicate 9 possible inversions in 9 different scaffolds and 1 possible insertion.

Clustering of expanded mate-pairs is observed throughout scaffold 30; expanded mate-pair clustering is normally an indication of insertion mis-assembly. Further investigation of the scaffold sequence show large amounts of homopolymers (mononucleotide repeats). Ambiguities and sequencing errors associated with long stretches of homopolymers are known to occur in 454 sequencing and may be the reason for expanded mate-pair clustering [32]. Scaffold 30 is relatively small (186 kb) and does not contain any annotated genes.

Gene identification and annotation

The MAKER pipeline [21] was used to annotate the P. semeniperda genome and to create a publicly accessible genome database. The pipeline was used to make gene predictions, align ESTs to the genome, and integrate the ESTs into protein-coding gene annotations. To provide evidence of gene identity and ease of detection, a normalized P. semeniperda cDNA library was prepared for 454 sequencing. The library produced over 110.8 Mb of raw sequence data with read lengths averaging approximately 331 bp. The Newbler assembly of the cDNA sequence library generated 7,963 isogroups and over 7 Mb of total sequence length. In addition to EST evidence from P. semeniperda, 12,171 P. tritici-repentis gene models were used as a reference to reinforce confidence of ab initio gene predictions [10]. Each gene model was used as a query in a BLAST search against all protein sequences in the GenBank database, using an e-value cutoff threshold of hit scores ≤1×10−20.

The MAKER pipeline predicted a total of 11,453 ab initio gene models, of which 9,578 yielded BLASTx hits to genes from other fungal species. Most of the top BLASTx hits were either to genes from P. teres (4,793) or P. tritici-repentis (3,995). Other fungal species for which there were top BLASTx hits included Phaeosphaeria nodorum (214) and Leptosphaeria maculans (173). No other fungal species provided more than 20 top BLASTx hits. The average coding sequence (CDS) length is 1,312 bp, ranging in size from 72 bp to 25,382 bp (Figure 1). The longest gene model is for a hypothetical protein that matched a gene model from P. teres in the BLASTx search.

thumbnail
Figure 1. Length distribution of putative genes from P. semeniperda.

Each of the 11,453 gene model is included on the X-axis from smallest to largest (left to right) with its length plotted on the Y-axis.

https://doi.org/10.1371/journal.pone.0087045.g001

General genomic features

The overall GC content of the assembled P. semeniperda genome (32.029 Mb excluding gaps of unknown nucleotide sequence) is 49.98%. The GC content increases to 52.53% in gene coding sequences, which represent 46.47% of the genome. Analysis of the assembly using t-RNAscan-SE [33] detected 91 tRNA genes located on 22 scaffolds (Table 3). These putative tRNA genes do not group together as seen in other fungi such as Saccharomyces pombe [34]. Over 3,979 orthologous groups were identified between P. semeniperda and P. tritici-repentis with the Inparanoid v4.0 program [35], which describes genes derived from a common ancestor of the two fungal species; such genes are likely to share molecular function [36]. Also, 4,184 genes in P. semeniperda have been identified as in-paralogs (a result of gene duplication after a speciation event).

Gene ontology

The set of 11,453 P. semeniperda gene models were analyzed using Blast2GO [22], [23] to identify gene function. From the query set, 6,419 genes were successfully annotated, yielding 25,595 GO terms. GO terms for the annotated genes were placed into three broad categories: biological process (BP), molecular function (MF), and cellular components (CC). Figures 2A, 2B and 2C are pie charts showing the distribution of GO terms at Level 2 for the three categories. The most abundant GO terms in the BP category (11,686 total terms) were metabolic process (34%), cellular process (28%), single-organism process (13%) and localization (9%) (Figure 2A). Abundant GO terms in the MF category (8,449 total terms) include catalytic activity (46%), binding (39%), and transporter activity (6%) (Figure 2B). Finally, most of the CC terms (5,460 total terms) are categorized as cell (37%), organelle (24%), membrane (24%) or macromolecular complex (11%) (Figure 2C). These GO terms only describe putative genes within the P. semeniperda genome and does not document gene expression.

thumbnail
Figure 2. Distribution of Blast2GO annotations of putative genes from P. semeniperda.

The charts show level 2 annotations for (A) Biological Process, (B) Molecular Function and (C) Cellular Components.

https://doi.org/10.1371/journal.pone.0087045.g002

Repeat sequences and transposons

Transposable elements and repeated sequences are some of the most abundant sequences in eukaryotic genomes; for example, over 44% of the human genome [37] and more than 75% of the maize genome [38] are comprised of transposable and other repetitive elements. Fungal genomes, however, contain relatively small amounts of these elements when compared to other eukaryotes, rarely exceeding 5% of the genome [39]. Low levels of transposable and repetitive elements in fungal genomes may be due to defense mechanisms known as repeat-induced point mutations (RIP) [40] that protect fungal genomes against highly repeated sequences. The P. semeniperda genome was analyzed for repetitive sequences and retro-elements using RepeatMasker 3.2.7 [24], which screens DNA sequences for interspersed repeats and low complexity elements. Interspersed repeats (retroelements and DNA transposons) were the most abundant elements identified by RepeatMasker, totaling 610.7 kb or 1.89% of the genome. There were 447 class I retroelements identified, 388 of which were Gypsy/DIRS1 LTRs. Also found were 296 class II DNA transposable elements, 293 of which are Tc1-IS630-Pogo DNA transposons. In total, 859,266 bp or 2.66% of the genome was identified as containing interspersed repeats or low complexity elements (Table 4). This percentage is quantitatively consistent with the frequency of repeat elements observed in other ascomycete fungi, which rarely exceeds 5% of the genome [39].

Genome rearrangements

Questions have been raised concerning the impact transposable and repetitive elements have on the genomic architecture and evolution of fungi [39]. The presence of transposable elements in a genome can impact the regulation of neighboring genes and may provide sites for homologous and ectopic recombination [41][43]. Recombination sites may play an important role in observed local or wide-scale chromosomal rearrangements in fungi as well as in other organisms [44][48]. To investigate the role transposable and repeat elements may have played in the genomic architecture of P. semeniperda, its genome assembly was aligned with that of P. tritici-repentis. The P. tritici-repentis genome was ideal for a whole genome alignment as many of its sequences are arranged into full chromosomes, thereby allowing easier identification of genomic rearrangements within contigs or scaffolds of the P. semeniperda genome sequence.

NUCmer [49] and Circos [50] was used to align and visualize genomic synteny between P. semeniperda and P. tritici-repentis (Figure 3). The 19 largest P. semeniperda scaffolds, representing 86.5% of the sequenced genome of P. semeniperda, were aligned to the 11 P. tritici-repentis chromosomes. The alignment produced 88% and 80% of syntenic coverage in P. semeniperda and P. tritici-repentis, respectively, and 6,376 gene hits on P. semeniperda scaffolds. Over 8,070 genes within the P. semeniperda assembly were aligned to the P. tritici-repentis genome, despite only including 19 of the 54 scaffolds in the genome alignment.

thumbnail
Figure 3. Circular view of the genome alignment between 19 P. semeniperda (Ps) scaffolds and 11 P. tritici-repentis (Ptr) chromosomes.

The numbers marked on each scaffold or chromosome indicate length in megabases.

https://doi.org/10.1371/journal.pone.0087045.g003

A dot-plot of the genome alignment was created using Symap [27], revealing regions of synteny and colinearity between the two genome sequences (Figure 4). A total of 101 syntenic blocks with an identity range of 95% have been identified. Many of the observed rearrangements within the P. semeniperda scaffolds are localized within the corresponding P. tritici-repentis chromosome and include inversions, deletions, and transpositions (intrachromosomal rearrangements, Figure 5). These types of large-scale rearrangements are also observed when comparing the genomes of Podospora anserina and Neurospora crassa, most of which were intrachromosomal [51][53]. The distribution of intrachromosomal rearrangements is consistent across the P. semeniperda scaffolds with the exception of scaffold 1 which shows patterns of interchromosomal rearrangements, transposing to four different P. tritici-repentis chromosomes (Figure 6).

thumbnail
Figure 4. SyMap dot plot of genome alignment between P. semeniperda scaffolds and P. tritici-repentis chromosomes.

The P. semeniperda scaffolds are numbered along the x-axis across the top and the P. tritici-repentis chromosomes are numbered on the y-axis along the left of the figure. Boxes highlight regions of homology between the two genomes.

https://doi.org/10.1371/journal.pone.0087045.g004

thumbnail
Figure 5. SyMap dot plot of P. semeniperda scaffold 6 and P. tritici-repentis chromosome 7.

The dot plot alignment displays P. tritici-repentis chromosome 7 on the x-axis and P. semeniperda scaffold 6 on the y-axis. Boxes highlight regions of homology between the two genome regions.

https://doi.org/10.1371/journal.pone.0087045.g005

thumbnail
Figure 6. SyMap dot plot of alignment between P. semeniperda scaffold 1 and P. tritici-repentis 2, 3, 6 and 10.

The P. semeniperda scaffold is displayed along the x-axis and the P. tritici-repentis chromosomes along the y-axis. Boxes highlight regions of homology between the two genomes.

https://doi.org/10.1371/journal.pone.0087045.g006

Large-scale genomic rearrangements such as those observed in P. semeniperda have been extensively studied in S. cerevisiae, many of which are attributed to recombination events between retrotransposons and other repetitive elements [42]. Further investigation of the P. semeniperda genome revealed transposable and repetitive elements flanking syntenic blocks suggesting that such elements may play a role in chromosomal rearrangements. Homology searches of these areas in P. semeniperda scaffolds reveal the presence of retroelements including Ty1 and Ty3-type elements (copia and gypsy LTR elements), as well as Gag, Env, and Pol genes. Ty elements in S. cerevisiae have been shown to be sources of chromosomal crossovers which cause deletions, duplications, inversions, and translocations, though by what mechanisms and under what conditions this occurs under is unknown [42]. Additional molecular evidence is needed to make conclusions concerning the role retrotransposons have in the genomic architecture and evolution of P. semeniperda.

Pathogenicity and infection-related genes

Because fungi use a variety of pathogenic strategies, it is not clear what mechanism is used by P. semeniperda to infect host seeds. To help identify putative infection mechanisms, the PHI-base fungal pathogenicity database were searched against the set of P. semeniperda gene models (tblastx, with an alignment threshold of ≤1×10−20). The PHI-base database contains 924 genes and their products from bacterial, fungi and oomycetes that have been demonstrated experimentally to be involved in pathogenesis [54]. The search identified 663 genes from P. semeniperda that matched 552 PHI-base entries (Table S1). Among the matches were putative genes that code for hydrolases, protease inhibitors, secondary metabolite biosynthesis enzymes, ABC transporters, and effector proteins, all factors related to virulence in necrotrophic plant pathogens. For instance, there are 19 genes in the P. semeniperda genome that encode proteins with homology to type I polyketide synthases. Fungal polyketides are important pharmacological compounds and are known virulence factors in several fungal species [55]. Other examples are 9 genes in the P. semeniperda genome that encode proteins with homology to cyclic peptide synthetases from Alternaria alternata and Cochliobolus carbonum. Cyclic peptides such as AM-toxin from A. alternata and HC-toxin from C. carbonum are important virulence factors whose synthesis is catalyzed, in part, by nonribosomal peptide synthetases [56][58].

Secreted proteins

The expansion of secreted protein gene families has been observed in the genomes of the ascomycete phytopathogens Stagnospora nodorum and Magnaporthe grisea when compared with the saprophyte Neurospora crassa [59], [60], consistent with their role as plant pathogenic fungi. There are a relatively large number of putative genes encoding secreted proteins (996) in the P. semeniperda genome, as predicted by WolfP-SORT [61], ranging in length from 180–5,845 bp. A significant portion of the P. semeniperda secretome (81%) is homologous to P. tritici-repentis proteins. This level of homology is consistent with a similar analysis of the P. teres f. teres secretome [30], which shows that 85% of its predicted secreted proteins share homology with secreted proteins from P. tritici-repentis.

Nearly 55% (546 sequences) of the genes encoding secreted proteins were annotated with GO terms using Blast2GO [22], [23]. Although there are some drawbacks and limitations with the existing annotations databases due to their incompleteness [62], these GO terms provide a short synopsis of the types of secreted proteins that are found in P. semeniperda. Consistent with its role as a necrotrophic plant pathogen, many of the secreted proteins are putative enzymes that target various polysaccharides (Table 5). As observed in the previous assessment of pathogenic-related sequences described above, putative secreted proteins with hydrolase activity are homologous to proteins containing cellulose binding domains, carboxypeptidase, as well as cell wall glucanase and glycosyl hydrolase activity. Many of these sequences were also annotated with GO terms for oxidation reduction and oxidoreductase activity, suggesting that these gene products have key roles in the process of cellulose and lignin degradation [63].

Cytochalasin genes

Cytochalasins are a diverse group of fungal metabolites well-known for their ability to bind to actin filaments and block polymerization and elongation, thus inhibiting cytokinesis without affecting mitosis. Due to the ability of cytochalasins to block normal function of the cytoskeleton, many of them have been identified as antibiotic, antiviral, anti-inflammatory, or antitumoral agents [64]. Various cytochalasins forms have been identified in phytopathogenic fungi, including three previously unknown cytochalasins (Z1, Z2, and Z3) from P. semeniperda [65]. The exact role of cytochalasins in fungal virulence pathways is unknown; although, Beckstead et al. [16] suggest that P. semeniperda may use these compounds to inhibit germination of nondormant cheatgrass seeds and increase their vulnerability to attack from the fungus.

It is understood that the tricyclic ring system of cytochalasins is generated by a Diels-Alder-type reaction [66], [67]. Recently, the genes encoding the enzymes responsible for the early stages of cytochalasin biosynthesis in Penicillium expansum were identified by Schümann & Hertweck [64]. They identified 7 genes grouped together in what is now called the chaetoglobosin (Che) gene cluster. RNA silencing methods suggested that the CheA gene (encoding a PKS-NRPS hybrid protein) is essential to cytochalasin biosynthesis [64]. Using Che amino acid sequences from P. expansum, homologs of the genes in the Che cluster were found in the P. semeniperda genome sequence. Homologs were found for all seven genes including two PKS-NRPS protein genes. Like the P. expansum CheA protein, the putative P. semeniperda CheA proteins have PKS-NRPS hybrid domains as well as other protein features, including monooxygenase, transcription factor, and enoyl reductase domains. Putative P. semeniperda Che genes are not found in clusters as they are in P. expansum but are interspersed across multiple scaffolds.

Tox Genes

Host-selective toxins (HSTs) have been identified in Pyrenophora species, specifically ToxA and ToxB HSTs in Pyrenophora tritici-repentis. These HSTs are proteinaceous effectors that are structurally unrelated and, though seem to evoke different host responses, confer the ability to cause disease in the host organism [68]. P. tritici-repentis races can be differentiated by their expression of one or any combination of tox genes which have all been shown to be pathogenic. A single copy of the ToxA gene in P. tritici-repentis is sufficient to induce necrosis on ToxA-sensitive wheat cultivars. Unlike ToxA, ToxB-containing isolates are more virulent with increasing ToxB gene copy numbers [69][71]. A correlation between ToxB transcript number and virulence/pathogenicity has been identified in P. tritici-repentis; the greater the ToxB transcript number, the more efficient it is able to cause disease in its host [70], [72].

A BLAST search using the P. tritici repentis ToxA gene sequence in a query of the P. semeniperda genome did not yield any hits; however a search using a ToxB query yielded a single copy in the P. semeniperda genome with 81% sequence similarity to the P. tritici-repentis sequence. Because ToxB and its homologs are primarily described as chlorosis-inducing toxins, its role in seed pathogenicity of P. semeniperda is currently unknown. Further study of ToxB copy number in other P. semeniperda isolates may produce a clearer understanding of its role in P. semeniperda virulence.

Conclusions

The genome sequence, assembly and annotation of a single isolate of P. semeniperda are reported here. The assembly includes over 32 Mb with an estimated genome size of 40.1 Mb based on the metrics generated by the Newbler assembly. The size of the P. semeniperda genome is similar to the reported size of the P. tritici-repentis and P. teres genomes [10], [30], consistent with other related fungi. Genome comparisons between P. semeniperda and P. tritici-repentis allow visualization of large-scale rearrangements between these related species and provide clues to evolutionary mechanisms used by this fungus. The P. semeniperda genome contains a rich diversity of putative genes, common to other plant pathogens, notably hydrolases, ABC transporters, cytochrome P450 and secreted gene products attributable to other necrotrophs. In addition, the genome sequence can provide information for the development of molecular markers which may be implemented in population or evolutionary studies of this organism. This assembly also provides researchers with genomic and genetic resources to advance P. semeniperda research and the means to further our understanding of other phytopathogenic fungi.

The P. semeniperda genome is of immediate interest because of the genetic information it provides on putative genes that may play an important role in the infection process of the fungus on cheatgrass seeds. The genetic information is critical because it may inform efforts to create more powerful or effective fungal isolates to control the expansion of cheatgrass populations in the IMW. Future studies may include gene expression analyses that identify genes that are upregulated during the infection process. The genetic information will also make it possible to test the hypothesis that expression of P. semeniperda cytochalasin genes facilitates infection of nondormant seeds by inhibiting seed germination.

Supporting Information

Table S1.

Pyrenophora semeniperda gene models with homology to PHI-base protein entries.

https://doi.org/10.1371/journal.pone.0087045.s001

(DOCX)

Author Contributions

Conceived and designed the experiments: MMS SEM CEC. Performed the experiments: MMS. Analyzed the data: MMS JAU DEE RAH PMB AAH CEC. Contributed reagents/materials/analysis tools: SEM. Wrote the paper: MMS CEC.

References

  1. 1. Nelson EB (1994) Dreschlera and Pyrenophora: leaf-spotting diseases. Turf Grass Trends 3: : 1–6, 11.
  2. 2. Ciuffetti LM, Tuori RP (1999) Advances in the characterization of the Pyrenophora tritici-repentis—wheat interaction. Phytopathology 89: 444–449.
  3. 3. Liu Z, Ellwood SR, Oliver RP, Friesen TL (2011) Pyrenophora teres: profile of an increasingly damaging barley pathogen. Molecular Plant Pathology 12: 1–19.
  4. 4. Tekauz A, Chiko AW (1980) Leaf stripe of barley caused by Pyrenophora graminea: Occurrence in Canada and comparisons with barley stripe mosaic. Canadian Journal of Plant Pathology 2: 152–158.
  5. 5. Medd R, Murray G, Pickering D (2003) Review of the epidemiology and economic importance of Pyrenophora semeniperda. Australasian Plant Pathology 32: 539–550.
  6. 6. Lamari L, Bernier CC (1991) Genetics of tan necrosis and extensive chlorosis in tan spot of wheat caused by Pyrenophora-tritici-repentis. Phytopathology 81: 1092–1095.
  7. 7. De Wolf ED, Effertz RJ, Ali S, Francl LJ (1998) Vistas of tan spot research. Canadian Journal of Plant Pathology 20: 349–370.
  8. 8. Shabeer A, Bockus WW (1988) Tan spot effects on yield and yield components relative to growth stage in winter-wheat. Plant Disease 72: 599–602.
  9. 9. Singh PK, Mergoum M, Ali S, Adhikari TB, Hughes GR (2008) Genetic analysis of resistance to Pyrenophora tritici-repentis races 1 and 5 in tetraploid and hexaploid wheat. Phytopathology 98: 702–708.
  10. 10. Manning VA, Pandelova I, Dhillon B, Wilhelm LJ, Goodwin SB, et al. (2013) Comparative genomics of a plant-pathogenic fungus, Pyrenophora tritici-repentis, reveals transduplication and the impact of repeat elements on pathogenicity and population divergence. G3: Genes|Genomes|Genetics 3: 41–63.
  11. 11. Wallace HAH (1959) A rare seed-borne disease of wheat caused by Podosporiella verticillata.. Canadian Journal of Botany 37: 509–515.
  12. 12. Yonow T, Kriticos DJ, Medd RW (2004) The potential geographic range of Pyrenophora semeniperda. Phytopathology 94: 805–812.
  13. 13. Brittlebank CC, Adam DB (1924) A new disease of the gramineae: Pleosphaeria semeniperda nov. sp. Transactions of the British Mycological Society 10: : 123–127, IN128-IN129.
  14. 14. Kreitlow K, Bleak A (1964) Podosporiella verticillata, a soil-borne pathogen of some western Gramineae. Phytopathology 54: 353–357.
  15. 15. O'Gara P (1915) A Podosporiella disease of germinating wheat. Phytopathology 5: 323–325.
  16. 16. Beckstead J, Meyer SE, Molder CJ, Smith C (2007) A race for survival: Can Bromus tectorum seeds escape Pyrenophora semeniperda-caused mortality by germinating quickly? Annals of Botany 99: 907–914.
  17. 17. Medd RW, Campbell MA (2005) Grass seed infection following inundation with Pyrenophora semeniperda. Biocontrol Science and Technology 15: 21–36.
  18. 18. Meyer SE, Quinney D, Nelson DL, Weaver J (2007) Impact of the pathogen Pyrenophora semeniperda on Bromus tectorum seedbank dynamics in North American cold deserts. Weed Research 47: 54–62.
  19. 19. Galagan JE, Henn MR, Ma LJ, Cuomo CA, Birren B (2005) Genomics of the fungal kingdom: insights into eukaryotic biology. Genome Research 15: 1620–1631.
  20. 20. Boose D, Harrison S, Clement S, Meyer S (2011) Population genetic structure of the seed pathogen Pyrenophora semeniperda on Bromus tectorum in western North America. Mycologia 103: 85–93.
  21. 21. Cantarel BL, Korf I, Robb SM, Parra G, Ross E, et al. (2008) MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Research 18: 188–196.
  22. 22. Conesa A, Götz S, García-Gomez JM, Terol J, Talon M, et al. (2005) Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21: 3674–3676.
  23. 23. Götz S, García-Gomez JM, Terol J, Williams TD, Nagaraj SH, et al. (2008) High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Research 36: 3420–3435.
  24. 24. Smit AFA, Hubley R (1996) RepeatMasker Open-3.0. Institute for Systems Biology.
  25. 25. Phillippy A, Schatz M, Pop M (2008) Genome assembly forensics: finding the elusive mis-assembly. Genome Biology 9: R55.
  26. 26. Schatz M, Phillippy A, Shneiderman B, Salzberg S (2007) Hawkeye: an interactive visual analytics tool for genome assemblies. Genome Biology 8: R34.
  27. 27. Soderlund C, Nelson W, Shoemaker A, Paterson A (2006) SyMAP: A system for discovering and viewing syntenic regions of FPC maps. Genome Research 16: 1159–1168.
  28. 28. Delcher AL, Phillippy A, Carlton J, Salzberg SL (2002) Fast algorithms for large-scale genome alignment and comparison. Nucleic Acids Research 30: 2478–2483.
  29. 29. Smith TF, Waterman MS (1981) Identification of common molecular subsequences. Journal of Molecular Biology 147: 195–197.
  30. 30. Ellwood S, Liu Z, Syme R, Lai Z, Hane J, et al. (2010) A first genome assembly of the barley fungal pathogen Pyrenophora teres f. teres. Genome Biology 11: R109.
  31. 31. Aboukhaddour R, Cloutier S, Ballance GM, Lamari L (2009) Genome characterization of Pyrenophora tritici-repentis isolates reveals high plasticity and independent chromosomal location of ToxA and ToxB. Molecular Plant Pathology 10: 201–212.
  32. 32. Huse S, Huber J, Morrison H, Sogin M, Welch D (2007) Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biology 8: R143.
  33. 33. Lowe TM, Eddy SR (1997) tRNAscan-SE: A program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Research 25: 955–964.
  34. 34. Wanchanthuek P, Hallin PF, Gouveia-Oliveira R, Ussery D (2006) Sructural features of fungal genomes. In: Sunnerhagen P, Piskur J, editors. Comparative Genomics: Using Fungi as Models (Topics in Current Genetics). Berlin: Springer.
  35. 35. Remm M, Storm CE, Sonnhammer EL (2001) Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. Journal of Molecular Biology 314: 1041–1052.
  36. 36. Berglund AC, Sjolund E, Ostlund G, Sonnhammer EL (2008) InParanoid 6: eukaryotic ortholog clusters with inparalogs. Nucleic Acids Research 36: D263–D266.
  37. 37. Mills RE, Bennett EA, Iskow RC, Devine SE (2007) Which transposable elements are active in the human genome? Trends in Genetics 23: 183–191.
  38. 38. Wolfgruber TK, Sharma A, Schneider KL, Albert PS, Koo DH, et al. (2009) Maize centromere structure and evolution: sequence analysis of centromeres 2 and 5 reveals dynamic Loci shaped primarily by retrotransposons. PLoS Genetics 5: e1000743.
  39. 39. Wöstemeyer J, Kreibich A (2002) Repetitive DNA elements in fungi (Mycota): impact on genomic architecture and evolution. Current Genetics 41: 189–198.
  40. 40. Hood ME, Katawczik M, Giraud T (2005) Repeat-induced point mutation and the population structure of transposable elements in Microbotryum violaceum. Genetics 170: 1081–1089.
  41. 41. Maksakova IA, Mager DL (2005) Transcriptional regulation of early transposon elements, an active family of mouse long terminal repeat retrotransposons. Journal of Virolology 79: 13865–13874.
  42. 42. Mieczkowski PA, Lemoine FJ, Petes TD (2006) Recombination between retrotransposons as a source of chromosome rearrangements in the yeast Saccharomyces cerevisiae. DNA Repair 5: 1010–1020.
  43. 43. Thornburg BG, Gotea V, Makalowski W (2006) Transposable elements as a significant source of transcription regulating signals. Gene 365: 104–110.
  44. 44. Daboussi MJ, Capy P (2003) Transposable elements in filamentous fungi. Annual Review of Microbiology 57: 275–299.
  45. 45. Kupiec M, Petes TD (1988) Allelic and ectopic recombination between Ty elements in yeast. Genetics 119: 549–559.
  46. 46. Ladevèze V, Aulard S, Chaminade N, Périquet G, Lemeunier F (1998) Hobo transposons causing chromosomal breakpoints. Proceedings of the Royal Society of London Series B-Biological Sciences 265: 1157–1159.
  47. 47. Lahn BT, Page DC (1999) Four evolutionary strata on the human X chromosome. Science 286: 964–967.
  48. 48. Lim JK, Simmons MJ (1994) Gross chromosome rearrangements mediated by transposable elements in Drosophila melanogaster. Bioessays 16: 269–275.
  49. 49. Kurtz S, Phillippy A, Delcher A, Smoot M, Shumway M, et al. (2004) Versatile and open software for comparing large genomes. Genome Biology 5: R12.
  50. 50. Krzywinski MI, Schein JE, Birol I, Connors J, Gascoyne R, et al.. (2009) Circos: An information aesthetic for comparative genomics. Genome Research.
  51. 51. Espagne E, Lespinet O, Malagnac F, Da Silva C, Jaillon O, et al. (2008) The genome sequence of the model ascomycete fungus Podospora anserina. Genome Biology 9: R77.
  52. 52. Galagan JE, Calvo SE, Borkovich KA, Selker EU, Read ND, et al. (2003) The genome sequence of the filamentous fungus Neurospora crassa. Nature 422: 859–868.
  53. 53. Pain A, Hertz-Fowler C (2008) Genomic adaptation: a fungal perspective. Nature Reviews Microbiology 6: 572–573.
  54. 54. Winnenburg R, Urban M, Beacham A, Baldwin TK, Holland S, et al. (2008) PHI-base update: additions to the pathogen–host interaction database. Nucleic Acids Research 36: D572–D576.
  55. 55. Kroken S, Glass NL, Taylor JW, Yoder OC, Turgeon BG (2003) Phylogenomic analysis of type I polyketide synthase genes in pathogenic and saprobic ascomycetes. Proceedings of the National Academy of Sciences 100: 15670–15675.
  56. 56. Johnson RD, Johnson L, Itoh Y, Kodama M, Otani H, et al. (2000) Cloning and characterization of a cyclic peptide synthetase gene from Alternaria alternata apple pathotype whose product is involved in AM-toxin synthesis and pathogenicity. Molecular Plant-Microbe Interactions 13: 742–753.
  57. 57. Oide S, Moeder W, Krasnoff S, Gibson D, Haas H, et al. (2006) NPS6, Encoding a Nonribosomal Peptide Synthetase Involved in Siderophore-Mediated Iron Metabolism, Is a Conserved Virulence Determinant of Plant Pathogenic Ascomycetes. The Plant Cell Online 18: 2836–2853.
  58. 58. Scott-Craig JS, Panaccione DG, Pocard JA, Walton JD (1992) The cyclic peptide synthetase catalyzing HC-toxin production in the filamentous fungus Cochliobolus carbonum is encoded by a 15.7-kilobase open reading frame. Journal of Biological Chemistry 267: 26044–26049.
  59. 59. Dean RA, Talbot NJ, Ebbole DJ, Farman ML, Mitchell TK, et al. (2005) The genome sequence of the rice blast fungus Magnaporthe grisea. Nature 434: 980–986.
  60. 60. Hane JK, Lowe RG, Solomon PS, Tan KC, Schoch CL, et al. (2007) Dothideomycete plant interactions illuminated by genome sequencing and EST analysis of the wheat pathogen Stagonospora nodorum. Plant Cell 19: 3347–3368.
  61. 61. Horton P, Park KJ, Obayashi T, Fujita N, Harada H, et al. (2007) WoLF PSORT: protein localization predictor. Nucleic Acids Res 35: W585–W587.
  62. 62. Khatri P, Draghici S (2005) Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 21: 3587–3595.
  63. 63. Raíces M, Montesino R, Cremata J, García B, Perdomo W, et al. (2002) Cellobiose quinone oxidoreductase from the white rot fungus Phanerochaete chrysosporium is produced by intracellular proteolysis of cellobiose dehydrogenase. Biochimica Biophysica Acta 1576: 15–22.
  64. 64. Schümann J, Hertweck C (2007) Molecular basis of cytochalasan biosynthesis in fungi: gene cluster analysis and evidence for the involvement of a PKS-NRPS hybrid synthase by RNA silencing. Journal of the American Chemical Society 129: 9564–9565.
  65. 65. Evidente A, Andolfi A, Vurro M, Zonno MC, Motta A (2002) Cytochalasins Z1, Z2 and Z3, three 24-oxa[14]cytochalasans produced by Pyrenophora semeniperda. Phytochemistry 60: 45–53.
  66. 66. Oikawa H, Murakami Y, Ichihara A (1992) Biosynthetic study of chaetoglobosin A: Origins of the oxygen and hydrogen atoms, and indirect evidence for biological Diels-Alder reaction. Journal of the Chemical Society-Perkin Transactions 1: 2955–2959.
  67. 67. Oikawa H, Tokiwano T (2004) Enzymatic catalysis of the Diels-Alder reaction in the biosynthesis of natural products. Natural Product Reports 21: 321–352.
  68. 68. Ciuffetti LM, Manning VA, Pandelova I, Betts MF, Martinez JP (2010) Host-selective toxins, Ptr ToxA and Ptr ToxB, as necrotrophic effectors in the Pyrenophora tritici-repentis–wheat interaction. New Phytologist 187: 911–919.
  69. 69. Martinez JP, Oesch NW, Ciuffetti LM (2004) Characterization of the multiple-copy host-selective toxin gene, ToxB, in pathogenic and nonpathogenic isolates of Pyrenophora tritici-repentis. Molecular Plant-Microbe Interactions 17: 467–474.
  70. 70. Strelkov SE, Kowatsch RF, Ballance GM, Lamari L (2005) Characterization of the ToxB gene from North African and Canadian isolates of Pyrenophora tritici-repentis. Physiological and Molecular Plant Pathology 67: 164–170.
  71. 71. Strelkov SE, Lamari L, Sayoud R, Smith RB (2002) Comparative virulence of chlorosis-inducing races of Pyrenophora tritici-repentis. Canadian Journal of Plant Pathology 24: 29–35.
  72. 72. Amaike S, Ozga JA, Basu U, Strelkov SE (2008) Quantification of ToxB gene expression and formation of appressoria by isolates of Pyrenophora tritici-repentis differing in pathogenicity. Plant Pathology 57: 623–633.