Figures
Abstract
Alternative splicing (AS) is a key element of eukaryotic gene expression that increases transcript and proteome diversity in cells, thereby altering their responses to external stimuli and stresses. While AS has been intensively researched in plants and animals, its frequency, conservation, and putative impact on virulence, are relatively still understudied in plant pathogenic fungi. Here, we profiled the AS events occurring in genes of Cladosporium fulvum isolates Race 5 and Race 4, during nearly a complete compatible infection cycle on their tomato host. Our studies revealed extensive heterogeneity in the transcript isoforms assembled from different isolates, infections, and infection timepoints, as over 80% of the transcript isoforms were singletons that were detected in only a single sample. Despite that, nearly 40% of the protein-coding genes in each isolate were predicted to be recurrently AS across the disparate infection timepoints, infections, and the two isolates. Of these, 37.5% were common to both isolates and 59% resulted in multiple protein isoforms, thereby putatively increasing proteome diversity in the pathogen by 31% during infections. An enrichment analysis showed that AS mostly affected genes likely to be involved in the transport of nutrients, regulation of gene expression, and monooxygenase activity, suggesting a role for AS in finetuning adaptation of C. fulvum on its tomato host during infections. Tracing the location of the AS genes on the fungal chromosomes showed that they were mostly located in repeat-rich regions of the core chromosomes, indicating a causal connection between gene location on the genome and propensity to AS. Finally, multiple cases of differential isoform usage in AS genes of C. fulvum were identified, suggesting that modulation of AS at different infection stages may be another way by which pathogens refine infections on their hosts.
Author summary
Alternative splicing (AS) is a major source of transcriptome plasticity, proteome diversity, and phenotypic complexity in eukaryotes. Here, we analyzed the AS events happening in two isolates of the tomato pathogen Cladosporium fulvum, when infecting their host. We reveal an extensive infection-to-infection and isolate-to-isolate variation in the transcript isoforms assembled from transcribed pathogen genes, indicating that species-level inferences on AS cannot be reliably made based on single infections or isolates. Nonetheless, we found that AS is prevalent in pathogen genes and likely has a multifactorial effect on host infections, as a core set of genes that mostly encode transporters, transcription factors, and cytochrome P450 enzymes are recurrently AS in C. fulvum during infections. We finally show that genes in repeat-rich regions of the genome are more frequently affected by AS and that the pathogen may prime infections by the selective up- or downregulation of specific isoforms at different infection stages.
Citation: Zaccaron AZ, Chen L-H, Stergiopoulos I (2024) Transcriptome analysis of two isolates of the tomato pathogen Cladosporium fulvum, uncovers genome-wide patterns of alternative splicing during a host infection cycle. PLoS Pathog 20(12): e1012791. https://doi.org/10.1371/journal.ppat.1012791
Editor: Richard A. Wilson, University of Nebraska-Lincoln, UNITED STATES OF AMERICA
Received: July 8, 2024; Accepted: November 25, 2024; Published: December 18, 2024
Copyright: © 2024 Zaccaron et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The raw RNAseq reads were deposited in the NBCI’s sequence reads archive (SRA) under accessions SRR29437234-SRR29437254 for isolate Race 5, and SRR29424125-SRR29424145 for isolate Race 4. The assembled transcripts and their expression values were deposited in a public repository, available at https://zenodo.org/records/11176736.
Funding: This work was supported by the National Science Foundation Division of Integrative Organismal Systems (NSF-IOS) Award number 1557995 to IS. The funding agency had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Alternative splicing (AS) is a molecular process by which diverse mature mRNA molecules are produced by single genes, as a result of aberrant intron splicing during transcription [1]. There are five main types of AS events, including exon skipping (ES), mutually exclusive exons (MX), intron retention (IR), alternative 5’ or 3’ splice sites (A5/A3), and alternative first or last exons (AF/AL) [2]. A key outcome of AS is that it increases mRNA and protein diversity in cells, but it may further affect several other aspects of mRNA metabolism, thereby modulating gene expression at the post-transcriptional level [1]. Aberrant mRNA splicing, for instance, often results in isoforms that are the targets for nonsense-mediated mRNA decay (NMD), an mRNA surveillance system that detects and degrades prematurely terminated transcripts generated by errors in mRNA processing. AS variants can also regulate the abundance of functional transcripts via a mechanism known as regulated unproductive splicing and translation (RUST). It is thus not surprising that a strong correlation exists between AS and functional complexity in cells that affects many of their cellular, metabolic, and physiological processes, including responses to environmental stressors and other stimuli [3–5]. In humans, for instance, it is estimated that 95% of all genes undergo AS at different developmental stages of tissues [6,7], thereby often leading to hereditary diseases and cancers. Likewise, in plants, AS has been shown to have a significant functional impact in processes such as photosynthesis, flowering, abiotic stress responses, and defense against microbial attacks [8–17].
While the biological significance of AS in plants and mammals is well documented, AS in fungi is in contrast less investigated. Studies have shown that the frequency of AS varies considerably among fungal species and it is typically low compared to other eukaryotes, ranging from less than 1% in the baker’s yeast Saccharomyces cerevisiae to 48.9% of the genes in the biocontrol fungus Trichoderma longibrachiatum [18–21]. Despite such variations, evidence suggests that AS significantly impacts a plethora of cellular and physiological processes in fungi, including growth and development [22], response to environmental stressors [23–25], histone deubiquitination [26], gene expression [27,28], secretion of enzymes [29], subcellular localization of proteins [30,31], and others [18,19]. An increasing body of evidence also indicates that AS may affect virulence of pathogenic fungi and resistance to antifungal compounds [18–20,32–34]. For instance, in Sclerotinia sclerotiorum that causes white mold disease on an array of plant species, the landscape of AS events changes in accordance to the host plant that the pathogen is infecting [35], suggesting that it is important to host adaption. In the rice blast fungus Magnaporthe oryzae, the frequency of AS appears to increase during host infections, with many AS events putatively being upregulated [33]. A genome-wide analysis of AS in human pathogenic fungi also showed that AS is more frequent during stress and likely to play important regulatory roles during infection [32]. Collectively, such studies highlight that AS could be promoting virulence and adaptation of fungi to their hosts and environment.
Cladosporium fulvum (syn. Passalora fulva, Fulvia fulva) is a hemibiotrophic fungal pathogen that causes tomato leaf mold [36]. Over the last 40 years, this fungus has been a valuable model for the study of plant-microbe interactions [37,38]. Recently, high-quality chromosome-level genome assemblies were obtained for C. fulvum isolates Race 5 and Race 4 [39,40], thereby laying the ground for more in depth genomic and transcriptomic studies in this pathogen. Ensuing analyses revealed that the genome of C. fulvum is organized into 13 core and 2 dispensable chromosomes, is ~39% repetitive, and it is arranged in a ‘checkerboard’ pattern of gene-rich/repeat-poor regions, interspersed with gene-poor/repeat-rich regions, in accordance with the ‘two-speed genome’ model of evolution [41]. Although significant progress has been made in understanding the genomic architecture of C. fulvum [39,40], its transcriptome and particularly profile of AS during infections remained largely unexplored.
In this study, we sequenced the transcriptomes of isolates Race 5 and Race 4, and analyzed the landscape of AS events occurring in pathogen genes nearly throughout the infection process and in three independent tomato infections. This was done in order to assess the reproducibility of AS events and, based on their conservation across different isolates, infections, and infection stages, identify the ones that are most likely to be functionally relevant for virulence rather than being splicing noise. Our studies revealed a high frequency and dynamic spectrum of AS events taking place in the two isolates during host infections that is particularly affecting genes encoding major facilitator superfamily transporters, sugar transporters, transcription factors, and cytochrome P450 enzymes, but less candidate effectors, thereby suggesting a role for AS in adaptation of C. fulvum on its tomato host during infections.
Results
Transcriptome profiling of C. fulvum during compatible host infections reveals extensive transcript isoform heterogeneity among isolates and infections
The transcriptomes of C. fulvum isolates Race 5 and Race 4 were sequenced at high depth during their interaction with Solanum lycopersicum cv. Moneymaker at seven timepoints (2, 4, 6, 8, 10, 12, and 14 dpi) and from three independent infections (i.e. biological replicates) (Tables 1 and S1–S3 and S1 Text). The RNAseq reads obtained were first used to confirm that the previously predicted exon-intron structures in the intron-containing genes of isolates Race 5 and Race 4 [40] were well-supported by the data (S2 Text). An ensuing investigation into whether orthologous genes in isolates Race 5 and Race 4 had similar number and size of introns that further shared similar start and end coordinates, showed that from the 14,747 ortholog gene pairs, 14,739 (99.9%) pairs had the same number of introns in both orthologous genes (Fig 1A), 14,648 (99.3%) pairs had the same total size of intronic sequences (Fig 1B), and 14,729 (99.9%) had the same number of introns with the same start and end coordinates. This indicated a high conservation of gene structures (S2 Text), thereby enabling a comparative analysis of AS events in the two isolates.
(A) Number of introns predicted in the orthologous genes of isolates Race 5 and Race 4. (B) Total size of introns in base pairs (bp) between orthologous genes. Both scatter plots were generated by comparing 14,747 one-to-one orthologous gene pairs from the genome annotations of isolates Race 5 [39] and Race 4 [40].
The table shows numbers and percentages of RNAseq reads obtained from three independent infections (i.e. biological replicates; Rep.) and seven timepoints during the interaction between the C. fulvum and tomato that mapped to the genomes of C. fulvum isolates Race 5 and Race 4.
The RNAseq reads were next used to perform reference-based transcriptome assemblies and to further obtain a set of unique transcript isoforms from isolates Race 5 and Race 4 (S1 Fig and S4 Table and S3 Text). A total of 57,148 unique transcript isoforms could be assembled for the two isolates, 43,824 of which originated from genes of isolate Race 5 and 41,173 originated from genes of isolate Race 4. However, only 27,849 (49%) of them were common to both isolates, whereas 15,975 (28%) and 13,324 (23%) were unique to isolate Race 5 and Race 4, respectively (S2A Fig). This indicates that although similar numbers of unique transcript isoforms were generated from orthologous genes during infections, these varied substantially between the two isolates. A similar trend was also seen when examining the extent to which individual genes generated the same pool of transcript isoforms across different infections (i.e. biological replicates) (S2B and S2C Fig) or timepoints of the infection process (S3 Fig). Collectively, these results indicate that extensive heterogeneity in transcript isoforms exists among biological samples, which could be biologically driven from stochastic fluctuations in gene transcription and splicing events and/or due to technical noise in the data.
Further analysis of the 57,148 transcript isoforms that were assembled from the two isolates showed that most were assembled in only one sample, i.e. in one infection replicate of one timepoint of the interaction of one isolate with the host. Such singleton transcript isoforms accounted for 12,688 (79.4%) of the 15,975 transcript isoforms assembled from isolate Race 5, and 11,325 (85.0%) of the 13,324 transcript isoforms assembled from isolate Race 4, and they likely represented spurious or random transcriptional events that were therefore removed from the analyses. As a result, the total number of unique transcript isoforms reduced to 33,135, which included 31,136 and 29,848 transcript isoforms from isolates Race 5 and Race 4, respectively (S4A Fig). As expected, the removal of the singleton transcript isoforms from the dataset increased the percentage of transcript isoforms shared by the two isolates increased from 49% to 84%, and had a similar effect on the percentages of transcript isoforms supported by all three infection replicates in each isolate (S4B Fig) and at each timepoint (S5 Fig). Collectively, these results indicate that most isoforms generated by pathogen genes during infections are singletons that likely represent splicing or transcriptional noise of potentially minimal functional significance.
A core set of genes in C. fulvum are recurrently AS across different infections and isolates
To further increase the accuracy of our ensuing AS analysis, the 33,135 non-singleton transcript isoforms were filtered such that isoforms that shared all their intron coordinates (or lack of them) were considered as duplicates, and thus only the longest of them were kept. Following this filtering step, the total number of unique transcript isoforms was reduced to 33,559, including 26,818 transcript isoforms from isolate Race 5 and 26,397 from isolate Race 4. The transcript isoforms were then mapped to the genes of each isolate, and genes associated with more than one transcript isoform were now essentially considered as AS. This included 6,034 genes from isolate Race 5, or 40.3% of its total protein-coding genes, and 6,069 genes from isolate Race 4, or 40.5% of its total protein-coding genes (S5 Table). Among the AS genes, 5,611 genes, or 92.3% of the AS genes in isolate Race 5 and 92.5% of the AS genes in isolate Race 4, were common to both isolates, with 4,111 (73.3%) of them further generating the same number of transcript isoforms in each isolate (S6 Table). The percentage of AS genes shared between the two isolates is high, but it is likely to be somewhat inflated by the removal of the singleton transcript isoforms from the analysis. Even so, 37.4% (i.e. 5,611 of the 14,993 genes) and 37.7% (i.e. 5,611 of the 14,895 genes) of the protein-coding genes in isolates Race 5 and Race 4, respectively, were AS spliced and common to both isolates, suggesting that within the noise of the transcript heterogeneity that is created during infections, a sizeable core set of genes exists that are recurrently AS.
Inspection of the transcript diversity generated by the AS genes showed that most of the AS genes in isolates Race 5 (n = 4,590; 76.1%) and Race 4 (n = 4,669; 76.9%) produced two or three transcript isoforms, and only a small number of AS genes from isolates Race 5 (n = 55; 0.9%) and Race 4 (n = 45; 0.7%) produced larger numbers of 10 or more transcript isoforms (Fig 2A). An analysis of the AS events revealed similar patterns for isolates Race 5 and Race 4, with IR being the most frequent type of AS, and accounting for 71.1% and 70.7% of all classified AS events in isolates Race 5 and Race 4, respectively (Fig 2B and 2C and S7 Table). Collectively these results indicate similar patterns of AS events in the two isolates and that although many genes are predicted to be AS, they overall generate a low number of transcript isoforms.
(A) Bar chart showing the number of AS genes from isolates Race 5 and Race 4 (y-axis), producing two or more transcript isoforms (x-axis). (B) Bar chart showing the number of AS events in isolates Race 5 and Race 4 (y-axis), classified in one of the major types of AS (x-axis). (C) Bar chart showing the number of AS genes in isolates Race 5 and Race 4 (y-axis), classified in one of the major types of AS (x-axis). In panels B and C, types of AS events shown are alternative 5’ or 3’ splice sites (A5/A3), alternative first or last exons (AF/AL), mutually exclusive exons (MX), intron retention (IR), and exon skipping (ES).
We next investigated whether AS in pathogen genes during host infection differentially affected different gene categories, by performing a functional enrichment analysis based on conserved PFAM domains, gene ontology (GO) terms, and functional gene categories. Among the 6,034 and 6,069 AS genes in isolates Race 5 and Race 4, respectively significant overrepresentation (adjusted p-value< 0.01) of PFAM domains was observed for major facilitator superfamily (MFS) and sugar transporters, TFs of the fungal Zn(2)-cys(6) family, and cytochrome P450 enzymes (Fig 3A and 3B). Accordingly, among all the AS genes in both isolates, the most significantly enriched biological function GO terms were transmembrane transport (GO:0055085), regulation of transcription by RNA polymerase II (GO:0006357), carbohydrate transport (GO:0008643), and oxidoreductase activity (GO:0016491) (Fig 3C and S8 Table). Finally, based on hypergeometric tests, AS genes predicted to encode transporters (p-value< 1e-30), secreted proteins (p-value< 1e-8), cytochrome P450 enzymes (p-value< 1e-6), carbohydrate-active enzymes (CAZymes; p-value< 1e-6), and proteases (p-value< 0.01) were significantly overrepresented among all AS genes in both isolates (Table 2). In contrast, no significant enrichment or depletion of genes encoding candidate effectors was detected among the pool of AS genes in isolate Race 5 and/or Race 4. Collectively, these results indicate that AS in C. fulvum during host infections occurs more frequently in genes likely to be involved in the transport of sugars or other carbohydrates, regulation of genes, and monooxygenase activity, but less frequently in genes encoding proteins that are directly involved in modulation of host-immunity, such as effectors.
(A-B) Dot plots showing the conserved PFAM domains that are significantly enriched among the AS genes in isolates Race 5 and Race 4. The size of the dots corresponds to the number of AS genes containing the respective PFAM domain. The x-axis shows what proportion of AS genes containing the respective PFAM domain contributes to the total of AS genes containing a conserved PFAM domain. Dots are color-coded based on enrichment p-values adjusted using the Benjamini–Hochberg method. (C) Bar charts showing p-values of gene ontology (GO) terms from the classes of biological process and molecular function that are enriched among the genes predicted to undergo AS in isolates Race 5 and Race 4. The x-axis indicates the enrichment p-values in negative log scale.
The table shows the total number of genes and of AS genes in C. fulvum isolates Race 5 and Race 4 that are classified in the different gene functional categories. Enrichment p-values were obtained with hypergeometric tests.
AS genes are more abundantly present in repeat-rich chromosomes and exhibit longer upstream intergenic regions
We have previously determined that the genome of C. fulvum shows a compartmentalized architecture composed of gene-dense/repeat-poor regions interspersed with gene-sparse/repeat-rich regions [39,40]. We therefore examined whether the frequency and type of AS events was affected by the presence of the genes in one or the other region type. An inspection of the distribution of AS genes on the different chromosomes, showed that the number of AS genes varied among the core chromosomes, ranging between 33.9% and 33.6% of the total genes present in Chr4 of isolates Race 5 and Race 4, respectively to 45.7% and 45.8% of the total genes present in Chr1 of isolates Race 5 and Race 4, respectively (Figs 4A and S6 and S9 Table). Interestingly, in the dispensable Chr14, which is present in isolate Race 5 but absent in isolate Race 4, only 25.0% of the genes were AS. Overall, a positive correlation (Person correlation coefficient r = 0.63) was observed between abundance of AS genes and repetitive DNA content in the 13 core chromosomes (Fig 4B). For instance, in Chr3 which has the highest repetitive DNA content among core chromosomes (62%), 43.8% and 44.6% of the genes in isolates Race 5 and Race 4, respectively were AS during host infections. In contrast, in Chr13 which has the lowest repetitive DNA content (36%), only 36.9% of the genes in both isolates were AS. These results indicate that AS is more prevalent in genes located in repeat-rich regions of the genome and that dispensable chromosomes carry less AS genes.
(A) Bar chart showing the percentage of AS genes (y-axis) present in each of the chromosomes (x-axis) of the reference genome of isolate Race 5. (B) Scatter plot showing that chromosomes with an overall higher repeat content typically contain higher percentages of AS genes, except for the dispensable chromosome Chr14 that is absent in isolate Race 4. Each point represents a chromosome, and points are color-coded to distinguish isolate Race 5 from isolate Race 4. (C) Violin plots showing the size distribution of the intergenic regions from AS and non-AS genes. The violin plots show that the upstream intergenic regions of AS genes are significantly longer compared to non-AS genes. P-values were obtained with the Wilcoxon rank sum test.
Next, we investigated whether AS also affects more frequently genes present in gene-poor regions that are typically characterized by long intergenic regions and high amounts of repetitive DNA. To do so, we compared the distribution of the sizes of the intergenic regions for AS and non-AS genes. For both isolates Race 5 and Race 4, the upstream intergenic regions of AS genes (Race 5: mean = 3,911 bp, median = 759 bp; Race 4: mean = 3,873 bp, median = 762 bp) were significantly longer (Race 5: p-value< 2E-12; Race 4: p-value< 3E-13) compared to the upstream intergenic regions of non-AS (Race 5: mean = 3,290 bp, median = 684 bp; Race 4: mean = 3,313 bp, median = 682 bp) (Fig 4C). In contrast, no significant differences were observed when comparing the size of downstream intergenic regions of AS genes and non-AS genes in both isolates (Fig 4C). These results indicate that the upstream intergenic regions of AS genes, which include promoter and other cis-regulatory gene regions, were significantly longer compared to genes with no evidence of AS. Because long intergenic regions are in C. fulvum almost always associated with high repetitive DNA content [39], the amount of repetitive DNA was investigated. Indeed, the average repetitive DNA content of the upstream intergenic regions of AS genes was significantly (p-value< 0.01) larger in both isolates compared to non-AS genes (S7 Fig), but no significant difference in repetitive DNA content was observed for the downstream intergenic regions (S7 Fig). Finally, significant differences were observed between AS genes and non-AS genes with respect to their physical characteristics. Specifically, AS genes had an overall lower GC content, were longer, had more conserved PFAM domains, and had shorter exons compared to non-AS genes (S8 Fig and S10 Table). These observations suggest that gene structure influences AS frequency, potentially by facilitating secondary RNA structures and providing more opportunities for splicing events. The presence of more conserved domains in AS genes may also indicate their functional importance and evolutionary conservation.
AS may putatively increase protein diversity in C. fulvum during tomato infections
AS has the potential to increase protein diversity in cells when it affects gene coding sequences, as opposed to AS events in 3’ or 5’ untranslated regions (UTRs). To investigate the extent to which AS theoretically increased protein diversity in C. fulvum during infections, ORFs in the assembled transcript isoforms of isolates Race 5 and Race 4 were predicted. A total of 26,632 and 26,108 ORFs could be predicted from the 26,818 and 26,397 unique transcript isoforms that were assembled from isolates Race 5 and Race 4, respectively. The sequences of the translated ORFs were subsequently organized into clusters with cd-hit such that identical or fully contained sequences were grouped together and each cluster represented a unique protein isoform. By doing so, a total of 19,757 and 19,551 protein isoforms were identified in isolates Race 5 and Race 4, respectively. These numbers are 31% higher than the predicted number of protein-encoding genes in the genomes of isolates Race 5 (n = 14,993) and Race 4 (n = 14,895) [40], suggesting that AS could theoretically increase protein diversity in C. fulvum during host infections.
From the 6,034 and 6,069 AS genes in isolates Race 5 and Race 4 respectively, 3,545 (58.7%) and 3,554 (58.5%) genes were predicted to produce distinct protein isoforms (S11–S13 Tables), indicating that the rest ~41% of the AS genes had splicing events in non-coding sequences. When considering this data in view of the entire genome, then only 23.6% to 23.9% of the total protein-coding genes in isolates Race 5 and Race 4, respectively contributed through AS to proteome diversity. A functional enrichment analysis indicated that genes encoding transporters (p-value< 4E-13), secreted proteins (p-value< 8E-9), and to a lesser extent CAZymes (p-value< 8E-4) and cytochrome P450 enzymes (p-value< 3E-3), were overrepresented in the pool of AS genes producing multiple protein isoforms in isolates Race 5 or Race 4 (S14 Table). For instance, 432 (12.2%) and 424 (11.9%) of such genes in isolates Race 5 and Race 4, respectively encoded secreted proteins (S11–S13 Tables). Interestingly, there were 134 effector-encoding genes among the AS genes yielding multiple protein isoforms, although on a genome-wide level no enrichment for genes encoding candidate effectors was observed (S14 Table). Included among the effector genes were the previously described Ecp1 (S9 Fig) [42], Ecp5 (S10 Fig) [43], Ecp6 (S11 Fig) [44], and Ecp12 effectors (S12 Fig) [45]. In particular, AS events modified the mature proteins of Ecp1, Ecp6, and Ecp12, while only the signal peptide was affected by an AS event in the gene encoding Ecp5. Analysis of isoform expression during interaction with tomato revealed that, in both Race 5 and Race 4 isolates, the longest isoform from Ecp5 carrying 17 additional amino acids in the signal peptide, was the one that was preferentially expressed. Similarly, the shortest isoform from Ecp6 that is missing six amino acids between the signal peptide and the first LysM conserved domain, was preferentially expressed. In contrast, none of the isoforms from Ecp1 and Ecp12 were clearly preferentially expressed during infection.
The vast majority of the AS genes producing multiple protein isoforms (i.e. 2,498 or 70.5% of the AS genes in isolate Race 5 and 2,540 or 71.5% of the AS genes in isolate Race 4) were predicted to yield just two isoforms (Fig 5A), and only 117 (3.3%) of the AS genes in isolate Race 5 and 134 (3.8%) of the AS genes in isolate Race 4 were predicted to encode five or more distinct protein isoforms (Fig 5A and S11–S13 Tables). Likewise, of the 2,950 AS genes shared by both isolates, 2,207 (74.8%) were predicted to yield a similar number of distinct protein isoforms, indicating a similar contribution to proteome diversity (Fig 5B and S11 Table). Two notable examples of such AS genes are CLAFUR5_09979 and CLAFUR5_09583, which were predicted to yield 13 and 10 distinct protein isoforms in both isolates Race 5 and Race 4 (S13 Fig), and which encode putative TFs with homology to the CON7 TF required for appressorium formation and pathogenicity of M. oryzae [46,47], and to the ascospore maturation 1 protein (Asm-1) TF that regulates sexual and asexual reproduction in Neurospora crassa [48], respectively.
(A) Bar chart showing the number of AS genes in isolates Race 5 and Race 4 producing 0, 1, or more distinct protein isoforms (x-axis). (B) Scatter plot showing that a positive correlation exists among pairs of orthologous AS genes between isolates Race 5 and Race 4 that produce similar numbers of distinct protein isoforms. Each point represents a gene, and the plot shows a total of 6,493 AS genes in either isolate Race 5 or Race 4. (C-D) Scatter plots showing the diversity of protein isoforms produced by AS genes in isolate Race 5 (panel C) and Race 4 (panel D). Each point represents a pairwise alignment between the different protein isoforms that are produced by a single AS gene. Only AS genes predicted to yield two or more distinct protein isoforms are shown in the scatter plots. The y-axis shows the percent amino acid identity among the protein isoforms, and the x-axis shows the differences in size between the aligned protein isoforms as the percentage of the longest aligned isoform. Alignments were generated based on the local-global strategy, which is based on aligning the longest sequences locally and the shortest sequences globally.
We finally examined the extent to which AS led to the gain or loss of conserved domains or of a signal peptide (SP) in the yielded protein isoforms (S11–S13 Tables). A total of 1,664 AS genes in isolate Race 5 and 1,841 AS genes in isolate Race 4 produced multiple protein isoforms that showed presence/absence variation in their PFAM domains (S15 Table). PFAM domains that varied the most among the protein isoforms were the major facilitator superfamily domain (PF07690; 67 AS genes in Race 5 and 81 AS genes in Race 4), the fungal Zn(2)-Cys(6) binuclear cluster domain (PF00172; 78 AS genes in Race 5 and 74 AS genes in Race 4), the short chain dehydrogenase domain (PF00106; 41 AS genes in Race 5 and 48 AS genes in Race 4), and the fungal specific TF domain (PF04082; 33 AS genes in Race 5 and 36 AS genes in Race 4). Along the same lines, 355 AS genes in Race 5 and 362 AS genes in Race 4 (intersection size = 270) yielded isoforms with presence/absence variation of SP (S15 Table). Included among these were 94 CAZyme and 76 effector encoding genes, including the previously described Ecp30, Ecp42, Ecp53-1, Ecp33, and Ecp10-3 effectors [45].
Differential isoform usage and isoform switching across different infection timepoints are mostly isolate-specific
A differential isoform usage (DIU) analysis was performed in order to detect statistically significant changes in the usage of the different transcript isoforms produced by the AS genes in isolates Race 5 and Race 4 during the course of tomato infection. Such changes could be functionally relevant, as they might signify the preferential production of isoforms with varying functional potential at different stages of the infection. For this purpose, the expression values of the 26,818 and 26,397 transcript isoforms assembled from isolates Race 5 and Race 4 were first estimated for each of the seven timepoints of the infection process that were sampled. Next, DIU analysis was carried out between all possible pairwise comparisons among the seven timepoints.
A total of 401 transcripts from 246 genes of isolate Race 5 and 166 transcripts from 103 genes of isolate Race 4, showed significant (p-value< 0.01) changes in their abundance across different timepoints of the infection (Fig 6A and 6B). Of the AS genes with significant DIU, 111 genes in isolate Race 5 and 42 genes in isolate Race 4 had AS events affecting their ORF, and thereby putatively yielded protein isoforms with altered levels of abundance during disease progression (S16 Table). Most of these genes (i.e. 76 and 34 genes in isolates Race 5 and Race 4, respectively) produced just two protein isoforms, but 35 genes in isolate Race 5 and seven genes in isolate Race 4 produced more than three isoforms. The majority also of the AS genes with DIU across the infection process encoded hypothetical proteins (i.e. 55 in isolate Race 5 and 26 in isolate Race 4), but 17 and 11 genes in isolate Race 5 and Race 4, respectively encoded for secreted proteins of which 2 and 2, respectively were candidate effectors (Fig 6C and 6D). Finaly, only 17 genes with DIU at the transcript level during disease progression were common to both isolates (S14 and S15 Figs and S16 Table) and only five had splicing events in their coding sequences and could thereby yield multiple protein isoforms. These five genes encoded a tRNA (uracil-O(2)-)-methyltransferase (CLAFUR5_01082), a lactose permease (CLAFUR5_11255), and three hypothetical proteins (CLAFUR5_08245, CLAFUR5_09805, CLAFUR5_20329), and resulted in the production of two protein isoforms each in both isolates Race 5 and Race 4 (CLAFUR5_01082, CLAFUR5_08245, CLAFUR5_20329), or in a different number of protein isoforms in the two isolates (CLAFUR5_11255, CLAFUR5_09805).
(A-B) Transcript usage and relative expression of the 401 transcripts from 246 genes of isolate Race 5 (panel A) and 166 transcripts from 103 genes of isolate Race 4 (panel B) with changes in abundance, when originating from the same gene, during different timepoints of the infection. In both panels A and B, the left hand-side heat map shows transcript usage as the fraction of the sum of the expression of all transcripts from the gene and considering the average expression values of three replicates. The heat map on the right hand-side shows the expression in transcripts per million (TPM) for the transcripts. (C-D) Examples of AS genes with DIU during the infection, encoding candidate effectors from C. fulvum isolates Race 5 and Race 4. The line graphs show two AS genes (i.e. CLAFUR5_11054, CLAFUR5_14663) from isolate Race 5 (panel C) and two AS genes (i.e. CLAFUR5_11499, CLAFUR5_12536) from isolate Race 4 (panel D). In the line graphs, the points represent the expression values in TPM (transcripts per million) of the individual transcripts at different timepoints of the infection. Standard deviation in the TPM values from three infections (i.e. biological replicates) is shown as vertical lines. The trends of transcript expression across time are shown as thick lines connecting the average TPM values for each individual transcript. The exon/intron structures of the transcripts are shown at the bottom of each line graph, with the predicted coding sequences represented as thicker boxes.
In addition to the DIU analysis, an isoform switching (IS) analysis was also carried out to identify pairs of transcript isoforms switching in their relative abundance during host infection. If such switches are present, then they would appear as an intersection in the transcript abundance of the different isoforms during disease progression. The analysis revealed a total of 193 and 123 pairs of transcript isoforms from 166 and 105 genes in isolates Race 5 and Race 4, respectively that significantly (p-value< 0.05) switched their expression values during host infection (S17 and S18 Tables). Of these, 13 (7.8%) genes in isolate Race 5 encoded secreted proteins and 2 (1.2%) encoded candidate effectors, while 17 (16.2%) genes in isolate Race 4 encoded secreted proteins and 7 (6.6%) candidate effectors. Overall, only 10 genes exhibited cases of IS in both isolates, which included the meiosis protein mei2 (CLAFUR5_07241), a polyubiquitin (CLAFUR5_08768), two hypothetical secreted proteins (CLAFUR5_06894, CLAFUR5_14590), an alcohol dehydrogenase (CLAFUR5_01545), an (R,R)-butanediol dehydrogenase (CLAFUR5_06495), a protoporphyrin uptake protein (CLAFUR5_11018), and three other hypothetical proteins (CLAFUR5_08576, CLAFUR5_10847, CLAFUR5_12639). Other genes with unusual isoform expression patterns include CLAFUR5_10677, which encodes a predicted 2-oxoglutarate-dependent ethylene/succinate-forming enzyme. The two AS isoforms of this gene exhibited contrasting expression trends between isolates Race 5 and Race 4, as in Race 5 their expression decreased over time, while in Race 4 it increased (S16 Fig). Another gene with contrasting isoform expression patterns is CLAFUR5_03540, which encodes a putative fructose-bisphosphate aldolase. This gene produces three AS isoforms and while expression of one of its isoforms showed a S-shaped pattern in isolate Race 5, it exhibited a V-shaped pattern in isolate Race 4 (S16 Fig).
It is currently unknown whether IS in these genes or genes with DIU has any functional consequences for infections or if they represent transcriptional noise. However, given the low number of genes with DIU or IS common to the two isolates, it can be assumed that their potential impact of on host infections may be isolate-specific rather than at the species level.
Discussion
In this study, we systematically analyzed the landscape of AS events occurring in genes of two C. fulvum isolates, i.e. Race 5 and Race 4, during a complete fungal infection cycle on the tomato host. Our transcriptomic analyses revealed a significant degree of heterogeneity in the transcript isoforms assembled from different isolates, infections, and infection timepoints, suggesting that the majority are the result of stochastic noise in the transcriptional and splicing machinery. However, given the dynamic nature of AS in cells and the complexity of the splicing machinery, stochastic fluctuations in the splicing output are to an extent expected [49]. Other factors that are difficult to control, such as inaccuracies in sequencing and transcriptome assembly, cell-to-cell variability in splicing and transcription kinetics, and others [50–52], could contribute to the transcriptional noise and the discrepancies observed in the splicing outputs as well. Despite the high sample-to-sample heterogeneity in the assembled transcript isoforms, our studies showed that ~40% of the protein-coding genes in each of the two isolates of C. fulvum were AS in more than one sample, with ~37.5% of them being AS in both isolates Race 5 and Race 4. This indicates that a sizeable fraction of C. fulvum genes are recurrently AS across different infections and/or isolates, suggesting that splicing in these cases may be functionally relevant. The percentage of AS genes in C. fulvum is also very high compared to reports in other fungal pathogens, which typically have less than 30% of their genes undergoing AS [18,19,21,33,49,53]. Yet, most studies, including this one, are in agreement that IR is the most frequent type of AS in fungi [21,29,33,35,49,54] and in contrast to mammals, in which ES is typically the prevalent type of AS [55].
It was previously shown that the genome of C. fulvum is rich in TEs and exhibits a bipartite architecture that resembles the ‘two-speed genome’ model of evolution, with candidate effector genes enriched in TE-rich regions [39]. Here, we found that AS genes were more frequently present in repeat-rich chromosomes and have significantly longer upstream intergenic regions with higher repetitive DNA content. An exception the dispensable Chr14, as the majority of the genes in this chromosomes are transcriptionally inactive during interaction with tomato [40]. The insertion of TEs in intergenic or intronic regions has been previously associated with changes in AS patterns in plants and humans [56–59], and it is thus plausible that a connection exists between TEs and the induction of AS in genes of C. fulvum as well. It would be interesting to examine whether this applies more generally to other fungal species and the extent to which AS associates with the organization of their genomic content.
Recently, a comparative analysis of AS in seven human fungal pathogens showed that genes subjected to AS during host infections were mostly associated with the functionality of the cell membrane, whereas AS under environmental stress conditions mainly affected genes with diverse regulatory functions [32]. Likewise, in the rice-blast fungus M. oryzae, it was shown that genes that were AS during infections were mostly enriched for TFs and phospho-transferases (35). These and various other studies have highlighted that different gene categories are differentially affected by AS in response to similar stresses, and that the complement of genes undergoing AS in fungi varies significantly among species [19,20,32]. In our studies, we found that in C. fulvum, AS during host infections frequently affects genes encoding TFs, suggesting that it may have a predominantly regulatory effect by reprogramming gene expression. The genes CLAFUR5_09583 and CLAFUR5_09979 were notable examples, as they encode orthologs of the ASM-1 [48] and CON7 [46] TFs, respectively and each is predicted to yield ten or more distinct protein isoforms. Recently, CON7 was shown to be a key transcriptional regulator in Fusarium graminearum, affecting genes involved in conidiation, sexual development, virulence, and vegetative growth [60], whereas ASM-1 is shown to affect morphogenesis (e.g. conidiation) and development in several fungal species [61]. The contribution of CLAFUR5_09583 and CLAFUR5_09979 in virulence of C. fulvum is unknown, but since both genes were AS in both isolates during infection and were predicted to yield the same number of protein isoforms, it is possible that some of the produced isoforms are biologically meaningful for infections. Overall, the prevalence of splicing in TF-encoding genes supports that AS could have a role in C. fulvum in modulating infections on its tomato host.
Other functional gene categories enriched among AS genes in C. fulvum were MFS transporters, sugar transporters, and cytochrome P450 enzymes. The functional significance of AS in these gene categories is perhaps less clear but given their general involvement in nutrition, metabolism, and cellular detoxification processes, it may imply that AS in these genes promotes adaptation to the host environment. MFS transporters form the largest superfamily of secondary active transporters that collectively transport a broad spectrum of substrates, thereby participating in diverse physiological processes and stress responses, including nutrient acquisition, resistance against oxidative stress and xenobiotic compounds, secretion of endogenously produced toxins, and others [62–65]. Likewise, fungal sugar transporters are involved in the uptake of small plant-derived sugar molecules [66] and they may further have functions in sugar sensing, carbon catabolite repression, and utilization of the most favorable carbon source in the environment [67–70]. Finally, cytochrome P450 monooxygenases are a diverse superfamily of proteins known to be involved in cellular metabolism, xenobiotic detoxification, synthesis of toxins [71,72], and other metabolic processes of relevance to infections [73–75]. Collectively, the above suggest that in C. fulvum, AS in MFS transporters, sugar-like transporters, and cytochrome P450 enzymes during tomato infections may offer a means to augment and fine-tune virulence on the host at multiple levels, including metabolic adaptation to the host’s carbon and nutrient environment, protection against oxidative stress and plant defense compounds, production of toxins during pathogenesis, and others. However, despite the possible contribution of these gene families to infections, in contrast, genes encoding proteins that are directly involved in modulation of host-immunity, such as effectors, were less frequently affected by AS. Though, caution should be drawn here as many effectors are typically expressed at very early stages of the infection process when fungal biomass is still very limited and the detection of fungal transcripts in the sequenced samples challenging. Therefore, it is perhaps not surprising that genes involved in various metabolic processes were more readily detected in the pool of AS genes, as compared to genes encoding effectors.
By comparing the expression of pathogen-derived transcripts at seven timepoints during host infections, we identified several cases of DIU in AS genes of C. fulvum. Most of these genes encoded hypothetical proteins, but a few encoded for effector suggesting that infection stage-dependent modulation of AS in effector-encoding genes could possibly prime infections of the host by the selective production or downregulation of specific functional isoforms of the effectors. Finally, ~41% of the AS genes in C. fulvum had splicing events in their 5’ or 3’ UTRs, thereby producing just a single protein isoform. Such splicing events, although they do not increase protein diversity, increase the functional diversity in the 5’ or 3’ UTRs that could lead to significant alterations in gene expression, protein translation and localization [76–80]. Collectively, the high frequency of AS events in the 5’ or 3’ UTRs of C. fulvum genes during infections may suggest an additional level of post-transcriptional control of the infection process that so far remains largely unexplored.
Materials and methods
Inoculations of tomato plants with C. fulvum isolates
Cladosporium fulvum isolates Race 5 [81] and Race 4 [82] were grown in half-strength potato dextrose agar (PDA) for two weeks at 23°C. Tomato plant (Solanum lycopersicum cv. Moneymaker) inoculations were performed as previously described [83,84]. Briefly, C. fulvum spores were collected from two-week-old PDA plates growing at 25°C in the dark. Moneymaker tomato plants were grown for six weeks in a growth chamber with 16 hr light, 70% humidity at 25°C, and 8 hr dark, 90% humidity at 23°C. 106/ml spores were suspended in 10% potato dextrose broth (PDB) and sprayed on the lower leaf side of ten six-week-old tomato plants. The inoculated plants stayed in the dark, with 98% humidity for the first two days. After that, they were returned to the 16/8 hrs light/dark conditions as indicated above. Two infected leaves from each inoculated plant were harvested at 2, 4, 6, 8, 10, 12, and 14 dpi. Collected samples were immediately frozen in liquid nitrogen and stored at -80°C until RNA extraction. The inoculation was repeated three times under the same conditions.
RNA extraction and sequencing
Samples were ground to a powder in liquid nitrogen, and total RNA was extracted using Trizol (Invitrogen, Carlsbad, CA, USA). RNA quality was measured using a Qubit fluorometer (Life Technologies, New York, NY, USA) and the Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, USA). Preparation and sequencing of the polyA-selected RNAseq libraries were outsourced to the DNA Technologies and Expression Analysis Core Laboratory at the UC Davis Genome Center (https://dnatech.genomecenter.ucdavis.edu/). Libraries were sequenced on an Illumina NovaSeq 6000 instrument (PE150 format).
Comparison of introns and splicing sites
The number of introns supported by RNAseq reads in C. fulvum Race 4 was obtained by analyzing the splicing junction table generated by STAR v2.7.9a, after mapping the RNAseq reads to the genome of C. fulvum Race 4 [40]. To estimate conservation of number and size of introns, one-to-one pairs of orthologous genes from C. fulvum Race 5 and Race 4 were obtained with OrthoFinder v.2.5.4 [85]. Number and size of introns of orthologous genes were then investigated with support of the script agat_sp_add_introns.pl from AGAT v1.2 package [86] to add introns to the gene annotation files. To investigate whether the ortholog genes share the same intron start and end coordinates, the gene annotation of isolate Race 4 was mapped to the genome of isolate Race 5 using Liftoff v1.6.3 [87]. This resulted in a new annotation with coordinates of genes, exons, and introns from isolate Race 4 in the genome of isolate Race 5. The new annotation generated by this procedure allowed direct comparison of the intron coordinates based on the reference genome of isolate Race 5.
Estimation of percentages of RNAseq reads from C. fulvum and tomato genomes
Prior to read mapping, the RNAseq reads were processed with the script bbduk.sh from BBMap package v38.90 [88] to remove remaining adapter sequences (parameters: ktrim = r, k = 23, mink = 11, hdist = 1, minlength = 40, tpe, and tbo). To quantify and remove reads that originated from tomato, the trimmed reads were processed with seal.sh from BBMap package v38.90 (parameters: ambiguous = toss, and k = 27), using as reference the genomes of C. fulvum Race 5 [39] (GenBank accession GCA_020509005.2) and Solanum lycopersicum version SL4.0 [89]. Briefly, RNAseq reads that had better match to the S. lycopersicum genome instead of the C. fulvum genome and reads that were equally well matched to both genomes, were filtered out. RNAseq reads that had a better match to the C. fulvum genome and reads considered unmatched to any genome were used for downstream analysis.
Read mapping and transcriptome assembly
The filtered RNAseq reads were mapped to the C. fulvum Race 5 genome with STAR v2.7.9a [90] in 2-pass mode (parameters:—twopassMode Basic,—alignIntronMin 20,—alignIntronMax 2000,—outSAMtype BAM SortedByCoordinate). The mapped reads were assembled into full length transcripts with StringTie v2.1.7 [91], using the gene annotation of C. fulvum Race 5 as reference (parameter: -G). Because the genome of C. fulvum Race 5 has many genes physically close to each other, with median intergenic region size of only 646 bp [39], overlapping of untranslated regions (UTRs) between neighboring genes is common, which can result in chimeric transcripts during assembly. To minimize this issue, a gene-by-gene transcriptome assembly strategy was utilized. This strategy consisted of assembling the transcripts for each individual sample by extracting the RNAseq reads from the respective sample mapped to the annotated gene space using SAMtools v1.9 [92], and then assembled into full length transcripts using StringTie, which is considered as one of the best reference-based transcriptome assemblers, despite showing overall low precision levels of between 29% and 59% at the transcript level [93,94]. By doing so, transcripts for each gene were obtained for each sample (7 timepoints x 3 replicates x 2 isolate = 42). To facilitate downstream analyses, assembled transcripts were assigned IDs that contained the name of the sample from which they originated.
Detection of alternative splicing genes
Using the GTF files of the assembled transcripts and the reference genomes of C. fulvum Race 5, transcript sequences were extracted with gffread v0.12.7 [95]. The sequences were then clustered with cd-hit v4.8.1 (parameters: -T 8 -M 2048 -c 1 -d 0) [96], such that identical or fully contained transcripts were organized into the same cluster. Thus, each cluster represents a unique transcript. A table with the clusters were obtained with the script cluster2txt that comes with cd-hit. The organization of transcripts into clusters allowed to identify whether the cluster included transcripts assembled using RNAseq reads from a specific biological replicate from a timepoint. Specifically, the representative transcript t of the cluster c was considered present in a sample s if there was at least one transcript u assembled using reads from s such that u was present in c. This strategy allowed the identification of transcripts supported by multiple replicates, and whether they were shared or unique to the isolates Race 5 and Race 4. Because transcripts supported by only one sample are likely random transcriptional events, only transcripts supported by at least two samples, i.e., transcripts that could be replicated, were considered for downstream analyses. After that, one additional filtering step was applied. First, the script agat_sp_add_introns.pl from AGAT v1.2 package [86] was used to add intron coordinates to the GTF files containing the assembled transcripts. Then, the intron coordinates were compared among isoforms using a custom bash script. The isoforms containing the same number of introns and the same coordinates were considered duplicates, and only the longest one was kept. Finally, genes were considered undergoing AS, if they encoded at least two distinct transcripts that remained after the filtering steps. The steps to assemble the transcripts and identify AS genes are summarized in S17 Fig.
Classification of AS types and gene enrichment
AS events were classified into Skipping Exon (SE), alternative 5’/3’ Splice Sites (A5/A3), mutually exclusive exons (MX), intron retained (IR), and alternative first/last exons (AF/AL) with the command generateEvents from SUPPA v2.3 [97] with default settings, except for IR events, which were identified with the "variable boundary" parameter (-b V) set to 50 to relax the restrictive default behavior of SUPPA2 to identify IR events. Gene enrichments were performed for conserved PFAM domains, GO terms, and functional gene categories. Enrichment for PFAM domains was conducted with clusterProfiler v.4.6.2 [98] with Benjamini-Hochberg adjusted p-value threshold of 0.05 based on PFAM domains identified with InterProScan v5.59–91.0 [99]. Enrichment for GO terms was performed with topGO v2.52 [100] with p-value threshold of 0.01 based on GO terms identified with PANNZER2 [101], using minimum Positive Predictive Value (PPV) of 0.4. Enrichment of functional gene categories was performed with hypergeometric tests using the R function phyper based on the lists of gene category reported previously [40]. All three types of enrichment were conducted within R v4.3.1.
Prediction of ORFs and functional impact of AS
Before predicting ORFs within the transcripts, the transcripts in GFF format obtained from Race 4 using Race 5 genome as reference, were mapped to the genome of Race 4 using liftoff v1.6.3 [87] with default settings. By doing so, a new GFF file with transcripts coordinates in the genome of Race 4 was obtained. The nucleotide sequences of the transcripts from Race 5 and Race 4 were extracted using gffread v0.12.7 [95]. ORFs were predicted with ORFanage v1.2.0 [102] using parameters adjusted to finding ORFs of at least 120 bp that best matched the gene annotation (parameters:—best and—minlen 120). Transcripts with no predicted ORF were further processed with ORFfinder v0.4.3 with parameters adjusted to predict ORFs of at least 120 bp and starting only with ATG (parameters: -s 0, -ml 120, -strand plus). Only the longest ORFs predicted by ORFfinder were retained. For each isolate Race 5 and Race 4, the predicted protein sequences were clustered with cd-hit v4.8.1 [96] such that identical or fully contained protein sequences were present in the same group (parameter -c 1). Genes encoding distinct proteins were identified based on the cd-hit results. Specifically, the protein sequences encoded by the isoforms of a gene grouped in distinct cd-hit clusters, then the gene was considered to encode distinct proteins. To investigate to what extent protein sequences encoded by isoforms of the same gene differ in amino acid sequence, the protein sequences encoded by isoforms were aligned in a pairwise manner using the pairwiseAlignment function within the R package Biostrings v2.68.1 [103] using the “local-global” alignment strategy, such that the gaps at the end of the alignment do not penalize the alignment score. To investigate gain or loss of conserved motifs among isoforms, conserved PFAM domains within the protein sequences were identified with InterProScan v5.59–91.0 [99]. Presence/absence variation of PFAM domains among protein isoform was obtained by analyzing the output of InterProScan using a custom R script within R v4.3.1. Proteins with signal peptide were predicted using SignalP v6 [104]. AS genes encoding at least one protein with a predicted signal peptide and at least one protein without a signal peptide were considered to exhibit gain or loss of signal peptide. Functional annotations of protein isoforms were obtained by querying the protein sequences against the SwissProt database [105] with BLASTp (parameters: -evalue 1e-10 -outfmt 6 -num_alignments 1 -max_hsps 1). Protein isoforms encoding CAZymes were identified with dbCAN3 [106]. Candidate effectors were identified as described in Zaccaron and Stergiopoulos, 2024 [40]. The 3D structures of encoded mature proteins of the isoforms of the effectors Ecp1, Ecp5, Ecp6, and Ecp12 were predicted with ColabFold v1.5.5 [107] using AlphaFold2 and MMseqs2.
Transcript quantification and differential isoform usage (DIU)
To estimate expression levels of the assembled transcripts, first their nucleotide sequences were extracted with BEDtools v2.29.0 [108] and used to build an index with Salmon v1.10.0 [109] (parameter—keepDuplicates). After that, Salmon v1.10.0 was used in mapping-based mode (parameters: -l IU,—gcBias,—seqBias) to calculate the expression levels of the transcripts using the reads after trimming with bbduk.sh and selecting those with a k-mer matching with the C. fulvum genome. By doing so, Salmon generated expression values in Transcripts Per Million (TPM), which were used in SUPPA2 v2.3 [97] to predict differential transcript usage among all possible pairs of timepoints following SUPPA2’s specification. More precisely, isoform inclusion levels were quantified with the command psiPerIsoform using the TPM values, followed by the command diffSplice to calculate differential splicing between conditions with replicates (parameters:—area 1000,—lower-bound 0.05,—combination,—tpm-threshold 2,—gene-correction). A filtering step was carried out to keep only events with p-value < 0.01 and differential splicing value (dPSI) > 0.2. Isoform switch events were detected with the Time-Series Isoform Switch (TSIS) program [110] using the mean method to identify intersection points, p-value< 0.05, and isoform switch probability cutoff< 0.5.
Supporting information
S1 Text. RNA sequencing of C. fulvum isolates Race 5 and Race 4 during interaction with tomato.
https://doi.org/10.1371/journal.ppat.1012791.s001
(PDF)
S2 Text. Isolates Race 5 and Race 4 of C. fulvum share nearly all their intron splice sites.
https://doi.org/10.1371/journal.ppat.1012791.s002
(PDF)
S3 Text. Transcriptome profiling of C. fulvum during host infections reveals extensive transcript isoform heterogeneity among isolates and infections.
https://doi.org/10.1371/journal.ppat.1012791.s003
(PDF)
S1 Fig. Preliminary transcriptome assembly generated chimeric transcripts spanning genes physically close in the genome.
https://doi.org/10.1371/journal.ppat.1012791.s004
(PDF)
S2 Fig. A high heterogeneity in transcripts produced by Cladosporium fulvum isolates Race 5 and Race 4 during tomato infections is seen between the two isolates and the three different infections that were performed with each isolate.
https://doi.org/10.1371/journal.ppat.1012791.s005
(PDF)
S3 Fig. An overall low number of transcripts are constitutively present in samples from all three tomato infections (i.e. biological replicates) performed either with Cladosporium fulvum isolate Race 5 or isolate Race 4, and in every of the seven infection timepoints that were sampled per infection.
https://doi.org/10.1371/journal.ppat.1012791.s006
(PDF)
S4 Fig. The number of transcripts shared by Cladosporium fulvum isolates Race 5 and Race 4 increases when singleton transcripts were filtered out.
https://doi.org/10.1371/journal.ppat.1012791.s007
(PDF)
S5 Fig. The number of transcripts present in samples from all three tomato infections performed either with Cladosporium fulvum isolate Race 5 or Race 4, and in each of the seven infection timepoints that were sampled per infection, after filtering out singleton transcripts that were present in only one sample.
https://doi.org/10.1371/journal.ppat.1012791.s008
(PDF)
S6 Fig. The genomic distribution of genes predicted to be recurrently alternative spliced (AS) in Cladosporium fulvum isolates Race 5 and Race 4 during tomato infections.
https://doi.org/10.1371/journal.ppat.1012791.s009
(PDF)
S7 Fig. The upstream genomic regions of alternative spliced (AS) genes in Cladosporium fulvum isolates Race 5 and Race 4, have higher amounts of repetitive DNA compared to the upstream genomic regions of non-AS genes.
https://doi.org/10.1371/journal.ppat.1012791.s010
(PDF)
S8 Fig. Genes that are alternatively spliced (AS) in Cladosporium fulvum isolates Race 5 and Race 4 during tomato infections exhibit distinct physical characteristics as compared to non-AS genes.
https://doi.org/10.1371/journal.ppat.1012791.s011
(PDF)
S9 Fig. Alternative splicing (AS) in the effector gene Ecp1 of Cladosporium fulvum isolates Race 5 and Race 4.
https://doi.org/10.1371/journal.ppat.1012791.s012
(PDF)
S10 Fig. Alternative splicing (AS) in the effector gene Ecp5 of Cladosporium fulvum isolates Race 5 and Race 4.
https://doi.org/10.1371/journal.ppat.1012791.s013
(PDF)
S11 Fig. Alternative splicing (AS) in the effector gene Ecp6 of Cladosporium fulvum isolates Race 5 and Race 4.
https://doi.org/10.1371/journal.ppat.1012791.s014
(PDF)
S12 Fig. Alternative splicing (AS) in the effector gene Ecp12 of Cladosporium fulvum isolates Race 5 and Race 4.
https://doi.org/10.1371/journal.ppat.1012791.s015
(PDF)
S13 Fig. Two genes encoding putative transcription factors in Cladosporium fulvum isolates Race 5 and Race 4, and their predicted protein isoforms produced via alternative splicing (AS) events.
https://doi.org/10.1371/journal.ppat.1012791.s016
(PDF)
S14 Fig. Genes from Cladosporium fulvum isolate Race 5 with significant evidence of differential isoform usage at the transcript level during disease progression, which are common to both isolates.
https://doi.org/10.1371/journal.ppat.1012791.s017
(PDF)
S15 Fig. Genes from Cladosporium fulvum isolate Race 4 with significant evidence of differential isoform usage at the transcript level during disease progression, which are common to both isolates.
https://doi.org/10.1371/journal.ppat.1012791.s018
(PDF)
S16 Fig. Examples of genes from Cladosporium fulvum producing isoforms with unsual expression patterns during the infection process.
https://doi.org/10.1371/journal.ppat.1012791.s019
(PDF)
S17 Fig. Flowchart summarizing the steps performed to assemble transcripts of Cladosporium fulvum isolates Race 5 and Race 4, and identify genes predicted to undergo alternative splicing.
https://doi.org/10.1371/journal.ppat.1012791.s020
(PDF)
S1 Table. Number (No.) of raw paired-end reads obtained for Cladosporium fulvum isolates Race 5 and Race 4, during interaction with tomato (Solanum lycopersicum) cv. Moneymaker.
https://doi.org/10.1371/journal.ppat.1012791.s021
(XLSX)
S2 Table. Number and percentage of paired-end reads obtained for Cladosporium fulvum isolates Race 5 and Race 4 that had a k-mer matching to the genomes of C. fulvum or tomato (Solanum lycopersicum) cv. Moneymaker.
https://doi.org/10.1371/journal.ppat.1012791.s022
(XLSX)
S3 Table. The number (No.) of protein-coding genes in the genomes of Cladosporium fulvum isolates Race 5 and Race 4 with the specified number of introns.
https://doi.org/10.1371/journal.ppat.1012791.s023
(XLSX)
S4 Table. Number (No.) of transcript isoforms assembled for Cladosporium fulvum isolates Race 5 and Race 4, at each of the seven timepoints during interaction with tomato.
https://doi.org/10.1371/journal.ppat.1012791.s024
(XLSX)
S5 Table. Number of unique transcript isoforms assembled from alternatively spliced (AS) genes in Cladosporium fulvum isolates Race 5 or Race 4.
https://doi.org/10.1371/journal.ppat.1012791.s025
(XLSX)
S6 Table. The subset of alternatively spliced (AS) genes that are common between Cladosporium fulvum isolates Race 5 and Race 4.
https://doi.org/10.1371/journal.ppat.1012791.s026
(XLSX)
S7 Table. Number (No.) and percentages of alternative splicing (AS) events in Cladosporium fulvum isolates Race 5 and Race 4.
https://doi.org/10.1371/journal.ppat.1012791.s027
(XLSX)
S8 Table. Enrichment analysis of alternatively spliced (AS) genes in Cladosporium fulvum isolates Race 5 and Race 4.
https://doi.org/10.1371/journal.ppat.1012791.s028
(XLSX)
S9 Table. Number and percentages of genes from Cladosporium fulvum isolates Race 5 and Race 4 harboring different types of alternative splicing (AS) events.
https://doi.org/10.1371/journal.ppat.1012791.s029
(XLSX)
S10 Table. Mean and median values of the distributions of gene size, GC content, number of PFAM domains, and size of exons of alternatively spliced (AS) genes as compared to non-AS genes in Cladosporium fulvum isolates Race 5 and Race 4 during tomato infections.
https://doi.org/10.1371/journal.ppat.1012791.s030
(XLSX)
S11 Table. Alternatively spliced (AS) genes in Cladosporium fulvum isolates Race 5 and Race 4, producing multiple distinct protein isoforms.
https://doi.org/10.1371/journal.ppat.1012791.s031
(XLSX)
S12 Table. Functional annotation of the protein isoforms that are produced by alternatively spliced (AS) genes of Cladosporium fulvum isolate Race 5.
https://doi.org/10.1371/journal.ppat.1012791.s032
(XLSX)
S13 Table. Functional annotation of the protein isoforms that are produced by alternatively spliced (AS) genes of Cladosporium fulvum isolate Race 5.
https://doi.org/10.1371/journal.ppat.1012791.s033
(XLSX)
S14 Table. Enrichment analysis of alternatively spliced (AS) genes in Cladosporium fulvum isolates Race 5 and Race 4 producing multiple distinct protein isoforms.
https://doi.org/10.1371/journal.ppat.1012791.s034
(XLSX)
S15 Table. Alternatively spliced (AS) genes in Cladosporium fulvum isolates Race 5 and Race 4, producing multiple distinct protein isoforms with presence/absence variation in PFAM domains and signal peptides (SP).
https://doi.org/10.1371/journal.ppat.1012791.s035
(XLSX)
S16 Table. Alternatively spliced (AS) genes in Cladosporium fulvum isolates Race 5 and Race 4, producing transcript isoforms with (Yes) or without (No) evidence of differential isoform usage across the seven sampled timepoints of the infection process.
https://doi.org/10.1371/journal.ppat.1012791.s036
(XLSX)
S17 Table. Isoform switch events in alternatively spliced (AS) genes in Cladosporium fulvum isolate Race 5 during infection process.
https://doi.org/10.1371/journal.ppat.1012791.s037
(XLSX)
S18 Table. Isoform switch events in alternatively spliced (AS) genes in Cladosporium fulvum isolate Race 4 during infection process.
https://doi.org/10.1371/journal.ppat.1012791.s038
(XLSX)
Acknowledgments
We are thankful to Jonathan Niño-Sánchez and Anthony Salvucci for aiding with the plant inoculations.
References
- 1. Kornblihtt AR, Schor IE, Alló M, Dujardin G, Petrillo E, Muñoz MJ. Alternative splicing: a pivotal step between eukaryotic transcription and translation. Nat Rev Mol Cell Biol. 2013;14: 153–165. pmid:23385723
- 2. Xing Y, Lee C. Alternative splicing and RNA selection pressure—evolutionary consequences for eukaryotic genomes. Nat Rev Genet. 2006;7: 499–509. pmid:16770337
- 3. Chen L, Bush SJ, Tovar-Corona JM, Castillo-Morales A, Urrutia AO. Correcting for Differential Transcript Coverage Reveals a Strong Relationship between Alternative Splicing and Organism Complexity. Mol Biol Evol. 2014;31: 1402–1413. pmid:24682283
- 4. Barbosa-Morais NL, Irimia M, Pan Q, Xiong HY, Gueroussov S, Lee LJ, et al. The Evolutionary Landscape of Alternative Splicing in Vertebrate Species. Science. 2012;338: 1587–1593. pmid:23258890
- 5. Merkin J, Russell C, Chen P, Burge CB. Evolutionary Dynamics of Gene and Isoform Regulation in Mammalian Tissues. Science. 2012;338: 1593–1599. pmid:23258891
- 6. Nilsen TW, Graveley BR. Expansion of the eukaryotic proteome by alternative splicing. Nature. 2010;463: 457–463. pmid:20110989
- 7. Wang Y, Liu J, Huang B, Xu Y-M, Li J, Huang L-F, et al. Mechanism of alternative splicing and its regulation. Biomed Rep. 2015;3: 152–158. pmid:25798239
- 8. Lin F, Zhang Y, Jiang M-Y. Alternative Splicing and Differential Expression of Two Transcripts of Nicotine Adenine Dinucleotide Phosphate Oxidase B Gene from Zea mays. J Integr Plant Biol. 2009;51: 287–298. pmid:19261072
- 9. Calixto CPG, Guo W, James AB, Tzioutziou NA, Entizne JC, Panter PE, et al. Rapid and Dynamic Alternative Splicing Impacts the Arabidopsis Cold Response Transcriptome. Plant Cell. 2018;30: 1424–1444. pmid:29764987
- 10. Laloum T, Martín G, Duque P. Alternative splicing control of abiotic stress responses. Trends Plant Sci. 2018;23: 140–150. pmid:29074233
- 11. Mastrangelo AM, Marone D, Laidò G, De Leonardis AM, De Vita P. Alternative splicing: Enhancing ability to cope with stress via transcriptome plasticity. Plant Sci. 2012;185–186: 40–49. pmid:22325865
- 12. Leviatan N, Alkan N, Leshkowitz D, Fluhr R. Genome-Wide Survey of Cold Stress Regulated Alternative Splicing in Arabidopsis thaliana with Tiling Microarray. PLOS ONE. 2013;8: e66511. pmid:23776682
- 13. Qin F, Kakimoto M, Sakuma Y, Maruyama K, Osakabe Y, Tran L-SP, et al. Regulation and functional analysis of ZmDREB2A in response to drought and heat stresses in Zea mays L. Plant J. 2007;50: 54–69. pmid:17346263
- 14. Staal J, Dixelius C. RLM3, a potential adaptor between specific TIR-NB-LRR receptors and DZC proteins. Commun Integr Biol. 2008;1: 59–61. pmid:19513199
- 15. Michael Weaver L, Swiderski MR, Li Y, Jones JDG. The Arabidopsis thaliana TIR-NB-LRR R-protein, RPP1A; protein localization and constitutive activation of defence by truncated alleles in tobacco and Arabidopsis. Plant J. 2006;47: 829–840. pmid:16889647
- 16. Costanzo S, Jia Y. Alternatively spliced transcripts of Pi-ta blast resistance gene in Oryza sativa. Plant Sci. 2009;177: 468–478.
- 17. Dinesh-Kumar S, Baker BJ. Alternatively spliced N resistance gene transcripts: their possible role in tobacco mosaic virus resistance. Proc Natl Acad Sci. 2000;97: 1908–1913.
- 18. Muzafar S, Sharma RD, Chauhan N, Prasad R. Intron distribution and emerging role of alternative splicing in fungi. FEMS Microbiol Lett. 2021;368: fnab135. pmid:34718529
- 19. Fang S, Hou X, Qiu K, He R, Feng X, Liang X. The occurrence and function of alternative splicing in fungi. Fungal Biol Rev. 2020;34: 178–188.
- 20. Grützmann K, Szafranski K, Pohl M, Voigt K, Petzold A, Schuster S. Fungal Alternative Splicing is Associated with Multicellular Complexity and Virulence: A Genome-Wide Multi-Species Study. DNA Res. 2014;21: 27–39. pmid:24122896
- 21. Xie B-B, Li D, Shi W-L, Qin Q-L, Wang X-W, Rong J-C, et al. Deep RNA sequencing reveals a high frequency of alternative splicing events in the fungus Trichoderma longibrachiatum. BMC Genomics. 2015;16: 54. pmid:25652134
- 22. Hoppins SC, Go NE, Klein A, Schmitt S, Neupert W, Rapaport D, et al. Alternative Splicing Gives Rise to Different Isoforms of the Neurospora crassa Tob55 Protein That Vary in Their Ability to Insert β-Barrel Proteins Into the Outer Mitochondrial Membrane. Genetics. 2007;177: 137–149. pmid:17660559
- 23. Leal J, Squina FM, Freitas JS, Silva EM, Ono CJ, Martinez-Rossi NM, et al. A splice variant of the Neurospora crassa hex-1 transcript, which encodes the major protein of the Woronin body, is modulated by extracellular phosphate and pH changes. FEBS Lett. 2009;583: 180–184. pmid:19071122
- 24. Trevisan GL, Oliveira EHD, Peres NTA, Cruz AHS, Martinez-Rossi NM, Rossi A. Transcription of Aspergillus nidulans pacC is modulated by alternative RNA splicing of palB. FEBS Lett. 2011;585: 3442–3445. pmid:21985967
- 25. Zhang M-Y, Miyake T. Development and Media Regulate Alternative Splicing of a Methyltransferase Pre-mRNA in Monascus pilosus. J Agric Food Chem. 2009;57: 4162–4167. pmid:19368389
- 26. Hossain MA, Rodriguez CM, Johnson TL. Key features of the two-intron Saccharomyces cerevisiae gene SUS1 contribute to its alternative splicing. Nucleic Acids Res. 2011;39: 8612–8627. pmid:21749978
- 27. Shaul O. How introns enhance gene expression. Splicing. 2017;91: 145–155. pmid:28673892
- 28. Preker PJ, Kim KS, Guthrie C. Expression of the essential mRNA export factor Yra1p is autoregulated by a splicing-dependent mechanism. Rna. 2002;8: 969–980. pmid:12212852
- 29. Gehrmann T, Pelkmans JF, Lugones LG, Wösten HAB, Abeel T, Reinders MJT. Schizophyllum commune has an extensive and functional alternative splicing repertoire. Sci Rep. 2016;6: 33640. pmid:27659065
- 30. Freitag J, Ast J, Bölker M. Cryptic peroxisomal targeting via alternative splicing and stop codon read-through in fungi. Nature. 2012;485: 522–525. pmid:22622582
- 31. Strijbis K, van den Burg J, Visser WF, van den Berg M, Distel B. Alternative splicing directs dual localization of Candida albicans 6-phosphogluconate dehydrogenase to cytosol and peroxisomes. FEMS Yeast Res. 2012;12: 61–68. pmid:22094058
- 32. Sieber P, Voigt K, Kämmer P, Brunke S, Schuster S, Linde J. Comparative Study on Alternative Splicing in Human Fungal Pathogens Suggests Its Involvement During Host Invasion. Front Microbiol. 2018;9. pmid:30333805
- 33. Jeon J, Kim K-T, Choi J, Cheong K, Ko J, Choi G, et al. Alternative splicing diversifies the transcriptome and proteome of the rice blast fungus during host infection. RNA Biol. 2022;19: 373–386. pmid:35311472
- 34. Lopes MER, Bitencourt TA, Sanches PR, Martins MP, Oliveira VM, Rossi A, et al. Alternative Splicing in Trichophyton rubrum Occurs in Efflux Pump Transcripts in Response to Antifungal Drugs. J Fungi. 2022;8. pmid:36012866
- 35. Ibrahim HMM, Kusch S, Didelon M, Raffaele S. Genome-wide alternative splicing profiling in the fungal plant pathogen Sclerotinia sclerotiorum during the colonization of diverse host families. Mol Plant Pathol. 2021;22: 31–47. pmid:33111422
- 36. Thomma BP, Van Esse HP, Crous PW, de Wit PJ. Cladosporium fulvum (syn. Passalora fulva), a highly specialized plant pathogen as a model for functional studies on plant pathogenic Mycosphaerellaceae. Mol Plant Pathol. 2005;6: 379–393. pmid:20565665
- 37.
De Wit PJ, Joosten MH, Thomma BH, Stergiopoulos I. Gene for gene models and beyond: the Cladosporium fulvum-Tomato pathosystem. Plant relationships. Springer; 2009. pp. 135–156.
- 38. de Wit PJ. Cladosporium fulvum effectors: weapons in the arms race with tomato. Annu Rev Phytopathol. 2016;54: 1–23.
- 39. Zaccaron AZ, Chen L-H, Samaras A, Stergiopoulos I. A chromosome-scale genome assembly of the tomato pathogen Cladosporium fulvum reveals a compartmentalized genome architecture and the presence of a dispensable chromosome. Microb Genomics. 2022;8: 000819. pmid:35471194
- 40. Zaccaron AZ, Stergiopoulos I. Analysis of five near-complete genome assemblies of the tomato pathogen Cladosporium fulvum uncovers additional accessory chromosomes and structural variations induced by transposable elements effecting the loss of avirulence genes. BMC Biol. 2024;22: 25. pmid:38281938
- 41. Dong S, Raffaele S, Kamoun S. The two-speed genomes of filamentous pathogens: waltz with plants. Curr Opin Genet Dev. 2015;35: 57–65. pmid:26451981
- 42. Van den Ackerveken G, Van Kan JA, Joosten M, Muisers JM, Verbakel HM, De Wit P. Characterization of two putative pathogenicity genes of the fungal tomato pathogen Cladosporium fulvum. Mol Plant-Microbe Interact. 1993;6: 210–215. pmid:8471794
- 43. Stergiopoulos I, De Kock MJ, Lindhout P, De Wit PJ. Allelic variation in the effector genes of the tomato pathogen Cladosporium fulvum reveals different modes of adaptive evolution. Mol Plant Microbe Interact. 2007;20: 1271–1283.
- 44. Bolton MD, Van Esse HP, Vossen JH, De Jonge R, Stergiopoulos I, Stulemeijer IJ, et al. The novel Cladosporium fulvum lysin motif effector Ecp6 is a virulence factor with orthologues in other fungal species. Mol Microbiol. 2008;69: 119–136.
- 45. Mesarich CH, Ӧkmen B, Rovenich H, Griffiths SA, Wang C, Karimi Jashni M, et al. Specific hypersensitive response–associated recognition of new apoplastic effectors from Cladosporium fulvum in wild tomato. Mol Plant Microbe Interact. 2018;31: 145–162.
- 46. Odenbach D, Breth B, Thines E, Weber RWS, Anke H, Foster AJ. The transcription factor Con7p is a central regulator of infection-related morphogenesis in the rice blast fungus Magnaporthe grisea. Mol Microbiol. 2007;64: 293–307. pmid:17378924
- 47. Shi Z, Christian D, Leung H. Interactions Between Spore Morphogenetic Mutations Affect Cell Types, Sporulation, and Pathogenesis in Magnaporthe grisea. Mol Plant-Microbe Interactions®. 1998;11: 199–207. pmid:9487695
- 48. Aramayo R, Peleg Y, Addison R, Metzenberg R. Asm-1 +, a Neurospora crassa Gene Related to Transcriptional Regulators of Fungal Development. Genetics. 1996;144: 991–1003. pmid:8913744
- 49. Dong W-X, Ding J-L, Gao Y, Peng Y-J, Feng M-G, Ying S-H. Transcriptomic insights into the alternative splicing-mediated adaptation of the entomopathogenic fungus Beauveria bassiana to host niches: autophagy-related gene 8 as an example. Environ Microbiol. 2017;19: 4126–4139. pmid:28730600
- 50. Hsieh P-H, Oyang Y-J, Chen C-Y. Effect of de novo transcriptome assembly on transcript quantification. Sci Rep. 2019;9: 8304. pmid:31165774
- 51. Wan Y, Larson DR. Splicing heterogeneity: separating signal from noise. Genome Biol. 2018;19: 86. pmid:29986741
- 52. Skinner SO, Xu H, Nagarkar-Jaiswal S, Freire PR, Zwaka TP, Golding I. Single-cell analysis of transcription kinetics across the cell cycle. Singer RH, editor. eLife. 2016;5: e12175. pmid:26824388
- 53. Lu P, Chen D, Qi Z, Wang H, Chen Y, Wang Q, et al. Landscape and regulation of alternative splicing and alternative polyadenylation in a plant pathogenic fungus. New Phytol. 2022;235: 674–689. pmid:35451076
- 54. Jin L, Li G, Yu D, Huang W, Cheng C, Liao S, et al. Transcriptome analysis reveals the complexity of alternative splicing regulation in the fungus Verticillium dahliae. BMC Genomics. 2017;18: 130. pmid:28166730
- 55. Black DL. Mechanisms of alternative pre-messenger RNA splicing. Annu Rev Biochem. 2003;72: 291–336. pmid:12626338
- 56. Zhang L, Qian J, Han Y, Jia Y, Kuang H, Chen J. Alternative splicing triggered by the insertion of a CACTA transposon attenuates LsGLK and leads to the development of pale-green leaves in lettuce. Plant J. 2022;109: 182–195. pmid:34724596
- 57. Clayton EA, Rishishwar L, Huang T-C, Gulati S, Ban D, McDonald JF, et al. An atlas of transposable element-derived alternative splicing in cancer. Philos Trans R Soc B Biol Sci. 2020;375: 20190342. pmid:32075558
- 58. Varagona MJ, Purugganan M, Wessler SR. Alternative splicing induced by insertion of retrotransposons into the maize waxy gene. Plant Cell. 1992;4: 811–820. pmid:1327340
- 59. Gebrie A. Transposable elements as essential elements in the control of gene expression. Mob DNA. 2023;14: 9. pmid:37596675
- 60. Soobin Shin, Jiyeun Park, Lin Yang, Hun Kim, Choi Gyung Ja Lee Yin-Won, et al. Con7 is a key transcription regulator for conidiogenesis in the plant pathogenic fungus Fusarium graminearum. mSphere. 2024;0: e00818–23. pmid:38591889
- 61. Chung D, Upadhyay S, Bomer B, Wilkinson HH, Ebbole DJ, Shaw BD. Neurospora crassa ASM-1 complements the conidiation defect in a stuA mutant of Aspergillus nidulans. Mycologia. 2015;107: 298–306. pmid:25550299
- 62. Xu X, Chen J, Xu H, Li D. Role of a major facilitator superfamily transporter in adaptation capacity of Penicillium funiculosum under extreme acidic stress. Fungal Genet Biol. 2014;69: 75–83. pmid:24959657
- 63. Hayashi K, Schoonbeek H, De Waard MA. Bcmfs1, a novel major facilitator superfamily transporter from Botrytis cinerea, provides tolerance towards the natural toxic compounds camptothecin and cercosporin and towards fungicides. Appl Environ Microbiol. 2002;68: 4996–5004.
- 64. Alexander NJ, McCormick SP, Hohn TM. TRI12, a trichothecene efflux pump from Fusarium sporotrichioides: gene isolation and expression in yeast. Mol Gen Genet MGG. 1999;261: 977–984. pmid:10485289
- 65.
Perlin MH, Andrews J, San Toh S. Chapter Four—Essential Letters in the Fungal Alphabet: ABC and MFS Transporters and Their Roles in Survival and Pathogenicity. In: Friedmann T, Dunlap JC, Goodwin SF, editors. Advances in Genetics. Academic Press; 2014. pp. 201–253. https://doi.org/10.1016/B978-0-12-800271-1.00004–4
- 66. Carbó R, Rodríguez E. Relevance of Sugar Transport across the Cell Membrane. Int J Mol Sci. 2023;24. pmid:37047055
- 67. Mattam AJ, Chaudhari YB, Velankar HR. Factors regulating cellulolytic gene expression in filamentous fungi: an overview. Microb Cell Factories. 2022;21: 44. pmid:35317826
- 68. Wu VW, Thieme N, Huberman LB, Dietschmann A, Kowbel DJ, Lee J, et al. The regulatory and transcriptional landscape associated with carbon utilization in a filamentous fungus. Proc Natl Acad Sci. 2020;117: 6003–6013. pmid:32111691
- 69. Adnan M, Zheng W, Islam W, Arif M, Abubakar YS, Wang Z, et al. Carbon Catabolite Repression in Filamentous Fungi. Int J Mol Sci. 2018;19. pmid:29295552
- 70. Kim J-H, Roy A, Jouandot D, Cho KH. The glucose signaling network in yeast. Biochim Biophys Acta BBA—Gen Subj. 2013;1830: 5204–5210. pmid:23911748
- 71. Shin J, Kim J-E, Lee Y-W, Son H. Fungal Cytochrome P450s and the P450 Complement (CYPome) of Fusarium graminearum. Toxins. 2018;10. pmid:29518888
- 72. Chen W, Lee M-K, Jefcoate C, Kim S-C, Chen F, Yu J-H. Fungal Cytochrome P450 Monooxygenases: Their Distribution, Structure, Functions, Family Expansion, and Evolutionary Origin. Genome Biol Evol. 2014;6: 1620–1634. pmid:24966179
- 73. Zhang J, Jin X, Wang Y, Zhang B, Liu T. A Cytochrome P450 Monooxygenase in Nondefoliating Strain of Verticillium dahliae Manipulates Virulence via Scavenging Reactive Oxygen Species. Phytopathology®. 2022;112: 1723–1729. pmid:35224980
- 74. George HL, Hirschi KD, VanEtten HD. Biochemical properties of the products of cytochrome P450 genes (PDA) encoding pisatin demethylase activity in Nectria haematococca. Arch Microbiol. 1998;170: 147–154. pmid:9683653
- 75. Shin JY, Bui D-C, Lee Y, Nam H, Jung S, Fang M, et al. Functional characterization of cytochrome P450 monooxygenases in the cereal head blight fungus Fusarium graminearum. Environ Microbiol. 2017;19: 2053–2067. pmid:28296081
- 76. Chan JJ, Zhang B, Chew XH, Salhi A, Kwok ZH, Lim CY, et al. Pan-cancer pervasive upregulation of 3′ UTR splicing drives tumourigenesis. Nat Cell Biol. 2022;24: 928–939. pmid:35618746
- 77. Mayr C. What are 3′ UTRs doing? Cold Spring Harb Perspect Biol. 2019;11: a034728. pmid:30181377
- 78. Hong D, Jeong S. 3’UTR Diversity: Expanding Repertoire of RNA Alterations in Human mRNAs. Mol Cells. 2023;46: 48–56. pmid:36697237
- 79. Wieder ND’Souza EN, Martin-Geary AC, Lassen FH, Talbot-Martin J, Fernandes M, et al. Differences in 5’untranslated regions highlight the importance of translational regulation of dosage sensitive genes. Genome Biol. 2024;25: 111. pmid:38685090
- 80. Ryczek N, Łyś A, Makałowska I. The Functional Meaning of 5′UTR in Protein-Coding Genes. Int J Mol Sci. 2023;24. pmid:36769304
- 81. Stergiopoulos I, Groenewald M, Staats M, Lindhout P, Crous PW, De Wit PJ. Mating-type genes and the genetic structure of a world-wide collection of the tomato pathogen Cladosporium fulvum. Fungal Genet Biol. 2007;44: 415–429.
- 82. Boukema I. Races of Cladosporium fulvum Cke.(Fulvia fulva) and genes for resistance in the tomato (Lycopersicon Mill.). Genetics and breeding of tomato: proceedings of the meeting of the Eucarpia Tomato Working Group, Avignon-France, May 18–21, 1981. Versailles, France: Institut national de la recherche agronomique, 1981.; 1981. pp. 287–292.
- 83. De Wit PJGM. A light and scanning-electron microscopic study of infection of tomato plants by virulent and avirulent races of Cladosporium fulvum. Neth J Plant Pathol. 1977;83: 109–122.
- 84. van Esse HP, Bolton MD, Stergiopoulos I, de Wit PJGM, Thomma BPHJ. The Chitin-Binding Cladosporium fulvum Effector Protein Avr4 Is a Virulence Factor. Mol Plant-Microbe Interactions®. 2007;20: 1092–1101. pmid:17849712
- 85. Emms DM, Kelly S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 2015;16: 1–14.
- 86. Dainat J, Hereñú D, Murray DKD, Davis E, Crouch K, LucileSol , et al. NBISweden/AGAT: AGAT-v1.2.0. Zenodo; 2023.
- 87. Shumate A, Salzberg SL. Liftoff: accurate mapping of gene annotations. Bioinformatics. 2020. https://doi.org/10.1093/bioinformatics/btaa1016
- 88.
Bushnell B. BBMap: a fast, accurate, splice-aware aligner. Berkeley, CA: Lawrence Berkeley National Lab (LBNL); 2014.
- 89. Hosmani PS, Flores-Gonzalez M, van de Geest H, Maumus F, Bakker LV, Schijlen E, et al. An improved de novo assembly and annotation of the tomato reference genome using single-molecule sequencing, Hi-C proximity ligation and optical maps. bioRxiv. 2019; 767764.
- 90. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29: 15–21. pmid:23104886
- 91. Pertea M, Pertea GM, Antonescu CM, Chang T-C, Mendell JT, Salzberg SL. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol. 2015;33: 290–295. pmid:25690850
- 92. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25: 2078–2079. pmid:19505943
- 93. Yu T, Mu Z, Fang Z, Liu X, Gao X, Liu J. TransBorrow: genome-guided transcriptome assembly by borrowing assemblies from different assemblers. Genome Res. 2020;30: 1181–1190. pmid:32817072
- 94. Voshall A, Behera S, Li X, Yu X-H, Kapil K, Deogun JS, et al. A consensus-based ensemble approach to improve transcriptome assembly. BMC Bioinformatics. 2021;22: 513. pmid:34674629
- 95. Pertea G, Pertea M. GFF Utilities: GffRead and GffCompare [version 1; peer review: 3 approved]. F1000Research. 2020;9. pmid:32489650
- 96. Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22: 1658–1659. pmid:16731699
- 97. Trincado JL, Entizne JC, Hysenaj G, Singh B, Skalic M, Elliott DJ, et al. SUPPA2: fast, accurate, and uncertainty-aware differential splicing analysis across multiple conditions. Genome Biol. 2018;19: 1–11.
- 98. Yu G, Wang L-G, Han Y, He Q-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics J Integr Biol. 2012;16: 284–287.
- 99. Jones P, Binns D, Chang H-Y, Fraser M, Li W, McAnulla C, et al. InterProScan 5: genome-scale protein function classification. Bioinformatics. 2014;30: 1236–1240. pmid:24451626
- 100. Alexa A, Rahnenführer J. Gene set enrichment analysis with topGO. Bioconductor Improv. 2009;27: 1–26.
- 101. Törönen P, Medlar A, Holm L. PANNZER2: a rapid functional annotation web server. Nucleic Acids Res. 2018;46: W84–W88. pmid:29741643
- 102. Varabyou A, Erdogdu B, Salzberg SL, Pertea M. Investigating open reading frames in known and novel transcripts using ORFanage. Nat Comput Sci. 2023;3: 700–708. pmid:38098813
- 103. Pagès H, Aboyoun P, Gentleman R, DebRoy S. Biostrings: Efficient manipulation of biological strings. R Package Version 2640. 2022.
- 104. Teufel F, Almagro Armenteros JJ, Johansen AR, Gíslason MH, Pihl SI, Tsirigos KD, et al. SignalP 6.0 predicts all five types of signal peptides using protein language models. Nat Biotechnol. 2022;40: 1023–1025. pmid:34980915
- 105. The UniProt Consortium. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 2019;47: D506–D515. pmid:30395287
- 106. Zheng J, Ge Q, Yan Y, Zhang X, Huang L, Yin Y. dbCAN3: automated carbohydrate-active enzyme and substrate annotation. Nucleic Acids Res. 2023;51: W115–W121. pmid:37125649
- 107. Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. ColabFold: making protein folding accessible to all. Nat Methods. 2022;19: 679–682. pmid:35637307
- 108. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26: 841–842. pmid:20110278
- 109. Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 2017;14: 417–419. pmid:28263959
- 110. Guo W, Calixto CPG, Brown JWS, Zhang R. TSIS: an R package to infer alternative splicing isoform switches for time-series data. Bioinformatics. 2017;33: 3308–3310. pmid:29028262