Dicer-Dependent Biogenesis of Small RNAs and Evidence for MicroRNA-Like RNAs in the Penicillin Producing Fungus Penicillium chrysogenum

MicroRNAs (miRNAs) are non-coding small RNAs (sRNAs) that regulate gene expression in a wide range of eukaryotes. In this study, we analyzed regulatory sRNAs in Penicillium chrysogenum, the industrial producer of the β-lactam antibiotic penicillin. To identify sRNAs and microRNA-like RNAs (milRNAs) on a global approach, two sRNA sequencing libraries were constructed. One library was created with pooled total RNA, obtained from twelve differently grown cultures (RNA Mix), and the other with total RNA from a single submerged cultivation (∆ku70FRT2). Illumina sequencing of both RNA libraries produced 84,322,825 mapped reads. To distinguish between Dicer-dependent and independent sRNA formation, we further constructed two single dicer gene mutants (∆dcl2 and ∆dcl1) and a dicer double mutant (∆dcl2∆dcl1) and analyzed an sRNA library from the Dicer-deficient double-mutant. We identified 661 Dicer-dependent loci and in silico prediction revealed 34 milRNAs. Northern blot hybridization of two milRNAs provided evidence for mature milRNAs that are processed either in a complete or partial Dicer-dependent manner from an RNA precursor. Identified milRNAs share typical characteristics of previously discovered fungal milRNAs, like a strong preference for a 5' uracil and the typical length distribution. The detection of potential milRNA target sites in the genome suggests that milRNAs might play a role in posttranscriptional gene regulation. Our data will further increase our knowledge of sRNA dependent gene regulation processes, which is an important prerequisite to develop more effective strategies for improving industrial fermentations with P. chrysogenum.


Introduction
Small regulatory RNAs are short non-coding RNAs ranging in size from 17 to 29 nt that mediate RNA interference (RNAi). Small RNAs (sRNAs) have been discovered in most eukaryotic organisms, and can be divided into three major regulatory classes [1][2][3]: piwi-interacting

Analysis of small RNA libraries
For a comprehensive survey of sRNAs from P. chrysogenum, low molecular weight RNA (18-50 nt) was isolated and used for small RNA sequencing. For RNA isolation, the industrial laboratory strain P2niaD18 was grown on four different media and harvested at three different time points (see Material and Methods). From these 12 differently grown cultures, RNA was isolated and pooled to generate a sample designated "RNA-Mix". Two further small RNA libraries were generated from submerged cultures of Δku70FRT2 and Dicer-double mutant Δku70FRT2Δdcl2Δdcl1, hereafter referred to as Δdcl2Δdcl1. After removing low quality reads and trimming of 3'-adaptor sequences, 46,240,082,52,144,415, and 65,660,252 trimmed reads were obtained as "total" read datasets for the RNA-Mix, Δku70FRT2 and Δdcl2Δdcl1. For subsequent analysis, the number of identical reads within each total dataset was calculated and all total reads were collapsed and transferred to a "unique" dataset. To avoid accumulation of rare reads, such as reads derived from degradation products, only unique reads with a read count of at least 10 were used for further analyses.
Total and filtered unique reads were mapped to the reference genome of P2niaD18 [25]. For the RNA-Mix, we found that 84.8% of all total reads (39,188,594) map at least once to the reference genome. Identical reads were combined to form a set of 1,117,770 unique reads and of these 122,018 were represented with a minimal read count of 10. The corresponding values of total reads for Δku70FRT2 and Δdcl2Δdcl1 that mapped at least once are 86.6% (45,134,231) and 87.9% (57,721,023) respectively. Within these samples we found 1,457,875 and 1,290,420 unique reads, respectively, including 113,603 and 104,922 unique reads with read numbers equal or higher than 10 (Table 1).
To determine the origin of the sRNAs, we mapped the sequence data acquired from the RNA-Mix, Δku70FRT2, and Δdcl2Δdcl1 to intergenic, exonic, and intronic regions and to rRNA and tRNA sequences. The proportion of reads that mapped to intragenic and intergenic regions, as well as tRNA and rRNA genes looked highly similar for all three libraries (S1 Fig) and is exemplarily displayed for Δku70FRT2 in Fig 1A and 1B. Within the total dataset, degraded rRNAs represented with 37.3% a rather high proportion. 33.9% of total reads mapped to intergenic regions and the amount of reads that mapped to introns (12.2%), exons (8.1%), and tRNAs (8.4%) was relatively low (Fig 1A). Interestingly, the number of total reads mapping forward to exonic sequences was lower than the number originating from the antisense sequences of these coding sequences (Table 1). Within the unique dataset of Δku70FRT2, the read distribution changed towards a higher fraction of reads mapping to exonic sequences (15.4%), while the percentage of reads from rRNA and tRNA regions declined to 30.2% and 3.6% (Fig 1B). To identify sRNA-producing loci, overlapping and adjacent unique reads were grouped. 1,693 loci were calculated for Δku70FRT2 and 783 for Δdcl2Δdcl1. Loci were mapped to the P2niaD18 genome and allocation of the loci to annotated genomic regions was performed. We calculated 1319 sRNA-producing loci in Δku70FRT2 and 374 of these generated sRNAs in forward and reverse orientation from the same genomic region. In Δdcl2Δdcl1, 783 sRNA-producing loci were identified but only 6 loci yielded sRNAs of both orientations and were located in intergenic regions or rRNA genes ( Table 2). Pie graphs illustrate that the proportion of loci that map to different features has significantly been altered between Δku70FRT (Fig 1C) and the Dicer-deficient strain (Fig 1D). The percentage of sRNA-producing loci in exonic regions in sense orientation and the number of loci mapping to tRNAs and rRNAs genes have strongly increased in Δdcl2Δdcl1 compared to Δku70FRT. In particular, the significant depletion of sRNA-producing loci with sRNAs that map in both orientations is noteworthy and points to the fact that Dicer-dependent sRNAs in P. chrysogenum are mostly produced from coding sequences and intergenic regions.

Identification and characterization of Dicer-dependent small RNAs
Previously, we showed that a mutant lacking the gene coding for Dcl2 had a wild-type morphology, although siRNAs are not processed [24]. To identify Dicer-dependent sRNAs, read counts obtained from the Δku70FRT and Δdcl2Δdcl1 were normalized and compared. Furthermore, we characterized the identified sRNAs for specific traits of sRNAs, such as origin, length distribution, and starting nucleotide preference.
Length distributions of mapped reads ranging from 15 to 36 nt within the total and unique datasets are displayed in Fig 2A and 2B. Although the total reads were more or less irregularly distributed for all three samples, unique reads showed a significant peak at 21 nt for RNA-Mix and Δku70FRT2, but not for Δdcl2Δdcl1. In addition, we determined the first nucleotide of all unique reads. Recent reports have indicated a preference for regulatory sRNAs to start with uracil [12,27]. As expected for Dicer-processed sRNAs, unique reads obtained from RNA-Mix and Δku70FRT2 had an increased ratio of reads starting with uracil (58.0% and 49.2%). In Δdcl2Δdcl1 (20.9%), this phenomenon is not observed, obviously due to the absence of both Dicer-like proteins. To further demonstrate that the accumulation of reads starting with uracil depends on Dicer processing, all unique reads were calculated separately according to their read lengths. In Δku70FRT2, reads starting with uracil were predominantly found in the range between 18 to 23 nt, which is a common length of Dicer-processed sRNAs (Fig 2C) [12]. In contrary, no accumulation of reads with a 5'-uracil was present in Δdcl2Δdcl1 ( Fig 2D). To identify Dicer-dependent sRNAs, all unique sRNAs in Δku70FRT2 were compared to those found in Δdcl2Δdcl1. After normalization of the datasets we identified 16,090 unique reads from Δku70FRT2 that were absent in the total read dataset of Δdcl2Δdcl1. Further, we found 3,695 unique reads that are strongly (log 2 -2) and 4,734 that are slightly (log 2 between -2 and -1) underrepresented in Δdcl2Δdcl1 compared to Δku70FRT2. On the contrary, the same analysis of the unique reads of Δdcl2Δdcl1 resulted in only 440 different reads that were not present in Δku70FRT2 (Fig 3A). The overwhelming majority of Dicer-dependent unique reads had a length between 17 and 23 nt and 64.8% start with uracil ( Fig 3B), while Dicer- Chromosomal distribution of small RNAs and sRNA-producing loci. Pie graphs for total reads (A) and unique reads (B) are showing the relative abundance of sRNAs located in tRNAs, rRNAs, intergenic, exonic and intronic regions in Δku70FRT2. Alignments of sRNA-producing loci of Δku70FRT2 (C) and Δdcl2Δdcl1 (D) show that the number of sRNAs that map to both DNA strands of one feature have strongly increased and that the fraction of sRNA loci that align to exonic regions in sense orientation has decreased substantially in Δdcl2Δdcl1 compared to Δku70FRT2. independent sRNAs (Δdcl2Δdcl1) showed no significant preference for any nucleotide (S2 Fig). Mapping results of the 24,519 reads that were underrepresented by at least log 2 -1 in Δdcl2Δdcl1 compared to Δku70FRT2, were regarded as Dicer-dependent unique reads and revealed a different composition compared to the complete unique read dataset (Fig 1B and Fig  3C). The proportion of reads that mapped to exonic regions had increased (+13.7%), while the amount of rRNAs fragments had strongly decreased (-22.6%). The high percentage of reads originate from intergenic region did not show any significant change (-1.6%). These results led us conclude that the majority of Dicer-dependent sRNAs are processed from RNA molecules that are transcribed from intergenic, and exonic regions, especially from those in antisense orientation. Comparative analyses of the sRNA-producing loci from Δku70FRT2 and Δdcl2Δdcl1 showed that the accumulation of sRNAs in 661 sRNA-producing loci were affected by the Dicer-double deletion. In total, 368 loci that produced sRNAs in both orientations and 293 loci that produced sRNA in either sense or antisense orientation were found in Δku70FRT2, but were no longer present in Δdcl2Δdcl1. We further analyzed the distribution of these 661 Dicerdependent sRNA loci. Within the 368 Dicer-dependent sRNA-producing loci that generated sense and antisense sRNAs, 246 were assigned to coding sequences and 121 to intergenic regions. The origin of the loci that showed a strand-specific accumulation of Dicer-specific sRNAs was determined, too. Of these 293 loci, 123 were assigned to exonic regions in sense and 35 in antisense orientation. Further, 131 were located in intergenic regions and 4 in tRNA genes, respectively ( Fig 3D).
For representative genes, the distribution of sRNAs from Dicer-dependent and Dicer-independent loci on both strands are presented in Fig 4. Normalized read counts of the Dicer-dependent loci of the Copia13-like transposable element Pc17g00440 and the putative Helixloop-helix DNA-binding protein Pc12g14660 are shown in Fig 4A. Besides Pc17g00440, multiple copies of the transposable element PCcopia13 (Pc17g00590, Pc21g00460, Pc22g26000, Pc24g01930, and Pc24g02680) and transposable element PCretro14 (Pc24g01940) were identified as Dicer-dependent loci with sRNAs in sense and antisense orientation. In Δdcl2Δdcl1, both strands showed a decreased level of sRNAs from transposable elements, while sRNAs from other exonic regions were only significantly decreased on one strand. A representative example for this is the gene Pc12g14660, coding a putative DNA-binding protein, that show a significantly decreased sRNA level only on the antisense strand ( Fig 4A). To illustrate sRNA distribution within Dicer-independent sRNA loci, putative cell-wall protein Pc20g06530 and a histidine-tRNA gene cluster are shown in Fig 4B. For these Dicer-independent loci, only sRNAs in sense orientation were detected. For Pc20g06530, sRNAs and mRNAs occur in similar amounts in both the recipient and double deletion strain Δdcl2Δdcl1. For all of the above mentioned examples, qRT-PCRs were performed to validate that a depleted amount of sRNA is not dependent from decreased transcription levels of mRNAs.

Identification of milRNAs in P. chrysogenum
The processing of microRNAs from an endogenous RNA hairpin molecule by Dicer-like enzymes is a major feature that differentiates milRNAs from other sRNAs. This feature allowed the in silico prediction of milRNAs by mapping sRNAs to afore predicted RNA-hairpin precursors. The position and frequency of sRNAs within the set of precursors was used to calculate milRNAs with miRDeep2 [28,29]. The RNA samples obtained from RNA-Mix and Δku70FRT2 contain Dicer-dependent sRNAs and were used for milRNA prediction, while sRNAs from Δdcl2Δdcl1 were used as a negative control. For each dataset, milRNAs were predicted independently. 20 putative milRNAs were found for the RNA-Mix and 22 in Δku70FRT2. Eight milRNAs were found in both datasets, providing a total of 34 different milRNAs, designated milR-1 to -34. As expected for the Dicer-dependent formation of milRNAs, the number of corresponding reads were significantly reduced in Δdcl2Δdcl1. On the contrary, all predicted milRNAs found in Δku70FRT2 were also present in the RNA-Mix sample, and vice versa ( Table 3). 31 of the 34 predicted milRNAs had a 5' uracil, which is another indication for Dicer-processing of the predicted milRNAs, and the milRNA precursor sequences differed in length from 43 to 89 nt. The Δdcl2Δdcl1 dataset helped to exclude predictions of milRNAs that were based on random patterns. Here, only two false-positives were found that did not show the typical features of predicted milRNAs, such as a 5'-uracil or the presence of a milRNA-star sequence.
All 34 milRNA precursors are located outside of tRNA and rRNA genes. The precursor sequence of most milRNAs occurred only once with a perfect match within the genome. Only the precursor sequence of milR-7 was found at three different loci (Table 3). To explore the location of milRNAs within the nuclear genome further, the mature sequence of milRNAs was also mapped to the genome. Twenty-four mature milRNAs mapped only once to the reference genome, and 10 (milR-6, milR-7, milR-8, milR-11, milR-12, milR-13, milR-14, milR-26, milR-27, milR-30) mapped at least twice. The most frequent milRNA sequences are milR-13, and milR-11, with 13 and 8 loci. While milR-13 was found in exons, introns, 5'-UTRs, and 3'-UTRs, all loci of milR-11 were located within the 5' end of the coding sequence of multiple copies of a putative RNA-dependent DNA-polymerase (RdDP). This RdDP was previously identified as the Copia13-like transposable element PCcopia13 [30]. All copies of PCcopia13 have highly complementary loci, including the mature milR-11 and milR-11-star sequence. Table 3. Predicted milRNAs and the appearance of their reads within the three datasets.

MilRNA target gene prediction
So far, no algorithms are available that have been adapted to fungal milRNAs. Therefore, target predictions were performed using three different programs, written for either animals or plants. The program miRanda was written for accurately detecting microRNA binding sites in animals, and the programs TAPIR and psRNATarget were originally written for predictions in plants [31][32][33]. Since no 3'-UTR database is available for P. chrysogenum, we extracted the 1 kb downstream sequences of all protein coding genes in the P2niaD18 genome. These 1 kb sequences were used as templates for target prediction. We succeeded in identifying mRNA targets for the predicted milRNAs described in this report. For 30 milRNAs, at least one target binding site was predicted with at least one program (S1 Table), and for 9 milRNAs (milR-6, milR-7, milR-8, milR-11, milR-14, milR-29, milR-30, milR31, milR-34) the same target sites were confirmed with programs written for the prediction of microRNAs in animals (miRanda) and plants (TAPIR and/or psRNATarget). Because some milRNAs are derived from 3'-UTRs, we compared the target sequence with the corresponding milRNA-star sequence, and only the predicted target sites of milR-8 and milR-14 are limited to their own milRNA-star sequences ( Table 4). The largest number of putative target sites was predicted for milR-21; 118 binding sites of milR-21 were found within the 3'-UTRs of P. chrysogenum. The most interesting mRNA targets were predicted for milR-1 and milR-21. For the most common milRNA, milR-1, the dcl2 (Pc12g13700) gene was predicted as target. For milR-21 at least three genes that might be involved in the DNA-repair mechanisms were predicted. These interesting findings led us to select milR-1 and milR-21 for further analyses.

Expression analysis and experimental validation of milRNAs
To confirm the presence of milRNAs in P. chrysogenum, we performed northern blot analysis to detect milR-1 and milR-21. For northern blotting, total RNA from mycelia grown under the same conditions as for the library construction of Δku70FRT2 and Δdcl2Δdcl1 was used to detect predicted milRNAs. Both processed milRNAs as well as their precursor RNAs were detected by radioactive labeled probes (Fig 5). Interestingly, additional signals detected between the precursor and mature milR-21 may represent intermediates of the processed precursors.
To test Dicer-dependent processing of milRNAs, further expression analyses were performed with total RNA from strains lacking one or both Dicer-like proteins. As expected for the dicer-double deletion strain, milR-21 and its precursors were not detected. Furthermore, milR-21 and its precursors were also absent in the dcl2-deletion, but not in the dcl1-deletion strain, indicating that processing of milR-21 is Dcl2-dependent. Interestingly, weak signals for milR-1 and its precursors were detected in the Dicer-deficient strain Δdcl2Δdcl1 and the Δdcl2 single deletion strain. These signals were significantly weaker than the signals observed from strains with an intact dcl2 gene. This result is consistent with the sRNA sequencing data presented in Table 3.

Discussion
Evidence for small RNAs in P. chrysogenum We previously reported that in P. chrysogenum silencing of endogenous genes by artificial siR-NAs is Dcl2 dependent [24]. Here we extend our analysis using small RNA deep sequencing of three different samples to increase our knowledge about endogenous fungal sRNAs, and specifically to identify microRNA-like RNAs. Indeed, we identified 34 milRNAs in P. chrysogenum and demonstrate that the majority of sRNAs, including the novel milRNAs are processed from RNA molecules transcribed from intergenic and exonic regions, especially those in antisense orientation. This agrees with a previously study in the ascomycete Fusarium oxysporum, where intergenic regions provided a high proportion of sRNAs [16]. In Magnaporthe oryzae it was further demonstrated that antisense sequences of coding regions can provide numerous sRNAs [34]. Further, a class of endogenous sRNAs that map to exons, called exonic-siRNAs, regulate the expression of protein coding genes in the basal fungus Mucor circinelloides [35]. To avoid analysis of mRNA degradation products, we used a strict cut-off for unique reads and identified sRNA-producing loci. We identified 661 Dicer-dependent sRNA-producing loci and demonstrate that 368 of these loci generate sRNAs in sense and antisense orientation from the same genomic source. The number of loci however that generate sRNA in both orientations is significantly decreased in the Dicer-deficient double-mutant. Further Dicer-dependent sRNAs are obtained mostly originating from intergenic and exonic regions in sense and antisense orientation. This observation is in accordance with the Dicer-dependent cleavage of dsRNAs and studies of exonic-sRNAs in M. circinelloides [35].
Although no obvious phenotype has so far been detected in Δdcl2Δdcl1, we clearly demonstrated a lack of Dicer-dependent sRNAs in the Dicer-double mutant. Further, we showed that the amount of sRNAs on both strands of the selfish genetic element PCcopia13 (Pc17g04400) Evidence for MicroRNA-Like RNAs in P. chrysogenum was significantly reduced in the Dicer-deficient double-mutant. This observation is consistent to results obtained for transposons from Neurospora crassa and Magnaporthe oryzae and for the antiviral defense mechanism in the chestnut blight fungus Cryphonectria parasitica [36][37][38]. Genetic stability is an essential feature for industrial applications using P. chrysogenum [39]. Therefore, a well-functioning defense mechanism against selfish transposable elements is an important requirement of industrial strains to guarantee stable production yields. Further, we observed typically characteristics for sRNAs known from other filamentous fungi, like significant preference for a 5'-uracil, their length distribution, and their origin from intergenic and exonic regions [40]. However, the percentage of uracil-starting Dicer-dependent sRNAs was only drastically higher for sRNAs ranging from 17 to 23 nt. Nevertheless, Dicer-dependent sRNAs were the predominant class of sRNAs in the overall sRNA population. We have demonstrated that formation of milR-1 is only partially Dicer-dependent. This result infers that at least one Dicer-independent pathway may be present to generate milR-1 in the Dicer-deficient mutant. This idea is strengthened by the presence of various Dicer-independent pathways of interfering sRNA in N. crassa [12]. Therefore, further analyses are necessary to reveal Dicer-independent sRNA-biogenesis pathways in P. chrysogenum. Evidence for MicroRNA-Like RNAs in P. chrysogenum

Evidence for milRNAs
Our bioinformatics analysis identified 34 Dicer-dependent sRNAs that carry the typical features of conventional miRNAs as similarly characterized in other eukaryotes. Typical features are RNA stem-loop formation, a strong preference for uracil at the 5' end, and an average size of about 21 nt [5,41]. We predicted 8 milRNAs in the small RNA sequencing datasets of both RNA-Mix and Δku70FRT2, and repeatedly deriving these 8 milRNAs demonstrates the reliability of our analysis. Due to different growing conditions, some milRNAs were only predicted in the RNA-Mix or Δku70FRT2 dataset. Nevertheless, all mature milRNAs were represented by reads within both datasets. This indicates different expression rates of milRNA under different growth conditions. Further, these reads were either totally absent or reduced to single reads in the Δdcl2Δdcl1 double mutant.
Since knowledge about fungal milRNAs is still limited, improving parameters to predict fungal milRNAs is a challenging task for the future. However, we were able to identify Dcl2-dependent milRNAs ranging in size between 19 and 29 nt, which is consistent with previous findings that some mature fungal milRNAs are longer than regular animal and plant miRNAs [12,16]. Nevertheless, the majority of fungal Dicer-dependent sRNAs and milRNAs range in size between 19 and 22 nt. All predicted mature milRNA sequences of P. chrysogenum were compared to those recently found in other fungi, but no conserved milRNAs were found.
The predicted precursor sequences range in size between 43 to 89 nt. These results are consistent with milRNAs and milRNA-precursors found for other ascomycetes [12,[14][15][16]. Northern blot analyses confirmed the existence of precursor and mature milRNAs milR-1 and milR-21 in P. chrysogenum. So far, we were unable to detect pri-milRNAs precursors, most probably due to their low abundance. When analyzed in detail, both milRNAs proved to be complete or partial Dcl2 dependent; moreover Dcl1 and Dcl2 did not show functional redundancy. These observations are also supported by our previous analyses demonstrating that only Dcl2 is essential for processing artificial dsRNAs to generate functional siRNAs that silence genes containing a perfect complementary target site [24]. These results are consistent with findings of milRNA biogenesis in P. marneffei but in contrast to findings in N. crassa, where both Dicerlike proteins are functionally redundant [12,17,42].
Remarkably, milR-1 was detected in the Dicer double-deletion strain. This demonstrates that milR-1 can form Dicer-independently, and that an alternative Dicer-independent pathway may exist to generate milRNAs. In N. crassa at least four different milRNA-generating pathways have been discovered involving components of Dicer-like proteins, Argonaute proteins, exonucleases (QIP) and an RNase III domain-containing protein (MRPL3). It was also demonstrated that single milRNAs can be processed independently by two different pathways [12].
Here, we clearly demonstrate that in P. chrysogenum the majority of sRNAs are Dicer-dependent and that Dcl2 is essential for the maturation of milR-21 and necessary for the accumulation of milR-1. Since the majority of predicted milRNAs are not present in Dicer doubledeletion strain Δdcl2Δdcl1, their generation is Dicer dependent.
Among the predicted 34 milRNAs, 24 map only once to the reference genome, and 10 at least twice. Surprisingly, one of the most abundant milRNA sequences was milR-11, found 8 times in the 5' end of the coding sequence for the Copia13-like transposable element PCco-pia13. We demonstrated that the level of sRNAs from this transposable element was reduced on both strands in Δdcl2Δdcl1. This indicates strongly to an sRNA-mediated RNAi pathway of selfish genetic elements in P. chrysogenum. The location of milRNA genes inside transposable elements is consistent with recent studies in Cryptococcus neoformans, where multiple loci of two milRNAs were discovered in the sequences of transposable elements and pseudogenes. It was demonstrated that milRNAs in C. neoformans induce transgene silencing via the canonical RNAi pathway [13]. Our results prompt us to believe that milRNAs in P. chrysogenum are likewise involved in the negative regulation of transposon activity.

Various mRNA targeting by milRNA?
Using diverse programs, we were able to predicted potential milRNA target sites for all 34 milRNAs on putative 3'-UTRs of P. chrysogenum. Eight milRNA binding sites, calculated by miRanda, were validated by the plant miRNA-prediction programs TAPIR and/or psRNATarget. While some milRNAs had only a single predicted target, others, like milR-21, have the option of interacting with multiple mRNA targets. Interestingly the second best predicted target for milR-1 is the dcl2 gene (Pc12g13700) und milR-21 was predicted to bind to at least three mRNA targets that are involved in DNA-repair mechanisms. As yet, we were not able to detect milRNA-mediated mRNA-degradation in P. chrysogenum. Previously, reporter gene constructs with artificial target sites were used to demonstrate milRNA-mediated silencing in fungi [12,13]. In these studies, target-mRNA degradation has only been demonstrated with reporter genes that carried a fully complementary target site. In contrast to siRNAs, miRNAs do not require full complementarity to bind and mediate silencing of target mRNAs. Thus, it is suggested that milRNA-mediated gene silencing in fungi is mostly caused by translational repression like it is known for animals and plants [1,12,43]. Therefore, further analyses of target-mRNA protein expression are necessary provide evidence for milRNA-target interactions and thus, milRNA mediated gene silencing in P. chrysogenum. Nevertheless, our small RNA sequencing approach identified novel milRNAs that are predicted to bind multiple target mRNAs, possibly as factors involved in a variety of different biological processes, like DNA-repair. Finally, results from sRNA identification and milRNA target predictions will serve as valuable clues for understanding sRNAs as regulators of gene expression in P. chrysogenum. A deeper understanding of sRNA-mediated gene regulation is an important prerequisite for developing more effective strategies to genetically manipulate this industrially important fungus to further improve production processes.

Strains and growth conditions
For the RNA-Mix, P. chrysogenum strain P2niaD18 was grown as submerged cultures in 500 ml shaking flasks in 100 ml CCM and MM at 27°C and 120 r.p.m., and as surface cultures on PAL Science Supor 200 Disc Filters on solid Oatmeal and M322 media. All cultures were inoculated with 1.0 x 10 7 conidiospores as previously described [44,45]. Liquid cultures were grown for 72, 96, and 120 hours, and surface cultures for 48, 72, and 96 hours. The mycelia of these 12 different cultures were frozen in liquid nitrogen and total RNA was isolated separately as described in the "Nucleic acid isolation" section. After purification, total RNA of the 12 samples were pooled in equal amounts to the so-called RNA-Mix sample. For the samples Δku70FRT2 and Δdcl2Δdcl1 the recipient strain Δku70FRT2 and the Dicer-deficient double-mutant Δdcl2Δdcl1 were grown for three days in shaking flasks containing CCM under the same conditions as mentioned above (S2 Table and S3 Table).

Identification of RISC core components
The recently published genome of P. chrysogenum strain P2niaD18 [25] and the previously published annotation of strain Wisconsin 54-1255 [46] served as source for identifying genes involved in RNAi. To identify core components, BLAST analyses of RISC proteins from N. crassa were performed (S3 Fig and S4 Table) [47]. Protein sequence data from N. crassa were obtained from the National Center for Biotechnology Information (NCBI) (http://www.ncbi. nlm.nih.gov/), and sequences were aligned using MAFFT [48]. For domain identification, the predicted P. chrysogenum protein sequences were used to perform an NCBI conserved domain search (http://www.ncbi.nlm.nih.gov/Structure/) [49].

Generation of Dicer single-and double-knockout strains
Plasmids were constructed to substitute genes encoding Dicer-like proteins in P. chrysogenum recipient strain Δku70FRT2 (S5 Table). For homologous recombination, fragments of about 1 kb flanking the 5' and 3' ends of the coding sequence were amplified by PCR and ligated to corresponding 5' and 3' flanks of nouseothricin or phleomycin knockout cassettes, expressing the nat or ble gene, under control of the trpC promoter. For deletion of dcl2 (Pc12g13700), the gene was replaced with a phleomycin resistance cassette. The nat-knockout cassette, used for deletion of dcl1 (Pc21g06890), carries additional components of the FLP/FRT recycling system that are under the control of the inducible promoter xylP [50]. Transformation of P. chrysogenum strains with foreign DNA was performed using conventional transformation procedures with modifications as previously described [45,51]. Homologous integration of the knockout construct and substitution of the target genes were confirmed by Southern blot and PCR analysis covering regions within and flanking the knockout cassette (S4 Fig, S5 Fig and S6  Fig). With this procedure, we generated dcl1-and dcl2-deletion strains. Transformation of plasmid pKOdcl1 in dcl2-deletion strain resulted finally in the construction of the Δdcl2Δdcl1double-deletion mutant.

Nucleic acid isolation
As described in the "Strains and growth conditions" sections, total RNA used for small RNA sequencing was obtained from different cultivations. For the "RNA-Mix" sample, total RNA was isolated independently from twelve different cultures of strain P2niaD18. For samples Δku70FRT2 and Δdcl2Δdcl1 a single culture of the corresponding strains was grown in liquid CCM for 72 h. For all samples, mycelia were collected for guanidinium-thiocyanate-phenolchloroform extraction of total RNA [52]. RNA was measured with NanoDrop 1000 Spectrophotometer and Bioanalyzer 2100, using the Agilent RNA 6000 Nano Kit. All total RNAs that were pooled to equal amounts for the RNA-Mix show RNA integrity numbers (RIN) 9.0. For the analysis of Δku70FRT2 and Δdcl2Δdcl1 only total RNA with RIN values 9.8 were used. The 28S:18S ratio was calculated with 1.7 for all samples. DNA preparation was carried out as described recently [53].

Small RNA deep sequencing
Total RNAs from the three different samples described above in the "Strains and growth conditions" section were used for small RNA library preparation and small RNA sequencing. Library preparation and sequencing was performed at GATC Biotech, Konstanz, Germany. Total RNA samples were separated by a 10% TBE-urea denaturing polyacrylamide gel electrophoresis (PAGE) and sRNAs ranging between 18 and 50 nt were used for library preparation using the Illumina TruSeq Small RNA Sample Preparation Kit. After reverse transcription and amplification, cDNA products were checked and measured with Bioanalyzer 2100. Sequencing was performed on an Illumina HiSeq 2000 platform. Trimming of TruSeq adapter sequences was performed with the program Cutadapt and only reads trimmed at their 3' end were used for further studies [54]. Raw sequencing data from sRNA sequencing have been deposited in the NCBI sequence read archive (SRA) (http://www.ncbi.nlm.nih.gov/sra) under the accession numbers SRR1705825 (RNA-Mix), SRR1706009 (Δku70FRT2), and SRR1706010 (Δdcl2Δdcl1).

Small RNA data analysis and milRNA prediction
To examine the origin of the cleaned RNA reads, alignments against the reference genome of P. chrysogenum P2niaD18 and diverse datasets, containing all intronic, exonic, and intergenic regions, as well as tRNA and rRNA sequences were performed with bowtie v1.1.0 and only perfect matches were considered [25,55]. This procedure was performed for total reads and unique reads with a read count of at least 10 that were collapsed using the perl script mappler.pl, a part of miRDeep2 [29]. For the calculation of read length and starting nucleotide distribution custom perl scripts were written. To remove spurious sRNAs, derived by RNA degradation, we adapted a previously published method [56]. All unique overlapping or closely mapped sRNAs were grouped. Only loci with at least four unique reads within a sliding interval of 300 bp were taken into account for further analyses. To identify Dicer-dependent reads and loci, unique reads and loci obtained from Δku70FRT2 and Δdcl2Δdcl1 were compared to each other. Read counts were normalized (transcripts per ten million, TPTM) using all reads that map to intragenic and intergenic regions. Reads mapping to structural RNAs, like tRNAs or rRNAs, were not considered for normalization. After normalization, log 2 ratios were calculated using the count number of all unique reads obtained from the Dicer-deficient mutant and Δku70FRT2.
To distinguish between Dicer-dependent and-independent loci, all loci in Δku70FRT2 and Δdcl2Δdcl1 were compared in accordance to their strand bias. A genomic region that showed sRNAs either on one strand or in forward and reverse orientation was rated as a sRNA-producing locus. Loci that appeared in Δku70FRT2 but not in the Dicer-deficient double-mutant Δdcl2Δdcl1 were assigned as Dicer-dependent.
MilRNAs were predicted with the program miRDeep2 with a score cutoff of 20 and without using additional miRNA information. For all predicted milRNAs, the read counts inside all three unique datasets were determined. For visualization of predicted milRNA precursors, RNAfold, part of the Vienna RNA Package 2.0, was used to predict secondary structures [57].

MilRNA target prediction
Little is known about milRNA-target interactions in fungi. To include both animal and plant surveys into our analysis, the program miRanda, written to predict miRNA targets in animals, and the programs TAPIR and psRNATarget, to predict potential plant-like target interactions, were used with default settings [31][32][33]. Since no 3'-UTR database is available for P. chrysogenum the 1 kb downstream sequences were extracted from all protein coding genes. These hypothetical 3'-UTRs were used for milRNA target prediction.

Northern blot analyses
For northern blot analysis, conducted to confirm the existence of predicted milRNAs, 30 μg of total RNAs were separated on a 15% denaturing polyacrylamide gel, containing 7 M urea. Separated RNAs were transferred onto a Hybond-N + hybridization membrane (GE Healthcare Life Sciences) and crosslinked by UV-radiation (Stratalinker). Synthesized DNA StarFire probes (DNA Integrated Technology), which were perfectly complementary to the mature milRNA were labeled with α-[ 32 P] dATP, following the manufacturer's instructions. Hybridization in modified Church-Gilbert-hybridization solution (0.25 M KH 2 PO 4 , 0.25 M K 2 HPO 4 , 7% SDS, and 1 mM EDTA) was started at 65°C for 1 h, followed by an overnight stepwise cooling phase ending at 35°C [58,59]. For detection of bound radioisotopes Fujifilm BAS-III Phos-phorImager plates and FLA-3000 detection system were used.

Quantitative real-time PCR
Quantitative real-time PCR (qRT-PCR) was carried out in a StepOnePlus real-time PCR system (Applied Biosystems) with GoTaq qPCR MasterMix for SybrGreen (Promega) as described previously [60,61]. Oligonucleotide primers are given in S6 Table. Supporting Information GeneRuler DNA Ladder (Thermo Scientific) was used as size standard. Homokaryotic transformants that show the expected fragments according to a correct genomic integration of the resistance cassette are marked with an asterisk. (TIF) S5 Fig. Construction and validation of Δdcl1 mutants. Replacement of the gene coding for Dcl1 (Pc21g06890), with the inducible FLP/FRT cassette containing an N-acetyltransferase coding gene (nat) that mediates resistance to the antibiotic nourseothricin. Validation of homologous integration within the recipient strain Δku70FRT2 was performed by Southern blotting of 20 μg BglI digested genomic DNA with 32 P-labled complementary DNA probes of the 5'-flank (dark bar) of dcl1. Furthermore, PCR analyses, with primers (indicated with grey arrows) surrounding the sequences used for homologous integration, validate the homologous integration of the resistance cassette. In contrast to the tested transformants, the no-template controls (-Control) and PCRs with the recipient DNA (WT) show no PCR product. GeneRuler DNA Ladder (Thermo Scientific) was used as size standard. Homokaryotic transformants that show correct genomic integrations are marked with an asterisk. (TIF) S6 Fig. Construction of Δdcl2Δdcl1 double mutants based on the strain Δdcl2. Homologous integration of the dcl1 knockout construct resulted in the replacement of the dlc1 coding gene (Pc21g06890) with a nourseothricin resistance cassette. Validation of homologous integration within the recipient strain Δdcl2 T1 was performed by Southern blotting and PCR of 20 μg BamHI digested genomic DNA with 32 P-labled complementary DNA probes of the 5'-flank (dark bar) of dcl1. Furthermore, PCR analyses, with a primer pair (indicated with grey arrows) surrounding the 3'-flank, which was used for homologous integration, validate the homologous integration of the resistance cassette. In contrast to the tested genomic DNA of Δdcl1 T1 (+-Control), the no-template controls (-Control) and PCRs with the recipient DNA (WT) show no PCR product. GeneRuler DNA Ladder (Thermo Scientific) was used as size standard. Homokaryotic transformants that show correct genomic integrations are marked with an asterisk. (TIF) S1