House dust mites are common pests with an unusual evolutionary history, being descendants of a parasitic ancestor. Transition to parasitism is frequently accompanied by genome rearrangements, possibly to accommodate the genetic change needed to access new ecology. Transposable element (TE) activity is a source of genomic instability that can trigger large-scale genomic alterations. Eukaryotes have multiple transposon control mechanisms, one of which is RNA interference (RNAi). Investigation of the dust mite genome failed to identify a major RNAi pathway: the Piwi-associated RNA (piRNA) pathway, which has been replaced by a novel small-interfering RNA (siRNA)-like pathway. Co-opting of piRNA function by dust mite siRNAs is extensive, including establishment of TE control master loci that produce siRNAs. Interestingly, other members of the Acari have piRNAs indicating loss of this mechanism in dust mites is a recent event. Flux of RNAi-mediated control of TEs highlights the unusual arc of dust mite evolution.
Investigation of small RNA populations in dust mites revealed absence of the piwi-associated RNA (piRNA) pathway. Apart from several nematode and platyhelminths lineages, piRNAs are an essential component of animal genome surveillance, actively targeting and silencing transposable elements. In dust mites, expansion of Dicer produced small-interfering RNA (siRNA) biology compensates for loss of piRNAs. The dramatic difference we find in dust mites is likely a consequence of their evolutionary history, which is marked by descent from a parasite to the current free-living form. Our study highlights a correlation between perturbation of transposon surveillance and shifts in ecology.
Citation: Mondal M, Klimov P, Flynt AS (2018) Rewired RNAi-mediated genome surveillance in house dust mites. PLoS Genet 14(1): e1007183. https://doi.org/10.1371/journal.pgen.1007183
Editor: Cédric Feschotte, Cornell University, UNITED STATES
Received: July 19, 2017; Accepted: January 3, 2018; Published: January 29, 2018
Copyright: © 2018 Mondal et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Assembled genome was submitted under GenBank ID: NBAF01000000. Small RNA bioSample accession number is: SAMN05441789. Datasets of Bi-sulfite sequencing are deposited under the BioSample accession number: SAMN06891248.
Funding: The work was supported by NSF-MCB (Award ID: 1616725) and Mississippi INBRE program (P204M103476) from the National Institute of General Medical Science. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
House dust mites are ubiquitous inhabitants of human dwellings, and are the primary cause of indoor allergy . Dust mites have an unusual evolutionary history, descending from a parasitic ancestor . Parasite genomes are typically highly modified; possibly to accommodate genetic novelty needed to productively interact with a host [3, 4]. The sequence of events leading to adoption of a parasitic lifestyle may require a period of genomic crisis to yield the rewired parasite genome. Dust mites represent an extreme case potentially experiencing a second round of genomic change to reacquire a free-living ecology.
Transposable element (TE) activity is a major source of genome instability [5, 6]. Silencing of TE activity in multicellular organisms is commonly achieved by RNA interference (RNAi)-based mechanisms, which employ small RNAs associated with Argonaute/Piwi (Ago/Piwi) proteins to target TE transcripts . In many animals, the Piwi-associated RNA (piRNA) pathway is the primary RNAi-based defense [8, 9]. In arthropods and vertebrates piRNAs are recognized as being roughly 23–32 nucleotides (nt) long, and unlike other small RNAs, such as microRNAs (miRNAs) and small-interfering RNAs (siRNAs), they are not excised from double-stranded RNA (dsRNA) precursors by the RNase III enzyme Dicer . piRNAs in Drosophila are generated in two collaborative pathways: Phased cleavage of transcripts by the RNase Zucchini (Zuc) and a “ping-pong” mechanism involving direct cleavage by Piwi proteins . Ago/Piwi proteins may possess “slicer” activity which cuts transcripts base-paired with a bound small RNA 10 nt from the 5’ end of the small RNA . Zuc-dependent piRNA biogenesis is initiated by Piwi-mediated slicing of targeted transcripts, which propagates in a 5’-3’ direction from the site of scission [13, 14]. These piRNAs feed into the ping-pong where Piwi proteins collaborate to capture fragments of TEs and convert them to new piRNAs [15, 16]. This leads to further production of Zuc-dependent piRNAs in an amplifying system . As TE transcripts processed through the ping-pong pathway are products of slicing they exhibit 10 nt 5’ overlaps with cognate, antisense piRNAs . In contrast, Zuc-dependent piRNAs are derived from single stranded RNA precursors, and while this process has been found to be dependent on initial slicing there are Drosophila cell types in which the ping pong system is absent that initiate zuc processing through the factor Yb .
Another feature of piRNA-mediated genome surveillance is the involvement of piRNA cluster master loci as sites of Zuc-dependent piRNA production . These loci are composed of TE fragments, serving as catalogs of restricted sequences. Loss of master loci integrity compromises TE repression and causes sterility. Nematode piRNAs, while possessing a related role in controlling TEs, differ in that they are short (21nt) cleavage products of small discrete transcripts . Despite these differences in biogenesis, piRNAs in both species typically exhibit an “U” residue at the 5’ terminus. The exception is some ping-pong piRNAs, which instead have an “A” at the tenth position.
While piRNA regulation of TEs is common in animals, it has been lost in several nematodes and platyhelminths [21, 22]. In the nematode species, alternative mechanisms restrict TE mobilization involving Rdrp (RNA dependent RNA polymerase) and Dicer. Conversion of TE transcripts by Rdrp into dsRNA substrates of Dicer results in siRNA generation. Ago proteins then associate with nascent TE transcripts, recruiting chromatin modulators including DNA methyltransferase. This process, RNA-induced transcriptional silencing (RITS) is common in plants and fungi [23–25]. RITS-like mechanisms are found in animals as nuclear localized Ago and Piwi proteins can influence chromatin biology [26, 27]. However, outside nematode clades, amplifying RITS mechanisms involving siRNAs have not been observed in vertebrates or other ecdysozoans–potentially due to absence of Rdrp . One possible exception is chelicerae arthropods, a lineage where dust mites belong, which possess Rdrp proteins. RNAi pathways in chelicerates appear complex as they have both Rdrp and Piwi class Argonaute proteins, both of which appear to have roles in controlling TE’s [29–31]. Here we investigate the status of small RNA pathways in the dust mite to understand how RNAi biology might be structured in this highly-derived organism.
Absence of Piwi proteins in dust mite genome
We obtained a genome sequence for the American house dust mite Dermatophagoides farinae using Illumina and PacBio platforms. The HGAP pipeline was used through PacBio SMRT analysis portal to filter and assemble PacBio reads, which resulted in 1,828 contigs producing a total length of 93,777,723 bp . Then, Illumina reads were used to connect and extend the PacBio contigs using SSPACE scaffolding, which produced a 93,804,520 bp assembly in 1728 scaffolds . After removal of bacterial contamination, the final contig number was reduced to 1706 with a N50 read length of 19,371. The assembled and filtered final genome was ~92 Mb compared to a 53 Mb genome that was previously reported . Using mRNA-seq datasets we annotated ~18,500 transcripts through the Cufflinks program . 47% of the genic transcripts exhibited similarity to S. scabiei and/or D. melanogaster protein coding genes or to the NCBI conserved domain collection (Materials and Methods) (S1 Table) [34, 36].
Ago/Piwi proteins were identified in the D. farinae genome using RNA-seq annotations and amino acid sequences of seven Ago and six Piwi proteins from Tetranychus urticae–the closest relative of D. farinae with a high-quality genome and experimentally supported annotations . Eight confident Ago homologs were found (Ago1-GenBank ID: KY794591, Ago2-GenBank ID: KY794592, Ago3-GenBank ID: KY794593, Ago4-GenBank ID: KY794594, Ago5-GenBank ID: KY794595, Ago6-GenBank ID: KY794596, Ago7-GenBank ID: KY794597, Ago8-GenBank ID: KY794598). Ago proteins from T. urticae, D. melanogaster, C. elegans, and Ascaris suum were compared to D. farinae Agos using amino acid sequences of Paz, Mid, and Piwi domains (Fig 1A). Our phylogenetic analysis recovered two Ago family members likely involved in miRNA (DfaAgo1) and siRNA (DfaAgo2) pathways . The remainder belong to a divergent clade specific to dust mites (DfaAgo3-8). Surprisingly, none of the Agos from D. farinae belong to the Piwi clade. We examined D. farinae Agos for the presence of slicer motifs. The DEDH slicer motif, which is common in metazoan Ago and Piwi proteins, was found in DfaAgo1 (miRNA) and DfaAgo2 (siRNA). The divergent Agos have an uncommon DEDD catalytic motif (S2 Fig). Orthologs containing a DEDD motif can be found in scabies (S. scabiei), social spiders (Stegodyphus mimosarum), and in C. elegans Ago family members of unknown function; which emphasizes the unusual nature of this Ago clade [36, 38, 39].
A. Relationship of Ago/Piwi proteins from D. farinae, Drosophila, C. elegans, and A. suum using conserved Paz, Mid and Piwi domains. Dust mite proteins indicated in red. Only two Wago proteins included for simplicity. Bootstrap values for major nodes indicated. B-D. Heatmaps showing Z-scores for Overlap probabilities for 18-30nt small RNAs from dust mites (B), spider mites (C), and Drosophila female bodies (D). Overlaps are shown for each read length as well as all lengths together. Read lengths listed horizontally, Overlaps vertically. Blue arrow labeled Dcr indicates expected 2nt register suggestive of dicer cleavage. Red arrow labeled pp shows expected overlap for ping-pong processing.
Loss of piRNAs in dust mite
A dust mite small RNA library of nearly 400 million reads was generated to investigate whether piRNA-class small RNAs could be identified (S1 Text) (S1 Fig). To accommodate the repetitive nature of piRNA targets all mapping events were captured for reads that mapped fewer than 100 times. An overall rate of ~80% mapping was observed with ~0.69% discarded due to mapping >100 times (S1 Fig). Next an algorithm that determines small RNA read overlap probabilities in mapping data was used to characterize biogenesis of dust mite small RNAs . When applied to either all mapping or mapping in discreet size ranges no clear bias for 10nt overlapping reads was uncovered, showing an absence of ping pong processing (Fig 1B). Instead a strong signal seen in a register 2nt shorter than the length of read sizes. This is congruent with 2nt overhangs left by Dicer cleavage. Overlaps seen in dust mites starkly contrast with those seen in spider mites and Drosophila. In spider mites, ping pong signatures could be seen in longer reads (23-28nt) and the dicer-associated 2nt register in shorter (20-22nt) reads (Fig 1C). Likewise, in RNAs sequenced from Drosophila female bodies a prominent ping pong signature is evident (Fig 1D). siRNA processing was not evident when considering whole genome mapping, but could be seen in a group of Drosophila IDEFIX retroelements that had biased mapping of 21nt RNAs (S3 Fig) (S3 Table). Drosophila endo-siRNAs are a relatively small proportion of total small RNAs, and are frequently produced from inverted repeat loci which are not captured by the overlap probability calculation used here . Moreover, this highlights a correlation between the presence of Rdrp in spider mites and an expanded population of Dicer products. Together this shows a dramatic departure in the composition of dust mites small RNA populations relative to those in the Piwi protein possessing spider mite. The difference is even more stark when comparing dust mites to the more distantly related fruit fly. The configuration of RNAi in spider mites is likely ancestral due to the similarities to Drosophila, which is supported by clear orthology of spider mite Piwis to distinct Drosophila ping-pong partners: dmeAgo3 and Piwi/Aub (Fig 1A). Thus, RNAi pathways have diverged in dust mites and appear to be dominated by siRNA-like Dicer products, and lack the signature of amplifying ping pong piRNAs .
To functionally characterize dust mite small RNAs, we sought to identify genomic loci that generate and/or are targeted by these transcripts (S1 Fig). To achieve this, annotation of the dust mite genome was extended to find ncRNAs and repetitive elements using Repeatmasker. Additionally, non-miRNA, small RNA producing loci were annotated that exhibited >1000 read density and were longer than 200nt. The identities of regions were determined using blast2go . Nearly a third of the loci were rRNA or mRNA. The remainder showed homology to either TEs or lacked similarity to known sequences. Together this permitted segmentation of the dust mite genome into mRNA, TE, rRNA, tRNA, snRNA, and unknown small RNA-mapping loci (S2 Table). Small RNA reads were then mapped to these regions using multiple mapping conditions described above, as well as unique mapping. To ensure multi-mapping events were specific to loci groups, datasets were cleaned before mapping by removing reads that mapped to non-target genomic features (S1 Fig).
Multi-mapping alignments showed considerable enrichment at TEs relative to other classes, consistent with their repetitive nature (Fig 2A). Both multi- and uniquely mapping TE reads also exhibited lower strand bias with only a single locus showing 100% bias after unique mapping (S4 Fig). This is consistent with processing from dsRNA. Higher bias was seen at other loci, suggesting that some mapping events may be due to capture of RNA degradation fragments and not functional small RNAs. This was supported when overlap probabilities were calculated; which, with exception of TEs and mRNAs, did not show consistent processing signatures (S5 Fig). This includes the unknown loci, suggesting these transcripts may be degradation products of uncharacterized ncRNAs and are not generally siRNA or piRNA class small RNAs. Closer inspection of per locus overlaps did show Dicer processing at a minority of loci (S6 Fig). There was no clear ping pong processing at unknown loci. Small RNA mapping coverage was calculated per locus to understand siRNA production from TEs and mRNAs (Fig 2B). On average, small RNA coverage was even across TE loci, while mRNAs had greater coverage at transcript 3’ ends. This pattern at mRNA loci is suggestive of cis-NAT siRNAs . Depth of coverage at TE loci varies, showing that active targeting is occurring at a subset of loci.
A. An RDI plot of per locus strand bias seen after multi-mapping and uniquely-mapping protocols in dust mite genome feature classes: mRNA, rRNA, TE, tRNA, U6 snRNA, and unknown genomic loci. Mean indicated by black bar, white transparent box shows standard deviation. Under the graph millions of reads and number of loci in each category is shown. B. Coverage of small RNA mapping in TEs (left) and mRNAs (right). Line plots show average coverage across loci. Heatmaps below show length-normalized per locus coverage of small RNA reads.
The absence of purely single-stranded small RNA producing loci that have homology to TE sequences suggests that dust mites also lack a Zuc-dependent piRNA-like pathway that is involved in genome surveillance. This does not rule out the existence of dual strand piRNA clusters; however, piRNAs produced from these loci are found to participate in the ping pong cycle, which we did not observe . These data suggest that the piRNA pathway has been lost in dust mites and control of TE’s is likely under the purview of a siRNA-like pathway.
siRNAs facilitate genome surveillance in dust mite
To investigate the role of dust mite small RNAs in genome surveillance we compared the biogenesis of TE-associated small RNAs to those found in spider mites. The size distribution of genome-aligned dust mite small RNAs is unimodal with a peak at 24nt, versus a bimodal distribution in spider mites (Fig 3A). When TE-mapping reads are examined, the 24nt sized RNAs in dust mite were enriched by 10%, while in spider mites only larger size range RNAs were found (Fig 3B). In other locus groups, less read size bias was observed, consistent with the heterogeneity seen in strand bias and read overlap probabilities, further reinforcing that generally non-TE loci do not produce small regulatory RNAs (S7 Fig). Next we looked at the 5’ nucleotide bias and found that dust mites TE siRNA reads have an equal prevalence of T and A residues versus spider mites where there was striking over representation of T (Fig 3C). Then we examined per locus read size distribution and overlap probabilities to assess whether Dicer processed ~24 nt small RNAs are common across dust mite TE loci (Fig 3D). All loci exhibited mapping of predominantly 24 nt reads, and in the most prevalent size ranges (23-26nt) a clear pattern of overlaps could be seen that is consistent with Dicer processing (Fig 3D). This contrasts with a similar analysis in spider mite where a ping pong signature was seen across all TEs. Together this suggests siRNAs are the main RNAi-based mode of controlling TEs in dust mites, accommodating the apparent loss of piRNAs. This is a clear departure from spider mites where stereotypical piRNAs target TEs.
A. Size distribution of genome mapped small RNAs in dust mites (solid line) and spider mites (dashed line). B. Size distribution of TE mapped RNAs small in dust mites (solid line) and spider mites (dashed line). C. Seqlogo showing 5’ nucleotide bias in TE mapped small RNA in spider mites (top) and dust mites (bottom). D. Per locus biogenesis of dust mite TE associated small RNAs. Left shows Log2 read accumulation per read size. Overhang probabilities (positive z-scores only) of small RNA pairs at specific or all sizes. The size(s) of reads show above heat map. A similar analysis from spider mite small RNAs (18-30nt) shown on right. Red arrow indicates overlap for ping pong process. Blue arrow shows overlap expected for dicer processing.
In the D. farinae genome we found three Dicers (DfaDcr1-GenBank ID: KY794588, DfaDcr2-GenBank ID: KY794589, DfaDcr3-GenBank ID: KY794590). DfaDcr1 is a close ortholog of Arthropod miRNA-producing dicer (S8 Fig). The other two Dicer proteins are related to family members in other mites and lophotrochozoans, and are unrelated to Arthropod Dicer2 or nematode Dicer (S8 Fig). Unexpectedly, DfaDcr1 possesses an ATP binding helicase domain, which is implicated for processing of long dsRNA (S9 Fig) . The more divergent Dicers, DfaDcr2 and DfaDcr3, lack both DUF283 and dsRNA binding domains, and have divergent PAZ domains (S9 Fig) [46–48]. Together this suggests that mites, and possibly other chelicerates, possess ancient Dicer biology present in basal protostomes that was lost both in nematoda and pancrustacea (insects and crustaceans).
To verify whether TEs are controlled by Dicer-produced siRNAs we sought to inhibit the activity of dust mite Dicer proteins. To generate loss of Dicer function we elicited RNAi against each Dicer by feeding mites cognate dsRNA (Fig 4). Dust mites tolerate being soaked for several hours in aqueous solution, which they can be observed to ingest after 30 mins (Fig 4A). Small RNAs (20-27nt) derived from dsRNA can be recovered from soaked mites (Fig 4B). Knockdown of target genes can also be observed (Fig 4C–4K). Depletion by RNAi of each DfaDcr protein resulted in derepression of multiple TEs (Fig 4L) (S10 Fig). A strong effect was seen with loss of DfaDcr1 and DfaDcr2 function. The presence of processive helicase activity in DfaDcr1 suggests that long dsRNAs could be substrates. This combined with the lack of dsRNA binding motifs in DfaDcr2/3 suggests DfaDcr1 has a unique capacity to process dsRNA (S9 Fig), and therefore it is unsurprising that it has a significant role in the control of TEs (Fig 4L). Loss of DfaDcr2 showed a greater effect on TE expression compared to DfaDcr3. How these atypical Dicer proteins function is unclear; however, residues in the DfaDcr3 PAZ differ significantly from those in DfaDcr2 PAZ suggesting non-overlapping roles in the metabolism of dust mite small RNAs (S9 Fig). These results are consistent with reports that psoroptid mites are sensitive to dsRNA soaking, resulting in gene knockdown [49, 50].
A. Dust mite soaking. Mites were soaked separately in orange and green food color for 30 min. B. Radiolabeled RNAs recovered from mites fed either single-stranded (ssRNA) or double-stranded RNA (dsRNA). RNAs were treated with DNase and CIP prior to separation via denaturing PAGE. C. Western blot of Derf1 allergen after soaking animals with derf1 dsRNA (upper panel) and coomassie staining of the membrane (lower panel). Animals were soaked for 30 min and after 4 days lysates were prepared. D-K. qPCR for dust mite transcripts, all experiments were performed at least three times. Values represent four technical replicates. Reverse transcription was carried out with either oligo dT (D, F, H, J) or with random hexamers (E, G, I, K). Target transcripts were derf1 (D,E), dcr1 (F,G), dcr2 (H, I), and dcr3 (J, K). Cntrl represents no treatment, and KD soaking in the indicated dsRNA. L. Increased expression of numerous TE’s (S2 Table) following RNAi against three dust mite Dicers relative to untreated control. Error bars represent SEM.
Investigation of RNAi in dust mites revealed loss of the piRNA pathway and replacement by siRNAs. This is similar to observations in nematodes and flatworms [21, 22]. The loss of piRNA activity in dust mites, nematodes, and possibly in flatworms may be tolerated due to compensation by amplifying siRNAs produced by Rdrp [21, 51]. The collective function of dust mite Rdrps, however, appears to be distinct from nematodes, as only processive versions are present, suggesting the de novo siRNA pathway may not be present in mites (S11 Fig). Substantial Rdrp activity does appear to be present in dust mites; dsRNA soaking results in elevation of target mRNA when reverse transcription is carried out with random hexamers (Fig 4E, 4G, 4I and 4K) but not oligo dT (Fig 4D, 4F, 4H and 4J). Increase of transcript abundance was not due to the presence of ingested dsRNA as the region cloned to generate dsRNA was distinct from the qPCR amplicon (S10 Fig). Random priming will capture Rdrp products, while oligo dT will only hybridize to the initial transcript. For all the genes tested an elevation of cognate transcripts could be observed after random priming that were poorly recovered from Oligo dT primed cDNA.
Cataloging restricted sequences in siRNA producing master loci
Dust mites differ from nematodes that lost piRNAs in the organization of siRNA producing loci. A key feature of piRNA biology is the cataloging of restricted sequences into master loci. In nematode lineages lacking piRNAs, master loci also appear to be absent . This is not the case in dust mites (Fig 5A). Three loci were discovered that span 62 kb, contain sequences from multiple varieties of TEs, and exhibit homology to 70% of TE mapped small RNAs (Fig 5B). Two of the loci, ML-283 and ML-95, appear to be generated by duplication; however, some sequence divergence indicates they are distinct loci. Similar regions could not be found in the S. scabiei genome . Though, poor conservation is a characteristic of piRNA master loci . The dust mite loci appear to be generated from a dsRNA precursor as both strands of the loci show similar rates of read mapping (Fig 5A). We found a tendency for 2nt overhangs along with little evidence for nucleotide bias (S12 Fig). The loci were inspected for common motifs using the meme suite . Motifs recovered were primarily simple repeats with none being shared between loci suggesting dust mite master loci don’t possess elements like the Ruby motif which is central to directing piRNA transcription in C. elegans . Following knockdown of each of the individual dust mite Dicers significant (>80%) reduction in siRNAs exhibiting homology to these regions was observed, indicating a dependence on the activity of all dust mite Dicers for biogenesis (Fig 5C). Detection of the siRNAs was accomplished with a combination of oligonucleotide probes complementary to sites of highest small RNA density in the three master loci (S1 Text). They also have homology to other regions of the genome, specifically TEs. Thus, the Dicer sensitive siRNAs include master loci derived primary siRNAs and potentially secondary siRNAs generated from processed TE transcripts. This is consistent with loss of TE control after knockdown of each Dicer (Fig 4L). However, there is a clear difference in the magnitude of TE expression, which may point to roles for dust mite Dicer proteins outside the production of siRNAs and to involvement in targeting of TE transcripts. This could be similar to limiting of latent viral infection by Drosophila Dcr2 .
A. siRNA producing TE-control master loci (ML). Read density of all mapping events to the positive strand in red, negative strand in blue. Density of uniquely mapping reads in yellow for positive strand and green for negative strand. B. Catalog of TE homology sequences in master loci. Multiple sequence alignment of TEs against master loci to show homologous sequences. C. Northern blots against ML-associated siRNAs (ML-A siRNA) after eliciting RNAi against dust mite Dicers. D. Northern blots against ML-A siRNAs after β-elimination test. E. Accumulation of ML-A siRNAs following incubation with the monophosphate specific terminator ribonuclease (term) and Calf intestinal phosphatase (CIP). Relative accumulation of ML-A siRNAs was determined by densitometry and normalization to U6 signal. Experiments were performed at least three times, representative results shown.
Next, we sought to characterize terminal moieties of master loci associated siRNAs through biochemical tests to gain greater insight into their biogenesis (Fig 5D and 5E). The primary goal was to determine if the siRNAs had characteristics of Dicer cleavage: 5’-monophosphates and 3’-OH groups. β-elimination showed a shift to a lower molecular weight indicating an unmodified 2’OH; therefore, unlike Drosophila Ago2 endo-siRNAs or C. elegans Prg-1 associated small RNAs, dust mite siRNAs are not 2’-OH methylated (2’OMe) (Fig 5D) [57, 58]. Next, we identified groups on 5’ ends of small RNAs using the 5’ monophosphate specific terminator ribonuclease. After treatment, a 50% reduction in siRNAs could be observed (Fig 5E). Degradation by terminator could be abrogated by prior treatment with calf intestinal phosphatase (CIP). There is a noticeable lag in siRNA gel migration following CIP treatment, which is consistent with removal of 5’ phosphate groups and loss of charge. These results also reinforce the absence of a de novo siRNA pathway. Small RNAs produced by non-processive Rdrps in C. elegans have 5’ triphosphate groups. While treatment with terminator did not completely eliminate siRNAs there was no observable change in migration. If the remaining small RNAs were spared due to the presence of trisphosphate groups there would be shift towards a smaller molecular weight, relative to untreated. Together, dust mite master loci associated siRNAs appear to be Dicer products arising from a dsRNA precursor, possess the expected 5’-monophosphate, but differ from insect endo-siRNAs due to the absence of 2’-OMe groups. We were able to identify a dust mite gene with similarity to Hen1 methyltransferase proteins; however, inspection of potential open-reading frames revealed the absence of a common motif involved in recognition of 2 nt 3’ overhangs characteristic of Dicer products (S12 Fig). This likely explains the lack of 2’-OMe groups on dust mite siRNAs.
DNA methylation is not involved in dust mite TE control
Extent of DNA methylation in CG widely varies across insect clades and can be as high as 40% in roaches, while other groups, like flies, show little evidence for this modification . Here we investigated whether this epigenetic control mechanism is a component of TE control in dust mites, as the genomes of nematodes and platyhelminths that lack the piRNA pathway are frequently modified by cytosine methylation [21, 60]. Dust mites differ from these organisms, as evidence for this modification seems minimal and it is not enriched at TE loci (Fig 6A). Indeed, bisulfite sequencing showed potential CG and CHG methylation is underrepresented in TE sequences, despite these sites occurring at the same rate as other genomic loci. Furthermore, the overall rate of DNA methylation (0.5%) was very low, suggesting this base modification is not a major feature of dust mite chromatin regulation. Moreover, we found a single DNA methyltransferase in the D. farinae genome, a Dnmt1 homolog (Fig 6B and 6C). It is likely a pseudogene as it appears to be truncated and shows little evidence of expression. This further highlights the distinct, derived nature of small RNA-mediated genome surveillance in dust mites.
A. Distribution of methylated bases assessed by bisulfite sequencing across the entire genome, mRNAs, and TEs. Percentage of methylated Cs (mC) identified in all sequence contexts are compared with the number of bases identified in each category. B. Dust Mite DNMT1 homolog. Expression of dust mite DNA (cytosine-5)-methyltransferase 1 (Dnmt1) in mixed stage RNA-Seq data. Blue bar represents dust mite Dnmt1 locus in the scaffold. Read density in region shown as grey plot. Reads mapping below; plus strand mapping in red, minus strand mapping in blue. C. Domain structure of truncated D. farinae Dnmt1 and an intact ortholog from Limulus polyphemus.
This work provides insight into the elaborate nature of RNAi in chelicerates, many of which appear to have both Piwi proteins and Rdrps [29, 30, 39]. Loss of the piRNA pathway in dust mites probably occurred in the parasitic ancestor. Inspection of the scabies mite genome similarly failed to uncover Piwi proteins (S13 Fig) . Members of the divergent dust mite Ago family; however, were found. Indeed, a deeper inspection of scabies mite RNAi factors uncovered further similarities to dust mites (Table 1). Thus, absence of the piRNA pathway in dust mites is likely a consequence of descending from an ancestor that underwent dramatic genome changes, potentially during the acquisition of a parasitic life style. This highlights plasticity of RNAi pathways and how clade-specific biology might impact evolution of RNAi technologies.
Dust mites exhibit a highly distinct RNAi biology, possessing both novel and ancient effectors that haven’t been studied in popular ecdysozoan model organisms. Indeed, there seems to be wholesale changes to the small RNAome of these organisms. Dicer produced siRNAs are an unusually common feature of the dust mite small RNA populations, comprising approximately three-fourths of all small RNA species. This contrasts with many other organisms where microRNA-class small RNAs are the archetype. Dust mite siRNAs are, at least in part, involved in genome surveillance. They target TE’s and depletion of Dicer proteins causes derepression of these elements. Control of TE’s is typically carried out by piRNAs in flies, from which dust mite siRNAs are distinct. A common feature of nearly all piRNAs is a “U” residue at the first position. We do not observe this in any subset of dust mite siRNAs. Furthermore, well-described modes of piRNA biogenesis found in Drosophila and C. elegans are absent in dust mites. Loss of piRNAs seems specific to psoroptidian mites, as they are clearly present in other Acari, like spider mites. The divergent nature of dust mite siRNAs is particularly apparent in the absence of 2’-OMethylation of siRNAs–a common feature of siRNAs and piRNAs in other organisms. Interestingly, scabies mites also lack the requisite Hen-1 protein . Inspection of syntenic regions of the dust mite and scabies mite genomes showed rearrangements at this locus, potentially linking the loss of this activity to the evolution of Psoroptidia-specific Ago proteins (S13 Fig) (Table 1). The highly divergent RNAi pathways of dust mites provide an evolutionary perspective not only on the utility of small RNAs to acquire roles in genome surveillance, but also that the precise mechanism may not be that important. This is supported by relatively similar composition of classes of TE’s in spider mites, dust mites, and scabies mites (S15 Fig). While similar classes were observed their locations and specific identities are distinct. Furthermore, this indicates that the collection of dust mite TEs analyzed in this study accurately represent the overall TE population.
Flux of small RNA pathways correlates with evolutionary innovation; for example, higher arthropods lost Rdrp in favor of piRNA control of TE . This also occurred when vertebrates diverged from basal chordates . In both cases, loss of Rdrp accompanied innovation in body plan and sensory organs. In vertebrates, whole genome duplication occurred twice following descent from a Rdrp expressing chordate ancestor, affirming a period of genome instability . TE mobilization may be fortuitous for adaptation, and dramatic evolutionary changes may require extreme events such as perturbation of surveillance mechanisms.
Materials and methods
Genome assembly pipeline
The dust mite genome was assembled using reads produced by PacBio and Illumina platforms. The initial assembly was generated by PacBio HGAP. Illumina reads were preprocessed in three steps before using them for extending PacBio contigs: a) Using Trimmomatic , from both ends of reads, nucleotides with base quality lower than 15 were removed. b) Using FastUniq , duplicate pairs were removed from the PE library, and c) SOAPec  was used to correct read error [64, 65]. Any initial genome sequence has bacterial contamination due, at least, to the presence of gut microbiota in DNA isolates. To remove bacterial DNA sequences from D. farinae genome sequence, 4,864,367 Bacterial genome sequences  were downloaded from RefSeq database at: ftp://ftp.ncbi.nih.gov/refseq/release/bacteria and a blast database was created using the sequences [66, 67]. All the contigs were blasted against the created bacterial genome database to check bacterial contaminations in the sequenced contigs. Then the matched percentages were calculated for each of the contigs. If the matched percentages were higher than 10% of an individual contig length, the contig was considered as contaminated by bacterial DNA and was discarded. After this process, our final contig number was reduced to 1706, N50 Read Length of 19,371 with the total length of 91,947,272 bp. Finally, a published dust mite genome  was compared to our assembled contigs using QUAST [34, 68]. 79.3% bases of the reference genome could be aligned in the new assembly.
Using available mRNA-seq datasets , transcripts were identified by the Tuxedo suite. Initial mapping with Tophat was followed by transcript annotation with cufflinks . Transcript similarity was estimated using Blast2Go.
Small RNA analysis
Total RNA isolated via the trizol method from bulk collected dust mites in order to capture life stages of D. farinae. Small RNAs were cloned from total RNA with an Illumina small RNA truseq kit, and sequenced on the Illumina NextSeq platform. The dataset was comprised of nearly 400 million reads. Quality of the sequenced library was assessed by FastQC tool and the small RNA reads were analyzed using a custom pipeline (S1 Fig) .
dsRNA soaking of mites and northern blotting
Mites collected with the salt bath method were suspended in a solution of dsRNA dissolved in nuclease free water (S1 Text). After 6 hours, animals were washed in water and dried on filter paper. After that the animals were kept in 23°C with relative humidity of 80%. After two days, total RNA was extracted using trizol method and resolved in a 12.5% urea-polyacrylamide gel. When animals were fed unlabeled dsRNAs, RNAs were transferred to nylon membranes and subject to northern blotting as previously described (S1 Text) . If radiolabeled RNAs were fed, gels were directly exposed to phosphoimager screens.
20 μg of total RNA was oxidized at room temperature in borax/boric-acid buffer (60 mM borax and 60 mM boric acid-pH 8.6) containing 80 mM NaIO4 for 30 min. β-elimination reaction was carried out for 90 min using 200 mM NaOH at 45°C. Following precipitation, RNA was resolved on a 12.5% urea-polyacrylamide gel, and subject to northern blotting as previously described .
CIP and terminal exonuclease treatment
20 μg of total RNA was used for each of reaction. Terminator exonuclease (epicenter) was added to one tube and the tube was incubated at 30°C for 60 minutes. After that the reaction RNA was purified by organic extraction protocol . In the second condition, 1 μl CIP (Calf intestinal phosphatase, NEB) was added and incubated at 37°C for 30 min. Terminator exonuclease was added followed by a second incubation at 30°C for 60 minutes. Precipitated RNAs were resuspended in loading buffer and resolved on a 12.5% urea-polyacrylamide gel, and subjected to northern blotting as previously described .
A Methyl DNA seq library was created with Illumina Methyl-seq TruSeq Kit from dust mite DNA recovered by organic extraction followed by precipitation. Using the Bismark algorithm  base converted dust mite genome indexes were used to determine the rate of cytosine methylation. Using coordinates from cufflinks (mRNA), and RepeatMasker (TE) annotations, rates of methylation were determined for different genomic features. Reads were mapped uniquely and duplicated reads were discarded that resulting in an average 6X coverage depth . Using bedtools, genomic regions that had >4 reads mapping were determined and the base conversion rate measured.
Assembled genome was submitted under GenBank ID: NBAF01000000. Small RNA bioSample accession number is: SAMN05441789. Datasets of Bi-sulfite sequencing are deposited under the BioSample accession number: SAMN06891248. Spider mite small RNA datasets used in the study can be accessed at GEO GSE32005. Drosophila small RNA dataset using in the study can be accessed at GEO GSE83698.
S1 Text. Supplementary methods and materials (Contains supplementary methods. provided in a separate file: S1_Text).
S1 Fig. Pipelines used to analyze small RNAs from high throughput sequence data.
S2 Fig. Alignment of dust mite Ago “slicer” DEDH/D motif.
Multiple sequence alignment was carried out using clustal omega. Active site residues are highlighted in red or green.
S3 Fig. Heatmap of overlap probability z-scores for D. melanogaster siRNAs derived from a subset of IDEFIX TEs sequenced from female bodies.
Top bar graph represents number of reads in each size. Probability z-scores were calculated for each length separately (18, 19, etc.) and together (18–30). R heatmap2 package was used to draw the heatmap. 2nt dicer processing register is shown by blue arrow “D”. Red arrow labeled “pp” shows 10nt ping-pong overlap signature. Blank areas in the heatmap are due to the absence of overlapping pairs.
S4 Fig. Strand bias and expression for TE, mRNA, and unknown loci.
For each locus, number of mapped reads to either sense or antisense strand was determined using bedtools multicov. Strand bias was calculated by dividing the absolute difference between strand specific coverage by total converage (y-axis). Each locus is plotted by bias and log2(number of mapping reads) (x-axis). Read line indicates mean values, dotted lines standard deviation. Green regression line also plotted. Box plots on left and below show distribution of values: y-axis bias, x-axis expression.
S5 Fig. Heatmap of overlap probability z-scores for loci groups (TE, mRNA, unknown, ncRNA–rRNA, tRNA, U6.
Probability z-scores, on top of maps, were calculated for each size separately (18, 19,…. 30) and together (18–30). Overlaps shown on right of maps. Heatmaps were drawn in with the R heatmap2 package. The blue arrow labeled “D” shows 2nt dicer processing register. Red arrow labeled “pp” shows 10nt overlap where ping-pong cleavage would be seen.
S6 Fig. Overlap probabilities by locus for unknown loci.
Size of read pairs indicated above the heatmaps. Blue arrows denote the expected overlap for dicer processing. Red arrows indicate expected overlap for ping pong cleavage.
S7 Fig. Size distribution of reads mapped to different types of loci.
S8 Fig. Dicer family tree comparing relationships among Dicers.
Dust mite Dicers indicated in red. Full name of the gene abbreviations can be found in S1 Text.
S9 Fig. Strutural annotations of dust mite Dicer proteins.
Dfa_Dcr1 (NCBI accession KY794588), Dfa_Dcr2 (NCBI accession KY794589), and Dfa_Dcr3 (NCBI accession KY794590) compared to D. melanogaster orthologs. A. Protein domain prediction of Dust mite Dicer proteins compared to Drosophila Dicers. Dicer protein domains were predicted using ScanProsite . Helicase_ATP_Bind (Helicase ATP Binding domain), Helicase_Cterm (Helicase C-terminal domain), DUF 283 (dsRNA annealing domain), PAZ (Piwi-Argonaute-Zwille domain), RNase IIIa/b (RNase III domains), DS_RBD (Double stranded RNA-binding domain) B. Crucial amino acids for Dicer activity in helicase, RNase III, and PAZ domains. Multiple sequence alignment was carried out using clustal omega and alignment visualized by jalview. Amino Acid positions indicated from PAZ domain correspond to Drosophila Dicer1.
S10 Fig. Positions of dsRNA and qPCR sites.
Regions used for creation of dsRNA and qPCR are shown in red and green respectively for the Derf1 and DfaDcr1-3 genes.
S11 Fig. D. farinae’s Rdrps are processive enzymes due to absence of a proline/tryotophan rich loop.
Insertion of a proline/tryotophan rich loop in RRF1/EGO1 group of Rdrp is responsible for de novo initiation of RNA synthesis, which is a property of non processive Rdrps. This group of Rdrp makes short RNAs like 22G RNA in C. elegans while processive Rdrps (RRF3 group) that do not have this loop elongate nasecent RNA for longer length. All D. farinae Rdrps do not have this loop thus are processive (RRF3 type) and synthesize longer RNAs.
S12 Fig. Characteristics of ML-siRNAs.
A. Overhang of reads uniquely mapping to ML-siRNA loci show a 2nt overhange, which is characteristics of Dicer processing. Overlap z-score probability was calculated using the python script for each size pair (18/18, 19/19,.....28/28) and averaged. Overlap probability was then converted to overhang probability by subtracting each overlap length from the read reangth (for example, 19 overlap probability is same as 2nt overhang probability for 21/21 pair). B. Seqlogo analysis showing nucleotide bias in ML-siRNAs. These small RNAs tend to be AT rich.
S13 Fig. Dust mite Hen1 protein.
A. Sequences from Drosophila and Arabidopsis were blasted against the dust dite genome. A single Hen1 homolog was found that lacks a conserved domain involved in recognition of 2 nt 3’ overhangs found in Dicer products. B. Expression from RNA seq at the Hen1 locus and annotations of neighboring genes. Potential syntenic region from the scabies genome below showing loss of the Hen1 gene in this mite.
S14 Fig. Comparison of dust mite and scabies Ago proteins.
Clade containing Dust Mite specific Ago proteins described in Fig 1 highlighted in yellow. microRNA binding Agos indicated by blue. Drosophila Piwi included to demonstrate lack of clustering with this group of Ago proteins.
S15 Fig. Distribution of TE classes in spider mites, dust mites, and scabies mite.
S1 Table. Dust mite transcriptome annotations (provided in a separate file: DustMite_mRNA.bed).
S2 Table. Annotated all TE in the dust mite genome (provided in a separate file: DustMite_TE_ ncRNA_And_HighExpressingLoci.bed).
S3 Table. IDEFIX TE coordinates of D. melanogaster (provided in a separate file: IDEX_Fly.bed).
- 1. Arlian LG. House-dust-mite allergens: a review. Exp Appl Acarol. 1991;10(3–4):167–86. pmid:2044430.
- 2. Klimov PB, OC B. Is permanent parasitism reversible?—critical evidence from early evolution of house dust mites. Syst Biol. 2013;62(3):411–23. pmid:23417682.
- 3. Brookfield JF. Host-parasite relationships in the genome. BMC Biol. 2011;9:67. pmid:21985691; PubMed Central PMCID: PMCPMC3189907.
- 4. Poulin R, Randhawa HS. Evolution of parasitism along convergent lines: from ecology to genomics. Parasitology. 2015;142 Suppl 1:S6–S15. pmid:24229807; PubMed Central PMCID: PMCPMC4413784.
- 5. Fedoroff NV. Presidential address. Transposable elements, epigenetics, and genome evolution. Science. 2012;338(6108):758–67. pmid:23145453.
- 6. Hedges DJ, Deininger PL. Inviting instability: Transposable elements, double-strand breaks, and the maintenance of genome integrity. Mutat Res. 2007;616(1–2):46–59. pmid:17157332; PubMed Central PMCID: PMCPMC1850990.
- 7. Buchon N, Vaury C. RNAi: a defensive RNA-silencing against viruses and transposable elements. Heredity (Edinb). 2006;96(2):195–202. pmid:16369574.
- 8. Crichton JH, Dunican DS, Maclennan M, Meehan RR, Adams IR. Defending the genome from the enemy within: mechanisms of retrotransposon suppression in the mouse germline. Cell Mol Life Sci. 2014;71(9):1581–605. pmid:24045705; PubMed Central PMCID: PMCPMC3983883.
- 9. Senti KA, Jurczak D, Sachidanandam R, Brennecke J. piRNA-guided slicing of transposon transcripts enforces their transcriptional silencing via specifying the nuclear piRNA repertoire. Genes Dev. 2015;29(16):1747–62. pmid:26302790; PubMed Central PMCID: PMCPMC4561483.
- 10. Weick EM, Miska EA. piRNAs: from biogenesis to function. Development. 2014;141(18):3458–71. pmid:25183868.
- 11. Czech B, Hannon GJ. One Loop to Rule Them All: The Ping-Pong Cycle and piRNA-Guided Silencing. Trends Biochem Sci. 2016;41(4):324–37. pmid:26810602; PubMed Central PMCID: PMCPMC4819955.
- 12. Huang H, Li Y, Szulwach KE, Zhang G, Jin P, Chen D. AGO3 Slicer activity regulates mitochondria-nuage localization of Armitage and piRNA amplification. J Cell Biol. 2014;206(2):217–30. pmid:25049272; PubMed Central PMCID: PMCPMC4107788.
- 13. Han BW, Wang W, Li C, Weng Z, Zamore PD. Noncoding RNA. piRNA-guided transposon cleavage initiates Zucchini-dependent, phased piRNA production. Science. 2015;348(6236):817–21. pmid:25977554; PubMed Central PMCID: PMCPMC4545291.
- 14. Mohn F, Handler D, Brennecke J. Noncoding RNA. piRNA-guided slicing specifies transcripts for Zucchini-dependent, phased piRNA biogenesis. Science. 2015;348(6236):812–7. pmid:25977553.
- 15. Iwasaki YW, Siomi MC, Siomi H. PIWI-Interacting RNA: Its Biogenesis and Functions. Annu Rev Biochem. 2015;84:405–33. pmid:25747396.
- 16. Siomi MC, Sato K, Pezic D, Aravin AA. PIWI-interacting small RNAs: the vanguard of genome defence. Nat Rev Mol Cell Biol. 2011;12(4):246–58. pmid:21427766.
- 17. Wang W, Han BW, Tipping C, Ge DT, Zhang Z, Weng Z, et al. Slicing and Binding by Ago3 or Aub Trigger Piwi-Bound piRNA Production by Distinct Mechanisms. Mol Cell. 2015;59(5):819–30. pmid:26340424; PubMed Central PMCID: PMCPMC4560842.
- 18. Ishizu H, Iwasaki YW, Hirakata S, Ozaki H, Iwasaki W, Siomi H, et al. Somatic Primary piRNA Biogenesis Driven by cis-Acting RNA Elements and trans-Acting Yb. Cell Rep. 2015;12(3):429–40. pmid:26166564.
- 19. Malone CD, Brennecke J, Dus M, Stark A, McCombie WR, Sachidanandam R, et al. Specialized piRNA pathways act in germline and somatic tissues of the Drosophila ovary. Cell. 2009;137(3):522–35. pmid:19395010; PubMed Central PMCID: PMCPMC2882632.
- 20. de Albuquerque BF, Placentino M, Ketting RF. Maternal piRNAs Are Essential for Germline Development following De Novo Establishment of Endo-siRNAs in Caenorhabditis elegans. Dev Cell. 2015;34(4):448–56. pmid:26279485.
- 21. Sarkies P, Selkirk ME, Jones JT, Blok V, Boothby T, Goldstein B, et al. Ancient and novel small RNA pathways compensate for the loss of piRNAs in multiple independent nematode lineages. PLoS Biol. 2015;13(2):e1002061. pmid:25668728; PubMed Central PMCID: PMCPMC4323106.
- 22. Tsai IJ, Zarowiecki M, Holroyd N, Garciarrubio A, Sanchez-Flores A, Brooks KL, et al. The genomes of four tapeworm species reveal adaptations to parasitism. Nature. 2013;496(7443):57–63. pmid:23485966; PubMed Central PMCID: PMCPMC3964345.
- 23. Klenov MS, Gvozdev VA. Heterochromatin formation: role of short RNAs and DNA methylation. Biochemistry (Mosc). 2005;70(11):1187–98. pmid:16336177.
- 24. Pikaard CS. Cell biology of the Arabidopsis nuclear siRNA pathway for RNA-directed chromatin modification. Cold Spring Harb Symp Quant Biol. 2006;71:473–80. pmid:17381329.
- 25. Verdel A, Vavasseur A, Le Gorrec M, Touat-Todeschini L. Common themes in siRNA-mediated epigenetic silencing pathways. Int J Dev Biol. 2009;53(2–3):245–57. pmid:19412884.
- 26. Fagegaltier D, Bouge AL, Berry B, Poisot E, Sismeiro O, Coppee JY, et al. The endogenous siRNA pathway is involved in heterochromatin formation in Drosophila. Proc Natl Acad Sci U S A. 2009;106(50):21258–63. pmid:19948966; PubMed Central PMCID: PMCPMC2795490.
- 27. Sienski G, Batki J, Senti KA, Donertas D, Tirian L, Meixner K, et al. Silencio/CG9754 connects the Piwi-piRNA complex to the cellular heterochromatin machinery. Genes Dev. 2015;29(21):2258–71. pmid:26494711; PubMed Central PMCID: PMCPMC4647559.
- 28. Tomoyasu Y, Miller SC, Tomita S, Schoppmeier M, Grossmann D, Bucher G. Exploring systemic RNA interference in insects: a genome-wide survey for RNAi genes in Tribolium. Genome Biol. 2008;9(1):R10. pmid:18201385; PubMed Central PMCID: PMCPMC2395250.
- 29. Grbic M, Van Leeuwen T, Clark RM, Rombauts S, Rouze P, Grbic V, et al. The genome of Tetranychus urticae reveals herbivorous pest adaptations. Nature. 2011;479(7374):487–92. pmid:22113690.
- 30. Kurscheid S, Lew-Tabor AE, Rodriguez Valle M, Bruyeres AG, Doogan VJ, Munderloh UG, et al. Evidence of a tick RNAi pathway by comparative genomics and reverse genetics screen of targets with known loss-of-function phenotypes in Drosophila. BMC Mol Biol. 2009;10:26. pmid:19323841; PubMed Central PMCID: PMCPMC2676286.
- 31. Lewis SH, Quarles KA, Yang Y, Tanguy M, Frezal L, Smith SA, et al. Pan-arthropod analysis reveals somatic piRNAs as an ancestral defence against transposable elements. Nat Ecol Evol. 2018;2(1):174–81. pmid:29203920; PubMed Central PMCID: PMCPMC5732027.
- 32. Chin CS, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods. 2013;10(6):563–9. pmid:23644548.
- 33. Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W. Scaffolding pre-assembled contigs using SSPACE. Bioinformatics. 2011;27(4):578–9. pmid:21149342.
- 34. Chan TF, Ji KM, Yim AK, Liu XY, Zhou JW, Li RQ, et al. The draft genome, transcriptome, and microbiome of Dermatophagoides farinae reveal a broad spectrum of dust mite allergens. J Allergy Clin Immunol. 2015;135(2):539–48. pmid:25445830.
- 35. Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc. 2012;7(3):562–78. pmid:22383036; PubMed Central PMCID: PMCPMC3334321.
- 36. Rider SD Jr., Morgan MS, Arlian LG. Draft genome of the scabies mite. Parasit Vectors. 2015;8:585. pmid:26555130; PubMed Central PMCID: PMCPMC4641413.
- 37. Meister G. Argonaute proteins: functional insights and emerging roles. Nat Rev Genet. 2013;14(7):447–59. pmid:23732335.
- 38. Faehnle CR, Joshua-Tor L. Argonautes confront new small RNAs. Curr Opin Chem Biol. 2007;11(5):569–77. pmid:17928262; PubMed Central PMCID: PMCPMC2077831.
- 39. Sanggaard KW, Bechsgaard JS, Fang X, Duan J, Dyrlund TF, Gupta V, et al. Spider genomes provide insight into composition and evolution of venom and silk. Nat Commun. 2014;5:3765. pmid:24801114; PubMed Central PMCID: PMCPMC4273655.
- 40. Antoniewski C. Computing siRNA and piRNA overlap signatures. Methods Mol Biol. 2014;1173:135–46. pmid:24920366.
- 41. Czech B, Malone CD, Zhou R, Stark A, Schlingeheyde C, Dus M, et al. An endogenous small interfering RNA pathway in Drosophila. Nature. 2008;453(7196):798–802. PMC2895258. pmid:18463631
- 42. Wen J, Mohammed J, Bortolamiol-Becet D, Tsai H, Robine N, Westholm JO, et al. Diversity of miRNAs, siRNAs, and piRNAs across 25 Drosophila cell lines. Genome Res. 2014;24(7):1236–50. pmid:24985917; PubMed Central PMCID: PMCPMC4079977.
- 43. Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21(18):3674–6. pmid:16081474.
- 44. Okamura K, Balla S, Martin R, Liu N, Lai EC. Two distinct mechanisms generate endogenous siRNAs from bidirectional transcription in Drosophila melanogaster. Nat Struct Mol Biol. 2008;15(6):581–90. pmid:18500351; PubMed Central PMCID: PMCPMC2713754.
- 45. Mohn F, Sienski G, Handler D, Brennecke J. The rhino-deadlock-cutoff complex licenses noncanonical transcription of dual-strand piRNA clusters in Drosophila. 2014;(1097–4172 (Electronic)).
- 46. Fukunaga R, Colpan C, Han BW, Zamore PD. Inorganic phosphate blocks binding of pre-miRNA to Dicer-2 via its PAZ domain. EMBO J. 2014;33(4):371–84. pmid:24488111; PubMed Central PMCID: PMCPMC3989643.
- 47. Gao Z, Wang M, Blair D, Zheng Y, Dou Y. Phylogenetic analysis of the endoribonuclease Dicer family. PLoS One. 2014;9(4):e95350. pmid:24748168; PubMed Central PMCID: PMCPMC3991619.
- 48. Park JE, Heo I, Tian Y, Simanshu DK, Chang H, Jee D, et al. Dicer recognizes the 5' end of RNA for efficient and accurate processing. Nature. 2011;475(7355):201–5. pmid:21753850; PubMed Central PMCID: PMCPMC4693635.
- 49. Marr EJ, Sargison ND, Nisbet AJ, Burgess ST. Gene silencing by RNA interference in the house dust mite, Dermatophagoides pteronyssinus. Mol Cell Probes. 2015;29(6):522–6. pmid:26212476.
- 50. Fernando DD, Marr EJ, Zakrzewski M, Reynolds SL, Burgess STG, Fischer K. Gene silencing by RNA interference in Sarcoptes scabiei: a molecular tool to identify novel therapeutic targets. Parasit Vectors. 2017;10(1):289. pmid:28601087; PubMed Central PMCID: PMCPMC5466799.
- 51. McVeigh P, McCammick EM, McCusker P, Morphew RM, Mousley A, Abidi A, et al. RNAi dynamics in Juvenile Fasciola spp. Liver flukes reveals the persistence of gene silencing in vitro. PLoS Negl Trop Dis. 2014;8(9):e3185. pmid:25254508; PubMed Central PMCID: PMCPMC4177864.
- 52. Rider SD, Morgan MS, Arlian LG. Draft genome of the scabies mite. Parasites & Vectors. 2015;8(1):1–14. pmid:26555130
- 53. Shi Z, Montgomery TA, Qi Y, Ruvkun G. High-throughput sequencing reveals extraordinary fluidity of miRNA, piRNA, and siRNA pathways in nematodes. Genome Res. 2013;23(3):497–508. pmid:23363624; PubMed Central PMCID: PMCPMC3589538.
- 54. Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 2009;37(Web Server issue):W202–8. pmid:19458158; PubMed Central PMCID: PMCPMC2703892.
- 55. Billi AC, Freeberg MA, Day AM, Chun SY, Khivansara V, Kim JK. A Conserved Upstream Motif Orchestrates Autonomous, Germline-Enriched Expression of Caenorhabditis elegans piRNAs. PLoS Genet. 2013;9(3):e1003392. pmid:23516384
- 56. Flynt A, Liu N, Martin R, Lai EC. Dicing of viral replication intermediates during silencing of latent Drosophila viruses. Proc Natl Acad Sci U S A. 2009;106(13):5270–5. pmid:19251644; PubMed Central PMCID: PMCPMC2663985.
- 57. Montgomery TA, Rim Y-S, Zhang C, Dowen RH, Phillips CM, Fischer SEJ, et al. PIWI Associated siRNAs and piRNAs Specifically Require the Caenorhabditis elegans HEN1 Ortholog henn-1. PLoS Genet. 2012;8(4):e1002616. pmid:22536158
- 58. Saito K, Sakaguchi Y, Suzuki T, Suzuki T, Siomi H, Siomi MC. Pimet, the Drosophila homolog of HEN1, mediates 2′-O-methylation of Piwi- interacting RNAs at their 3′ ends. Genes & Development. 2007;21(13):1603–8.
- 59. Bewick AJ, Vogel KJ, Moore AJ, Schmitz RJ. Evolution of DNA Methylation across Insects. Mol Biol Evol. 2017;34(3):654–65. pmid:28025279; PubMed Central PMCID: PMCPMC5400375.
- 60. Geyer KK, Chalmers IW, MacKintosh N, Hirst JE, Geoghegan R, Badets M, et al. Cytosine methylation is a conserved epigenetic feature found throughout the phylum Platyhelminthes. BMC genomics. 2013;14(1):462. pmid:23837670
- 61. Maida Y, Masutomi K. RNA-dependent RNA polymerases in RNA silencing. Biol Chem. 2011;392(4):299–304. pmid:21294682.
- 62. Putnam NH, Butts T, Ferrier DE, Furlong RF, Hellsten U, Kawashima T, et al. The amphioxus genome and the evolution of the chordate karyotype. Nature. 2008;453(7198):1064–71. pmid:18563158.
- 63. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20. pmid:24695404; PubMed Central PMCID: PMCPMC4103590.
- 64. Xu H, Luo X, Qian J, Pang X, Song J, Qian G, et al. FastUniq: a fast de novo duplicates removal tool for paired short reads. PLoS One. 2012;7(12):e52249. pmid:23284954; PubMed Central PMCID: PMCPMC3527383.
- 65. Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience. 2012;1(1):18. pmid:23587118; PubMed Central PMCID: PMCPMC3626529.
- 66. Tatusova T, Ciufo S, Federhen S, Fedorov B, McVeigh R, O'Neill K, et al. Update on RefSeq microbial genomes resources. Nucleic Acids Res. 2015;43(Database issue):D599–605. pmid:25510495; PubMed Central PMCID: PMCPMC4383903.
- 67. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421. pmid:20003500; PubMed Central PMCID: PMCPMC2803857.
- 68. Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome assemblies. Bioinformatics. 2013;29(8):1072–5. pmid:23422339; PubMed Central PMCID: PMCPMC3624806.
- 69. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotech. 2010;28(5):511–5. http://www.nature.com/nbt/journal/v28/n5/abs/nbt.1621.html—supplementary-information.
- 70. Patel RK, Jain M. NGS QC Toolkit: a toolkit for quality control of next generation sequencing data. PLoS One. 2012;7(2):e30619. pmid:22312429; PubMed Central PMCID: PMCPMC3270013.
- 71. Goubau D, Schlee M, Deddouche S, Pruijssers AJ, Zillinger T, Goldeck M, et al. Antiviral immunity via RIG-I-mediated recognition of RNA bearing 5'-diphosphates. Nature. 2014;514(7522):372–5. pmid:25119032; PubMed Central PMCID: PMCPMC4201573.
- 72. Krueger F, Andrews SR. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics. 2011;27(11):1571–2. pmid:21493656; PubMed Central PMCID: PMCPMC3102221.
- 73. de Castro E, Sigrist CJ, Gattiker A, Bulliard V, Langendijk-Genevaux PS, Gasteiger E, et al. ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins. Nucleic Acids Res. 2006;34(Web Server issue):W362–5. pmid:16845026; PubMed Central PMCID: PMCPMC1538847.