Diversity and Expression of MicroRNAs in the Filarial Parasite, Brugia malayi

Human filarial parasites infect an estimated 120 million people in 80 countries worldwide causing blindness and the gross disfigurement of limbs and genitals. An understanding of RNA-mediated regulatory pathways in these parasites may open new avenues for treatment. Toward this goal, small RNAs from Brugia malayi adult females, males and microfilariae were cloned for deep-sequencing. From ∼30 million sequencing reads, 145 miRNAs were identified in the B. malayi genome. Some microRNAs were validated using the p19 RNA binding protein and qPCR. B. malayi miRNAs segregate into 99 families each defined by a unique seed sequence. Sixty-one of the miRNA families are highly conserved with homologues in arthropods, vertebrates and helminths. Of those miRNAs not highly conserved, homologues of 20 B. malayi miRNA families were found in vertebrates. Nine B. malayi miRNA families appear to be filarial-specific as orthologues were not found in other organisms. The miR-2 family is the largest in B. malayi with 11 members. Analysis of the sequences shows that six members result from a recent expansion of the family. Library comparisons found that 1/3 of the B. malayi miRNAs are differentially expressed. For example, miR-71 is 5–7X more highly expressed in microfilariae than adults. Studies suggest that in C.elegans, miR-71 may enhance longevity by targeting the DAF-2 pathway. Characterization of B. malayi miRNAs and their targets will enhance our understanding of their regulatory pathways in filariads and aid in the search for novel therapeutics.


Introduction
The lymphatic filarial parasites Brugia malayi, Brugia timori and Wuchereria bancrofti infect an estimated 120 million people in 80 countries worldwide [1]. They are transmitted by mosquitos harboring infective third stage larvae (L3s) that upon entering the vertebrate host, molt to L4s which mature to adulthood over the course of 6-12 months [2]. Adult parasites settle in the lymphatic vessels and mate producing microfilariae (mf). The mf can survive for up to a year migrating throughout the peripheral circulation waiting to be ingested by a mosquito during a blood meal [3].
Lymphatic filarial infections are characterized by recurrent fevers, painful adenolymphangitis and elephantiasis [4]. Although not considered fatal, the morbidity caused by filarial infections greatly impedes socio-economic development in affected communities [5]. Diethylcarbamazine (DEC), ivermectin and albendazole are the drugs commonly used to treat lymphatic filarial infections. All three kill microfilariae but only DEC exhibits limited efficacy against adult parasites [6]. The recent appearance of drug resistance against ivermectin [7] and the lack of good macrofilariacides necessitate the development of new approaches for combating this debilitating disease. The complex filarial life cycle and the inability to genetically manipulate the parasite make biological studies difficult. Recently, molecular approaches including EST and genome sequencing of B. malayi [8,9] and related filarids [10,11] have greatly expanded our knowledge of the expression and evolution of filarial genes. We have extended this approach by cloning and high-throughput sequencing of B. malayi small RNAs. An understanding of RNA-mediated regulatory pathways in filarial parasites may open new avenues for treatment. For example, identification of filarial-specific components of small RNA pathways or miRNAs may be leveraged for the development of novel anti-filarial agents.
Caenorhabditis elegans lin-4 was the first gene discovered to encode a small RNA and demonstrated to post-transcriptionally regulate LIN-14 protein levels by binding to complementary sequences in the 39UTR of its mRNA [12,13]. MicroRNAs function through ARGONAUTE proteins, a component of the RNA induced silencing complex (RISC). In general, microRNAs guide RISC to sequences in the 39 UTR of mRNAs complementary to nucleotides 2-7 of the miRNA known as the ''seed'' sequence [14,15,16] however, microRNA sequence outside of the seed can compensate for weak or imperfect seed pairing [15,17,18,19,20]. Once bound, mRNA stability and translational suppression is mediated through the interaction of miRNA-RISC with members of the GW182 protein family [21,22].
It is now known that miRNAs are ancient in origin. They are found in an evolutionarily diverse assortment of organisms ranging from sponges to vertebrates [23,24]. MicroRNAs in the free-living nematode, C. elegans are well characterized [25,26,27,28,29,30,31] but little is known about them in parasitic nematodes. Our initial work to characterize small RNAs in B. malayi identified 32 miRNAs using bioinformatic and cloning approaches [32]. C. elegans (100 Mb) and B. malayi (90-95 Mb) likely encode similar numbers of miRNAs given that their genome sizes are roughly equivalent [9]. The goal of this present study is a more comprehensive identification of miRNAs in B. malayi and to compare the findings to what is known in B. pahangi [33], a closely related filarial parasite and C. elegans. This study forms the background for understanding miRNA function in light of the complex B. malayi lifecycle and can be used as the basis for designing anti-miRNA compounds that are lethal to the parasite.

Library Overview
This publication is an in depth characterization of the diversity and expression of miRNAs from different stages of the human filarial parasite, B. malayi. Our initial publication [32] describes our analysis of only a few hundred small RNA reads while in this publication, we report the analysis of ,30 million reads for miRNAs from 3 different stages of the parasite.
Five small RNA libraries were prepared from B. malayi males, females and mf using 3 different protocols ( Table 1) that distinguish between differences in the phosphorylation states of small RNAs [34,35,36] and to minimize the prevalence of degradation products. The male, female and one mf library were prepared with calf intestinal phosphatase, (CIP) and T4 polynucleotide kinase. Treatment with CIP followed by T4 polynucleotide kinase enabled all small RNA populations including RNA degradation products with 59OH groups to ligate to the 59 linker. Although ,71-74% of the reads from the CIP libraries were $17 nt long and an exact match to the B. malayi genome, ,6-11% of reads matched the 18S rRNA gene indicating significant levels of degradation in these libraries (Table 1). To address this problem, two additional libraries (DIR and TAP) were prepared from the same mf RNA sample. These libraries were constructed using microfilariae because they are abundant and easier to obtain than adult parasites. In the DIR library, RNA is directly ligated to the 59 linker without pretreatment. Using this method, only small RNA populations, including endogenous miRNAs, with a single 59 phosphate and 39 OH will ligate to the linkers. In the mf TAP library, caps and polyphosphates on small RNA populations are digested to a single 59 phosphate using Tobacco Acid Pyrophosphatase prior to ligation with the 59 linker. Both the DIR and TAP protocols minimize ligation of degradation products to the 59 linker as demonstrated by the 45X reduction in the reads matching the B. malayi 18S rRNA gene from ,11% in the CIP libraries to 0.25% in the DIR and TAP libraries ( Table 1).
The TAP and DIR libraries were constructed from the same mf RNA sample and as expected, the relative abundance of miRNA reads from these two libraries is very similar: 98% of the standardized miRNA reads are within two fold (Tables S1 &  S2). However, comparison of miRNA reads from libraries prepared from separate RNA samples using different methods such as the CIP and DIR mf libraries, resulted in only 52% of the miRNA reads being within two fold (Tables S1 & S2). The comparison was made between miRNAs that represented more then 0.01% of total miRNA reads in a library.
Only 11 miRNAs were identified in the DIR and/or TAP mf libraries that were not found in the CIP mf library (Table S1). All these miRNAs were present at very low read numbers (,10) and were likely observed because of the increased depth at which the DIR and TAP libraries were sequenced compared to the CIP libraries (Table 1).
These libraries consist of a variety of overlapping as well as unique small RNA populations. The DIR library consists primarily of small RNAs processed by Dicer such as miRNAs [37,38] whereas the TAP library consists of capped and polyphosphorylated small RNA populations in addition to Dicer products [38].

MicroRNA Discovery and Family Assignment
A total of 145 miRNAs (from 129 hairpin sequences) were identified in the B. malayi small RNA libraries. These include 96 prevalent miRNAs ($100 reads in one of the 5 libraries, Table S3) and 49 rare (,100 reads, Table S4). Thirty-two of the prevalent miRNAs were described previously [32]. The B. malayi miRNAs segregate into 99 families each defined by a unique seed sequence ( [16,39], Figure 1). Sixty-six miRNAs segregate into 20 families of paralogues based on nucleotide identity in the 59 seed region or by global identity of $70% ( [40], Tables 2 and S5). The remaining 79 families consist of a single miRNA.
For comparison with B. malayi, 195 C. elegans miRNAs were downloaded from miRBase release 15 and analyzed using the same criteria. The C. elegans miRNAs segregate into a similar number of families (101) compared to B. malayi (99) consisting of 40 families of paralogues; twice the number found in B. malayi (Tables 2 and S6). About half of these B. malayi miRNA families, 48 families (representing 75 miRNAs) are also found in C. elegans (Tables 2 and S7).
Approximately 25% fewer B. malayi miRNAs have been identified than have been found in C. elegans (195, miRBase release 15), a nematode with a similar genome size [9,41]. There are several likely reasons for this difference. MiRDeep is reported to identify 70-90% of miRNAs depending on the data set being mined [42]. Because of its requirement for genomic sequence, miRNAs were probably missed because the B. malayi genome is only 75-80% complete [9]. We were able to identify 5 highly conserved miRNAs by searching the small RNA libraries directly. For example, we have been unable to identify the highly conserved miR-1 presumably because its DNA sequence is missing from the genome [32] but it was easily identified by directly searching the small RNA libraries with the C. elegans miR-1 sequence. It is one of the most abundant miRNAs to be identified and represents 8.9-11.5% of the total miRNAs in the adult and mf libraries (Tables S1-S3). A current version of the B. malayi genome is being curated by WormBase (www.wormbase.org). As the remainder of the B. malayi genome sequence becomes available, more miRNAs will likely be identified in these libraries as well as in small RNA libraries of other developmental stages.
The B. malayi miRNAs were also compared to miRNAs identified in B. pahangi [33]. Fifty-five of the 99 B. malayi miRNA families are conserved in B. pahangi based on nucleotide identity in the 59 seed region or by global identity of $70% ( [40]; Tables 3,  S8 and S9). This low level of homology between the two closely related Brugia species is likely explained by the methodology used to identify the B. pahangi miRNAs as well as the fact that different life cycle stages were sequenced from B. pahangi then in this work [33]. The B. pahangi miRNAs were identified based on B. malayi   (Table S9)  genome and any miRNAs with a single nt mismatch to the B. malayi genome were discarded [33].

MicroRNA validation using p19 and qPCR
Two different methods, other then sequencing, were used to measure the relative abundance of four miRNAs from three different stages of the parasite. The p19 protein from the Carnation Italian ringspot virus binds 20-23 nucleotide long dsRNA molecules in a size dependent, sequence independent manner. This protein does not bind ssRNA. Using a labeled RNA probe complementary to a specific miRNA, the p19 protein can isolate miRNA:RNA probe duplexes in a million fold excess of total RNA. Using a radioactive probe, this method can detect miRNAs in the sub-picogram range [43]. The selective binding properties of p19 have also been used in conjunction with nanopore [44] and electrochemical detection [45] to measure very low levels of miRNAs.
The p19 detection and qPCR were used to measure relative abundance of four miRNAs in males, females and mf. The miRNAs were chosen to represent a range of expression and stage specificity. The very abundant miR-71 represents 27% of all miRNA reads in the CIP mf library while miR-5842* only represents 0.016% of the reads in the mf CIP library (Table S2). For stage specific expression, miR-36c* was chosen because it is preferentially expressed in females (0.469%) compared to males (0.018%) and mf (0.001%; Table 4).
Standard curves were generated using synthetic miR-71 RNA oligos to obtain a quantitative measure of this miRNA in different stages (Figures 2 and S1). Using p19, the amount of miR-71 in mf, females and males was calculated to be 178, 47 and 38 pg/mg of total RNA respectively. Using qPCR, the amounts of miR-71 in mf, females and males was calculated to be 97, 32 and 18 pg/mg of total RNA respectively. This is in agreement with the miR-71 sequencing read data which shows that mf have 3-5X more miR-71 then B. malayi adults (Figures 2, S1 and Table 4).
The autoradiographs of the p19 isolated miRNA:RNA probe duplexes show that miR-36c* is preferentially detected in RNA from female parasites, while miR-71, miR-153 and miR-5842* are more abundant in RNA from mf then adults ( Figure 2). This is in agreement with the normalized read numbers from the CIP libraries ( Figure 2 and Table S2). With the exception of miR-5842*, the qPCR data exhibited a similar pattern ( Figure 2). Using qPCR, the quantity of miR-5842* was calculated to be 9.9 pg/mg in adult female RNA verses 3.8 pg/mg in mf RNA ( Figure 2), a 2.6X difference. Additional validation using both qPCR and p19 detection methods at the same time and with the same RNA sample is needed to resolve this discrepancy.

MicroRNA Conservation in the Hosts of Filarial Parasites
The Venn diagram in Figure 1 shows how the 99 B. malayi miRNA families are conserved in arthropods and vertebrates, the hosts of filarial parasites. When used to search miRBase (release 15), 61 (2/3) of the B. malayi miRNA families were found to be conserved in a wide range of organisms including arthropods, vertebrates and other helminths ( Figure 1). Homologues were identified in vertebrates but not arthropods for 20 B. malayi miRNA families. For 7 of the 20 families, a homologue was identified in either C. elegans or A. suum [46], but for 13 B. malayi families, the only homologues identified were in B. pahangi or a vertebrate. For all except bma-miR-5844, homologues were identified in vertebrates infected by filarial parasites including humans, horses, cattle, pigs and platypus. In contrast, homologues were identified in arthropods but not vertebrates for 6 B. malayi miRNA families indicating that the miRNAs in these families may Table 2. Comparison of miRNA Families in B. malayi and C. elegans. MicroRNAs sharing a minimum of 6 contiguous nt between positions 1-7 at the 59 end or that exhibit $70% identity overall in the absence of a conserved 59 end were grouped into families of paralogues. In B. malayi, 66 miRNAs were grouped into 20 families (Table S5). In C. elegans, 134 miRNAs were grouped into 40 families (Table S6)

Putative filarial-specific miRNAs
Nine of the 12 B. malayi miRNA families identified only in helminths, appear to be specific to filarial parasites ( Figure 1). Orthologues were identified for 8 of these in at least one other filarial species but not in other worms, arthropods or vertebrates (Table 5). No orthologues were identified for miR-9535 raising the possibility that this miRNA may be specific for B. malayi.
Bma-miR-5866 is one of the most abundant filarial-specific miRNAs. It is found in all the libraries but is most prevalent in the male library where it represents ,0.7% of the total miRNAs ( Figure 3B and Table S2). Orthologues of bma-miR-5866 were identified in B. pahangi [33] as well as in the WGS data of W. bancrofti and L. loa [10]. The mature miRNA sequence is identical in the filarial species ( Figure 3B and Table 5). Mature miR-5866 appears to be closely related to the miR-57 family. The miR-5866 seed sequence (TACCAT) is similar to the nucleotide sequence (TACCCT) at the 59 end of B. malayi and C. elegans miR-57 ( Figure 3B). Based on ClustalW alignments, bma-miR-5866 and bma-miR-57 are 67% identical overall. However, the filarial miR-5866 hairpin sequences only exhibit 34-40% identity with the hairpin sequences of B. malayi and C. elegans miR-57 (data not shown) suggesting that the miR-5866 and -57 families may have evolved from a common ancestor by gene duplication.
Of the putative filarial specific microRNAs, miR-9536 (Table  S4) is one of the most interesting because of its location in an intron of Bm1_03065, a cut-1 cuticlin gene fragment (genbank acc.#: XM_001892072). Cuticlin is the highly cross-linked and insoluble protein component of the nematode cuticle. Expression of the C. elegans cuticlin genes cut-1, -3 and -5 are tightly regulated at the transcriptional level and are involved with the formation of alae and control body shape [47]. Because of its location within an intron of Bm1_03065, miR-9536 expression is likely dependent on and coordinated with the expression of this cuticlin gene and suggests that its targets may be components of cuticlin biosynthesis or more generally involved with molting and cuticle synthesis.
These putative filarial specific miRNAs may be regulating some of the 20% of the proteome predicted to be the specific for B. malayi [9] such as pathways required for interactions with its vertebrate and arthropod hosts or with its Wolbachia endosymbiont [48].

MicroRNA clusters
In B. malayi, eight clusters containing a total of 19 miRNAs have been identified that span #2 kbps (Table S10) whereas in C. elegans, 18 clusters containing a total of 47 miRNAs can be found spanning #2 kb [49,50]. Fewer miRNA clusters have been identified in B. malayi compared to C. elegans; however, new clusters may be identified upon completion of the Brugia genome.
The majority of C. elegans clusters consist of multiple paralogues in a miRNA family. For example, 7 of the 8 miR-36 family members are clustered within 800 nt on chromosome II [51,52]. In B. malayi, the only family members found clustered are miR-100a and -100d [32] as well as miR-2i and -2d ( Figure 4 and Table S10).
The B. malayi miR-2 family and clustering With 11 members (9 individual sequences +2 duplicates), the miR-2 family is the largest in B. malayi and twice the size of the 5 member C. elegans miR-2 family ( Figure 5, Tables S5 and S6). The 6 nt consensus seed sequence, ATCACA is present in all family members except miR-2i and miR-2h which both have ATCGCA as a seed sequence ( Figure 5A). ClustalW alignment of mature miR-2i and miR-2h exhibit 82 and 68% nt identity with miR-2d respectively, despite the single nt change in the seed sequence (data not shown). In addition, alignment of the B. malayi miR-2 hairpins demonstrates that a majority of the highly conserved nt blocks (highlighted purple in Figure 5A) are conserved in both the miR-2i (5/7) and miR-2h (6/7) hairpins.
The phylogenetic tree derived from the miR-2 hairpin alignment suggests that miR-2b, -2d, -2f, -2i and -2h are derived from a common ancestor that underwent an expansion event ( Figure 5B). It also demonstrates that bma-miR-2a and cel-miR-2 are the most closely related family members being 100% identical [32].
Bma-miR-2a, -2b and -2e are the most highly expressed of the B. malayi miR-2 paralogues but have different patterns of expression ( Figure 5C, Tables 4 and S2). Bma-miR-2a represents ,2% of the miRNA population in mf, about 7.5X higher than the levels of this miRNA in either the male or female libraries whereas bma-miR-2b, -2c, -2d and -2e are all expressed in the adult libraries, none of them are found to be expressed at significant levels in the mf library ( Figure 5C, Tables 4 and S2). The remaining B. malayi miR-2 paralogues (miR-2f, -2g, -2h and -2i) are only present at very low levels (,100 reads) in the libraries (Table S4).
About half of the B. malayi miR-2 family members are located in clusters; most with paralogues of the miR-36 family (Figure 4 and Table S10). The miR-2b, -36 clusters represent a possible duplication event as they are 94% identical (Figure 4, scaffolds A and B). The miR-2d, -36d cluster on scaffold C may also represent a duplication event as it is 72 and 68% identical to the MicroRNAs sharing a minimum of 6 contiguous nt between positions 1-7 at the 59 end or that exhibit $70% identity overall in the absence of a conserved 59 end were grouped into families of paralogues. In B. malayi, 66 miRNAs were grouped into 20 families (Table S5). In B. pahangi, 38 miRNAs were grouped into 14 families (Table S8) miR-2b, -36 clusters on scaffolds A and B respectively. A fourth miR-2 paralogue, miR-2i, is also found clustered on scaffold C with miR-2d and -36d ( Figure 4 and Table S10). Using an updated assembly of the B. malayi genome, miR-2 clusters A-C were found arranged in 2 groups along the length of a 23 kb scaffold with a fourth miR-2 cluster consisting of miR-2h-2, miR-5839 and miR-2f (data not shown, personal communication, Elodie Ghedin). Unlike B. malayi, C. elegans miR-2 family members are not clustered with one another but with paralogues of the miR-44 family. Only one C. elegans miR-2 paralogue, miR-43, is found clustered with a miR-36 paralogue, miR-42 [49].
Clustering may result in coordinated expression. For example, miR-2b and its cluster partners miR-36c and -36b are more highly expressed in adult parasites, particularly females compared to mf (Tables 4 and S2) confirming previous Northern blot results [32]. In addition, the expression patterns of miR-2d and -36d as well as miR-2c and -5841 are also enhanced in adults compared to mf (Tables 4 and S2). Additional studies are necessary to determine if these clustered miRNAs are processed from the same RNA transcripts.
Although, the exact function of the miR-2, -36 clusters in B. malayi must await target identification, recent work in other systems suggests that they have a role in regulating programmed cell death. In D. melanogaster, the miR-2 family suppresses apotosis during embryogenesis by targeting the 39 UTRs of pro-apototic genes [53]. In addition, RISC immunoprecipitation studies identified the BH3-like pro-apototic gene, egl-1, [54], as a target of the miR-36 family in embryos [55].

Stage enhanced expression of miRNAs in B. malayi
Deep sequencing is very useful in determining relative expression levels of the same miRNA because it avoids the bias observed when comparing different sequences [56,57]. The number of normalized reads can be used for quantitative comparisons of miRNA levels in different developmental stages. To compare miRNAs levels across libraries, the raw read numbers of each miRNA in a library was normalized to 1 million total miRNA reads (Tables S1 and S2). Using this methodology, about 2/3 (51/85) of the prevalent miRNAs, excluding duplicates (Table  S3), are enriched in one CIP library over another as defined by a 5X difference in the % of normalized reads/million miRNA reads total (Tables 4 and S2).
For example, miR-71 represents ,27% of the total miRNAs in mf and is 4.6 and 7.2X more abundant than the levels of miR-71 observed in either the male or female CIP libraries, respectively (Tables 4 and S2). The levels of miR-71 in the CIP libraries were also validated using p19 and qPCR, (see section titled, ''Micro-RNA validation using p19 and qPCR''). In C. elegans, miR-71 promotes longevity [58,59]. Knockouts of miR-71 shortened the lifespan of mutant worms from about 20 to 10 days and this effect is mediated through activation of the DAF-16/FOXO transcription factor in the intestines [59,60]. In addition, miR-71 levels are up-regulated in L1 diapause and dauer larvae [61].
Our finding that miR-71 is one of the most abundantly expressed miRNAs in B. malayi mf suggests that it may function to regulate the longevity of mf which can circulate throughout the body for up to a year waiting to be ingested by a mosquito [3,62]. Homologues of several of the proteins predicted to be targeted by miR-71 in C.elegans have been identified in B. malayi including pdk-1 (genbank acc. EDP33519) and Daf-16 (genbank acc. XP_001901487).
Of the 31 miRNAs expressed more highly in adults than mf, 11 miRNAs are more highly expressed in females than males; 11 miRNAs are more highly expressed in males while 9 miRNAs appear to be expressed at about the same levels in both sexes (Tables 4 and S2).
For example, miR-36a one of the 4 miR-36 paralogues identified in B. malayi, represents 8% of the total miRNA population in the female CIP library but only ,0.4% of miRNAs in the male library (a 20X difference) and is only found at trace levels in mf. B. malayi miR-36c and -36b are also predominately expressed in the female CIP library but at lower levels compared to miR-36a ( Figure 3A, Tables 4 and S2). Based on Clustal W alignment, bma-miR-36a is closely related to (76% nt identity) miR-36 homologues from a wide variety of helminths including C. elegans, the marine polychaete Capitella teleta, the flatworm Schmidtea mediterranea and the parasitic blood fluke Schistosoma japonicum ( Figure 3A). Unlike the other B. malayi miR-36 paralogues, miR-36a is not clustered with a miR-2 paralogue (Figure 4). In C.elegans, the miRNA-35 family (miRNAs 35-42) is required for embryonic development. Its deletion results in embryonic and L1 lethality [51]. Interestingly, most expression of this family originates in the gonad during oogenesis and to a lesser extent during spermatogenesis [52].
Bma-miR-5838 is one of the 11 miRNAs more abundant in the male than the female CIP library. It represents ,0.7% of miRNAs in the male CIP library compared to only .02% in the female library (a 35X difference; Figure 3C, Tables 4 and S2). Clustal W alignment shows that bma-miR-5838 shares 8 nt at its 59 end which includes the seed sequence, GAGTAT, with Aedes aegypti miR-12 and human miR-496 ( Figure 3C). Future work will involve sequencing small RNA libraries from germline tissue dissected from adult parasites to determine whether the enhanced expression of miRNAs in adults derives from somatic or germline origins.

MicroRNA Target Identification and Chemotheraputic Development
The developmental stages of filarial parasites that transition between insect and vertebrate hosts have traditionally been targeted for chemotherapeutic control. However, the high expression levels of some miRNAs in adult parasites (Table 4) suggests that they might be useful for the design of macrofilaricides.
Although miRNAs target hundreds of transcripts, recent work indicates that miRNAs often target functionally related mRNAs within a biochemical pathway [59,61,63] and are often in feedback loops with transcription factors which themselves may regulate transcription of functionally related genes [64,65]. For example, miR-71 was found to target multiple components of the insulin-like pathway in C. elegans including the forkhead box O transcription factor, Daf-16 [59,60]. An association between miRNAs and transcription factors likely occurs in B. malayi as well. Bma-miR-5855 is located within an intron of Bm1_31015, Figure 2. Quantitative detection of miRNAs by p19 and qPCR. (A) Total RNA from B. malayi mf, males, females and yeast (0.5-5.0 mg) was probed with a 32 P-labeled miRNA specific ribo oligonucleotide followed by p19 detection. Eluted dsRNA duplexes were electrophoresed through 20% non-denaturing TBE gels. The positions of the double-stranded duplex (ds) and single-stranded oligonucleotide probe (ss) on each gel are denoted. Single-stranded probe can elute from the porous chitinase beads but represents ,1% of the total probe added to p19 reactions [43]. The normalized read numbers of each miRNA in the mf, male and female CIP libraries (Table S2)    The WGS sequence of O.volvulus (Ovo), W. bancrofti (Wba) and L. loa (Llo) were searched for orthologues using B. malayi miRNA hairpin sequences. 2 The B. pahangi data is from Winter et. al. [33]. 3 Nucleotides that differ from the B. malayi miRNA sequence are underlined. 4 The accession number & nucleotide position defining a hairpin for each miRNA is shown. 5 ND = not determined. The DG could not be determined for wba-miR-5838* because only a partial stemloop sequence is available. 6 When 2 miRNAs are excised from opposite arms of the same hairpin, a star (*) after the name generally denotes the less frequent form [25,50]. doi:10.1371/journal.pone.0096498.t005 annotated as a putative transcription factor and homologue of C. elegans unc-3 (Table S4). The development of chemotherapeutics against filarial-specific miRNAs provides a novel way to distrupt post-transcriptional regulation in parasites without disturbing the host's miRNA regulatory networks. Filarial-specific miRNAs might be targeted directly using modified antisense RNAs such as antagomirs [66,67]. Recent developments in antagomir technology have demonstrated that intravenously injected 8 mer locked nucleic acid oligonucleotides are readily distributed throughout mouse tissue and can efficiently silence whole miRNA families (the same seed sequence) with negligible off-target effects [68].
Using miRDeep [42], 145 miRNAs were predicted from ,30 million small RNA sequencing reads cloned from B. malayi adults and mf. The miRNAs segregate into 99 families each defined by a unique seed sequence. Two ligase independent methods, quanitative PCR and p19 [43] used to determine the quantity of four miRNAs from different stages of B. malayi showed good correlations with the sequencing read numbers from the libraries. Comparisons of the miRNA expression levels from the three CIP libraries demonstrated that about a third of the B. malayi miRNAs are differentially expressed. Sixty-one of the B. malayi families are widely conserved in arthropods, vertebrates and helminths. Approximately half of the families have a homologue with a seed match in B. pahangi and C. elegans. The miR-2 family, the largest in B. malayi with 11 paralogues, has expanded compared to C. elegans with 5 paralogues, respectively. Interestingly, homologues of 20 B. malayi miRNA families are conserved in vertebrates but not in other helminths or arthropods. Nine miRNA families appear to be filarial-specific as homologues were not identified in other organisms.
Characterization of the miRNAs from B. malayi is an important first step in determining the miRNA regulatory network in filarial parasites. Unlike free-living nematodes, filarial parasites encounter pronounced environmental changes when they transition between vertebrate and insect hosts. In addition, they must contend with their host's immune responses. It is likely that miRNAs have developed critical roles in these processes that could be considered for anti-filaricidals. Future work to identify the targets of these miRNAs, particularly of the filarial-specific miRNAs, may lead to new approaches for the development of therapies against this debilitating disease.

RNA isolation
RNA was isolated from B. malayi males, females and mf (TRS Laboratories, Athens, GA) using Trizol (Invitrogen) with a steel ball bearing as described previously [32].

Small RNA cloning and nucleotide sequencing
Small RNAs (18 to 30 nt) from B. malayi males, females and microfilariae were gel purified and cloned using a 59 ligationdependent protocol [36]. Essentially, the 59 ends of the small RNAs were dephosphorylated with calf intestinal phosphatase (CIP) from New England Biolabs before ligation of their 39 ends to a preadenylated DNA oligo. The 39-ligated-RNAs were rephosphorylated with polynucleotide kinase (PNK) and ligated to a 59 linker prior to first strand cDNA synthesis and amplification with Solexa sequencing primers. In addition, a direct (DIR) and a Tobacco Acid Pyrophosphatase (TAP) small RNA library were also prepared from mf. In the direct library, small RNAs were ligated to the 39 preadenylated DNA oligo and 59 linker without pretreatment with CIP and polynucleotide kinase. In the TAP library, small RNAs were incubated with Tobacco Acid Pyrophosphatase (Epicentre) that digests capped and multiphosphorylated small RNA ends to monophosphates prior to ligation with the preadenylated DNA oligo and 59 linker [36]. The cDNA libraries were sequenced by the University of Massachusetts (Worcester, MA) Deep Sequencing Core using an Illumina Genome Analyzer II.

Sequencing data analysis
Small RNA library profiles. As a basis for comparing the sequenced libraries, a profile was generated for each library that consists of four descriptive metrics. The first is the total number of raw reads returned from the Illumina sequencer. After removing the 39 linker (identified by its first 6 nt, CTGTAG) from the raw reads, all inserts ,17 nts long are discarded. The number of remaining reads with inserts $17 nt long comprise the second metric of the profile. This set of reads was then aligned to the B. malayi genome (Genbank accession #s: DS236884 to DS264093) and to the sequence of the 18S rRNA gene (GI: 2707744) using RazerS [69] with the alignment parameters set at 100% recognition rate and 100% identity. All reads that aligned to more than 100 locations were discarded to avoid reads that match to low complexity regions of the genome. The number of reads that are an exact match to the B. malayi genome and the number of 18S rRNA matching reads comprise the third and fourth metric of the profile respectively. The processed reads from each of the five sequenced libraries are in Files S1-S5.
miRDeep Analysis. MicroRNAs were primarily identified in the filtered Illumina sequencing data (inserts $17 nt long after removal of the 39 linker) using miRDeep with default settings [42]. The miRDeep algorithm is based on the model of miRNA biogenesis [70]. MiRDeep maps sequencing reads onto a genomic sequence then extracts the DNA sequence bracketing these alignments and determines whether the DNA sequence folds into a hairpin consistent with miRNA biogenesis. The miRDeep output was manually curated to confirm miRNA identifications.
Because the B. malayi genome is only 75-80% complete [9], the filtered illumina sequencing data was also searched directly with the full length sequences of highly conserved miRNAs or with conserved 7 nt seed sequences to identify miRNAs missed by miRDeep.
Calculation of miRNA prevalence in the CIP RNA libraries. The filtered libraries containing only inserts $17 nucleotides long (see small RNA library profiles) were searched for the mature form of each miRNA as well as for shorter and longer forms (to a maximum of 64 nucleotides). For each library, the total number of reads of each form of a miRNA was reported as well as the sum of all the forms. In order to compare miRNA levels across CIP libraries, the number of reads of each miRNA was normalized to 1 million total miRNA reads. A 5X difference in the expression level of a miRNA between normalized samples was chosen as the minimum threshold for differential expression.

MicroRNA family assignments
To identify paralogues, B. malayi miRNAs were aligned using BioPython's [71] pairwise2 module using match/mismatch scores of 1, 23 and gap open/extend penalties of 25/22. MicroRNAs were grouped into families if they shared a minimum of 6 consecutive nt with no mismatches in the first 9 bases of each sequence or if the global alignment between miRNAs exhibited a minimum of 60% identity [40]. A more stringent refinement of the initial output was made before making final family assignments using the following criteria: miRNAs were grouped into families if they shared a minimum of 6 contiguous nt in common between positions 1-6 or 2-7 [16]. Without nt identity at the 59 end, miRNAs were only considered family members if they exhibited $70% nucleotide identity overall [40]. Paralogous miRNA families in C. elegans (miRBase release 15, http://www.mirbase. org/) and B. pahangi [33] were identified using this same protocol. To identify homologues in other species, B. malayi miRNAs were aligned with C. elegans, human and mosquito (Aedes aegypti, Anopheles gambiae and Culex quinquefasciatus) miRNAs downloaded from miRBase (release 15) and B. pahangi miRNAs [33] using the above protocol. When a homologue was not identified, miRBase was searched directly for a seed match to any arthropod or vertebrate miRNA.

Phylogenetic Analysis
Hairpin sequences were aligned using MAFFT v6.833b [72] run with RNA structural alignment options: mafft-xinsi -reorderkimura 1 -ep 0.0. The bma-miR-2g hairpin was excluded from the alignment because the mature miRNA is on a different arm of the hairpin compared to all the other sequences. A Baysian Inference tree was calculated from the aligned sequences using MrBayes [73] at Phylogeny.fr (http://www.phylogeny.fr/) [74]. MrBayes v3.1.2 was run for 100000 generations with trees sampled every 10 generations. The likelihood model used 6 substitution types, invariable+gamma rate variation across sites, and the default (4by4) substitution model. The first 2500 trees were discarded when calculating the consensus to allow the models to converge.
MicroRNA detection with the p19 dsRNA binding protein B. malayi miR-71, miR-36c*, mir-153 and miR-5842* identified in the small RNA libraries were confirmed using the p19 miRNA detection kit (NEB, [43]). Essentially, total RNA (0.5-5.0 mgm) from B. malayi mf, males and females was mixed with 1-2 ng of a 32 P-labeled probe in 1X p19 binding buffer and hybridized for 2 hrs at 65uC with the exception of the miR-71 samples which were hybridized at 55uC. Typically, 50 ng of probe was endlabeled with [c -32 P] ATP (10 mCi/ml, 6000 Ci/mmol; Perkin Elmer) using T4 polynucleotide kinase (NEB). After heat killing the kinase, unincorporated label was separated from the probe by passing the reaction over an illustra microspin G-25 column (GE Healthcare Following hybridization, p19 beads (10 ml) were added and the reactions were incubated at RT for 2 hrs with shaking. After washing the p19 beads and elution, the miRNA: 32 P-labeled probe duplexes were electrophoresed through 20% acrylamide, nondenaturing TBE gels (Invitrogen) along side controls. Tortula yeast RNA (0.5-5.0 mg; US Biological) was used as a negative control. For positive controls, yeast RNA was spiked with 1-2 ng of a miRNA oligonucleotide (Table S3; Integrated DNA Technologies) complementary to the probe. Control samples were processed along side the B. malayi RNA samples.
After electrophoresis, the gels were dried and exposed to a Storage Phosphor Screen GP (Kodak) for 1-6 days that was then scanned using a Typhoon 9400 variable mode imager (GE Healthcare).
MicroRNA detection using qPCR. qPCR was performed essentially as described [75]   Supporting Information Figure S1 Quantitation of miR-71 in B. malayi using p19. A standard curve was generated by hybridizing defined quantities of synthetic B. malayi miR-71 with 1 ng of 32 P-labeled miR-71 probe as described [43]. (A) Autoradiograph of a 20% non-denaturing gel showing the miR-71:probe hybrid eluted from p19 beads. The mobility of the double-stranded (ds) miRNA:RNA probe duplex and single-stranded (ss) RNA probe is shown to the right of the gel. (B) The amount of radioactivity in 10 ml (1/5 th ) of the eluted miR-71: probe hybrid was measured by scintillation counting and plotted against known amounts of synthetic miR-71 to generate a standard curve.    (Table S1) for each miRNA were normalized to a theoretical 10 6 total miR reads and are presented as a % of the total miRNA reads in a given library. Values highlighted red indicate a 5X increase in the relative number of reads in one CIP library over another. (XLSX)  (Table S1). Rare miRNAs are listed in Table S4. When two miRNAs are generated from the same RNA hairpin, a star (*) after the name denotes the less abundant miRNA. (XLSX)  (Table S1). Prevalent miRNAs are listed in Table S3. When two miRNAs are generated from the same RNA hairpin, a star (*) after the name denotes the less abundant miRNA. (XLSX) . malayi miRNAs that are related by either a 59 7 nt match at positions 1-7 or 2-8, a 59 6 nt match at positions 1-6 or 2-7 or by overall identity of $70% with no 59 match. In addition, miRNAs with a 59 nt match that also exhibit overall identity of 60-69% are denoted by a ' symbol. (XLSX) Table S6 C. elegans miRNA paralogues. For each miRNA family, the table lists C. elegans miRNAs that are related by either a 59 7 nt match at positions 1-7 or 2-8, a 59 6 nt at positions 1-6 or 2-7 or by overall identity of $70% with no 59 match. In addition, miRNAs with a 59 nt match that also exhibit overall identity of 60-69% are denoted by a ' symbol. (XLSX)

Table S7
Homologous miRNAs in B. malayi and C. elegans. For each miRNA family, the table lists the B. malayi members that have homologues in C. elegans. The relationship is indicated as either a 59 7 nt match at positions 1-7 or 2-8, a 59 6 nt at positions 1-6 or 2-7 or by overall identity of $70% with no 59 match. In addition, those miRNAs with a 59 nt match that also exhibit overall identity of 60-69% are denoted by a ' symbol. (XLSX) pahangi miRNAs that are related by either a 59 7 nt match at positions 1-7 or 2-8, a 59 6 nt at positions 1-6 or 2-7 or by overall identity of $70% with no 59 match. In addition, miRNAs with a 59 nt match that also exhibit overall identity of 60-69% are denoted by a ' symbol. (XLSX) pahangi. The relationship is indicated as either a 59 7 nt match at positions 1-7 or 2-8, a 59 6 nt at positions 1-6 or 2-7 or by overall identity of $70% with no 59 match. In addition, those miRNAs with a 59 nt match that also exhibit overall identity of 60-69% are denoted by a ' symbol. (XLSX) Table S10 B. malayi miRNA clusters. The table lists the miRNAs that are clustered in the B. malayi genome. The orientation (forward or reverse strand) and relative position of the miRNAs in a cluster is based on the annotated B. malayi genome. The bp distance between the miRNAs in a cluster is shown in the right hand column. MicroRNA-9 and miR-79 are on opposite arms of the same hairpin. (XLSX) File S1 Female CIP library. Solexa reads from the female B. malayi small RNA library. The RNA for this library was treated with CIP and then T4 PNK prior to linker ligation allowing ligation of all small RNAs regardless of their phosphorylation status. These reads are $17 nt long and an exact match to the B. malayi genome. Each sequence in the library is preceded by a line containing the following information: 1) the library from which the sequence is derived (F = female), 2) a unique alphanumeric identifier for that sequence and 3) the number of reads for that sequence in the library. (ZIP) File S2 Male CIP library. Solexa reads from the male B. malayi small RNA library. The RNA for this library was treated with CIP and then T4 PNK prior to linker ligation allowing ligation of all small RNAs regardless of their phosphorylation status. These reads are $17 nt long and an exact match to the B. malayi genome. Each sequence in the library is preceded by a line containing the following information: 1) the library from which the sequence is derived (M = male), 2) a unique alphanumeric identifier for that sequence and 3) the number of reads for that sequence in the library. (ZIP) File S3 Microfilarial CIP library. Solexa reads from the B. malayi mf small RNA library. The RNA for this library was treated with CIP and then T4 PNK prior to linker ligation allowing ligation of all small RNAs regardless of their phosphorylation status. These reads are $17 nt long and an exact match to the B. malayi genome. Each sequence in the library is preceded by a line containing the following information: 1) the library from which the sequence is derived (U = mf), 2) a unique alphanumeric identifier for that sequence and 3) the number of reads for that sequence in the library. (ZIP) File S4 Microfilarial TAP library. Solexa reads from the B. malayi mf TAP small RNA library. The RNA for this library was treated with tobacco acid pyrophosphates, TAP, prior to linker ligation allowing ligation and sequencing of all small RNAs that may have a 59 cap or 59 triphosphate. These reads are $17 nt long and an exact match to the B. malayi genome. Each sequence in the library is preceded by a line containing the following information: 1) the library from which the sequence is derived (U_TAP = mf TAP), 2) a unique alphanumeric identifier for that sequence and 3) the number of reads for that sequence in the library. (ZIP) File S5 Microfilarial DIR library. Solexa reads from the B. malayi mf DIR small RNA library. The RNA for this library was not treated enzymatically prior to linker ligation. This selects for RNAs such as miRNAs, that already have a 59 phosphate and a 39 OH. These reads are $17 nt long and an exact match to the B. malayi genome. Each sequence in the library is preceded by a line containing the following information: 1) the library from which the sequence is derived (U_DIR = mf DIR), 2) a unique alphanumeric identifier for that sequence and 3) the number of reads for that sequence in the library. (ZIP)