Mitochondrial Genome of the Eyeworm, Thelazia callipaeda (Nematoda: Spirurida), as the First Representative from the Family Thelaziidae

Human thelaziosis is an underestimated parasitic disease caused by Thelazia species (Spirurida: Thelaziidae). The oriental eyeworm, Thelazia callipaeda, infects a range of mammalian definitive hosts, including canids, felids and humans. Although this zoonotic parasite is of socio-economic significance in Asian countries, its genetics, epidemiology and biology are poorly understood. Mitochondrial (mt) DNA is known to provide useful genetic markers to underpin fundamental investigations, but no mt genome had been characterized for any members of the family Thelaziidae. In the present study, we sequenced and characterized the mt genome of T. callipaeda. This AT-rich (74.6%) mt genome (13,668 bp) is circular and contains 12 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes, but lacks an atp8 gene. All protein-coding genes are transcribed in the same direction; the gene order is the same as those of Dirofilaria immitis and Setaria digitata (Onchocercidae), but distinct from Dracunculus medinensis (Dracunculidae) and Heliconema longissimum (Physalopteridae). Phylogenetic analyses of the concatenated amino acid sequence data for all 12 protein-coding genes by Bayesian inference (BI) showed that T. callipaeda (Thelaziidae) is related to the family Onchocercidae. This is the first mt genome of any member of the family Thelaziidae and should represent a new source of genetic markers for studying the epidemiology, ecology, population genetics and systematics of this parasite of humans and other mammals.


Introduction
Thelazia callipaeda Railliet and Henry, 1910, known as the 'oriental eye-worm', because of its geographical distribution in Asian countries (including China, India, Japan, Korea and Thailand), is frequently reported as being responsible for thelaziosis of humans, carnivores (dogs, foxes and cats) and rabbits, causing mild to severe clinical signs (including lacrimation, epiphora, conjunctivitis, keratitis and/or sometimes corneal ulcers) [1]. Fortunately, thelaziosis can be treated effectively using anthelminthics, such as milbemycin oxime or macrocyclic lactones (e.g., moxidectin), and anti-inflammatory compounds [2][3][4]. Although T. callipaeda may seem to be of minor importance to some clinicians and scientists, human thelaziosis is highly endemic in some under-developed communities in Asia, particularly in China [3]. Clearly, scant attention has been paid to human thelaziosis, and there are difficulties in its clinical diagnosis and differentiation from allergic conjunctivitis, particularly when small numbers of adult or larval stages of T. callipaeda are present in the eyes of infected patients.
The transmission of human thelaziosis occurs when the intermediate host, a drosophilid fly of the genus Phortica, feeds on lacrimal secretions from humans and other animals, and ingests first-stage larvae (L1s) produced by adult females of T. callipaeda, which live together with males in the conjunctival sac. After being ingested by the fly, the T. callipaeda larvae migrate in the vector's body (i.e. testis of the male) and undergo development from the L1 to the infective, third-stage larvae (L3) within 14-21 days. Following this migration, the L3s of Thelazia emerge from the labella of the infected fly, are deposited on the eye, as the vector feeds on lacrimal secretions, and then develop into the dioecious adult stages in the ocular cavity within ,35 days [5].
In spite of the significance of human thelaziasis, little is known about the biology and epidemiology of T. callipaeda and its close relatives [1,3]. This relates mainly to a lack of reliable morphological characters for their specific identification and for comparative study. Although molecular tools, employing genetic markers in short regions of nuclear ribosomal and mitochondrial (mt) DNA, have found utility for taxonomic and epidemiological studies of some species, such as T. gulosa, T. rhodesi, T. skrjabini, T. lacrymalis and T. callipaeda [1], there is still a paucity of information on T. callipaeda in different human populations and countries around the world.
Mt genomes can provide markers for genetic and epidemiological investigations of spirurid nematodes (e.g., [6,7]), and provide the potential to discover population variants or cryptic species and investigate transmission patterns linked to particular haplotypes [8][9][10][11]. In addition, mt proteomic datasets could be used for reassessing systematic relationships of Thelazia and other spirurids. Recent studies have shown that concatenated mt proteomic datasets can be used effectively to retest hypotheses regarding the systematic relationships of different groups of nematodes (e.g., [12,13]). Such amino acid sequence datasets are relatively large and informative, usually achieving excellent phylogenetic signal and mostly attaining nodal support values of 98-100% in tree reconstructions [12,13]. Long-range PCR-coupled sequencing and bioinformatic methods [14,15] have underpinned these advances, which now allow mt proteomic barcodes to be defined rapidly for Thelazia spp. and their relatives from a range of mammalian and invertebrate hosts. Here, as a first step, we (i) determined the sequence and structure of the mt genome for T. callipaeda, (ii) assessed the phylogenetic position of this nematode in relation to other Spirurida for which whole mt sequence datasets are available, and (iii) discussed the implications of the new dataset as a new resource for future genetic studies of T. callipaeda populations.

Ethics statement
This study was approved by the Animal Ethics Committee of Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences (Approval No. LVRIAEC2010-008). The dog from which the adult specimens of T. callipaeda were collected was handled in accordance with good animal practice (GAP) required by the Animal Ethics Procedures and Guidelines of the People's Republic of China.

Parasites and total genomic DNA isolation
Adult specimens of T. callipaeda were collected from a conjunctival sac of an infected dog at a veterinary hospital in Zhanjiang, Guangdong Province, China. The worms were washed extensively in physiological saline, fixed in ethanol and then stored at 220uC until use. Upon thawing, the anterior and posterior ends of each nematode were cut off and cleared in lactophenol for subsequent morphological identification [16]. The mid-body section of each worm was used for the isolation of total genomic DNA by small-scale sodium dodecyl-sulphate (SDS)/proteinase K digestion [17] and mini-column purification (TIANamp Genomic DNA kit). The molecular identity of each specimen was then verified by PCR-based sequencing of regions in the cox1 and rrnS genes using an established method [18,19], and both regions had 99% identity to previously published sequences for T. callipaeda from China and Italy (GenBank accession nos. AM042555 and AJ544858, respectively).

Long-PCR, sequencing and annotation
Using primers (Table S1) designed to relatively conserved regions within the cox1 and rrnS regions (see Figure 1), the complete mt genome was amplified by long-PCR as two overlapping amplicons (,5 kb and ,9 kb) from the genomic DNA from the mid-body section of a single female specimen of T. callipaeda. PCR was conducted in 25 ml using 2 mM MgCl 2 , 0.2 mM each of dNTPs, 2.5 ml 106 Taq buffer, 2.5 mM of each primer and 0.5 ml LA Taq DNA polymerase (5 U/ml, Takara) in a thermocycler (Biometra) under the following conditions: 92uC for 2 min (initial denaturation), then 92uC for 10 s (denaturation), 58uC (5 kb) or 42uC (9 kb) for 30 s (annealing), and 60uC for 10 min (extension) for 10 cycles, followed by 92uC for 10 s, 58uC (,5 kb) or 42uC (,9 kb) for 30 s (annealing), and 60uC for 10 min for 20 cycles, with a cycle elongation of 10 s for each cycle and a final extension at 60uC for 10 min. Genomic DNA (80 ng in 2 ml) was added to PCR, and no-template and known-positive controls were included in each run. Amplicons were columnpurified (Wizard PCR Preps, Promega). Subsequently, the amount of DNA in each purified amplicon was estimated spectrophotometrically (ND-1000 UV-VIS spectrophotometer, v.3.2.1, Nano-Drop Technologies). Following an electrophoretic analysis of quality, purified amplicons were sequenced using a primer walking strategy [20]. The whole mt genome sequence (GenBank accession no. JX069968) was then assembled using the ContigExpress program of the Vector NTI software package v.6.0 (Invitrogen, Carlsbad, CA).
The mt genome was annotated using an approach similar to that of Yatawara et al. [21]. In brief, each protein-encoding mt gene was identified by local alignment comparison using amino acid sequences conceptually translated from corresponding genes from the mt genome of a reference species (i.e. Setaria digitata; accession number: NC_014282) [21]. The tRNA (trn) genes were identified using the program tRNAscan-SE [22] or by visual inspection [23]; rRNA (rrn) genes were predicted by comparison with those of S. digitata [21].

Author Summary
Human thelaziosis is an underestimated parasitic disease caused by the eyeworm Thelazia callipaeda (Spirurida: Thelaziidae). Although this parasite is of significance in humans in many Asian countries, its genetics, epidemiology and biology are poorly understood. Mitochondrial (mt) DNA can provide useful genetic markers for fundamental investigations, but no mt genome had been characterized for any members of the family Thelaziidae. In this study, we sequenced and characterized the mt genome of T. callipaeda. This circular mt genome is 13,668 bp long and contains 12 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes, but lacks an atp8 gene. Phylogenetic analyses of the concatenated amino acid sequence data for all 12 protein-coding genes by Bayesian inference showed that T. callipaeda is closely related to the family Onchocercidae, consistent with previous study. This is the first mt genome of any member of the family Thelaziidae, and represents a new source of genetic markers for studies of the epidemiology, ecology, population genetics and systematics of this parasite of human and animal health significance.

Results and Discussion
General features of the mt genome of T. callipaeda The complete mt genome sequence of T. callipaeda (GenBank accession no. JX069968) was 13,668 bp in length ( Figure 1). This genome contains 12 protein-coding genes (cox1-3, nad1-6, nad4L, atp6 and cytb), 22 trn genes, two rrn genes (rrnL and rrnS) and a noncoding (control or AT-rich) region, but lacks an atp8 gene ( Table 1). The gene content and arrangement are the same as those of D. immitis and S. digitata [21,26], but distinct from those of D. medinensis (rearrangement markedly) and H. longissimum (tRNA-Met and tRNA-Val change) [13]. All genes are transcribed in the same direction. In addition, the mt genes of T. callipaeda overlap by 102 nt in 14 locations (1 to 32 nt per location) ( Table 1). The mt genome of T. callipaeda has 14 intergenic regions, which range from 1 to 62 nt in length. The longest region is between tRNA-Pro and tRNA-Asp (Table 1).
The nucleotide content of the entire mt genome sequence of T. callipaeda is biased toward A+T (74.6%), in accordance with mt genomes of other nematodes of the order Spirurida (e.g., [21,26]) ( Table 2). One non-coding region (AT-loop) (328 bp), located between cox3 and tRNA-Ala, has the highest A+T content of 79.6% (Table 2). AT-and GC-skews of the whole mt genome were calculated for T. callipaeda and other spirurid nematodes studied to date (see Table 3). This composition of the mt genome sequence of T. callipaeda was strongly skewed away from A, in favour of T (AT skew = 20.40), and the GC skew was 0.449 (Table 3). All spirurid nematodes reported to date and in the present study show strand asymmetry (GC skew between 0.354 and 0.521) ( Table 3).

Protein-encoding genes
The boundaries between protein-coding genes of the mt genome of T. callipaeda were determined by aligning its sequence and by identifying translation initiation and termination codons with those of H. longissimum and S. digitata [13,21]. In this mt genome, all protein-coding genes had ATT, ATA and TTG as their initiation codons, and TAA or TAG as their termination codon. Incomplete termination codons (T or TA) were not identified, which is inconsistent with studies of some other nematodes, including Anisakis simplex (s. l.), A. suum, Caenorhabditis elegans, S. digitata, Toxocara spp. and Trichinella spiralis [21,[33][34][35][36]. Codons composed of A and T were more frequently used in protein-coding genes, reflecting the high A+T content in the mt genome of T. callipaeda. The most frequently used amino acid was Phe (19.3%), followed by Leu (13.4%), Val (7.6%), Gly (7.1%) and IIe (6.2%) ( Table 4).

Other genes
In the mt genome of T. callipaeda, the rrnL was located between tRNA-His and nad3, and rrnS was between nad4L and tRNA-Tyr ( Table 1). The sizes of the rrnL and rrnS genes of T. callipaeda were 966 bp and 666 bp, respectively ( Table 1). The 22 trn genes ranged from 52 to 66 bp in size. The secondary structures predicted for the latter genes were similar to those of S. digitata [21].

Substitution ratios
As synonymous and non-synonymous substitution rates assist in predicting evolutionary processes [37], the rate of non-synonymous substitutions (Ka), the rate of synonymous substitutions (Ks) and the Ka/Ks ratios were calculated for all 12 protein-coding genes encoded in the mt genomes of T. callipaeda and 11 other spirurid nematodes, including A. viteae, B. malayi, C. quiscali, L. loa, S. digitata and W. bancrofti (Table 3). The Ka/Ks ratio is a measure of selective pressures acting on gene that indicates neutral mutation (ka/ks = 1), negative or purifying selection (Ka/Ks of ,1), and positive or diversifying selection (Ka/Ks of .1) [38,39]. Here, nad2 showed the highest ratio, followed by nad3, while cox1 appeared to have the lowest ratio ( Figure 2). Notably, the Ka/Ks ratio of eight protein-coding genes was ,1 (range: 0.346 to 0.873), indicating that these genes are evolving under negative or purifying selection [40,41]. The Ka/Ks ratio of 4 protein-coding genes (nad2, nad3, nad5 and nad6) was .1 (range: 1.105 to 1.331), suggesting that these genes have evolved under positive or diversifying selection [42].

Sequence comparisons and phylogenetic relationships of T. callipaeda with selected members of the Spirurida
The amino acid sequences predicted from individual proteincoding mt genes of T. callipaeda were compared with those of 11 other spirurid nematodes (see Table 5). Pairwise comparisons of the concatenated amino acid sequences revealed identities of 40.3-91.8% among them. Based on identity, COX1 was the most conserved protein, whereas nad4L and nad3 were the least conserved (see Table 5). Phylogenetic analysis of the concatenated amino acid sequence data for all 12 mt proteins showed that T. callipaeda (Thelaziidae) was a sister taxon to a clade containing S. digitata (Setariidae) and other members of the Onchocercidae, including B. malayi and D. immitis, consistent with results of a previous study [18]. Basal to these taxa were H. longissimum (Physalopteridae) and D. medinensis (Dracunculidae) (posterior probability = 1.00) (Figure 3).

Fundamental and applied implications
Although much attention has been paid to soil-transmitted helminths as pathogens because of their major socioeconomic impact on human populations [43,44], parasitic nematodes that cause relatively subtle, but chronic disease, such as members of the genus Thelazia, have been seriously neglected [3,45]. The main reservoirs for human thelaziosis seem to be dogs, since they often live in areas populated by a large entomo-fauna [1,46]. T. callipaeda is usually prevalent in dogs, cats and humans in disadvantaged, rural areas of the former Soviet Union [47] and the Asian continent, including China [48], India [49], Indonesia [50], Japan [51], Korea [52], Taiwan [53] and Thailand [54,55]. More recently, T. callipaeda has also been reported in Europe, with a high prevalence (60%) in dogs being recorded in some areas of Southern Italy [56]. Autochthonous cases of canine thelaziosis have also been recorded in France [57,58], Portugal [59], Spain  [60] and Switzerland [61], suggesting that the latitude range of endemicity of canine thelaziosis in Europe (between 39u and 46uN) is similar to that of Asia (between 10u and 45uN for India and Japan) [56]. Interestingly, in spite of the high prevalence of canine thelaziosis reported for southern parts of Europe [56], only a small number of human cases have yet been reported in this geographical region [45].
In the present study, the characterization of the mt genome of T. callipaeda provides a foundation for the improved diagnosis of human thelaziosis using molecular methods as well as future, detailed studies of the population genetics and epidemiology/ ecology of this parasite in Asia. As adult and larval stages of T. callipaeda from the eyes of patients cannot be identified reliably by morphology to species, molecular tools, using genetic markers in the first internal transcribed spacer (ITS-1) region of nuclear rDNA and cox1, have been used to support clinical diagnosis and to assist in undertaking molecular epidemiological investigations of T. callipaeda [1,19]. Because sequence heterogeneity in ITS rDNA can be high in individual spirurid specimens (e.g., [62]), sometimes complicating sequence analyses, protein-coding mt genes appear to be better suited for such studies [19].
Having available the mt genome of T. callipaeda now sets the scene to develop combined DNA-based analytical and diagnostic tools, whereby mt genetic regions with differing levels of within-   [17], already effectively applied, on a small scale, to T. callipaeda [19]. A previous investigation, employing cox1 alone, showed that, despite a relatively high degree of genetic variability among specimens isolated from Asia (i.e. China and Korea), no genetic variation was detected among individual specimens from different host species (i.e. dogs, cats and foxes) and localities within Europe (i.e. France, Germany, Italy, Netherlands, and Spain) [19]. These data were supported by additional studies [58][59][60], suggesting a genetically homogenized population in Europe, a tighter affiliation of this nematode to intermediate hosts than to the definitive hosts, and, thus, that the distribution of the parasite might be expected to resemble that of the vector [19]. Nonetheless, cox1 is a relatively conserved mt gene [63], and, to date, there is no genetic information for T. callipaeda from humans. In the future, it would be interesting to assess whether various haplotypes or genotypes of T. callipaeda might relate to different clinical symptoms of thelaziosis in humans, and whether particular subpopulations of T. callipaeda undergo arrested development (hypobiosis, hibernation or aestivation) and survive for long periods of time in their intermediate hosts, as indicated in southern Europe [5]. Ecological aspects would be interesting to study in T. callipaeda in different countries and hosts, particularly dogs and cats, given the apparent complexity of the parasite's life cycle and biology. In addition, there is an applied human disease management imperative for improved molecular tools in being able to track the transmission of T. callipaeda and address the question as to whether infected domestic and wild canids and felids represent reservoir hosts for infection to humans in the same natural environment.
The characterization of the mt genome of T. callipaeda also stimulates a reassessment of the systematic relationships of nematodes within the order Spirurida using mt genomic/   [64,65]. Given the demonstrated utility of mt proteomic datasets, high phylogenetic signal and strong statistical support in trees [12,13], there is now an opportunity to test the phylogenetic relationships of a wide range of spirurid nematodes using expanded mt datasets.