Population structure analysis of the neglected parasite Thelazia callipaeda revealed high genetic diversity in Eastern Asia isolates

Background Thelazia callipaeda is the causative agent of thelaziasis in canids, felids and humans. However, the population genetic structure regarding this parasite remains unclear. Methodology/principal findings In this study, we first explored the genetic variation of 32 T. callipaeda clinical isolates using the following multi-molecular markers: cox1, cytb, 12S rDNA, ITS1 and 18S rDNA. The isolates were collected from 13 patients from 11 geographical locations in China. Next, the population structure of T. callipaeda from Europe and other Asian countries was analyzed using the cox1 sequences collected during this study and from the GenBank database. In general, the Chinese clinical isolates of T. callipaeda expressed high genetic diversity. Based on the cox1 gene, a total of 21 haplotypes were identified. One only circulated in European countries (Hap1), while the other 20 haplotypes were dispersed in Korea, Japan and China. There were five nucleotide positions in the cox1 sequences that were confirmed as invariable among individuals from Europe and Asia, but the sequences were distinct between these two regions. Population differences between Europe and Asian countries were greater than those among China, Korea and Japan. The T. callipaeda populations from Europe and Asia should be divided into two separate sub-populations. These two groups started to diverge during the middle Pleistocene. Neutrality tests, mismatch distribution and Bayesian skyline plot (BSP) analysis all rejected possible population expansion of T. callipaeda. Conclusions The Asian population of T. callipaeda has a high level of genetic diversity, but further studies should be performed to explore the biology, ecology and epidemiology of T. callipaeda.


Introduction
The spirurid nematode Thelazia callipaeda (Spirurida: Thelaziidae) is the major etiological agent of ocular thelaziosis [1]. This worm can parasitize the conjunctival sac of domestic and wild carnivores and humans, causing conjunctivitis, lacrimation and itchiness, and even blindness [2,3]. The parasite is transmitted by a drosophilid insect of the genus Phortica (Diptera, Drosophilidae), that feeds on the lachrymal secretions of mammals [4,5]. T. callipaeda was previously known as the oriental eyeworm because of its original description in eastern Asia countries [1]. Since the first cases of canine thelaziosis reported in Italy in 1988, the nematode has spread through many southern, central, western and eastern European countries [6]. A broad spectrum of wild carnivores, such as wolves, wildcats, red foxes, badgers, beech martens and even brown hares, plays an important role in maintaining and spreading eyeworm infections to domestic animals and humans [7]. The first human case was described in Beijing, China in 1917 [8]. Although sporadic human thelaziosis has been reported in several European countries, human infections are mainly documented in people living in China, Japan, Korea and India [9]. China is most likely to have the largest number of cases of thelaziasis in the world with more than 600 cases reported to date [10]. Recently, an increase in T. callipaeda infections has been reported in animals and humans living in European countries and in China [6,11].
Consequently, T. callipaeda poses a serious threat to public health and thelaziosis has even been termed as an emerging enzootic disease [1,6,9]. Understanding the host specificity, transmission pattern and population genetic characteristics of the parasite are valuable for the prevention and control of thelaziasis in animals and humans [12]. However, our knowledge regarding these issues is still fragmented, and insufficient studies on population genetics of T. callipaeda have been carried out. One possible reason is that as a neglected pathogen, T. callipaeda does not draw enough attention from most parasitologists. This may also be attributed to the difficulty in collecting T. callipaeda isolates from different hosts and distinct geographical locations.
More than 10 years ago, Otranto and Traversa [13] performed the first genetic variance analysis among Thelazia species by using the first internal transcribed spacer (ITS1) ribosomal DNA sequence, and concluded that the ITS1 sequence was a useful genetic marker for the molecular identification of Thelazia spp. In 2005, Otranto et al. [12] investigated the genetic variability among 50 individual adult specimens of T. callipaeda from Europe and Asia based on the mitochondrial cytochrome c oxidase subunit 1 gene (cox1). Recently, although various publications of the T. callipaeda infections in animals and humans have been reported [3,[14][15][16][17][18], no subsequent related studies about its population genetics have been reported. In this study, we explored the genetic variability within human T. callipaeda isolates collected from different geographical locations in China by using three mitochondrial genes and two nuclear ribosomal DNA sequences as follows: mitochondrial cytochrome b (cytb), the small subunit of ribosomal DNA gene (12S rDNA) and cox1; ITS1 and the small subunit of nuclear ribosomal DNA (18S rDNA). These molecular markers were selected because they are suitable for inferring population differences and conducting phylogenetic analysis at different taxonomic levels [12,13,[19][20][21]. Using cox1, we also performed a genetic variability comparative analysis on clinical T. callipaeda isolates collected in China and from previous publications. Additionally, the presumed transmission pattern of T. callipaeda investigated here relied on phylogeny and molecular dating methods.

Ethics statement
This study was approved by the Life Science Ethics Committee of Zhengzhou University (No. 2017-0006). The protocol and written informed consent form were approved by the Human Ethics Committees of the Zhengzhou University. All subjects older than eighteen years old provided written informed consent; in the case of children, they provided written informed assent, and their parents/guardians provided written consent for them. All worms were collected from patients to treat their thelaziasis and not expressly for the purpose of the present study.

Parasite collection and identification
A total of 32 worms were harvested from 13 patients from 11 distinct geographical locations in China from September 2007 to July 2016 (Table 1). All nematodes were removed from the eyes of patients with intraocular forceps, while the patients were anesthetized with oxybuprocaine. The collected nematodes were transferred to Petri dishes containing physiological saline (0.9% NaCl). These eyeworms were identified as T. callipaeda according to the morphological characteristics (e.g., shape of the buccal capsule, presence of transversally striated cuticle and cloacal papillae, morphology of the spicules in males and the position of the vulva in females) described in Otranto et al. [22].

Sequencing of target genes and alignment
Total genomic DNA was extracted from individual specimens using the EasyPure Genomic DNA Kit (Transgen, China) following the manufacturer's protocol. Five molecular markers, viz. cytb, cox1, 12S rDNA, ITS1 and 18S rDNA, were amplified to explore the genetic diversity of T. callipaeda. For 12S, ITS1 and cox1, the amplifications were obtained using primer combinations described in Casiraghi et al. (2004) [23],   [13], and   [12], respectively. The forward and reverse primers used for amplifying the cytb and 18S markers were designed as follows: cobF 5 0 -TGATTGGTGGTTTTGGTAA-3 0 ; and cobR 5 0 -ATAAGTACGAGTATCAATATC-3 0 ; and 18SF 5 0 -CTCATAAAATAATTGG TGAATCTGAATAGC-3 0 and 18SR 5 0 -ATAACTTTTCAGCAATGGTTACAG-3 0 , respectively. PCR products were purified using the EasyPure PCR Purification Kit (Transgen, China) and sequenced in both directions at the Genwiz Company (Beijing, China). All sequences were deposited in the GenBank database (S1 Table). The sequenced genes were initially aligned using the default settings in the program Clustal X v.2.0 [24] and adjusted in MEGA v.6.06 [25] according to the corresponding amino acid sequences of protein-coding genes and secondary structure of ribosomal DNA sequences. The nucleotide composition, conserved sites, variable sites, parsimony-informative sites, and singleton sites were estimated using MEGA v.6.06.

Genetic diversity analysis of T. callipaeda isolates from China
The program DnaSP v5.10 [26] was employed to analyze the number of haplotypes, haplotype diversity (Hd), and nucleotide diversity (Pi) of each molecular marker. Network v5.0 [27] was used to draw a median-joining network to analyze the relationships among the detected haplotypes. Analysis of molecular variance (AMOVA) was computed in Arlequin v.3.5 [28] with non-parametric permutations of 1,000 times (p = 0.05) to detect the partitions of genetic diversity within and among populations. Pairwise F ST values between populations were performed for all datasets in Arlequin to explore levels of genetic differentiation among the populations. The significance of F ST values evaluated was based on 1000 random permutations. Demographic changes were also estimated using mismatch distributions in Arlequin with 1000 simulations, under a scenario of no recombination. The validity of the expansion model was tested by using the sum of squared deviations (SSD) and raggedness index (RI) between observed and expected mismatches. The neutrality tests using Tajima's D [29] and Fu's F S [30] were also applied through Arlequin as an assessment of possible population expansion.

Genetic variability comparative analysis of T. callipaeda isolates using cox1
To make a worldwide genetic variability comparative analysis of T. callipaeda, all available sequences of cox1 in the GenBank database were included (S1 Table). In addition, we performed a Bayesian skyline plot analysis (BSP) implemented in BEAST v1.8.2 [31] to estimate the change in population size over time, and the time to the most recent common ancestor (tMRCA) for each T. callipaeda haplotype. A piecewise-constant skyline model was selected. The molecular evolutionary rate of cox1 was fixed at 0.01 substitutions per site per million years ago (Mya) according to the substitution rate for nematode mtDNA [32]. Tracer v1.5 [33] was used to reconstruct the demographic history over time.

Phylogenetic diversity analysis
The phylogenetic pattern of all cox1 haplotypes was estimated through maximum parsimony (MP) and Bayesian inference (BI). MP analysis was performed in MEGA v.6.06. Confidence in each node was assessed by boot-strapping (1000 pseudo-replicates). Bayesian inference was performed in MrBayes v.3.2 [34], after determining the appropriate substitution model by applying the Akaike information criterion (AIC) in jModelTest 2 [35]. The analysis consisted of two runs, each with four MCMC chains running for 5, 000, 000 generations, and sampling every 100th generation. Stationarity was assessed using a convergence diagnostic. An average standard deviation of the split frequencies (ASDSF) < 0.01 was used as criteria for convergence between both runs. The consensus tree was drawn after removing the first 10 000 trees (20%) as the burn-in phase. The approximate divergence time was estimated using an uncorrelated log-normal relaxed molecular-clock model in the BEAST v1.8.2 program. The substitution model was assigned following model selection by jModelTest 2. For the earlier tree, a basic coalescent model was chosen, assuming a constant population size over the given time period considered. Two replicate MCMC runs were performed, with the tree and parameter values sampled every 1, 000 steps over a total of 1×10 8 steps.

Genetic diversity of Chinese T. callipaeda isolates
The  Table). With the exception of isolates from Pingdingshan (PDS) and Zhengzhou (ZZ) of Henan province, which shared two haplotypes, each geographic population shared only a single haplotype. Analyses of the median-joining networks (Fig 1) showed that samples from HF and LA shared a haplotype (Hap1) when using cox1, 12S and ITS1, but under the analysis with 18S, these samples identified two Haps (Hap1 and Hap2). Clinical eyeworms from Dandong (DD) and Tongchuan (TC) shared geographical specific Haps using each of the selected markers. Isolates from Huanggang (HG) and Wuhan (WH) also revealed specific Haps when using cytb, cox1, 12S and ITS1; however, these samples shared a single haplotype using 18S. Using cytb, 12S and ITS1, a single geographical specific haplotype was identified in T. callipaeda from Shangluo (SL). Interestingly, using cox1 (Hap3) and 18S (Hap5), these isolates from SL shared the same haplotype with all samples from Luoyang (LY), 4 samples from Pingdingshan (PDS) and one from Zhengzhou (ZZ). The remaining isolates from PDS and ZZ shared the same haplotype using 12S, ITS1 and 18S. However, when using cytb and cox1, the remaining isolates possessed two distinct Haps. Using cytb, 12S, ITS1 and 18S, worms from Jiaozuo (JZ) shared the same haplotype with the second sample from ZZ. Only with cox1, did the JZ isolates identify as a distinct haplotype. The results from the analysis of molecular variance (AMOVA) showed that much more genetic variance lay among the populations than within the populations for all datasets, more specifically, cytb: 69.77% vs. 30   geographical regions were estimated for all molecular markers used to measure the population differentiation (S3 Table).  (Table 3). Using all markers, mismatch distribution analyses revealed multi-modal frequency distributions for the total population, rejecting possible population expansion (S1 Fig). In addition, low values were found for the sum of squared deviation (SSD) and raggedness index (RI) under the demographic expansion model (Table 3).

Compare the genetic variability between Chinese isolates and isolates from European and other Eastern Asia countries
A total 21 haplotypes were identified within T. callipaeda isolates from 25 localities in 12 countries (Table 1) Table).

Phylogenetic diversity and demographic estimation
The likelihood models identified by the jModelTest (AIC) suggested that the HKY+G model was most suitable for cox1 haplotypes. Both maximum parsimony and Bayesian analyses generated consistent tree topologies (Fig 2A and S3 Fig). Among the tested haplotypes, Hap1 and Hap4 composed a single clade (clade I), and the remaining haplotypes made up another clade (clade II). Within clade II, the earliest diversifications gave rise to Hap13, then to Hap9, Hap11 and Hap12 (Japanese haplotypes without Hap10). The next diversification event separated the remaining haplotypes. The molecular dating analysis suggested that the two clades began to diverge during the middle Pleistocene (Fig 2A).  Table). Mismatch distribution analyses revealed multi-modal frequency distributions (Fig 2B). In addition, low values for the sum of squared deviation and raggedness index under the demographic expansion model were found (S5 Table). The result of Bayesian skyline plot analysis of the T. callipaeda population revealed a gradual expansion trend, but also rejected sudden population expansion (Fig 2C).

Discussion
Although more than 10 species of the spirurid genus Thelazia can cause veterinary or medical problems in many parts of the world [13], only T. callipaeda affects humans in China, causing mild to severe clinical symptom [36]. It is still unknown whether there are other human  infective species or genotypes of T. callipaeda in China. In this study, we collected 32 eye worms from 13 patients from 11 distinct geographical locations in China over a period of 10 years, and performed a genetic variation analysis of these isolates using a multi-gene approach to investigate the population structure of T. callipaeda.
Haplotypes of the Chinese T. callipaeda population identified by each molecular marker were generally consistent. Of all the geographic populations, only PDS and ZZ isolates revealed Population structure analysis of the neglected parasite Thelazia callipaeda two different haplotypes. Worms from ZZ were harvested in two different patients. However, interestingly, all PDS isolates that were collected from a single child also had two haplotypes, indicating the patient probably experienced two or more infections of drosophilid flies. Each geographic population and the total population demonstrated high haplotype diversity (Hd) values (nearly 1.0) by all markers, however, the nucleotide diversity (Pi) values of each gene were below 0.01, indicating that multiple haplotypes were differentiated by few nucleotide mutations. Correspondingly, the AMOVA results also showed that much more genetic variance exists among the populations than within the populations [19]. The pairwise fixation index (F ST ) was used to measure the population differentiation. Most of the F ST values between specified geographical regions were far more than 0.25, indicating very high genetic differentiation and a long-term interruption of gene flow among the geographical populations [37].
Based on the partial cox1 sequence, Otranto et al. [12] performed the first genetic variability investigation of the T. callipaeda population in 2005. Since then, sequences of T. callipaeda from different geographical locations and different hosts have been published [3,[16][17][18]. In this study, based on previous research by Otranto et al. [12], we added new data from Chinese clinical isolates and other published data to generate a comprehensive genetic variation analysis of the T. callipaeda population. Some interesting discoveries were made: (1) Thirteen novel genotypes of T. callipaeda were identified; four from Japan (Haps9-12), and another nine from Chinese clinical isolates (Haps13-21); within the 4 Japanese genotypes, one was from a human, and the other 3 were from Canis familiaris. (2) In cox1, Otranto et al. [12] found six nucleotide positions (positions 89, 149, 206, 257, 539 and 608) that were invariable among individuals from Europe and Asia, but they were distinct between Europe and Asia. We confirmed five of them (positions 89, 149, 206, 257, 539 and 608) using an enlarged dataset in this study. (3) Population differences between Europe and Asian countries were larger than those among China, Korea and Japan. Within Asian countries, the genetic difference between China and Korea was the smallest, while the genetic difference between Korea and Japan was the largest. Only within human isolates were the genetic differences among Chinese samples smaller than those between China and Japan. This phenomenon may be attributed to the geographical distance's influence on gene flow among species [38], and possibly to the geographical origin, which needs to be analyzed further in the future. (4) The T. callipaeda population from Europe and Asia should be divided into the following two subgroups: one group (clade I) comprising worms from European countries (Hap1) and a sample isolated from a dog in Anhui of China (Hap4); the other group (clade II) comprising T. callipaeda from Korea, Japan and China. Individuals in clade II diverged earlier than those in clade I. Within the Asian haplotypes, Hap13 (clinical isolates of Anhui) was the ancestral haplotype; then, it was transmitted to Japan, Korea and other regions of China. (5) All of the neutrality tests, mismatch distribution and BSP analyses rejected possible population expansion of T. callipaeda, indicating this nematode population was in the stable phase [37].
Generally, the population structures of parasites are associated with large genetic differentiation among populations from different geographical regions and/or hosts and low intrapopulation genetic variability [39,40]. In this study, there was a relatively large genetic differentiation (above 0.82) between the European and Asian populations; however, the genetic variability among Asian countries was small (below 0.44). In addition, phylogenetic analyses supported the conclusion that isolates from Europe and Asia (excluding the Hap4) are separate populations. However, the genetic differentiation of T. callipaeda from distinct hosts (dogs, foxes, cats and human) was inconspicuous, suggesting a lack of connection between eyeworms and host species [12]. That being said, an important factor to consider is the abundance of genetic diversity in the Asian isolates. A possible explanation for this genetic diversity is that T. callipaeda populations are tightly linked to the intermediate host (vector). The drosophilid fly Phortica variegata was the vector of T. callipaeda in Europe, whereas the T. callipaeda in Anhui of China was transmitted by P. okadai [9,12]. Although it is still unknown whether other vector species exist in Asian countries, considering the high genetic diversity of T. callipaeda in Korea, Japan and China, it is very likely that some novel species or genotypes of Phortica flies exist in these areas. Hence, further studies should be launched to explore the biology, population genetics, ecology and epidemiology of T. callipaeda in the future.
In summary, the Chinese clinical isolates of T. callipaeda expressed a high level of genetic diversity. Using the cox1 gene, thirteen new genotypes were identified, four from Japan (Haps9-12), and nine from China (Haps13-21). There were five positions in the cox1 sequences (89, 149, 206, 257, 539 and 608) that were confirmed invariable among individuals from Europe and Asia, but the sequences were distinct between these two regions. Population differences between Europe and Asian countries were greater than those among China, Korea and Japan, and the T. callipaeda populations from Europe and Asia should be divided into two separate sub-populations. These two groups started to diverge during the middle Pleistocene.
Supporting information S1