Advertisement
  • Loading metrics

Population genomic evidence that human and animal infections in Africa come from the same populations of Dracunculus medinensis

  • Caroline Durrant †,

    † Deceased.

    Roles Formal analysis, Writing – original draft, Writing – review & editing

    Affiliation Parasites and Microbes, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom

  • Elizabeth A. Thiele,

    Roles Formal analysis, Writing – original draft, Writing – review & editing

    Affiliation Department of Biology, Vassar College, Poughkeepsie, New York, United States of America

  • Nancy Holroyd,

    Roles Project administration, Writing – review & editing

    Affiliation Parasites and Microbes, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom

  • Stephen R. Doyle,

    Roles Formal analysis, Writing – review & editing

    Affiliation Parasites and Microbes, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom

  • Guillaume Sallé,

    Roles Formal analysis, Writing – review & editing

    Affiliations Parasites and Microbes, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom, INRA—U. Tours, UMR 1282 ISP Infectiologie et Santé Publique, Nouzilly, France

  • Alan Tracey,

    Roles Formal analysis, Writing – review & editing

    Affiliation Parasites and Microbes, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom

  • Geetha Sankaranarayanan,

    Roles Investigation, Methodology

    Affiliation Parasites and Microbes, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom

  • Magda E. Lotkowska,

    Roles Investigation, Methodology

    Affiliation Parasites and Microbes, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom

  • Hayley M. Bennett,

    Roles Investigation, Methodology

    Affiliations Parasites and Microbes, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom, Present Address: Berkeley Lights Inc., Emeryville, California, United States of America

  • Thomas Huckvale,

    Roles Investigation, Methodology

    Affiliation Parasites and Microbes, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom

  • Zahra Abdellah,

    Roles Investigation, Methodology

    Affiliation Parasites and Microbes, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom

  • Ouakou Tchindebet ,

    Roles Resources

    ‡ Unavailable

    Affiliation Guinea Worm Eradication Program, The Carter Center, Atlanta, Georgia, United States of America

  • Mesfin Wossen,

    Roles Resources

    Affiliation Guinea Worm Eradication Program, The Carter Center, Atlanta, Georgia, United States of America

  • Makoy Samuel Yibi Logora,

    Roles Resources

    Affiliation Guinea Worm Eradication Program, The Carter Center, Atlanta, Georgia, United States of America

  • Cheick Oumar Coulibaly,

    Roles Resources

    Affiliation Guinea Worm Eradication Program, The Carter Center, Atlanta, Georgia, United States of America

  • Adam Weiss,

    Roles Resources

    Affiliation Guinea Worm Eradication Program, The Carter Center, Atlanta, Georgia, United States of America

  • Albrecht I. Schulte-Hostedde,

    Roles Resources

    Affiliation Department of Biology, Laurentian University, Sudbury, Canada

  • Jeremy M. Foster,

    Roles Resources

    Affiliation New England Biolabs, Ipswich, Massachusetts, United States of America

  • Christopher A. Cleveland,

    Roles Resources, Writing – review & editing

    Affiliation Southeastern Cooperative Wildlife Disease Study, Department of Population Health, Veterinary Medicine, University of Georgia, Athens, Georgia, United States of America

  • Michael J. Yabsley,

    Roles Resources, Writing – review & editing

    Affiliations Southeastern Cooperative Wildlife Disease Study, Department of Population Health, Veterinary Medicine, University of Georgia, Athens, Georgia, United States of America, Warnell School of Forestry and Natural Resources, University of Georgia, Athens, Georgia, United States of America

  • Ernesto Ruiz-Tiben,

    Roles Conceptualization, Resources, Writing – review & editing

    Affiliation Guinea Worm Eradication Program, The Carter Center, Atlanta, Georgia, United States of America

  • Matthew Berriman ,

    Roles Conceptualization, Funding acquisition, Supervision, Writing – review & editing

    james.cotton@sanger.ac.uk (JAC); mb4@sanger.ac.uk (MB)

    Affiliation Parasites and Microbes, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom

  • Mark L. Eberhard,

    Roles Conceptualization, Resources, Writing – review & editing

    Affiliation Retired, Parasitic Diseases Branch, Division of Parasitic Diseases and Malaria, Centers for Disease Control and Prevention, Atlanta, Georgia, United States of America

  •  [ ... ],
  • James A. Cotton

    Roles Conceptualization, Formal analysis, Funding acquisition, Supervision, Writing – original draft, Writing – review & editing

    james.cotton@sanger.ac.uk (JAC); mb4@sanger.ac.uk (MB)

    Affiliation Parasites and Microbes, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom

  • [ view all ]
  • [ view less ]

Abstract

Background

Guinea worm–Dracunculus medinensis–was historically one of the major parasites of humans and has been known since antiquity. Now, Guinea worm is on the brink of eradication, as efforts to interrupt transmission have reduced the annual burden of disease from millions of infections per year in the 1980s to only 54 human cases reported globally in 2019. Despite the enormous success of eradication efforts to date, one complication has arisen. Over the last few years, hundreds of dogs have been found infected with this previously apparently anthroponotic parasite, almost all in Chad. Moreover, the relative numbers of infections in humans and dogs suggests that dogs are currently the principal reservoir on infection and key to maintaining transmission in that country.

Principal findings

In an effort to shed light on this peculiar epidemiology of Guinea worm in Chad, we have sequenced and compared the genomes of worms from dog, human and other animal infections. Confirming previous work with other molecular markers, we show that all of these worms are D. medinensis, and that the same population of worms are causing both infections, can confirm the suspected transmission between host species and detect signs of a population bottleneck due to the eradication efforts. The diversity of worms in Chad appears to exclude the possibility that there were no, or very few, worms present in the country during a 10-year absence of reported cases.

Conclusions

This work reinforces the importance of adequate surveillance of both human and dog populations in the Guinea worm eradication campaign and suggests that control programs aiming to interrupt disease transmission should stay aware of the possible emergence of unusual epidemiology as pathogens approach elimination.

Author summary

Guinea worm–Dracunculus medinensis–was historically one of the major parasites of humans and has been known since antiquity. Guinea worm is now on the brink of eradication, as a global effort seeking to make this parasite extinct has reduced the number of people infected each year from millions in the 1980s to only 54 human cases in 2019. Despite the enormous success of eradication efforts to date, one complication has arisen. Over the last few years, hundreds of dogs have been found infected with Guinea worm, which was previously considered to be transmitted between people. Almost all of these infected animals are in Chad. We have used whole-genome sequence data to try and understand this peculiar situation. We confirm that the same population of worms are causing both infections and show suspected transmission from humans to dogs. The diversity of worms in Chad appears to exclude the possibility that there were no, or very few, worms present in the country during a 10-year absence of reported cases. This work reinforces the importance of adequate surveillance of both human and dog populations and suggests that control programs should stay aware of the emergence of unusual epidemiology as pathogens approach elimination.

Introduction

Guinea worm–Dracunculus medinensis (Linnaeus, 1758) Gallandant, 1773 –has been an important human parasite for most of recorded history. It is also one of the best known human pathogens and has been known since antiquity [1]. This infamy is presumably due to its distinctive life cycle, where the large adult female worm (up to 1m long) causes excruciating pain as it emerges from a skin lesion. As recently as 1986, there were probably around 3 million cases of Guinea worm disease (GWD) reported annually from 22 countries in Africa and Asia [2] and historically probably very many more [3]). Called the quintessential “forgotten disease of forgotten people,” GWD was responsible for an enormous disease burden as patients are incapacitated for several weeks during worm emergence [4, and many other studies cited in 5], and subsequent complications and serious secondary infections of the resulting ulcer are common and occasionally fatal [1].

Following the eradication of smallpox in 1980, public health scientists at the US Centers for Disease Control and Prevention (CDC) recognised that Guinea worm disease was a potential target for eradication [e.g. 6, 7]. Since 1986, Guinea worm has been the target of a large-scale control program aiming for complete, global eradication of the disease and extinction of the parasite responsible [8]. The introduction of interventions to encourage residents to report cases of GWD, prevent infected persons from contaminating source of drinking water, provide new sources of safe water and promote greater use of existing sources, promote the use of cloth and pipe (“straw”) filters, and the application of vector control measures has subsequently reduced the incidence of GWD [5, 9]. As the program has progressed, these measures have been complemented by work to ensure sources of infection are traced and treated for cases, containment of cases to prevent contamination of water, and active searching for new cases [9]. The global Guinea worm eradication campaign has been a great success–by 2000 there were only 74,258 cases of GWD in 15 countries in sub-Sharan Africa [10], and this had fallen to just 15 cases in each of Chad and Ethiopia as of 2017 [11], while in 2019 only Angola, Chad and South Sudan reported human infections [12]. Either Guinea worm disease or polio [13] will soon become the second human disease to be eradicated, and Guinea worm is on track to be the first to be wiped out without a vaccine, and probably the first animal species to be deliberately made extinct.

The eradication campaign was predicated on D. medinensis being an anthroponotic parasite, with transmission between people via drinking water. Sporadic reports of animal infections were assumed to either be due to misidentification of the worm involved or to represent spill-over infection of little or no epidemiological importance. However, experimental infections of non-human animals–and particularly of dogs–have been performed successfully on a number of occasions [1]. In natural conditions, worms have particularly frequently been reported as emerging from dogs, but generally at a low prevalence and sporadically. When human infections with Guinea worm have been eliminated from a region, dog infections from that region have subsequently disappeared [14, 15]. There are some apparent exceptions: for example, in Bukhara, Uzbekistan, where a hotspot of very high Guinea worm prevalence (up to 20%) was eliminated in the 1930s, Guinea worm infections in dogs continued to be reported for a few years after this, but no human cases were found after 1931 [16, 17].

From 2012, however, a distinct, and apparently unique situation became evident in Chad, where large numbers of infections in domestic dogs have appeared, against the background of a small number of human cases [15]. Dog infections became evident beginning in April 2012, when the Chad Guinea worm eradication program (with assistance from The Carter Center) launched active village-based surveillance in nearly 700 villages, following the detection in 2010 of human cases for the first time in 10 years in Chad. With increasing surveillance of dog populations, the number of dog infections reported has subsequently steadily increased, and since 2016, there have been over 1,000 infected dogs reported from Chad each year (1,935 in 2019; 12]. Small numbers of dog infections have also been identified in the other recently endemic countries (in 2019, 2 from Ethiopia, 8 from Mali and none from South Sudan, although this country did report a single dog infection in 2015). There is an urgent interest in understanding the epidemiology of worm infections in dogs and wildlife, and their relationship with human infections in the same areas [18, 19]. Greater scrutiny of animals for potential Guinea worm infection has also revealed occasional infections in wildlife, such as cats and baboons (see [20] for a full description of the situation in 2016–2017).

In this context, there was some uncertainty as to whether the worms emerging from dog and human infections in Chad represented the same species. Most of the key defining morphological features for this group of nematodes are found on adult males, which are not recovered from natural infections [1,21]. The other described species of the genus Dracunculus are all from the New World and include D. insignis and D. lutrae from North American carnivorous mammals and D. fuelleborni from a Brazilian opossum [1,22,23]. There are also numerous reports of Dracunculus spp. in reptiles–particularly snakes [21]. Molecular phylogenetic work supports the mammalian parasites as a distinct clade to those found in other vertebrates [2427]. The diversity and phylogeny of the genus has recently been reviewed [21]. There is a relative scarcity of parasitological work on wild mammals, and particularly of work looking beyond gastrointestinal species. There are also a number of reports of cryptic species of parasitic worms in wildlife (reviewed in [28]). It is thus possible that other, undescribed mammal-infective species exist, and these could explain the few reports of human or mammal Guinea worm infections from countries otherwise considered non-endemic (see Muller, 1971 for references to case reports). A number of comprehensive reviews of D. medinensis biology, epidemiology and control are available (e.g. [1,8]) and the reader is referred to the extensive literature on the Guinea worm eradication program [5,9,2931], including regularly updated surveillance data (most recently in [11]).

Previous molecular work established that D. medinensis and D. insignis could be differentiated at the 18S rRNA locus, and that a single dog worm from Ghana was identical at that locus to D. medinensis collected from nearby human cases [25]. We subsequently reported data from the 18S rRNA locus and a mitochondrial marker for 14 worms that emerged from humans and 17 from dogs in Chad, together with whole-genome data from 6 worms [15]. A draft reference genome assembly based on sequence data from a single worm from Ghana [32] recently gave the first picture of the genome content of this species and confirmed the phylogenetic position of D. medinensis within a large spiruromorph clade of parasites related to filarial nematodes. Here, we present an improved genome assembly for D. medinensis and whole genome sequence data for a much larger set of adult D. medinensis and from two closely related species. Together, these data give a detailed picture of the relationships between D. medinensis from different hosts and countries, and confirms existing microsatellite genotyping and mitochondrial sequence data from the same populations [33] which showed that human cases of dracunculiasis and animal infections all originate from the same populations of D. medinensis. We also investigate whether the pattern of genetic variation suggests a small population size (a population bottleneck) in Chad in the recent period of 10 years that no human cases were reported in the country, and more generally whether D. medinensis populations show signals of declining population sizes that we might expect to be due to the eradication program.

Methods

Worm material from D. medinensis was collected by the national Guinea worm eradication programs in the relevant countries, except that material from experimentally infected ferrets were obtained as previously described [34]. D. insignis material was collected from an American mink (Neovison vison) and D. lutrae material was collected from an otter (Lutra canadensis) in Ontario, Canada [26]. Specific ethical approval was not required as material was derived from standard containment and treatment procedures sanctioned by WHO and national governments and performed by national control program staff, and molecular testing is part of the standard case confirmation procedure. Human case samples were anonymized prior to inclusion in this study.

Genomic DNA was extracted from either 5-15mm sections of adult female worm specimens or from the pool of L1 larvae visible in sample tubes, wherever larvae were visible. DNA extraction was performed using the Promega Wizard kit, but with worm specimens cut into small pieces before digestion with 200μg of Proteinase K overnight in 300 μl of lysis buffer, then following the protocol described in the manual. PCR-free 200–400 bp paired-end Illumina libraries were prepared from genomic DNA as previously described [35] except that Agencourt AMPure XP beads were used for sample clean up and size selection. DNA was precipitated onto the beads after each enzymatic stage with a 20% (w/v) Polyethylene Glycol 6000 and 2.5 M sodium chloride solution, and beads were not separated from the sample throughout the process until after the adapter ligation stage. Fresh beads were then used for size selection. Where there was insufficient DNA for PCR-free libraries, adapter-ligated material was subjected to ~8–14 PCR cycles. Libraries were run on an Illumina platform (HiSeq 2000, 2500 or HiSeq X) to generate 100 base pair or 150 bp paired-end reads.

Sequence data was compared to a reference genome assembled from a worm collected in Ghana in 2001. The sequence data and automated assembly of v2.0 of this reference is described fully elsewhere [32]. The v3.0 reference used here has undergone some manual improvement, with REAPR [36] used to identify problematic regions of the assembly to be broken, followed by iterative rounds of re-scaffolding as indicated by read-pair and coverage information visualised in GAP5 [37] and automated gap-filling with IMAGE [38] and Gap-filler v1.11 [39] and a final round of sequence correction with iCORN v2.0 [40]. Assembly statistics for v2.0 and v3.0 of the D. medinensis genome are shown in S1 Table. All data generated in this study are available from the European Nucleotide Archive short read archive, a project ERP117282; accession numbers for individual samples are shown in S1 Table.

Mapping was performed with SMALT v.0.7.4 (http://www.sanger.ac.uk/science/tools/smalt-0), mapping reads with at least 95% identity to the reference, mapping paired reads independently and marking them as properly paired if the reads within a pair mapped in the correct relative orientation and within 1000bp of each other (parameters–x–y 0.95 –r 1 –i 1000). To avoid problems with mitochondrial data, mapping was also performed similarly against a reference containing mitochondrial genomes for dog, human and ferret. Duplicate reads were removed using Picard v2.6 MarkDuplicates. The BAM files produced were used as input to Genome Analysis Toolkit v3.4.0 for variant calling, following the ‘best practice’ guidelines for that software release: briefly, reads were realigned around indel sites, after which SNP variants were called using HaplotypeCaller with ploidy 2. Variants were then removed where they intersected with a mask file generated with the GEM library mappability tool [41] with kmer length 100 and 5 mismatches allowed, or were within 100bp of any gap within scaffolds. Finally, SNPs were then filtered to keep those with DP > = 10; DP < = 1.75*(contig median read depth); FS < = 13.0 or missing; SOR < = 3.0 or missing, ReadPosRankSum < = 3.1 AND ReadPosRankSum > = -3.1; BaseQRankSum < = 3.1 AND BaseQRankSum > = -3.1; MQRankSum < = 3.1 AND MQRankSum > = -3.1; ClippingRankSum < = 3.1 AND ClippingRankSum > = -3.1. An additional mask was applied, based on the all-sites base quality information output by GATK HaplotypeCaller. The filters applied were DP > = 10, DP < = 1.75*(contig median read depth) and GQ > = 10. Finally, sites with only reference or missing genotypes were then removed. Variant calling on the mitochondrion was performed similarly, except reads were first filtered to retain only those for which both reads in a pair mapped uniquely to the mitochondrion in the correct orientation with mapping quality at least 20, the read depth filter was 10 for all samples, all heterozygous calls were removed and the mask file was generated manually by examining dot plots and removing regions with a high density of heterozygotes.

Synteny between the D. medinensis v3.0 assembly and the published O. volvulus v4.0 assembly was confirmed using promer from the Mummer package [42] to identify regions of >50% identity between the two sequences over 250 codons. These results were then visualised using Circos v0.67pre5 [43]. The phylogenetic tree for D. medinensis samples was based on the proportion of alleles matching between each pair of samples at those sites for which both samples in a pair had a genotype call that passed the filter criteria. A phylogeny based on these distances was inferred by neighbour-joining using the program Neighbour from Phylip v3.6 [44]. Principal components analysis was performed on a matrix of genotypes for sites with no missing data in R v3.3.0 [45] using the prcomp command. Population genetic summary statistics within and between populations were calculated for 10kb window of SNPs containing between 5 and 500 variants, using ANGSD v0.919-20-gb988fab [46]. This software estimates neutrality tests [47] or genetic differentiation between populations [48] following a probabilistic framework that employs genotype likelihoods. It is intended to be more robust to genotyping error than traditional calculations using the genotypes directly. Only sites with a minimal depth of 5 reads, a minimal base and mapping quality Phred score of 30 and a call rate of at least five individuals were used, and genotype likelihoods were estimated under the samtools [49] framework (GL = 1). In the absence of known ancestral states, folded site frequency spectra were generated to derive nucleotide diversity π, Watterson’s θ and Tajima’s D in 1kb windows. FST estimates were computed from maximum-likelihood joint site frequency spectra between pairs of populations derived using the reference genome as the ancestral state. Estimates were generated for 10-kb sliding windows (with 1 kb overlaps) containing between 5 and 500 variants. We report genome-wide averages across these windows, and confidence intervals for these statistics were calculated for 10kb from 100 bootstrap replicates, resampling from 10kb windows. Values for absolute divergence (dxy) were calculated similarly using pixy (https://pixy.readthedocs.io/en/latest/about.html), with summary statistics calculated as above. Unless otherwise specified, plots were produced in R v3.3.0 with ggplot2 [50].

Bayesian clustering was performed with MavericK v1.0 using thermodynamic integration to estimate the number of clusters (K) best describing the data [51]. MavericK was run for 3 independent runs of 1,000 burnin generations and then 10,000 generations for inference, and with each rung of the thermodynamic integration run for 1,000 burnin generations and 5,000 generations for inference, for the default 21 rungs. For comparison, Structure v2.3.4 was run [52], using the deltaK method to select a value of K. Input to both of these was a set of 19,983 SNP variants samples across the D. medinensis scaffolds at 5 kb intervals. Structure was run using an admixture model, with a burn-in of 100,000 generations and using another 100,000 generations for inference.

Population history of D. medinensis was inferred using BPP v4 [53] to infer the number and branching pattern of populations, and then GPhoCS v1.2.2 [54] to infer branching times and effective population sizes on the maximum posterior probability history. GPhoCS requires populations and the phylogeny to be specified a-priori, but is able to perform inference using a larger set of loci more efficiently. For GPhoCS a total of 781 loci were chosen as contiguous 1kb regions spaced every 100kb across all autosomal scaffolds. Results were scaled to time and effective population time using a mutation rate of 2.7 x 10−9 per generation, as estimated for Caenorhabditis elegans [55] and a generation time of 12 months: D. medinensis females emerge 10–14 months after infection [6]. At least three (3–6) independent MCMC chains were run for each of 5 different prior assumptions, with each chain running for at least 25,000 MCMC generations. In each case, identical priors were used for all θ and τ (population sizes and divergence times, respectively) parameters; priors for GPhoCS are specified as gamma distributions parameterised with a shape (α) and rate (β) parameters (hyperparameters). We held the β hyperparameter constant at 0.1, and chose α values varying by 4 orders of magnitude, from 10−4 to 10−8, so that the means of the prior distributions varied from 6.43335–64,335 for θs following scaling and from 25.7342 to 257,342 for τ parameters. The variance of the prior distributions also varied linearly with changes in the α hyperparameter. Convergence was confirmed by visual inspection of the chains for each prior. For inference, the first 15,000 generations of each chain were removed and the remaining steps concatenated; highest posterior density estimates and effective sample sizes were calculated using the R packages HDInterval and mcmcse respectively. For all parameters the effective sample size was at least 250.

BPP attempts to identify reproductively isolated populations and estimate the phylogeny underlying those populations in a joint Bayesian framework [56, 57]. Population size parameters were assigned the default inverse gamma priors with mean 0.002 and shape parameter (alpha) = 3, the root divergence time an inverse gamma prior with mean 0.001 and alpha 3, other divergence time parameters default Dirichlet prior. Each analysis is run at least twice to confirm consistency between runs, and each chain was run for 10,000 burnin generations and 50,000 generations for inference. Convergence was assessed by inspection of these chains in Tracer v.1.6. Due to difficulties in getting the software to run successfully to convergence using our complete dataset for BPP, a subset of 100 loci was chosen at random from these 781 loci. Three different random sets of loci gave essentially identical results (97.2%, 98.2% and 98.4% support for the same maximum-probability reconstruction; duplicate runs of the same loci varied by less than 0.5%).

Kinship between samples was calculated using King v1.4 [58]. Distances between latitude and longitude points were calculated using the online calculator at the US National Oceanic and Atmospheric Administration at https://www.nhc.noaa.gov/gccalc.shtml. The map in S7 Fig was based on the 1:50m land shape file from www.naturalearth.com, and drawn using the R packages maps (v3.3.0), maptools (v0.9) and the theme_map from ggthemes (v4.1).

Results

Whole-genome sequence data from Dracunculus specimens from a range of host species and geographic regions

In this study, we attempted to generate whole-genome shotgun sequence data for 90 D. medinensis specimens; for four samples we could not make sequencing libraries. We also sequenced two samples of D. insignis and one sample of D. lutrae. To aid interpretation of these data, we used the original Illumina data used to improve the v2.0.4 reference genome assembly for D. medinensis–based on a worm collected in Ghana in 2001 [32]–with a combination of both manual and automated approaches to produce an improved (v3.0) assembly for D. medinensis. This substantially improved contiguity and reduced misassemblies, for example the average scaffold length is twice that of the previously published assembly version (Table 1).

thumbnail
Table 1. Assembly statistics for Dracunculus medinensis assembly versions.

https://doi.org/10.1371/journal.pntd.0008623.t001

Despite extensive sequencing effort, mapping our data against this reference showed that we achieved a median depth of 10× coverage for only about one-third (33) of D. medinensis samples. Unless otherwise specified, subsequent analyses were restricted to this set of 33 D. medinensis samples. These samples were collected from a number of African countries (Fig 1), with 22 from Chad, 5 from Ethiopia, 2 each from Ghana and South Sudan and 1 each from Mali and Côte d’Ivoire. It included 15 samples from humans, 15 from dogs and 3 from other animals (2 Ethiopian baboons, Papio anubis, and one from a Chad cat, Felis catus). Full details of the samples and coverage achieved are shown in S1 Table. The low coverage we achieved was due to extensive contamination with bacterial and, in some cases, host DNA, so that the percentage of reads mapping to the reference varied from 0.07% up to 94.8% (S1 Fig; S2 Fig); even within the genome-wide coverage set of 33 samples as few as 17.9% of reads mapped in one case. For 9 D. medinensis samples, sequence libraries were generated from both adult female material and L1 larvae present in the same sample tubes (representing the offspring of that female).

thumbnail
Fig 1.

(A) Coverage variation across the Dracunculus medinensis genome in worms with known sex. Each point is the mean single read coverage across non-overlapping 5kb windows along the length of the five longest scaffolds for three juvenile worms recovered from an experimentally infected ferret. The 3 longest scaffolds show synteny to different Onchocerca volvulus chromosomes, the next 2 scaffolds are syntenic to the O. volvulus X chromosome (see S1 Fig). (B) Ratio of coverage across large autosomal and X-linked scaffolds for worm recovered from infected humans and animals in Africa. The y-axis shows the ratio of mean coverage on the 3 longest autosomal scaffolds to that of the mean coverage on the 2 longest X-linked scaffolds (these are the longest 5 scaffolds in the assembly, as shown in panel (A).

https://doi.org/10.1371/journal.pntd.0008623.g001

Coverage also varied across the genome, most strikingly for two of the longest 5 scaffolds, which were often lower in coverage than other large scaffolds, varying from around three-quarters of the expected coverage to approximately similar coverage. We hypothesised that these scaffolds could represent all or part of a sex chromosome (X) in D. medinensis. The L1 larval samples showed coverage on these scaffolds around 75% that for other large scaffolds, as expected for an XY or XO sex determination system if the pool of larvae consisted of an approximately equal ratio of male and female larvae. More surprisingly, most of the DNA samples extracted from female worms showed similarly low relative coverage of these two scaffolds. We suggest that this is because much of the material extracted from these specimens is actually from L1 larvae remaining in the body of the female worm section. This seems plausible, as much of the female body comprises uterus containing several million larvae [8], and if the female body was largely degraded that explains the difficulty we had in extracting DNA from many samples.

To confirm this, we generated sequence data for juvenile worms harvested from a domestic ferret experimentally infected with D. medinensis [34]. These worms were 83 days old and pre-patent (and thus comprised only or largely somatic tissue), but could be morphologically identified as male and female. Analysis of data from these worms (Fig 1A) confirmed that the scaffolds with low coverage showed this pattern specifically in male worms, while the female worm showed essentially even coverage across the largest scaffolds, including the putative X and likely autosomal scaffolds. Further evidence comes from a comparison with Onchocerca volvulus, in which the sex chromosomes are known, as there is clear synteny between the D. medinensis scaffolds with variable coverage and one end of the O. volvulus X chromosome (S3 Fig). This part of the O. volvulus X chromosome represents the ancestral X chromosome of filarial nematode [59]: these data suggest that this was already present in Dracunculus, as well as filarial nematodes as previously suggested [60].

We thus used the ratio of mean coverage between the 3 largest autosomal scaffolds and 2 longest X chromosome scaffolds as a measure of the proportion of genomic DNA in our sample derived from larval vs female tissue, under the assumption that larvae are an equal mixture of the two sexes (Fig 1B). These data confirm that many samples contain substantial amounts of larval-derived DNA. One sample had a particularly high value for this statistic–for this sample, the mean coverage on scaffold X_DME_002 was inflated by the presence of a small region of extremely high read depth. X chromosome scaffolds were also excluded from subsequent population genetic analyses likely to be sensitive to the different dosage of these chromosomes (see Methods).

African D. medinensis is highly divergent from other mammalian Dracunculus species

While most sequencing reads from high-quality D. medinensis samples mapped against our reference assembly (median across samples of 68.48%), the reads from the other two Dracunculus species mapped less comprehensively against the D. medinensis reference (S2 Table), which given the mapping parameters used suggests that many regions of the genome are more than 5% divergent between species. This was confirmed by variant calls in those regions of good read mapping: even given the poorer mapping quality, around 2.9 million sites varied between the three species, suggesting genome-wide divergence of at least 3% of the 103.8 Mb genome, as the mapping difficulty meant this is likely a significant underestimate. While interpretation of absolute divergence levels is difficult, our lower-bound estimate of divergence between these species is much greater than between different species of Onchocerca (O. ochengi and O. volvulus, respectively), which are less than 1% divergent (Cotton et al., 2016) and is consistent with having hundreds of thousands of years of independent evolutionary history. A principal components analysis (PCA) of SNP variants between these samples confirmed that samples from each species cluster closely together, and that the different species are well separated (Fig 2A). The first two principal component axes shown here explain 79.9% and 16.5% of the variation, respectively. More than 3-fold more sites were called as varying between Dracunculus spp. than observed across all 33 of our genome-wide D. medinensis samples, where about 981,198 sites vary.

thumbnail
Fig 2.

Principal components analysis of whole-genome data for (A) 33 Dracunculus medinensis samples, 2 D. insignis samples and 1 D. lutrae sample and (B) principal components analysis and (C) phylogenetic tree for just the 33 D. medinensis samples. The legend in the top right-hand corner of (C) applies to both panels (B) and (C). Dotted lines on panel (C) enclose three pairs of samples where both adult female tissue and L1 larvae from the same worm are included.

https://doi.org/10.1371/journal.pntd.0008623.g002

Geography rather than host species explains the pattern of variation within African D. medinensis

Clear geographic structure was observed in the pattern of genome-wide variation within D. medinensis. PCA (Fig 2B) of the variants show distinct clusters of parasites from Ghana, Mali and Côte d’Ivoire (referred to as the ‘West African’ cluster) the Ethiopia, South Sudan and one Chad sample (an ‘East African’ cluster), and a group of parasites from Chad. The first two principal components explain only 22% of the variation in these data (14% and 8% respectively). Additional principal components axes, up to the 8th axis, together explain 54% of the variation but none of these axes partition the genetic variation between host species (S4 Fig).

Phylogenetic analysis (Fig 2C) supported this pattern, with clear clades of West African and East African worms. The Chad sample visible as being part of the East African cluster in the PCA (2015-5ChD, a worm from a dog infection emerging in 2015) was part of the East African clade in the phylogenetic tree. A second Chad worm, from a human case in 2011, also appeared to be divergent from any other worm on our phylogeny. There was no apparent clustering by host species in Chad or Ethiopia, the two countries for which worms from multiple dog and human infections were included, and no clear clustering by year of worm emergence. In all three cases where both L1 larvae and adult sections from a single emerging worm yielded high-quality data, these two samples clustered very closely together.

Other approaches to investigate population structure support these conclusions. Bayesian clustering using MavericK strongly supported a model of only 2 populations (K = 2) for these data, with posterior probability of 1.0 for this value of K. The two populations divided worms collected in Chad from those collected elsewhere, with the exception of the Chad worm 2015-5ChD, which clustered with those from other countries, as in the PCA and phylogeny. Analysis with Structure suggested that K values of between 2 and 4 fitted the data well. In all cases these analyses clustered worms largely by geographical origin, and not by host. In the highest K values, most Chad worms had mixed ancestry between two Chad populations, and in no analysis did worms from different host species cluster together more than expected (S5 Fig). As expected, differences in allele frequencies between worms from dog infections and human cases within Chad are low (mean Fst 0.01806, 99% confidence interval 0.0172–0.0189; median Fst 0.0114, CI 0.0109–0.0118) as are absolute levels of genetic differentiation (mean dxy 0.00357; 99% CI 0.00352–0.00364; median dxy 0.00265; 99% CI 0.00260–0.00269) and remain consistently low across the genome (S6 Fig), confirming that there is no genetic difference between worms infecting dogs and humans.

Mitochondrial genome data confirms the geographic structure of the D. medinensis population

To allow us to study a wider range of samples, we called variants against the mitochondrial genome of D. medinensis for a total of 65 samples that had median coverage of at least 10× across this sequence. The additional samples included 14 dog and 18 human samples and included a single sample from Niger, slightly expanding the geographical range of samples included. Our variant calling approach identified 182 variable sites that could be reliably genotyped across those samples. The results of this analysis (Fig 3) are congruent with those from nuclear genome variation, with a strong signal of clustering by geographical origin. The worm collected in Niger joined a tight cluster that included all West African samples (Ghana, Mali and Côte d’Ivoire) with the exception of two worms from Mali collected in 2014: one was closely related to two worms from Chad cases in 2014 and 2015, and the second appears as an outgroup to a large clade of Chad worms. The two other exceptions to the clear geographical structure were a worm from a dog in Chad in 2015 which was most similar to one from a South Sudan case from 2014 within a small clade of Ethiopia and South Sudan worms, and one from a human case in Chad in 2014 that groups as part of a more diverse group of Ethiopia worms. As with the nuclear data, worms from human cases and infections in dogs and other animals often group together, with extremely similar mitochondrial haplotypes; there is no clear signature of clustering by host species.

thumbnail
Fig 3. Phylogenetic tree based on inferred mitochondrial genome sequences for 65 Dracunculus medinensis samples for which sufficient coverage of the mitochondrial genome was available.

For clarity, arrowed circles show host and geographic origin for samples with very similar mitochondrial haplotypes.

https://doi.org/10.1371/journal.pntd.0008623.g003

D. medinensis from Chad are genetically diverse but are in decline

Phylogenetic analysis of both nuclear and mitochondrial data, and the nuclear genome PCA appear to show that worms in Chad are considerably more diverse than those from the other regions included in our analysis, although this could be due to our much smaller number of samples from other countries. To ensure an adequate sample size for comparison, we combined samples from countries with small numbers of samples into three regional groups, combining Ethiopia and South Sudan samples into an East African group, and samples from Mali, Ghana and Côte d’Ivoire into a West African group, while Chad was considered alone. Population genetic summary statistics (Table 2) for these groups confirmed the pattern suggested by phylogenies and PCA: we see highest nucleotide diversity (π) in Chad, while the East African group is slightly, but significantly less diverse and the West Africa group has an order-of-magnitude lower nucleotide diversity. A second estimator of genetic diversity (Watterson’s θ) shows lower values for the East African and Chad populations, but is higher in East Africa than Chad. For neutral variants in a population at equilibrium π and θ are expected to be equal, but Watterson’s estimator is heavily influenced by rare alleles. The difference between π and θ that we observe indicates an excess of common variants in the East Africa and Chad populations over neutral expectations (see e.g. pp28-30 and pp288-289 of [61] for a full description). This is captured by high Tajima’s D values, which are simply a normalised difference between π and θ. While a variety of population genetic processes can influence these statistics, high Tajima’s D across the genome in these two regions is most likely indicative of a demographic process, such as a recent sharp decline in the worm populations [62].

thumbnail
Table 2. Population genetic summary statistics for Dracunculus medinensis populations.

Values are means and 95% bootstrap confidence intervals for the means of 1kb windows containing between 5 and 100 informative (variable) sites.

https://doi.org/10.1371/journal.pntd.0008623.t002

Coalescent models suggest a large population has been continuously present in Chad

To confirm the population structure of D. medinensis in Africa, we constructed coalescent models based on 1kb loci spaced every 100kb–much longer than the distance over which linkage disequilibrium decays to approximately background levels–across the large scaffolds of the D. medinensis reference genome assembly. An initial analysis used samples grouped by country and human vs animal hosts, which aimed to assign these samples to the most likely set of discrete populations, resulted in the highest posterior probability supporting a single group of West African parasites (samples from Mali, Ghana and Côte d’Ivoire into a West African group (96.44% posterior probability), an East African group of Ethiopia and South Sudan samples (58.26%) and uniting human and animal infections from Chad into their own group (58.83% support). There was weaker support for a group uniting East Africa and Chad samples (38.5%). No alternative grouping of samples–including the solution of having a single population for each country–received more than 3% posterior probability support. Due to our small number of samples, we thus combined samples into these three regional groups; except that given our more extensive sample of worms from Chad we could also investigate whether Chad worms were best explained as two host-specific populations of worms from dog infections and human cases, or as a single group. Only two scenarios for the population structure received support in the posterior sample from the Markov chain Monte Carlo (MCMC) procedure (see Fig 4A). In both scenarios, worms from Chad were more closely related to those from the East African group than to the West African group. By far the strongest support (average 97.9% of posterior samples, over 3 replicate sets of 100 random loci) supported a single Chad population of worms that emerged from both human and animal hosts.

thumbnail
Fig 4. Coalescent models of Dracunculus medinensis population structure.

(A) Out of all possible scenarios for up to 4 distinct isolated populations of D. medinensis, we find posterior support for only 2, with strong support only for a model in which all worms from Chad are part of one population, more closely related to worms from Ethiopia and South Sudan than to those from elsewhere in our sample set. (B) Estimates of divergence times and genetic (effective) population sizes under the supported model shown in (A). Values shown are posterior means and 95% highest posterior density estimates for each parameter in this model, under one set of prior assumptions.

https://doi.org/10.1371/journal.pntd.0008623.g004

Using this highly supported population history, we used a second coalescence approach to estimate parameters describing the demographic history of the three regional present-day populations and the ancestral populations that gave rise to them (Fig 4B, S3 Table). Assuming a similar per-generation mutation rate to C. elegans and a generation time of 1 year, these analyses suggest that the Chad and East African populations have been separated for at least several thousand years, and that divergence from the West African population was about 5-fold older. The long-term effective population sizes of the Chad and East Africa populations reflect the higher nucleotide and phylogenetic diversity, with Chad being around 4-fold higher with an estimated 20 to 40 thousand breeding individuals.

Relatedness between D. medinensis isolates

Our population genetic evidence supports the idea that a single, diverse population of Guinea worms exists in Chad and is infecting both humans and animal species. More direct evidence of transmission between host species would be genetic relatedness between worms that emerged in different species. We employed a method to estimate pairwise relatedness between isolates based on SNP variants that is intended to be robust to population structure. Kinship is the probability that a random allele sampled from each of two individuals at a particular locus are identical by descent. The expected value in an outbred diploid population is 0.5 for monozygotic twins and 0.25 for full sibs or parent-offspring pairs.

Excluding the L1 larval samples that have matching adult worm samples, the median kinship across all pairs of samples is low, but non-zero (0.0077; approximately that expected for third cousins), but worms from the same countries are much more highly related (e.g. median kinship of 0.087 for pairs within Chad). There is clear geographic structure to kinship in these data, as most worm samples from the same countries are related to at least one other sample from that country with kinship of close to 0.25 or higher (Fig 5), while only a single pair of closely related worms are from different countries. Notably, six pairs of worms have relatedness of higher than 0.45, close to the maximum possible value of 0.5 (Fig 5). These six pairs include all 3 sets of matched adult and larval samples in the whole-genome coverage set, providing support for the hypothesis that other pairs with a similarly high relatedness could represent first-degree relatedness (such as parent-offspring or full siblings). The inflated kinship values are explained by a high level of inbreeding within each country, with high median kinship even when these 6 pairs of related worms excluded in Chad (median remains 0.087) and more generally (median kinship for all remaining pairs 0.0038). The other three pairs of high-relatedness samples are all from different worms and from consecutive years, so in light of the semelparous biology of Guinea worm we interpret these as being parent-offspring pairs and these links thus represent putative direct transmission events between guinea worm infections.

thumbnail
Fig 5. Relatedness between Dracunculus medinensis samples.

Nodes on the graph represent worm samples, coloured by their country of origin, and node shapes indicate host species. Lines connect samples with high levels of identity by descent, indicative of direct relatedness. Thick lines indicate kinship > 0.45, whereas thinner lines indicate kinship between 0.45 and 0.235. For clarity, samples with other kinship coefficients are not connected in the graph, and sample names are shown only for those samples in high-relatedness pairs. Inset panel shows the distribution of kinship coefficients across all pairs of samples.

https://doi.org/10.1371/journal.pntd.0008623.g005

The three pairs we identify are all of significant epidemiological interest. One appears to confirm cross-border transmission, proposing that a worm emerging from a dog in Chad in 2015 was caused by a human case detected in South Sudan in 2014. A second pair links a human case in 2014 in Chad with a dog infection in 2015, apparently confirming transmission is possible between human cases and dog infections, while a third links two Chad dog infections in 2014 and 2015. One important note of caution is that all three of these events would imply long range transmission of the infection, with 1812km, 378km and 432km separating the three pairs of infections above, respectively; the two transmission events within Chad also imply movement in different directions on the Chari river basin (S7 Fig).

Discussion

Our data shows that a single population of D. medinensis is responsible for both dog infections and human cases in Chad, with genetic structure in D. medinensis being apparently driven by geographic separation rather than definitive host species. Our data suggest that all Guinea worm infections in African mammals are caused by a single species, D. medinensis. The two other species of Dracunculus with mammalian hosts for which we have sequence data are highly divergent from any D. medinensis specimen we investigated. Genetic variation does exist within D. medinensis in Africa, but follows a spatial pattern, with populations from South Sudan and Ethiopia being more closely related to worms from Chad, and more divergent population of D. medinensis being present in West African countries prior to the recent elimination of the parasite from that region. The set of samples we have investigated from Chad and East Africa show a particularly high genetic diversity, but also signals of a recent decline in population size (positive Tajima’s D), consistent with that expected to be caused by the ongoing work to eradicate Guinea worm in those areas.

We have identified three pairs of worms with high kinship that emerged in consecutive years, which we propose may represent transmission events. If so, our data confirm that cross-border transmission of Guinea worm infection can occur (from South Sudan to Chad in this case) and that infections can be passed from dog to dog and from humans to dogs. Unfortunately, we did not observe dog to human transmission directly in these data. This could be due to the small number of transmission events we could reconstruct, because these transmissions are rare, or a combination of these factors. Interpreting kinship in an inbred population is difficult, so these genealogical links must be considered only provisional, although we note that all three pairs of larval-adult samples for which we had good sequence coverage were correctly identified by this approach. While our data do not speak directly to the changes in lifecycle that might be driving transmission through dog hosts in Chad, both the long-range nature of these 3 transmission events and the fact that they imply different directions of movement along the Chari river basin would seem to lend some support to the idea that a paratenic or transport host could be involved. In particular, as one event is between two dog hosts, human movement may be less likely to be involved. It has been demonstrated experimentally that D. medinensis can pass through tadpoles as paratenic hosts and fish as transport hosts and that both routes can successfully infect ferrets [34, 63]. Furthermore, a frog naturally infected with D. medinensis has been found in Chad [64]. Wildlife infections are also being reported, for example with a number of infections recently reported in Baboons in Ethiopia [11].

Our coalescent models of the Guinea worm population genetic data appear to confirm the geographical structure of these populations, and that worms from human cases and dog infections in Chad form a single population. The estimates of population divergence dates imply that the genetic structure we observe between different regions of Africa predates recent control efforts and likely represents historical population structure. The oldest subdivision we observe, at around 20,000 years ago, coincides with the last glacial maximum when Africa was likely to be extremely arid, even compared to present-day conditions [65]. Similarly the more recent divergence between East African and Chad populations at around 4,000 years ago is during the drying-out of the Sahara at the end of the African humid period [65], which was probably accompanied by a major collapse in human habitation of much of this region [66]. Although our qualitative results appear robust, there are more caveats with the specific quantitative results, due both to the nature of our samples as mixtures of maternal and offspring genetic material and to our limited knowledge of the basic transmission genetics of Guinea worm. In particular, these estimates depend on assumptions about the mutation rate and generation time of D. medinensis. It is generally accepted that Guinea worm infections take approximately 10–14 months to reach patency in human infections [1, 8]. Less certain is whether larvae can remain viable in copepods or within a paratenic host for extended periods of time. No direct measurement of the mutation rate is available for D. medinensis or any related parasitic nematode, and while mutation rates are reasonably consistent across eukaryotes with similar genome sizes [67], variation of several fold from the value we have assumed would not be very surprising. We also note that the relative values of divergence time and population size estimates will remain unchanged under different mutation rates.

Our quantitative model suggests that all three present-day populations have large average effective population size (Ne) (of the order of thousands to low tens of thousands) over thousands of years. The modelling approach we have used is not able to detect more recent changes in these populations, and interpreting these estimates is challenging, as genetic effective population sizes are influenced by many factors such as breeding systems, demography and selection. In particular, historical fluctuations in population size have a strong influence on Ne, approximated by the geometric mean of the population sizes across generations (pp225-226 of [61]). While we have not exhaustively investigated possible demographic scenarios that might be consistent with these genomic data, the high Ne in Chad appears to exclude the possibility that the population of worms in Chad either disappeared or was reduced to a very small bottleneck during the decade without reported human cases. Our Ne is difficult to reconcile with a population size during this time much below hundreds of worms, but we cannot exclude the possibility of a much shorter bottleneck during the period without cases, or a very much higher ancestral population or some combination of these factors. In the absence of Chad samples prior to 2000 or more extensive sampling from neighbouring countries, we cannot exclude the possibility that the Chad worms we analyse–which all emerged in Chad following the 10-year gap in reported cases–migrated from elsewhere. However, we see few Chad worms that are closely related to worms from any of the neighbouring countries for which we had access to samples, so this possibility is purely speculation, and it would seem that quite large-scale influx would be required to explain the level of diversity we see in Chad by migration. Without historical samples, it also remains uncertain to what extent the population structure we see in African Guinea worm today would have been different 30 years ago, when the census population size of the worms was more than three orders of magnitude higher and worms were still widespread in Africa. Our coalescence model suggests that at least Chad and the East African populations we have sampled were still largely distinct at this time, but we have not been able to obtain worm samples suitable for molecular analysis from much of the ancestral range of D. medinensis.

A limitation is the nature of samples available to us, and in particular the very small quantity of genuinely adult material present in specimens despite these being very large for a parasitic nematode. Enrichment methods targeting parasite over host DNA cannot enrich for adult versus larval DNA and it is operationally difficult to alter the way that material is collected in the field in the context of the eradication campaign. The nature of our existing samples as mixtures of many diploid individuals makes some forms of analysis challenging, and is likely to have had an effect on many of our quantitative results–for example in the population genetic summary statistics and coalescent models–in a way that is hard to estimate without more data from genuinely ‘individual worms’–i.e without larval material present. Some forms of analysis appear impossible with our current data: for example, many of the most sensitive signatures of inbreeding we expect to see appearing as the population size declines rely on changes in the level and distribution of homozygous and heterozygous sites [68]. These are not readily apparent in analysis of the data presented here, presumably because of the mixture of genotypes present in each sample. These may be particularly complex if, like many other parasitic nematodes D. medinensis is polyandrous [6971]. We are currently generating sequence data from individual L1 larvae which should let us look for these signals, dissect the contribution of different males to a brood, and infer recombination and mutation rates in D. medinensis, avoiding the need to rely on estimates from C. elegans, which is both very distantly related to D. medinensis and has, of course, a very different life history. We have recently demonstrated the feasibility of this approach in a different parasitic nematode system [71]. Efforts to extract useful genome-wide information from the low-quality D. medinensis samples not analysed here are ongoing, with results from a sequence capture approach showing some promise. Our results are consistent with the findings of previously published targeted genotyping with mitochondrial and microsatellite markers, which also produced additional insights into the population genetics of D. medinensis from a much more extensive set of parasite samples [33].

Finally, the data we present here, together with other data from Guinea worm populations [15,25,33] preserve something of the genetics of D. medinensis in the final foci of infection. The genome sequence should help preserve some of the biology of this important human pathogen following the extinction of D. medinensis with eradication, but more importantly we expect these data to be crucial in the final steps of the eradication process. By defining much of the currently existing diversity of Guinea worm, these data will act as a reference to determine whether future cases for which the source of infection is unclear represent continuing transmission from these foci or previously unidentified worm populations. At the time the global campaign to eradicate Guinea worm began, the strong expert consensus was that Guinea worm was anthroponotic, and transmitted exclusively through ingestion of contaminated drinking water [9]. The emergence of large numbers of dog infections in Chad could not have been predicted, and the eradication campaign has uncovered other unexpected aspects of Guinea worm biology or epidemiology, such as the possibility of paratenic or transport hosts being involved in the lifecycle [34,64]. For example, a recent surprise is the emergence of Guinea worm infections in Angola, which has no previous history of Guinea worm disease [72]. The long incubation time may make it particularly easy for Guinea worm to take advantage of human movement to spread, leading to these kinds of sporadic cases. It is also likely that incorrect reports of emerging worms will appear post-eradication [73]: given the paucity of morphological features defining D. medinensis, molecular tools will be key in providing certainty about the pathogen involved, and thus ultimately in allowing the WHO to declare that the world is free of Guinea worm.

Our work has clear implications for other parasite systems as we move into an era intended to see enhanced control efforts, regional elimination and even eradication for several neglected tropical disease parasites [74]. The Guinea worm eradication program has in many ways set the scene for these efforts in other parasites. The small size of the remaining Guinea worm populations means it should be particularly feasible to employ whole-genome approaches to track changes in Guinea worm populations during the final stages of eradication [75], but the particular difficulties in generating high-quality sequence data and in interpreting these data for D. medinensis highlight the fact that every pathogen system is unique, and genetic surveillance will likely face unique challenges in each case. Whether the particular challenges of an apparently emerging zoonotic transmission cycle in the endgame of eradication are unique remains to be seen as programs for other pathogens advance. It seems clear that the endgame of elimination has different requirements to much of the process of reducing disease burden [76] and the strong selection pressure on pathogen populations to evade control measures near eradication will result in evolutionary responses. The ecological changes apparently occurring in Guinea worm may be the equivalent of the evolution of drug resistance in chemotherapy-lead campaigns [77].

Our results are entirely consistent with a single population of D. medinensis infecting both dogs and humans in Chad. We show genetic variation within D. medinensis is largely geographical, with significant differentiation between populations present in Chad, and those present in countries in East Africa (South Sudan and Ethiopia) and West Africa (Côte d’Ivoire, Ghana, Mali and Niger). Worms that were genetically very similar were recovered from human cases and animal infections in both Chad and Ethiopia. We find a particularly diverse population of worms in Chad and East Africa that appears to be shrinking, presumably due to the eradication program. Coalescent models confirm that a single population of worms infects both dogs and humans in Chad, and the long-term effective population size suggests that a significant Guinea worm population persisted in Chad during the ten-year period prior to 2010 during which no cases were reported. Kinship analysis shows that the Guinea worm population is highly inbred, as we might expect in a small and shrinking population, and suggests direct relatedness between 3 pairs of worms, including two recovered from human cases in one year and recovered from dogs in a subsequent season. In the context of epidemiological data and previous genetic data, this suggests that dog infections are likely to be central to maintaining Guinea worm transmission in Chad. Continued efforts to understand the biology of transmission in Chad, as well as sustained surveillance among both human and non-human hosts, will help ensure the continuing success of the eradication program.

Supporting information

S1 Fig. Proportion of sequencing reads mapping to the reference genome assembly for Dracunculus medinensis samples.

Each bar indicates the proportion of sequencing reads from each sample that mapped against the reference genome assembly. The density of each bar indicates whether whole-genome data is included in our analysis, only mitochondrial genome data or whether insufficient data was available for that sample.

https://doi.org/10.1371/journal.pntd.0008623.s001

(TIF)

S2 Fig. Sources of contamination in sequencing libraries.

Number of reads inferred by k-mer analysis to originate from different phyla. Data are shown for all phyla to which at least 50,000 reads were assigned. 68 samples are shown: those not shown here did not match any phyla with his cut-off. Note that no nematode sequences are in the database used for this search (see Methods).

https://doi.org/10.1371/journal.pntd.0008623.s002

(TIF)

S3 Fig. Synteny between Dracunculus medinensis and Onchocerca volvulus scaffolds.

Lines connect sequences for which the conceptual amino acid translations are at least 50% identical over 250 amino acids. D. medinensis scaffolds highlighted in orange are those shown in Fig 2A. Note that one of the longest scaffolds matches to the opposite end of the X-chromosome scaffold in O. volvulus to the scaffolds with reduced coverage in male worms. This region of O. volvulus X was not part of the ancestral filarial X chromosome (Cotton et al., 2016), and so is not expected to be part of D. medinensis X and is thus labelled as autosomal in Fig 2A, and considered as autosomal in our analyses here. D. medinensis scaffolds with reduced male coverage are shown inset.

https://doi.org/10.1371/journal.pntd.0008623.s003

(TIF)

S4 Fig. PCA axes 3–8 for Dracunculus medinensis variation data; axes 1 and 2 are shown in main text Fig 3C.

https://doi.org/10.1371/journal.pntd.0008623.s004

(TIF)

S5 Fig. Bayesian assignment of individual samples to populations from Structure, for k = 2, 3 and 4 hypothetical populations.

Vertical bars represent individuals, with the proportion of each color in each bar representing the proportion of inferred ancestry of that individual from the population. Shapes and colours of symbols along the x-axis indicate host and country worms were collected from.

https://doi.org/10.1371/journal.pntd.0008623.s005

(TIF)

S6 Fig. Fst between Dracunculus medinensis samples from dogs and humans in Chad, across the three longest autosomal scaffolds.

Values shown are mean Fst for non-overlapping 1kb windows centered at the position shown on the x-axis.

https://doi.org/10.1371/journal.pntd.0008623.s006

(TIF)

S7 Fig. Transmission events implied by three parent-offspring pairs inferred from high genome-wide identity by descent between worms isolated in consecutive years.

Sample locations are indicated by dots, colour-coded by country of isolation. Red arrows indicate inferred parent-offspring relationships between samples; samples involved in these links are highlighted by dark rings around the point at which the infection was detected. The locations of detection may not represent the locations at which infections were acquired, or the location of residence of the hosts.

https://doi.org/10.1371/journal.pntd.0008623.s007

(TIF)

S1 Table. Details of Dracunculus medinensis samples, sequencing data and sequencing libraries used in this study.

Note that mean and median coverage are defined over the whole nuclear genome assembly for both MIT and NUC samples. Reads and mapping statistics are for the sum across all sequenced libraries and lanes. ENA = European Nucleotide Archive.

https://doi.org/10.1371/journal.pntd.0008623.s008

(DOCX)

S2 Table. Details of Dracunculus insignis and D. lutrae samples, sequencing data and libraries used in this study.

Reads and mapping statistics are for the sum across all sequenced libraries and lanes. ENA = European Nucleotide Archive.

https://doi.org/10.1371/journal.pntd.0008623.s009

(DOCX)

S3 Table. Influence of different prior distribution assumptions on results of coalescence analysis.

Values are means of the posterior distribution and 95% highest posterior density confidence intervals. Θ values are effective population size estimates and τ are divergence time estimates. East+Chad and Africa representing the two ancestral populations, as shown on Fig 4.

https://doi.org/10.1371/journal.pntd.0008623.s010

(DOCX)

Acknowledgments

This paper is dedicated to the memory of Dr Caroline Durrant. We thank the national Guinea worm eradication programs for collecting and making worm specimens and associated data available and the support of the Guinea worm program teams from the Carter Center and Centers for Disease Control and Eradication.

References

  1. 1. Muller R. Dracunculus and dracunculiasis. Adv Parasitol. 1971;9:73–151. pmid:4252302
  2. 2. Watts SJ. Dracunculiasis in Africa in 1986: its geographic extent, incidence, and at-risk population. Am J Trop Med Hyg. 1987;37(1):119–25. pmid:2955710
  3. 3. Stoll NR. This wormy world. Journal of Parasitology. 1947;32:1–18. pmid:20284977
  4. 4. Weiss AJ, Vestergaard Frandsen T, Ruiz-Tiben E, Hopkins DR, Aseidu-Bekoe F, Agyemang D. What It Means to Be Guinea Worm Free: An Insider's Account from Ghana's Northern Region. Am J Trop Med Hyg. 2018;98(5):1413–8. pmid:29557333
  5. 5. Ruiz-Tiben E, Hopkins DR. Dracunculiasis (Guinea worm disease) eradication. Adv Parasitol. 2006;61:275–309. pmid:16735167
  6. 6. Muller R. Guinea worm disease: epidemiology, control, and treatment. Bull World Health Organ. 1979;57(5):683–9. pmid:161522
  7. 7. Bourne PG. Global eradication of guinea worm. J R Soc Med. 1982;75(1):1–3. pmid:6460099
  8. 8. Cairncross S, Muller R, Zagaria N. Dracunculiasis (Guinea worm disease) and the eradication initiative. Clin Microbiol Rev. 2002;15(2):223–46. pmid:11932231
  9. 9. Hopkins DR, Ruiz-Tiben E. Strategies for dracunculiasis eradication. Bull World Health Organ. 1991;69(5):533–40. pmid:1835673
  10. 10. Centers for Disease Control and Prevention. Guinea worm wrap-up #110. Atlanta, GA. 2001.
  11. 11. Hopkins DR, Ruiz-Tiben E, Eberhard ML, Weiss A, Withers PC, Roy SL, et al. Dracunculiasis Eradication: Are We There Yet? Am J Trop Med Hyg. 2018;99(2):388–95. pmid:29869608
  12. 12. Centers for Disease Control and Prevention. Guinea worm wrap-up #267. Atlanta, GA. 2020.
  13. 13. Grassly NC, Orenstein WA. Securing the Eradication of All Polioviruses. Clin Infect Dis. 2018;67(suppl_1):S1–S3. pmid:30376096
  14. 14. Eberhard ML, Ruiz-Tiben E, Hopkins DR. Dogs and Guinea worm eradication. Lancet Infect Dis. 2016;16(11):1225–6. pmid:27788982
  15. 15. Eberhard ML, Ruiz-Tiben E, Hopkins DR, Farrell C, Toe F, Weiss A, et al. The peculiar epidemiology of dracunculiasis in Chad. Am J Trop Med Hyg. 2014;90(1):61–70. pmid:24277785
  16. 16. World Health Organisation. Dracunculus eradication in Uzbekistan, report WHO/CDS/CEE/DRA/99.9. Geneva. 1998.
  17. 17. Litvinov SK, Lysenko A. Dracunculiasis: its history and eradication in the USSR. Workshop on opportunities for control of Dracunculiasis. Washington, D.C.: National Academy Press. 1985.
  18. 18. Guagliardo SAJ, Roy SL, Ruiz-Tiben E, Zirimwabagabo H, Romero M, Chop E, et al. Guinea worm in domestic dogs in Chad: A description and analysis of surveillance data. PLoS Negl Trop Dis. 2020;14(5):e0008207. pmid:32463811
  19. 19. McDonald RA, Wilson-Aggarwal JK, Swan GJF, Goodwin CED, Moundai T, Sankara D, et al. Ecology of domestic dogs Canis familiaris as an emerging reservoir of Guinea worm Dracunculus medinensis infection. PLoS Negl Trop Dis. 2020;14(4):e0008170. pmid:32310976
  20. 20. Hopkins DR, Ruiz-Tiben E, Eberhard ML, Roy SL, Weiss AJ. Progress Toward Global Eradication of Dracunculiasis, January 2016-June 2017. MMWR Morb Mortal Wkly Rep. 2017;66(48):1327–31. pmid:29216028
  21. 21. Cleveland CA, Garrett KB, Cozad RA, Williams BM, Murray MH, Yabsley MJ. The wild world of Guinea worms: A review of the genus Dracunculus in wildlife. IJP: Parasites and Wildlife. 2018;7:289–300. pmid:30094178
  22. 22. Jones HI, Mulder E. Dracunculus mulbus n. sp. (Nematoda: Spirurida) from the water python Liasis fuscus (Serpentes: Boidae) in northern Australia. Syst Parasitol. 2007;66(3):195–205. pmid:16972152
  23. 23. Travassos L. Dracunculus fuelleborni n. sp., parasito de Didelphis aurita Wied. Memórias do Instituto Oswaldo Cruz. 1934;28(2):235–7.
  24. 24. Wijova M, Moravec F, Horak A, Modry D, Lukes J. Phylogenetic position of Dracunculus medinensis and some related nematodes inferred from 18S rRNA. Parasitol Res. 2005;96(2):133–5. pmid:15812670
  25. 25. Bimi L, Freeman AR, Eberhard ML, Ruiz-Tiben E, Pieniazek NJ. Differentiating Dracunculus medinensis from D. insignis, by the sequence analysis of the 18S rRNA gene. Ann Trop Med Parasitol. 2005;99(5):511–7. pmid:16004710
  26. 26. Elsasser SC, Floyd R, Hebert PD, Schulte-Hostedde AI. Species identification of North American guinea worms (Nematoda: Dracunculus) with DNA barcoding. Mol Ecol Resour. 2009;9(3):707–12. pmid:21564728
  27. 27. Nadler SA, Carreno RA, Mejia-Madrid H, Ullberg J, Pagan C, Houston R, et al. Molecular phylogeny of clade III nematodes reveals multiple origins of tissue parasitism. Parasitology. 2007;134(Pt 10):1421–42. pmid:17506928
  28. 28. Cole R, Viney M. The population genetics of parasitic nematodes of wild animals. Parasit Vectors. 2018;11(1):590. pmid:30424774
  29. 29. Biswas G, Sankara DP, Agua-Agum J, Maiga A. Dracunculiasis (guinea worm disease): eradication without a drug or a vaccine. Philos Trans R Soc Lond B Biol Sci. 2013;368(1623):20120146. pmid:23798694
  30. 30. Al-Awadi AR, Al-Kuhlani A, Breman JG, Doumbo O, Eberhard ML, Guiguemde RT, et al. Guinea worm (Dracunculiasis) eradication: update on progress and endgame challenges. Trans R Soc Trop Med Hyg. 2014;108(5):249–51. pmid:24699360
  31. 31. Molyneux D, Sankara DP. Guinea worm eradication: Progress and challenges- should we beware of the dog? PLoS Negl Trop Dis. 2017;11(4):e0005495. pmid:28426663
  32. 32. International Helminth Genomes Consortium. Comparative genomics of the major parasitic worms. Nat Genet. 2019;51:163–164. pmid:30397333
  33. 33. Thiele EA, Eberhard ML, Cotton JA, Durrant C, Berg J, Hamm K, et al. Population genetic analysis of Chadian Guinea worms reveals that human and non-human hosts share common parasite populations. PLoS Negl Trop Dis. 2018;12(10):e0006747. pmid:30286084
  34. 34. Eberhard ML, Yabsley MJ, Zirimwabagabo H, Bishop H, Cleveland CA, Maerz JC, et al. Possible Role of Fish and Frogs as Paratenic Hosts of Dracunculus medinensis, Chad. Emerg Infect Dis. 2016;22(8):1428–30. pmid:27434418
  35. 35. Kozarewa I, Ning Z, Quail MA, Sanders MJ, Berriman M, Turner DJ. Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes. Nat Methods. 2009;6(4):291–5. pmid:19287394
  36. 36. Hunt M, Kikuchi T, Sanders M, Newbold C, Berriman M, Otto TD. REAPR: a universal tool for genome assembly evaluation. Genome Biol. 2013;14(5):R47. pmid:23710727
  37. 37. Bonfield JK, Whitwham A. Gap5—editing the billion fragment sequence assembly. Bioinformatics. 2010;26(14):1699–703. pmid:20513662
  38. 38. Tsai IJ, Otto TD, Berriman M. Improving draft assemblies by iterative mapping and assembly of short reads to eliminate gaps. Genome Biol. 2010;11(4):R41. pmid:20388197
  39. 39. Nadalin F, Vezzi F, Policriti A. GapFiller: a de novo assembly approach to fill the gap within paired reads. BMC Bioinformatics. 2012;13 Suppl 14:S8. pmid:23095524
  40. 40. Otto TD, Sanders M, Berriman M, Newbold C. Iterative Correction of Reference Nucleotides (iCORN) using second generation sequencing technology. Bioinformatics. 2010;26(14):1704–7. pmid:20562415
  41. 41. Derrien T, Estelle J, Marco Sola S, Knowles DG, Raineri E, Guigo R, et al. Fast computation and applications of genome mappability. PLoS One. 2012;7(1):e30377. pmid:22276185
  42. 42. Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5(2):R12. pmid:14759262
  43. 43. Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, et al. Circos: an information aesthetic for comparative genomics. Genome Res. 2009;19(9):1639–45. pmid:19541911
  44. 44. Felsenstein J. PHYLIP (Phylogeny Inference Package) version 3.6; 2005.
  45. 45. R Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2015.
  46. 46. Korneliussen TS, Albrechtsen A, Nielsen R. ANGSD: Analysis of Next Generation Sequencing Data. BMC Bioinformatics. 2014;15:356. pmid:25420514
  47. 47. Korneliussen TS, Moltke I, Albrechtsen A, Nielsen R. Calculation of Tajima's D and other neutrality test statistics from low depth next-generation sequencing data. BMC Bioinformatics. 2013;14:289. pmid:24088262
  48. 48. Fumagalli M, Vieira FG, Korneliussen TS, Linderoth T, Huerta-Sanchez E, Albrechtsen A, et al. Quantifying population genetic differentiation from next-generation sequencing data. Genetics. 2013;195(3):979–92. pmid:23979584
  49. 49. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. pmid:19505943
  50. 50. Wickham H. ggplot2: elegant graphics for data analysis. New York: Springer-Verlag; 2009.
  51. 51. Verity R, Nichols RA. Estimating the Number of Subpopulations (K) in Structured Populations. Genetics. 2016;203(4):1827–39. pmid:27317680
  52. 52. Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155(2):945–59. pmid:10835412
  53. 53. Flouri T, Jiao X, Rannala B, Yang Z. Species Tree Inference with BPP Using Genomic Sequences and the Multispecies Coalescent. Mol Biol Evol. 2018;35(10):2585–93. pmid:30053098
  54. 54. Gronau I, Hubisz MJ, Gulko B, Danko CG, Siepel A. Bayesian inference of ancient human demography from individual genome sequences. Nat Genet. 2011;43(10):1031–4. pmid:21926973
  55. 55. Denver DR, Dolan PC, Wilhelm LJ, Sung W, Lucas-Lledo JI, Howe DK, et al. A genome-wide view of Caenorhabditis elegans base-substitution mutation processes. Proc Natl Acad Sci U S A. 2009;106(38):16310–4. pmid:19805298
  56. 56. Yang Z, Rannala B. Bayesian species delimitation using multilocus sequence data. Proc Natl Acad Sci U S A. 2010;107(20):9264–9. pmid:20439743
  57. 57. Rannala B, Yang Z. Efficient Bayesian Species Tree Inference under the Multispecies Coalescent. Syst Biol. 2017;66(5):823–42 pmid:28053140
  58. 58. Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26(22):2867–73. pmid:20926424
  59. 59. Cotton JA, Bennuru S, Grote A, Harsha B, Tracey A, Beech R, et al. The genome of Onchocerca volvulus, agent of river blindness. Nat Microbiol. 2016;2:16216. pmid:27869790
  60. 60. Post R. The chromosomes of the Filariae. Filaria Journal. 2005;4:10. pmid:16266430
  61. 61. Charlesworth B, Charlesworth D. Elements of evolutionary genetics. Greenwood, CO: Roberts and Company; 2010.
  62. 62. Tajima F. The effect of change in population size on DNA polymorphism. Genetics. 1989;123(3):597–601. pmid:2599369
  63. 63. Cleveland CA, Eberhard ML, Thompson AT, Smith SJ, Zirimwabagabo H, Bringolf R, et al. Possible Role of Fish as Transport Hosts for Dracunculus spp. Larvae. Emerg Infect Dis. 2017;23(9):1590–2. pmid:28820381
  64. 64. Eberhard ML, Cleveland CA, Zirimwabagabo H, Yabsley MJ, Ouakou PT, Ruiz-Tiben E. Guinea Worm (Dracunculus medinensis) Infection in a Wild-Caught Frog, Chad. Emerg Infect Dis. 2016;22(11):1961–2. pmid:27560598
  65. 65. Hoag C, Svenning JC. African Environmental Change from the Pleistocene to the Anthropocene. Annu Rev Env Resour. 2017;42:27–54.
  66. 66. Manning K, Timpson A. The demographic response to Holocene climate change in the Sahara. Quaternary Sci Rev. 2014;101:28–35.
  67. 67. Lynch M. Evolution of the mutation rate. Trends Genet. 2010;26(8):345–52. Epub 2010/07/03. pmid:20594608
  68. 68. Diez-Del-Molino D, Sanchez-Barreiro F, Barnes I, Gilbert MTP, Dalen L. Quantifying Temporal Genomic Erosion in Endangered Species. Trends Ecol Evol. 2018;33(3):176–85. pmid:29289355
  69. 69. Redman E, Grillo V, Saunders G, Packard E, Jackson F, Berriman M, et al. Genetics of mating and sex determination in the parasitic nematode Haemonchus contortus. Genetics. 2008;180(4):1877–87. pmid:18854587
  70. 70. Zhou C, Yuan K, Tang X, Hu N, Peng W. Molecular genetic evidence for polyandry in Ascaris suum. Parasitol Res. 2011;108(3):703–8. pmid:20949281
  71. 71. Doyle SR, Laing R, Bartley DJ, Britton C, Chaudhry U, Gilleard JS, et al. A Genome Resequencing-Based Genetic Map Reveals the Recombination Landscape of an Outbred Parasitic Nematode in the Presence of Polyploidy and Polyandry. Genome Biology and Evolution. 2018;10(2):396–409. pmid:29267942
  72. 72. Centers for Disease Control and Prevention. Guinea worm wrap-up #256. Atlanta, GA. 2018.
  73. 73. Mbong EN, Sume GE, Danbe F, Kum WK, Mbi VO, Fouda AA, et al. Not every worm wrapped around a stick is a guinea worm: a case of Onchocerca volvulus mimicking Dracunculus medinensis. Parasit Vectors. 2015;8:374. pmid:26178636
  74. 74. World Health Organisation. Accelerating work to overcome the global impact of neglected tropical diseases—a roadmap for implementation. Geneva, Switzerland: World Health Organisation, 2012.
  75. 75. Cotton JA, Berriman M, Dalen L, Barnes I. Eradication genomics-lessons for parasite control. Science. 2018;361(6398):130–1. pmid:30002241
  76. 76. Klepac P, Metcalf CJ, McLean AR, Hampson K. Towards the endgame and beyond: complexities and challenges for the elimination of infectious diseases. Philos Trans R Soc Lond B Biol Sci. 2013;368(1623):20120137. pmid:23798686
  77. 77. Whitty CJ. Milroy Lecture: eradication of disease: hype, hope and reality. Clin Med (Lond). 2014;14(4):419–21. pmid:25099846