Targeted Sequencing of Genomic Repeat Regions Detects Circulating Cell-free Echinococcus DNA

Background Echinococcosis is a chronic zoonosis caused by tapeworms of the genus Echinococcus. Treatment of the disease is often expensive and complicated, sometimes requiring extensive surgery. Ultrasonographic imaging is currently the main technique for diagnosis, while immunological analysis provides additional information. Confirmation still needs pathological analysis. However, these diagnostic techniques generally detect infection in late stages of the disease. An accurate, early and non-invasive molecular diagnostic method is still unavailable. Methodology/Principal findings We sequenced the cell-free DNA (cfDNA) from plasma of echinococcosis patients and confirmed the presence of Echinococcus DNA. To improve detection sensitivity, we developed a method based on targeted next-generation sequencing of repeat regions. Simulation experiments demonstrate that the targeted sequencing is sensitive enough to detect as little as 0.1% of an Echinococcus genome in 1 mL of plasma. Results obtained using patient plasma shows that the Area Under the Curve (AUC) of the method is 0.862, with a detection sensitivity of 62.50% and specificity of 100%, corresponding to a Youden-index of 0.625. Conclusions/Significance This study provides evidence that hydatid cysts release cfDNA fragments into patient plasma. Using the repeat region targeted sequencing method, highly specific detection of Echinococcus infection was achieved. This study paves a new avenue for potential non-invasive screening and diagnosis of echinococcosis.


Introduction
Echinococcosis is a severe parasitic disease that predominantly affects agricultural and pastoral areas, especially in South America, Africa, and Asia [1,2]. The etiological agent of echinococcosis are tapeworms of the genus Echinococcus [3]. There are mainly two types of the disease spread in the world, cystic echinococcosis (CE) caused by dog-transmitted E. granulosus and alveolar echinococcosis (AE) caused by fox-transmitted E. multilocularis [2]. Humans are accidental intermediate hosts in the life cycle of Echinococcus. CE, also known as hydatid disease, is the most common form all over the world. Once an individual is infected by E. granulosus, the larvae of the tapeworm develop into fluid-filled cysts in various organs [4]. As cysts grow larger over a period of time, symptoms caused by compression will arise, and permanent damage to affected organs will occur [2,5]. In contrast, the larvae of E. multilocularis invade in a cancer-like manner to damage the infected organs by infiltrating surrounding tissues, sometimes even metastasizing to other organs.
Clinical symptoms of echinococcosis are atypical and often do not occur until late stages of the disease. Consequently, diagnosis of the disease, especially early diagnosis, is challenging. Ultrasonographic imaging, with the advantage of low cost and rapid diagnostic abilities, is most commonly used [3,6,7]. Other imaging techniques with higher resolution such as CT and MRI are employed to detect lesions in specific anatomical locations or atypical echinococcosis [7]. The disadvantage of the imaging techniques is difficult to distinguish Echinococcus cysts from other types of cysts. In addition, imaging analysis requires relatively large cysts in CE patients that are already in late stages of the disease [8]. Serological tests may enable earlier diagnosis than imaging technology [9]. Several immunological methods to detect anti-Echinococcus antibodies have been developed [9]. However, sensitivity and specificity of these immunological assays vary in different conditions, especially in the cases of CE [10][11][12][13][14][15][16][17]. Therefore, immunological tests are generally used in combination with imaging techniques. Pathological examination of biopsy samples before surgery can be dangerous due to the regeneration capacity of the protoscolex and the risk of anaphylactic reaction during the biopsy procedure [7,8,18]. Thus, there is an urgent need for diagnostic methodology that can detect Echinococcus infection etiologically and non-invasively.
Assays based on detecting Echinococcus-derived circulating antigens were also developed and reported to have high specificity. Unfortunately, their sensitivity was relatively low [17,[19][20][21][22][23][24]. Consequently, they are barely used in clinic. Likewise, detection of Echinococcus DNA by PCR, qPCR or LAMP (loop-mediated isothermal amplification) was also reported [25][26][27][28][29][30][31]. These assays are only used for evaluating Echinococcus prevalence in dog or fox feces [25,26,28,29,31] or Echinococcus genotyping [27,30]. Chaya et al. recently reported diagnosis of hydatid disease in humans using PCR detection of parasite DNA in patient serum samples. However, only 25% of hydatid patients appeared positive. All these positive samples had a ruptured cyst confirmed by surgery [32]. No Echinococcus DNA was detected in the patient's urine samples [32]. To the best of our knowledge, no reliable molecular detection method is currently used for non-invasive clinical diagnosis of echinococcosis.
Cell-free DNA (cfDNA) refers to extracellular DNA in various body fluids and its detection in secretions such as blood, urine, and saliva, has been received much attention. The main features of cfDNA include fragmentation, low abundance and fast degradation. The size distribution of fragment length of cfDNA is around 166bp that is close to the length of DNA wrapping around a nucleosome [33,34]. The concentration of cfDNA is 1~10 ng per 1 mL of human plasma under normal circumstances [35]. It increases up to hundreds of nanograms after exercise [36] or under certain disease conditions [37][38][39]. The half-life of cfDNA in the free state is about 15 min, but becomes more stable when bound to proteins [34,37,39]. cfDNA is commonly released during apoptosis and necrosis of cells from different tissues [40,41]. Highthroughput sequencing can be used to analyze the source of cfDNA, therefore, allowing the non-invasive, safe and accurate real-time monitoring of the primary lesions. Also known as "liquid biopsy" [40], cfDNA detection is widely used in prenatal diagnosis [41][42][43], early tumor detection [44][45][46] and organ transplant monitoring [47][48][49][50]. Methods to detect diseases involving cell death in specific tissues based on cfDNA methylation patterns are also developed [34,44,[51][52][53].
Theoretically, DNA fragments of any foreign cells or organisms can be released into host blood, making it possible to be quickly and accurately detected by high-throughput sequencing [54][55][56]. In the present work, we demonstrate the presence of Echinococcus-derived cfDNA in blood plasma of both CE and AE patients. We further established an assay to detect the parasite DNA based on amplification of repeat regions followed by targeted next-generation sequencing (NGS). This technique opens a possibility to extensively screen and diagnose echinococcosis with high specificity and efficiency.

Sample collection
Plasma and hydatid cyst fluid (HCF) samples from echinococcosis patients were collected from Qinghai province and Xinjiang province in northwest China. All patients fulfilled the diagnostic criteria for echinococcosis [57]. 21 out of 24 patients reported in this study were pathologically confirmed after their blood samples were collected. Three patients (E07, E15 and E20), who had previously diagnosed as echinococcosis and surgically removed cysts, did not take a surgery this time for unwillingness. However, they fitted the diagnostic criteria of Probable cases (E07 and E20) or Possible case (E15) according to WHO-IWGE. CE patients were mainly at stage CE1 or CE2 according to the WHO-IWGE classification. AE patients were at stage P1N0M0 or P2N0M0. For all patients recruited to this study, no chemotherapy was administrated before collecting blood samples. Patients with a sign of either cyst rupture, secondary infection, consolidation or calcification were excluded from the study. Control samples without Echinococcus infection and non-relevant control samples with Schistosoma infection were collected from Hunan province of central China. The study was approved by the Ethics Committee of Xiangya Hospital of Central South University (Approval No: 201610049).
All participants signed a written informed consent. The Standards for Reporting of Diagnostic Accuracy (STARD) checklist and participants flow chart are provided in S1 Checklist and S1 Fig, respectively.
For plasma samples, 10 mL whole blood was collected from each patient and control individual using Streck cfDNA blood collection tubes (218997, Streck, USA). The blood was centrifuged at 1600 g for 10 min at 4˚C. The plasma was transferred to a fresh 1.5 mL tube, followed by centrifuging at 16,000 g for 10 min at 4˚C to remove remaining cell debris. The plasma was used for further cfDNA extraction. For HCF samples, five intact unilocular hydatid cysts were surgically removed from five unrelated cystic echinococcosis patients. After washing with saline, 3~4 mL HCF was carefully aspirated from each cyst. HCF samples were centrifuged at 16,000 g for 10 min at 4˚C. Supernatants were transferred to fresh tubes for further cfDNA extraction. Samples comprising 3 mL of plasma or HCF were used to extract cfDNA using the QIAGEN circulating nucleic acid extraction kit (55114, QIAGEN, Germany) according to the manufacture's protocol.

Untargeted cfDNA sequencing
30 ng cfDNA from plasma or HCF was used to prepare NGS libraries with standard protocol and reagents from NEBNext ultra II library kit (E7645, NEB, USA). Due to short length of cfDNA fragments, no further fragmentation was performed. Libraries were then sequenced by Illumina HiSeq X. 35M~50M reads were generated using the PE150 sequencing strategy. Raw data was first processed using trim-galore (Version 0.4.4) in Paired-end mode with default parameters to cut adapter sequences and remove short (< 20bp) or low-quality (Phred score < 20) reads. Duplicated reads were then removed using FastUniq (Version 1.1). For plasma samples, the de-duplicated reads were first mapped to a human reference genome (version hg38) using the bwa-mem algorithm with default parameters. Unmapped reads were extracted using SAMtools followed by searching against the NCBI nt database that includes 43,107,468 sequences in total (last updated in Jun 2017) using BLAST (Basic Local Alignment Search Tool, command line version 2.6.0+). The identity cut-off was 95%. Reads uniquely assigned to E. granulosus or E. multilocularis by BLAST were counted. In a "remapping strategy", no-hit reads from the BLAST analysis were collected and mapped to reference genomes of E. granulosus (ASM52419v1) or E. multilocularis (EMULTI002) using the bwamem algorithm with default parameters. For HCF samples, data was preprocessed as described above followed by mapping to a reference genome of E. granulosus (ASM52419v1) using bwamem. Unmapped reads were extracted and aligned to a human reference genome (version hg38).

Identification of repeat sequences of E. granulosus and E. multilocularis
Repeat sequences were identified using the RepeatExplorer web server on the Galaxy platform following instruction provided in the on-line manual (http://repeatexplorer.org/) [58]. E. granulosus and E. multilocularis sequencing data (ERR112220 and ERR065034 respectively) downloaded from the SRA databases were used as input sequences. High copy number hits were searched using > 0.01% genome proportion as the criterion. To confirm sequence specificity, each unique high copy number sequence was analyzed using BLAST (command line version 2.6.0+). Repeat sequence hits on species other than Echinococcus spp. in the NCBI nt database with a percent of identity > 95% were removed. Echinococcus specific repeat sequences with lengths between 70 and 2000bp were selected. These filtered sequences were then used for further primer design.

Primer design and validation
Primers were designed based on identified highly repeated sequences using an on-line primer design tool in NCBI (https://www.ncbi.nlm.nih.gov/tools/primer-blast/). One or two primer pairs for each repeat sequences were designed with product sizes restricted to 70~100bp and Tm of 60 ± 1˚C. Primer specificity was checked in the nr/nt database to exclude those likely to produce unintended products.
Each pair of primers was first validated by conventional PCR with a reaction mixture containing 10 μL of Takara Premix EX Taq (RR030A, Takara, Japan), 1 μL primers with a concentration of 10 μmol/L, 100 ng of human genomic DNA with or without 1pg of HCF DNA for the test and control group, respectively. The total reaction volume was brought to 20 μl using ultrapure water (10977023, Invitrogen, USA). The background human DNA was mixed human genomic DNA extracted from whole blood of 5 different individuals. The human genomic DNA was fragmented to size of approximately 160bp using the S2 Focused-ultrasonicator (Covaris, USA). For tests with a reduced amount of template DNA, 100 fg of HCF DNA combined with 10 ng of fragmented human genomic DNA was used as PCR templates. The PCR reaction was performed with an initial denaturation at 95˚C for 5 min, followed by 30 cycles of 30 seconds at 95˚C for denaturation, 30 seconds at 58˚C for annealing and 15 seconds at 72˚C for extension. The PCR products of each pair of primers were analyzed on a 2100 bioanalyzer using an Agilent DNA 1000 kit (5067-1504, Agilent, USA). Primer pairs that yielded the intended products were chosen to assemble a multiplex EcDNA (Echinococcus cfDNA) primer panel.
The EcDNA primer panel was further verified using HCF DNA samples. The total concentration of the primer panel was 20 μmol/L, with an equal molarity of each primer. Multiplex PCR reactions were performed in a total volume of 20 μL containing 2 μL of the primer panel, 1 μL of 10 ng/μL human genomic DNA, 6 μL of ultrapure water and 1μL of 1 pg/μL HCF DNA and 10 μL of Takara Premix EX Taq DNA polymerase. To increase amplification efficiency, 10 additional touch-down cycles were added prior to aforementioned 30 cycles of amplification, by decreasing the annealing temperature 1˚C per cycle from 68˚C to 58˚C. The multiplex PCR products were purified and subjected to NGS sequencing to confirm the sensitivity of each primer pair.

Analysis of patient cfDNA samples
The validated primer panel was used to detect Echinococcus-derived DNA in patient plasma cfDNA samples. The multiplex PCR reaction mixtures were composed of 10 μL of Takara Premix EX Taq, 2 μL of the EcDNA multiplex primer panel (20 μmol/L), and 8 μL of the circulating cfDNA sample from patients or controls. All samples were renamed before the multiplex PCR procedure for blind evaluation. Reactions were performed using the multiplex PCR conditions with touch down cycles as described above.
The multiplex PCR products were purified using 50 μL of AMPure XP beads (A63881, Beckman Coulter, USA) according to the manufacturer's protocol. To increase recovery efficiency of short products, 5 μL of PEG buffer consisting of 40% PEG8000 and 10 mM EDTA was added to cleanup reactions. For each sample, 10 ng of recovered DNA was used to construct an NGS library using the NEBNext ultra II library prep kit (E7645, NEB, USA) according to the standard protocol. Libraries were then sequenced on an Illumina MiSeq 500 platform (Illumina, USA) in PE75 mode. An average of 2 M reads was generated for each sample.
The raw sequencing data was initially processed using trim-galore as described above. Clean reads were analyzed using an in-house Perl script (https://github.com/wanzhq/ Echinococcus_detection). Briefly, clean data was initially mapped to the hg19 human reference genome using bowtie2. Unmapped reads were extracted and aligned to each intended PCR products or repeat sequences. The mapping ratio was calculated as the proportion of reads that mapped to Echinococcus repeats divided by the total number of reads unmapped to hg19.

Copy number calculation of repeat sequences
The copy numbers of three enriched repeats (Egs-1, Egs-2, Egs-3) were calculated using droplet digital PCR (ddPCR). 5 pg of HCF DNA was used as template for each ddPCR reaction. DdPCRs were done using a Bio-Rad QX200 system (Bio-Rad, USA). A pair of primers specifically amplifying glyoxylate reductase/hydroxypyruvate reductase (EGR-05219) of E. granulosus genome (forward: GTGTCTTCAACGACGAGGTTAG; reverse: GTCAGCGTAACCAT GCAAATG) was included as control. The copy numbers of repeats were normalized to the genome copies of EGR-05219 in the same reaction.

Statistical analysis
Mapping ratio of unique group were calculated and compared. Data is presented as means ± SEM. The unpaired t-test with Welch's correction was used to calculate p values. Statistical significance was shown as � p < 0.05, �� p < 0.01, ��� p < 0.001, and ���� p < 0.0001. ROC analysis was performed using SPSS statistical software version 16.0 (IBM Corp., USA).

Detection of Echinococcus DNA in plasma of echinococcosis patients and human DNA fragments in hydatid cyst fluids
To examine whether Echinococcus DNA is present in patient plasma samples, we initially performed direct NGS analysis of circulating cfDNA from plasma samples of 16 patients and 11 healthy controls. For each sample, sequencing data was first mapped to the human genome (hg38). Subsequently, unmapped reads were extracted followed by searching against the NCBI nt database using the BLAST algorithm. Reads uniquely aligned to E. granulosus or E. multilocularis were considered as Echinococcus-derived sequences. As shown in Table 1, DNA of E. granulosus or E. multilocularis was detected in 5 out of 16 patient samples, indicating that Echinococcus DNA was able to release from hydatid cysts during an infection. Considering the low coverage of the Echinococcus genome sequences in the nt database, we collected reads that did not generate hits in the BLAST analysis followed by remapping these reads to the genome of E. granulosus (ASM52419v1) or E. multilocularis (EMULTI002) using bwa-mem algorithm. These remapped reads can be regarded as a collection of sequences most likely derived from the Echinococcus genome. The ratio of remapped reads in patients was significantly higher than that in controls (S2 Fig). Echinococcus DNA was detected in 9 out of 16 patients with remapping ratio above 100×10 −6 ( Table 1). For AE, parasite-derived DNA was detected in 4 out of 5 patients.
To further verify the wall of hydatid cysts could allow DNA fragments to pass through, we extracted cfDNA from the HCF of intact hydatid cysts from 5 unrelated patients followed by NGS. Despite limited contamination with large fragments of genomic DNA, predominant size of cfDNA from HCF was approximately the same as that of human plasma cfDNA (Fig 1A). Sequencing reads mapped to the human reference genome were detected in all 5 HCF samples ( Table 2). The proportion of human-derived DNA sequences reached as high as 80% of the total cfDNA sequencing reads. The lowest ratio of reads within one cyst that matched to the human genome was about 12%. Results suggest that the hydatid cyst wall is permeable at least to short, freely circulating nucleic acids, such as cfDNA. In these cases, further mapping analysis revealed that most of the identified sequences of E. granulosus cfDNA were from non-coding, low complexity regions. In contrast, human cfDNA sequences were from both coding and non-coding regions of the genome (Fig 1B).

Multiplex PCR panel design
To increase detection sensitivity of cfDNA at trace levels, we employed a targeted multiplex PCR of repeat regions to enrich Echinococcus-derived cfDNA before NGS. A total of 19732 repeat sequences were identified using RepeatExplore on the Galaxy platform (S1 Appendix), based on E. granulosus (ERR112220) and E. multilocularis (ERR065034) sequences downloaded from the NCBI SRA database. To improve the template utilization efficiency, length of products was restricted to 70~100bp considering~166bp of typical cfDNA. A total of 16 primer pairs were initially selected for validation (Table 3). Each pair of candidate primers were validated by conventional PCR using a mixture of HCF-1 HCF cfDNA and human genomic DNA as templates. A total of 12 pairs of primers generated anticipated products with 1 pg HCF cfDNA mixed with 100ng of human genomic DNA. On the other hand, no specific band was detected using human DNA alone as templates (Fig 2A-2C). Further analysis revealed that 10 pairs of primers were sensitive enough to obtain specific PCR products using 100 fg of HCF cfDNA as templates (Fig 2D and 2E). 13 pairs of primers yielded specific PCR products using DNA extracted from the liver lesion of an alveolar echinococcosis patient (S3 Fig). As our primary goal was to detect CE, 12 pairs of primers that gave specific products with HCF DNA were combined into a EcDNA (Echinococcus cfDNA) primer panel. Multiplex PCR followed by NGS showed that the primer panel is effective to amplify Echinococcus DNA from 1 pg HCF DNA (Fig 3A, Table 4). Considering that human DNA is present in hydatid cysts, we further analyzed the copy number of Echinococcus genome per 100 pg HCF DNA by droplet digital PCR. The results showed that the copy number of the Echinococcus genome per 100 pg HCF DNA varied from 20 to 400, which was consistent with results of NGS analysis (Table 2 and  NGS analysis of multiplex PCR products revealed that the ratio of reads mapped to repeat sequence dataset (98.93% on average) was higher than that of reads exactly mapped to the  anticipated PCR products (91.75% on average). The difference likely resulted from the variability of each repeat in genome ( Fig 3B, Table 4). Therefore, further analysis was based on the ratio of reads mapped to the repeat regions rather than those mapped to the anticipated PCR products.

The EcDNA primer panel detect Echinococcus DNA using femtogram HCF cfDNA as templates
To further determine the detection limit of the EcDNA primer panel, we titrated HCF cfDNA templates mixed with human cfDNA. The results showed that the EcDNA panel was sensitive enough to stably detect as little as 2 fg cfDNA from the HCF-1 sample (Fig 4, S1 Table). To further verify the detection sensitivity, we added various quantities of HCF cfDNA to plasma isolated from healthy controls, followed by extracting cfDNA from the mixed plasma. cfDNA isolated from the mixtures were subjected to multiplex PCR and NGS. The results revealed that the detection sensitivity was as little as 5 fg HCF cfDNA per 1 mL of plasma (S2 Table). The quantitative results of ddPCR suggest approximately 30 copies of Echinococcus genome per 100 pg of DNA in the HCF-1 sample (S4 Fig). Therefore, the method is sensitive enough to detect 0.1% of Echinococcus genome per 1mL of plasma.

Detection of Echinococcus DNA using cfDNA from patient plasma samples
The EcDNA primer panel was next used to analyze cfDNA from plasma samples of patients infected with Echinococcus (n = 24). Healthy uninfected individuals (n = 27) and patients infected with Schistosoma (n = 9) were also analyzed as controls. A clinical summary of controls and patients is shown in Table 5. Since no method is available to exclude Echinococcus infection in early stages, all control samples were collected from Hunan province of central China, where echinococcosis is rarely reported. Further follow-up of the controls finds no

PLOS NEGLECTED TROPICAL DISEASES
Echinococcosis diagnosis by targeted next-generation sequencing infection with Echinococcus of these individuals. Detailed information of patients is shown in Table 6. The results showed that the mapping ratio, calculated as the proportion of reads mapped to the Echinococcus DNA repeats over the total number of reads unmapped to the human reference genome, was significantly higher in samples from alveolar and cystic echinococcosis patients than those of schistosomiasis patients or uninfected control individuals (Fig 5). Echinococcus-derived cfDNA was detected in 15 out of 24 plasma samples from echinococcosis patients with cut-off to 10.0%. In contrast, no Echinococcus-derived DNA was detected in the cfDNA isolated from plasma of schistosomiasis patients and healthy control individuals (Table 7). Results indicate that the targeted detection method is more sensitive than the direct sequencing method. A ROC analysis based on the mapping ratio of plasma samples showed that the area under the curve (AUC) was 0.862, with a 95% confidence interval of 0.758~0.965 (Fig 6). When the cut-off was set to 9.79%, the ROC curve reached a Youden-index of 0.625, with a detection sensitivity of 62.5% and specificity of 100%.

Discussion
Echinococcus parasitizes deeply in human organs such as liver, lung and brain. Due to the risk of disseminated infection, it is not advisable to conduct a biopsy to make a definitive diagnosis. Thus, non-invasive methodology for early diagnosis is urgently needed. In theory, hydatid cysts allow nutrients in and metabolites out in order to grow [59]. In this study, we show that Echinococcus derived cfDNA is present in plasma of Echinococcus infected individuals, which could facilitate non-invasive diagnosis. We further designed a targeted NGS strategy to detect Echinococcus-derived cfDNA in blood samples. The results suggest that targeted NGS is feasible to achieve non-invasive molecular diagnosis of echinococcosis.
Targeted NGS detected Echinococcus cfDNA in patient plasma in this study. The finding provides the first evidence to our knowledge that E. granulosus DNA is also released from intact hydatid cysts into blood of infected individuals, laying the basis for a non-invasive and precise diagnosis of echinococcosis using DNA sequencing. It is likely that the identified Echinococcus DNA fragments pass through the wall of hydatid cysts to enter the blood stream. This notion is further supported by the discovery of human DNA fragments inside hydatid cysts, despite that patient cfDNA in hydatid cysts is remarkably higher than parasite cfDNA in patient serum. The concentration of Echinococcus cfDNA in patient plasma is likely very low, suggesting that release of Echinococcus DNA fragments from hydatid cysts is either limited or that the DNA is rapidly degraded. It is possible that protein-bounded DNA has a limited capacity to pass through the wall of hydatid cysts. The proportion of Echinococcus DNA in the cfDNA isolated from patient plasma varies between 1/1000 and 1/100000, which is a major barrier for clinical detection using untargeted NGS. We therefore developed a targeted multiplex PCR primer panel to amplify repeat sequences from the Echinococcus genome in order to specifically enrich Echinococcus-derived cfDNA for further analysis via NGS. The combination of multiplex PCR and NGS incorporates both the sensitivity of PCR and the specificity of DNA sequencing while avoiding cross-reactions or false positives. This method is sensitive enough to detect a thousandth of an Echinococcus genome in simulation experiments. Meanwhile, targeted NGS cut down the sequencing cost from hundreds of dollars for an untargeted sequencing experiment to less than 1 dollar for a targeted sequencing detection. The overall cost for each detection is approximately $50 in laboratory that can be reduced to less than $20 with further improvement.
Several PCR or qPCR methods for the detection of Echinococcus DNA in patients have been reported by various research groups [25][26][27][28][29][30][31]. However, these methods have not demonstrated non-invasive diagnostic values for echinococcosis. Potential reasons could be diverse. One possibility is the low abundance of Echinococcus DNA in patient plasma. Alternatively, DNA extraction methods used in previous studies are generally optimized for genomic DNA extraction that is not optimized for short cfDNA fragments. Furthermore, the designed length of PCR products was usually longer than the average length of cfDNA fragments (166bp), reducing detection of short DNA fragments. A recent study on the detection of parasitederived cfDNA from serum of alveolar echinococcosis (AE) patients partly improved cfDNA extraction and primer design protocol. These authors were able to distinguish all samples from an AE animal model and 30% of patient serum samples via qPCR and ddPCR [60]. The present study enriched targeted DNA fragments by multiplex PCR followed by NGS, which increased the possibility to detect low abundant and short cfDNA fragments. In the present study, the AUC was 0.862, with a detection sensitivity of 62.50% and specificity of 100%, corresponding to a Youden-index of 0.625. Interestingly, the sensitivity increased to 80% if we consider only the 5 alveolar echinococcosis patients. This is likely because of its distinct way of expansion of E. multilocularis in host body allowing cfDNA to release to blood stream. The detection sensitivity may be even higher for early infected patients before formation of a complete cyst, because the laminated layer serves as a physiochemical barrier that reduces the permeability of macromolecules [5]. Unfortunately, no technology that diagnose Echinococcus infection in very early stages is available. A more comprehensive clinical study with larger numbers of patients will be needed to further verify the effectiveness of the method for early diagnosis. Our data provides evidence that hydatid cysts release cfDNA into patient plasma even if the cyst membrane remains intact. This study also establishes a new and practical method based on the concept of cell-free DNA in conjunction with the high sensitivity of multiplex PCR and the specificity of NGS, paving a new avenue for a potential early, accurate and non-invasive diagnosis of echinococcosis.