Chagas disease results from infection with the diploid protozoan parasite Trypanosoma cruzi. T. cruzi is highly genetically diverse, and multiclonal infections in individual hosts are common, but little studied. In this study, we explore T. cruzi infection multiclonality in the context of age, sex and clinical profile among a cohort of chronic patients, as well as paired congenital cases from Cochabamba, Bolivia and Goias, Brazil using amplicon deep sequencing technology.
Methodology/ Principal Findings
A 450bp fragment of the trypomastigote TcGP63I surface protease gene was amplified and sequenced across 70 chronic and 22 congenital cases on the Illumina MiSeq platform. In addition, a second, mitochondrial target—ND5—was sequenced across the same cohort of cases. Several million reads were generated, and sequencing read depths were normalized within patient cohorts (Goias chronic, n = 43, Goias congenital n = 2, Bolivia chronic, n = 27; Bolivia congenital, n = 20), Among chronic cases, analyses of variance indicated no clear correlation between intra-host sequence diversity and age, sex or symptoms, while principal coordinate analyses showed no clustering by symptoms between patients. Between congenital pairs, we found evidence for the transmission of multiple sequence types from mother to infant, as well as widespread instances of novel genotypes in infants. Finally, non-synonymous to synonymous (dn:ds) nucleotide substitution ratios among sequences of TcGP63Ia and TcGP63Ib subfamilies within each cohort provided powerful evidence of strong diversifying selection at this locus.
Our results shed light on the diversity of parasite DTUs within each patient, as well as the extent to which parasite strains pass between mother and foetus in congenital cases. Although we were unable to find any evidence that parasite diversity accumulates with age in our study cohorts, putative diversifying selection within members of the TcGP63I gene family suggests a link between genetic diversity within this gene family and survival in the mammalian host.
Trypanosoma cruzi, the causal agent of Chagas disease in Latin America, infects several million people in some of the most economically deprived regions of Latin America. T. cruzi infection is lifelong and has a variable prognosis: some patients never exhibit symptoms while others experience debilitating and fatal complications. Available data suggest that parasite genetic diversity within and among disease foci can be exceedingly high. However, little is know about the frequency of multiple genotype infections in humans, as well as their distribution among different age classes and possible impact on disease outcome. In this study we develop a next generation amplicon deep sequencing approach to profile parasite diversity within chronic Chagas Disease patients from Bolivia and Brazil. We were also able to compare parasite genetic diversity present in eleven congenitally infants with parasite genetic diversity present in their mothers. We did not detect any specific association between the number and diversity of parasite genotypes in each patient with their age, sex or disease status. We were, however, able to detect the transmission of multiple parasite genotypes between mother and foetus. Furthermore, we also detected powerful evidence for natural selection at the antigenic locus we targeted, suggesting a possible interaction with the host immune system.
Citation: Llewellyn MS, Messenger LA, Luquetti AO, Garcia L, Torrico F, Tavares SBN, et al. (2015) Deep Sequencing of the Trypanosoma cruzi GP63 Surface Proteases Reveals Diversity and Diversifying Selection among Chronic and Congenital Chagas Disease Patients. PLoS Negl Trop Dis 9(4): e0003458. https://doi.org/10.1371/journal.pntd.0003458
Editor: Armando Jardim, McGill University, CANADA
Received: September 20, 2014; Accepted: December 5, 2014; Published: April 7, 2015
Copyright: © 2015 Llewellyn et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: This study was funded by the FP7 European Sequencing and Genotyping Infrastructure consortium, grant number 262055 and the FP7 research consortium ChagasEpNet, grant number 223034. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Trypanosoma cruzi is a kinetoplastid parasite and the causative agent of Chagas disease (CD) in Latin America. T. cruzi infects approximately 8 million people throughout its distribution and causes some 13,000 deaths annually . Chagas disease follows a complex course. Infection, often acquired in childhood, is generally lifelong but progression from the indetermined (asymptomatic) to symptomatic stage occurs in only 30% of cases . A broad pathological spectrum is associated with clinical CD including potentially fatal cardiological and gastrointestinal abnormalities . The relative contributions of parasite and host immunity in driving disease pathology are a matter of continuing debate . Recently, for example, bioluminescent parasite infections in BALB/c mouse models have suggested that heart disease can progress in the absence of detectable local parasite load .
It is widely recognized that natural parasitic infections are often comprised of several parasite clones . Malariologists use the term ‘multiplicity of infection’ (MOI) to describe when multiple Plasmodium sp. genotypes occur within the same host [7,8]. A similar phenomenon has been observed in T. cruzi in vectors (e.g. ), as well as mammalian reservoir hosts (e.g ) and humans hosts (e.g. ) using solid phase plating and cell sorting techniques. The occurrence of multi-genotype infections has fundamental implications for host immunity , as well as for accurate evaluation of pathogen drug resistance , transmission rate, epidemiology and population structure (e.g. [7,11]). The efficiency with which it is possible to sample pathogen clonal diversity from biological samples has soared in recent years with the advent of next generation sequencing. Deep sequencing approaches have long been applied to study the dynamics of HIV anti-viral therapy escape mutations. As a result amplicon sequencing increasingly features in a clinical diagnostic context . Plasmodium falciparum MOI can be resolved at merozoite surface protein loci at far greater depths than possible by standard PCR approaches . Furthermore, targeting low copy number antigens in parasite populations via amplicon sequencing can provide important clues to frequency-dependent selection pressures within hosts, between hosts and between host populations .
T. cruzi can persist for several decades within an individual host. Unsurprisingly perhaps, therefore, T. cruzi shows significant antigenic complexity. T. cruzi surface proteins are encoded by several large, repetitive gene families that are distributed throughout the parasite genome . Among these gene families the mucins, transialidases, ‘dispersed gene families’ (DGFs), mucin-associated surface proteins (MASPs) and GP63 surface proteases comprise the vast majority of sequences—10–15% of the total genome size [17,18]. Whilst the role of some of the proteins encoded by surface gene families in host cell recognition and invasion is relatively well understood (e.g. the transialidases ), the role of others (e.g. the MASPs, DGFs) is not. Furthermore, the role each plays in evading an effective host response remains largely unknown.
The GP63 surface proteases are found in a wide variety of organisms, including parasitic trypanosomatids . In Leishmania spp. GP63 proteases are the most common component of the parasite cell surface with crucial roles in pathogenicity, innate immune evasion, interaction with the host extracellular matrix and ensuring effective phagocytosis by macrophages . In T. brucei subspp. the role of GP63 proteins is less well defined, although some protein classes are thought to be involved with variant surface glycoprotein processing between different life cycle stages . In T. cruzi at least four classes of GP63 gene are recognized . Like many GP63 proteases in Leishmania spp., surface expressed T. cruzi GP63 (TcGP63) genes are anchored to the cell membrane via glycosyl phosphatidylinositol moieties [23,24]. Among these are the TcGP63 Ia & Ib genes (collectively TcGP63I). TcGP63 Ia & Ib encode 78kDa 543 amino acid proteins, are expressed in all life cycle stages and are implicated in the successful invasion of mammalian cells in vitro [23,24].
In the current study we target TcGP63I genes as markers of antigenic diversity among three cohorts of Chagas disease patients: two in Cochabamba, Bolivia and one in Goias, Brazil. We also targeted a maxicircle gene for the NADH dehydrogenase subunit 5 to provide basic T. cruzi genotypic information for each case. Diversity at each of the two T. cruzi loci within each patient was characterized using a deep amplicon sequencing approach, generating several million sequence reads. Our results shed light on the diversity of parasite DTUs within each patient, as well as the extent to which parasite strains pass between mother and foetus in congenital cases. We were unable to find any evidence that parasite diversity accumulates with age in our study cohorts, or to detect a link between parasite diversity and clinical profile. However, we were able to detect evidence of putative diversifying selection within members of the TcGP63 gene family, suggesting a link between genetic diversity within this gene family and survival in the mammalian host.
Materials and Methods
Ethical permissions were in place at the two centres where human sample collections were made, as well as at the London School of Hygiene and Tropical Medicine (LSHTM). Local ethical approval for the project was given at the Plataforma de Chagas, Facultad de Medicina, UMSS, Cochabamba, Bolivia by the Comite de Bioetica, Facultad de Medcina, UMSS. Local ethical permission for the project was given at the Hospital das Clínicas da Universidade Federal de Goias (UFG), Goias, Brazil by the Comite de Etica em Pesquisa Médica Humana e Animal, protocol number 5659. Ethical approval for sample collection at the LSHTM was given for the overall study, “Comparative epidemiology of genetic lineages of Trypanosoma cruzi” protocol number 5483. Samples were collected with written informed consent from the patient and/or their legal guardian.
Biological sample collection
Parasite isolation protocols were different between centres. At the UMSS, 0.5 mL of whole venous blood was taken from chronic patients and inoculated directly into biphasic blood agar culture. T. cruzi positive samples were minimally repassaged and cryopreserved at log phase (precise repassage history unavailable). For infants, 0.5 mL of chord blood was taken at birth and inoculated into culture. Again, positive samples were cryopreserved at log phase after minimal repassage (precise repassage history unavailable). DNA extractions, using a Roche High-Pure Template Kit, were made directly from the cryopreserved stabilate. At the UFG, 17 mL of whole blood was collected into EDTA, centrifuged for 10 minutes at 1200g at 4°C and the plasma replaced with 8mL Liver Infusion Tryptone (LIT) medium. After a further 10 minutes at 1200g (4°C), the supernatant was again removed. Two mL of packed red blood cells were subsequently transferred to 3 mL of LIT medium and checked periodically for signs of epimastigote growth by light microscopy. Positive cultures were not repassaged. Instead primary cultures were stabilized by the addition of guanidine 6 M-EDTA 0.2 M (Sigma-Aldrich, UK). DNA extractions were made from the full volume using the QIAamp DNA Blood Maxi Kit (Qiagen, UK) according to the manufacturer’s instructions. Among Bolivian strains, DNA concentrations submitted to PCR were standardized after quantitation using a PicoGreen assay. In view of presence of human genetic material in Goias samples, parasite DNA concentrations were standardized to within the same order of magnitude via qPCR as previously described . All samples collected for in this study are listed in Table 1.
Epidemiological and clinical observations
The two areas studied have dissimilar histories in terms of Chagas disease transmission intensity. Vector-borne T. cruzi transmission in Goias and its surrounding states (where samples were collected—Table 1) was interrupted approximately 20 years before the sampling detailed in this study [26,27]. In the sub-Andean semi-arid valleys of Cochabamba and its environs, however, vector-borne domestic transmission is still a likely source of new infections, albeit at a reduced rate since intensive spraying campaigns in the mid 2000s . Clinical data collected in this study were categorised simply into symptomatic and asymptomatic classes for statistical tests in view of samples sizes. Sub-categories within symptoms were defined as 1) Cardiopathy (including any electrocardiographic and/ or echocardiographic abnormalities, X-ray with cardiac enlargement. Patients with atypical cardiac abnormalities i.e. those not exclusively associated with Chagas disease, were included in the symptomatic class in the context of this study.) 2) Megaesophagous (including achalasia and barium swallow abnormalities) 3) Megacolon (constipation associated with dilation as by barium enema) and 4) Normal (no symptoms or signs on examination and a normal electrocardiogram) (Table 1)
Primer design, PCR conditions, amplicon sequencing and controls
Degenerate primers for a 450bp fragment of the maxi-circle NADH dehydrogenase 5 were designed as described in Messenger et al. 2012 . Degenerate primer design for the TcGP63I family surface proteases (including Ia and Ib sublaclasses)  was achieved by reference to sequences retrieved from EuPathDB for Esmeraldo (TcII), CL Brener (TcVI), Silvio (TcI) and JR (TcI) (http://eupathdb.org/). Primer biding site positions in relation to TcGP63I putative functional domains are displayed in S1 Fig. Homologs were identified by BLAST similarity to a complete TcGP63I sequence (bit score (S) ≥ 1000). Alignments of resulting sequences were made in MUSCLE  and primers were designed manually to target a variable region within and between individual strains with a final size of 450bp. ND5b primer sequences were ND5b_F ARAGTACACAGTTTGGRYTRCAYA; ND5b_R CTTGCYAARATACAACCACAA. The final TcGP63 primers were TcGP63_F RGAACCGATGTCATGGGGCAA and TcGP63_R CCAGYTGGTGTAATRCTGCYGCC. Amplification was undertaken using the Fluidigm platform and a reduction of the manufacturer’s recommended number of cycles to total of 26 was made in an attempt to minimise PCR amplification bias. Thus, the manufacturer’s recommended conditions were adapted to the following protocol: one cycle of 50°C for 2 minutes, 70°C for 20 minutes, and 95°C for 10 minutes; six cycles of 95°C for 15 seconds, 60°C for 30 seconds, 72°C for 60 seconds; two cycles of 95°C for 15 seconds, 80°C for 30 seconds, 60°C for 30 seconds and 72°C for 60s; five cycles of 95°C for 15 seconds, 60°C for 30 seconds, 72°C for 60 seconds; two cycles of 95°C for 15 seconds, 80°C for 30 seconds, 60°C for 30 seconds and 72°C for 60 seconds; five cycles of 95°C for 15 seconds, 60°C for 30 seconds, 72°C for 60 seconds, and finally five cycles of 95°C for 15 seconds, 80°C for 30 seconds, 60°C for 30 seconds and 72°C for 60 seconds. Amplifications were performed using the FastStart High Fidelity PCR System (Roche). Three PCR reactions were pooled per sample prior to sequencing in an attempt to further reduce amplification biases . Equimolar concentrations of ND5 and TcGP63I amplicons from 96 DNA samples were multiplexed on Illumina runs using dual index sequence tags (Illumina Inc). Sequencing was undertaken using a MiSeq platform using a 2 x 250 bp (Reagent Kit version 2) according to the manufacturer’s protocol. In addition to the clinical samples, we included a dilution series of control samples. The controls comprised artificially mixes of DTUs I-VI genomic DNA at equimolar concentrations. At the ND5 locus, comparison between the expected DTU abundance ratios and diversity of artificial control mixes and that defined via amplicon sequencing was made (S2 Fig.).
Amplicon sequence data analysis
De-multiplexed paired-end sequences were submitted to quality control and trimming in Sickle  and mate pairs trimmed in FASTX Toolkit (http://hannonlab.cshl.edu/fastx_toolkit/). ND5, TcGP63 and contaminating sequences were then sorted against a reference using BOWTIE2 . Individual paired reads were found to be overlapping in only a minority of cases. Thus we chose to proceed with analysis of a sequence fragment with a truncated central section for both targets. Further sequence manipulations were undertaken using FASTX Toolkit and custom awk scripts to parse files and concatenate each mate pair into a single sequence for downstream analysis. MUSCLE  was used for alignment of amplicon sequences in each patient sample. Next, analysis was undertaken in the Mothur software package  for the elimination of putative PCR chimeras and individual sequence clustering. The Shannon index of diversity was calculated at the intra-patient level based on sequence types (STs) defined at 97% and 99% identity in Mothur . Comparisons of Shannon diversity were made between patients in each cohort (Bolivia chronic, Bolivia congenital, Goias chronic) via analyses of covariance and linear regression in the R package (http://CRAN.R-project.org). TcGP63I sequence datasets for patients from each cohort were then merged and analyses conducted using 97% and 99% STs defined with UPARSE  across patients. Weighted UniFrac distances between TcGP63I STs among samples were generated and subsequently clustered via a principal coordinates analysis in QIIME . Significance of association between UniFrac clustering, disease status and age was tested in the vegan package in R . Estimates of diversifying selection among TcGP63I STs were made in KaKs Calculator  using Yang and Neilson’s 2000 approximate method  and tested for significance using a Fisher’s exact test. Prior to selection calculations, sequences were clustered into 99% identity STs and singletons excluded in an attempt to exclude SNPs introduced as PCR artefacts. To test for diversifying selection across putative TcGP63I gene families (TcGP63Ia & Ib—97% cut-off as defined by Cuevas and colleagues ), 99% identity STs from each patient cohort were pooled (Table 2). To test for selection within TcGP63I gene families, STs within each 97% category (corresponding to TcGP63Ia & b respectively) were examined separately per cohort (Table 2). Amplicon sequences analysed in this study are available in the data appendix in supplementary information (S1 Appendix).
Sequence yields and discrete typing unit (DTU) designations
After quality filtering, trimming, decontamination and removal of unpaired reads, 6,736,749 reads were assigned to the ND5 mitochondrial marker and 871,855 to TcGP63I marker across the 92 clinical samples, perhaps reflecting higher copy number in the former than the latter. After trimming, the overlap between individual mate pairs was marginally too short to be assembled into a single read. Thus paired reads were first aligned against a full-length reference fragment and the central portion excised to remove any gaps and ensure correct alignment. Sequence depth thresholds per sample for inclusion were set for each dataset (Goias—ND5 & TcGP63–10,000; Cochabamba—ND5: 30,000; TcGP63 10,000; see Fig. 1). Reads from samples in excess of this threshold were discarded and samples with read counts below this threshold excluded. Our aim in setting the threshold was: 1) To include as many samples as possible while maintaining a good depth of coverage; 2) To standardise sampling intensity across individuals and thus facilitate comparisons between them.
Read depths generated on the Illumina MiSeq platform were standardized across samples prior to analysis. Inclusion thresholds for TcGP63 (Goias—10,000; Cochabamba—3000; wide dash line; red bars) and ND5 (Goais—10,000; Cochabamba—30,000; thin dash line; blue bars) are shown for each population.
The ND5 mitochondrial target was sequenced to provide DTU I-VI identification of parasites circulating within and among patients by comparison to existing data . However, with reference to the results from the control samples—and due the necessary truncation of the sequence fragment—only three groups could be reliably distinguished, corresponding to the three major T. cruzi maxicircle sequence classes . The three groups corresponded to TcI, TcII and TcIII-VI respectively. Furthermore, in reference to the control mixes, we found evidence that amplification bias dramatically skewed the recovery of sequence types (STs) towards the TcIII-VI group. Some skew is expected, as these four DTUs (TcIII-VI) share the same maxicircle sequence class, and this class is thus more abundant in the control mix. However, TcI and TcII—which should have in theory been present as 16% (1/6) of all sequences in the controls respectively—were in fact present (on average) at only 2.9% and 0.03% among the four samples where all three STs were recovered (S2 Fig.). Amplicon sequencing from the two most concentrated controls (57 ng/uL and 125 ng/uL genomic DNA respectively) resulted in poor sequence yields and a failure to recover all three STs.
Unsurprisingly perhaps in the light of the control data, most clinical samples were dominated by sequences from a single group, with minor contributions from others (Fig. 2). Indeed sequences recovered from many strains were monomorphic at the 97% identity level—especially in Cochabamba. As such, comparisons based on ND5 are necessarily descriptive and meaningful alpha (within sample) and beta (between sample) diversity statistics were not calculated. Fig. 2 shows the distribution of DTUs among samples as defined by the ND5 locus. Most Cochabamba chronic cases samples were assigned to a single sequence within the TcIII-VI group (likely to be TcV, as we defined with standard genotyping assays  with the exception to two TcI cases—PCC 240 and PCC 289 (Fig. 2, Panel B). Sequence-type diversity in Goias was considerably higher (Fig. 2, Panel A). In this case the TcII group, rather than the TcIII-VI group, predominated. Unlike in Bolivia, sequences from other groups were present alongside TcII in multiple patients but at frequencies two orders of magnitude lower. Congenital pairs that originated from Cochabamba resembled chronic cases from the same region in their DTU composition (TcIII-VI group predominant, Fig. 2, Panel C). Strikingly, mother/child pair CIUF65 (B5) and CIUF75 (M5) share similar mixed infection profiles (TcI/ TcIII-VI) at similar relative abundances (c.1:1000), consistent with the minor to major genotype abundance ratios observed in Goias. The same is also true for the Goias congenital pair (Fig. 1) which both showed TcII/TcI mixes. Finally, sequential isolates taken from the same Goias chronic patient at different time points suggest that minor abundance genotypes are not always consistently detectable in the blood (Fig. 2): TcI is absent at first sampling of patient y, but present at the second sampling. For patient z, the TcIII-VI genotype is only present in the first of the two sample points. For both Cochabamba and Goias, reference to the control data suggests that ‘minor genotypes’ could be substantially more abundant in the patients than the amplicon sequence data suggest.
A—Goias cohort chronic/intermediate cases; B—Cochabamba chronic/intermediate cases; C—Cochabamba congenital cases. Y axes show log transformed abundance (read counts). X axes show clustered bars for individual samples. Sequence type identities are given in the legend. Stars denote congenital pair from Goias. Labels x (6416 / 6452), y (6401 / 6536) and z (6379 / 6445) sample pairs from the same patient at different time points (see Table 1).
TcGP63I surface protease alpha diversity among clinical and congenital cases
Alpha diversity measurements aim to summarise the diversity of species (in this case STs), within an ecological unit (in this case a host). We summarized the number of STs and their relative abundance in each of our samples, using the Shannon Index (SI) . Among non-congenital cases, our aim was to evaluate possible associations between TcGP63I antigenic diversity and several epidemiological and clinical parameters—age, sex and disease status. We used analyses of covariance (ANCOVA) to test for the effect of these parameters on intra-host antigenic diversity (STs defined both at 97% and 99% for comparison), combining continuous (age) and categorical (sex, clinical forms) data. In Cochabamba, regardless of the order in which parameters were included as factors in the model, there was no evidence for a main effect of age, sex or symptoms on alpha diversity (SI) at either ST divergence level (97% ST Age: p = 0.734; Sex: p = 0.298; clinical form: p = 0.136. 99% ST—Age: p = 0.854; Sex: p = 0.169; clinical form = 0.0988). Similarly, ANCOVAs were non-significant for an association between the SI and age, sex or symptoms in Goias (97% ST—Age: p = 0.382; Sex: p = 0.535; clinical form: p = 0.486. 99% ST—Age: p = 0.319; Sex: p = 0.696; clinical form: p = 0.697). Finally, we undertook linear regressions of SI with age in each population. As one might expect from previous ANCOVAs, no significant correlation was detected (Goias R2 = 0.0233, p = 0.340 (97% ST); R2 = 0.0256, p = 0.3049 (99% ST) Cochabamba R2 = 0.0287, p = 0.429 (97% ST); R2 = 0.0230 p = 0.479(99% ST)).
Congenital comparisons were made pairwise between mother and infant at 99% ST similarity. In addition to the ten matched isolate pairs from Cochabamba, a single pair from Goias was also included (6718 & 6720) in the comparisons. The results of the alpha diversity comparisons are shown in Fig. 3, and read depths were balanced between samples. In terms of the absolute number of STs identified, infants exceeded mothers in most instances (pairs 2, 3, 4, 5, 6, 8 & 9). In the remaining cases however (4/11), the number of antigenic sequence types was greater in the mother. Shannon diversity index comparisons between mothers and infants, which also takes ST abundance into account, suggested that some differences (e.g. pairs 4, 5 &6) might be marginal (Fig. 3).
Diversity indices were derived from STs defined at 99% sequence similarity. Bar plot and associated x-axis on the right hand side shows the Shannon diversity index calculated in Mothur , with error bars defining upper and lower 95% confidence intervals.
TcGP63I ST distributions among clinical and congenital CD patients
Individual sample sequence datasets within each of the different study cohorts (Cochabamba congenital, Cochabamba non-congenital and Goias) were merged to facilitate analysis of the distribution of antigen 99% STs among individuals (i.e. beta-diversity comparisons). Pairwise weighted Unifrac distances were calculated within cohorts of chronic cases from Cochabamba and Goias to examine whether the sequence diversity of the TcGP63I antigenic repertoire present in each patient could be associated with disease outcome. Principal coordinate analyses of the resulting matrices are displayed in Fig. 4. Among cases from Goias, repertoires varied considerably among cases, with several outliers. However, repertoires from symptomatic and asymptomatic cases were broadly overlapping in terms of sequence identity, and no clustering was noted among different symptom classes either (Fig. 4, Plot B). Permutational multivariate analysis confirmed the absence of a link between ST clustering and symptoms as well as symptom classes (p = 0.77 & 0.74 respectively). However, ST clustering and age were weakly associated (p = 0.049), consistent perhaps with exposure of individuals among different age groups to different circulating parasite genotypes at their time of infection. TcGP63I read yields permitted comparisons for only two pairs of sequential isolates from the sample patients—x and y (see Table 1)—both of which showed closely clustering, although non-identical, profiles. TcGP63I diversity between Cochabamba chronic cases was arguably lower, with the exception of two outliers unambiguously identified as TcI with reference to the ND5 locus (all others were classified as TcIII-VI—likely TcV). Again, however, symptomatic and asymptomatic cases were broadly overlapping.
Genetic distances are based on a weighted unifrac metric. Plot A shows diversity comparisons among Go-as asymptomatic (asympt) and symptomatic (sympt) clinical cases, as well as one acute case. Plot B shows Goias cases with symptoms categorised as acute, card (cardiopathy), card + mega (cardiopathy as well as megacolon and / or megaesophagous), mega (megacolon and / or megaesophagous) or asympt (asymptomatic). Plot C shows comparisons among Cochabamba clinical cases (not including congenital cases) classified as either asymptomatic (asympt) and symptomatic (sympt). The dashed circle on plot C indicates samples unambiguously defined as TcI at the ND5 locus. Pairs of sequential isolates from the same patient are labelled x and y respectively.
Sequence type profile comparisons among Cochabamba congenital cases were made for 99% STs and are displayed in heatmap format in Fig. 5. There are two key features of interest. The first is that profiles in mother an infant can match very closely (e.g. pairs 2&6). The second is that novel STs were present in the infant sample with respect to the mother in half of the cases. Indeed, in pair 9, the infant profile was radically different to that of the mother.
Pairs are indicated down the left hand side of the image (y axis). The mid-point rooted maximum likelihood tree on the x axis describes relationships among the 99% similarity sequence types (STs) identified in UPRASE  and was generated in Topali under equal-frequency transversion model, allowing gamma distributed weights across sites . Values on dendrogram notes indicate % bootstrap support. Starred congenital pairs are those where STs are present in the infant but not in the mother.
Population-level Ka/Ks ratios within and between TcGP63I gene family members
Trimmed TcGP63 reads, pre-filtered for quality and PCR errors, were pooled within each study site (Bolivia, Goias). To further reduce minority SNPs and PCR errors, STs were defined at 99% with each site in UPARSE . Ka/Ks ratio estimates within each study area indicated a significant excess of synonymous mutations among STs (Goias = 0.8354, Bolivia = 0.7515) averaged across sites (Table 2). However, when calculations were based on diversity present among well represented STs of each gene family member (TcGP63Ia and TcGP63Ib, 97% cut-off ) a powerful and significant excess of non-synonymous substitutions was noted within each study area (Ka/Ks, Goias, ST1 = 2.6436, ST4 = 6.3415; Bolivia ST3 = 2.8059; Table 2). Again, calculations were based not on individual sequences, but rather 99% STs within predefined 97% clusters. The position of the 97% STs in question is shown in the tree in S3 Fig., with clear similarity between those clusters under apparent diversifying selection (Goias ST1 & 2, and Bolivia ST3) with TcGP63Ia and TcGP63Ib references respectively .
In this study our aim was to collect a cohort of T. cruzi samples from clinical CD cases, representative of different endemic regions and of different ages and disease presentations, to explore links between CD epidemiology and multiplicity of infection. To provide a robust, sensitive and quantifiable means of assessing intra-host parasite diversity we first implemented standardized parasite isolation (and enrichment) strategies within each study cohort. Latterly, we developed an amplicon sequencing approach to profile parasite diversity within each patient. Given the relatively short (400–500bp) read lengths generated by next generation sequencing platforms (at the time of experimentation), we chose a rapidly evolving maxicircle gene (ND5) in an attempt to resolve DTU level diversity (). Current multilocus nuclear targets are generally too long (500bp+) to meet our selection criteria ). To explore antigenic diversity, we chose a putatively low (5–10) copy number gene family member TcGP63I, expressed on the parasite surface during the amastigote and trypomastigote lifecycle stage and thus exposed to the human immune system . Given that both ND5 and TcGP63I are present as several copies per parasite genome (and potentially show inter-strain copy number variation e.g. ), one cannot presume a 1:1 relationship between ST and parasite individual, even if we were able to account for the PCR amplification bias we detected. The identification of a genetically, variable, single copy, surface expressed antigen locus is a major challenge in T. cruzi—antigen genes are by their nature highly repetitive [17,18]. TcGP63I, with its relatively low copy number represents the closest currently available fit, and, as we have shown, provides a useful target for revealing intra-host antigenic diversity. Merozoite surface proteins (MSP) 1 and 2 have traditionally provided useful targets for detecting MOI in P. falciparum (e.g. [45,46]. Furthermore, amplicon sequencing of the MSP locus has been successfully proven to reveal as many as six-fold more variants than traditional PCR-based approaches .
The substantial historical interest in defining MOI among P. falciparum owes itself to the strong correlation between MOI and rate of parasite transmission . As such, fluctuations in transmission intensity can be tracked to evaluate the efficiency of vector eradication campaigns, drug treatments, the introduction of insecticide-treated nets etc—without the need to directly estimate the entomological inoculation rate. Evaluation of CD transmission intensity has its own challenges. The presence of infected individuals, triatomine vectors in domestic buildings, incrimination of vectors via human blood meal identification (e.g. ) can all help to build the overall picture. However, parasite transmission is likely to occur in only a tiny proportion of blood meals [49,50], and vector efficiency is thought to vary considerably between triatomine species —thus the presence of vectors is no guarantee of transmission. Infection with T. cruzi is lifelong, thus positive patient serology is not a reliable indicator of active parasite transmission either. Traditionally, active T. cruzi transmission has been implied from positive serology among younger age classes. Especially in hyperendemic areas of Bolivia, Paraguay and Argentina the proportion of seroprevalent individuals increases with age [52,53]. MOI in T. cruzi patients should follow a similar trend given a stable force of infection. Furthermore MOI comparisons between disease foci could, controlling for age, facilitate an appreciation of relative transmission intensities—a useful tool for those who wish to track the efficacy of interventions. In the current study, however, we were unable to identify a correlation between MOI and age, even once patient sex and clinical form had been corrected for. Our inability to validate this fundamental prediction has many possible causes. First, patients in each cohort originate from different communities within each study area (Table 1). Micro-geographic variation in T. cruzi genetic diversity is commonly observed (e.g. [11,54,55], and the same is likely to be true for infection intensity. Thus, if patients from different sites share dissimilar histories in the intensity and diversity of exposure to T. cruzi clones, comparisons between them are difficult to make. Secondly, the relationship between MOI and age is not necessarily linear. If a degree of cross-genotype immunity accumulates with exposure, one might expect a slower increase in intra-host antigenic diversity in older age groups. However, this was not the case in our dataset and neither a linear, nor a unimodal relationship could be established.
Amplicon sequencing approaches to the study of transmission patterns in human parasites have so far been restricted to those species that replicate and reach high parasitemias in peripheral blood (i.e. T. brucei  and P. falciparum [13,15]). T. cruzi trypomastigote circulating parasitemias, as measured by qPCR, are thought to vary considerably between acute (400 parasites/ml), newborn (150–12000 parasites/ml) and chronic (3–16 parasites/ml) cases [25,57]. Nonetheless, they remain several orders of magnitude lower than those that occur during T. brucei or P. falciparum infections. Low circulating T. cruzi parasitemia presents major problems to studies that aim to achieve molecular diagnosis of CD in chronic cases and ours is no exception. One problem is that much of the parasite diversity present in the host is likely to be sequestered in the tissues at any give time , as our sequential samples from Goias also suggest. Thus blood stage parasite genetic diversity may be a poor representation of that actually present in the host. Another confounder is culture bias, by which differential growth of clones in culture, as well as loss of clonal diversity during repassage can both influence diversity estimates. Attempts to generate amplicon sequence data directly from clinical blood samples would likely to be thwarted by low circulating parasitemia [25, 56]. Instead we elected to enrich for parasite DNA via culture—in Goias without further repassage, but in Bolivia with at least one repassage before cryopreservation. Low circulating parasitemia in Chagas patients also means it is possible that amplicon-sequencing strategies might rapidly ‘bottom out,’ if few parasites are present within a sample. In our dataset, for example, at the ND5 locus, minority DTUs at 97% divergence can be present as a proportion of < 1 in 1000 (Fig. 1), with the implication that several thousand parasites must be present in the sample. In both Goias and Bolivia matched instances occurred in congenital cases where TcI exists in mother and infant as the minor DTU at similar relative abundance (i.e. 1 in 1000, Fig. 1). It is highly unlikely that these data directly reflect chronic CD parasitemia levels. Instead, with reference to the data we obtained from the controls, PCR amplification bias is a more likely source of unrealistic major to minor genotype ratios. As such, the fourfold over-representation of a ST in the original sample, for example, can result in 100–1000 fold over-representation after PCR. However, while the relative abundance of sequence types recovered using the amplicon approach may be an inaccurate reflection of those present for both ND5 and TcGP63, similar profiles between mother and infant suggests that this bias is likely to be consistent across samples. Thus comparisons between samples are still valid. Furthermore for ND5 at least it seems that T. cruzi frequently exchanges mitochondrial (maxicircle) genomes with little apparent evidence of nuclear exchange [11,29]. Fusion of maxicircle genomes occurs transiently during T. brucei genetic exchange events , and may also do so in T. cruzi. Even though standard maxicircle genotyping of progeny only ever reveals a single parent in both species, it is possible that heterologous maxicircle sequences may persist at low abundance in parasite clones. Such a phenomenon could explain the DTU sequence type ratios observed, and this study is the first to sequence a maxicircle gene to this depth.
There is general consensus in the literature is that the likelihood of congenital CD transmission is not strongly influenced by the genotype of the parasite infecting the mother [60–62]. Nonetheless, the majority of cases are reported in the Southern Cone region of South America, providing a circumstantial link with major human-associated T. cruzi genotypes TcV TcII, and TcVI. In this study, in the one mixed infection we found, major and minor DTUs (TcVI / TcI) detected in the mother at the ND5 locus were recovered from the infant in similar proportions. TcGP63I beta diversity comparisons of STs defined at 99% showed substantial sharing of between mother and infant (Fig. 5). However, both beta diversity comparisons (Fig. 5) and total ST diversity (alpha) comparisons (Fig. 3) at 99% indicate that while maternal diversity sometimes exceeds that of the infant (explicable perhaps by sequestration in the mother and selective or stochastic trans-placental transfer), the reverse is frequently true. The occurrence of STs in the infant, not present in the mother, has several possible explanations. The infants sampled in this study were neonates, thus superinfection can be ruled out as a source of further parasite clonal diversity. A recent study of infected neonates in Argentina estimated mean infant parasitemia at 1,789 parasites/ml via qPCR—far in excess of that one might expect in the mother . Thus the parasite sample size discrepancy between mother and infant perhaps explains the unexpected levels of diversity in the infant. Even though the TcGP63I gene family is apparently under intense diversifying selection, it seems unlikely that point mutation could generate novel variants over such a short time scale to explain genetic diversity in the infant. Structural variants and homologous recombination are a potential source of diversity, although most, if not all of recombinants should have been excluded in the quality filtering stages, and would be hard to distinguish from PCR chimeras in any case.
Many important T. cruzi surface genes belong to large, recently expanded paralogous multigene families . The abundance of these gene copies highlights their likely adaptive significance in terms of infectivity and host immune evasion, especially because trypansomatids exert so little control of gene expression at the level of transcription . In Leishmania major, for example, it has been recently shown that gene amplification may rapidly duplicate segments of the genome in response to environmental stress . As well as expansion, adaptive change is also likely to occur at the amino acid level among members of paralogous gene families, as has been suggested for T. brucei . Despite the relatively small size of the TcGP63I gene family, the amplicon sequencing approach we employed allowed us to explore selection at the level of the gene within the population, i.e. within and between parasite genomes within and between hosts at the population level. Highly elevated non-synonymous substitutions suggest intense diversifying selection within TcGP63Ia and TcGP63Ia STs respectively for those assigned to TcII or TcI. STs from patients infected with TcIII-TcVI (putative TcV) showed few apparent substitutions (Table 2), perhaps consistent with the recent origin of this DTU . The sequence fragment we studied was outside the zinc binding domain of this metalloprotease, indicating selective forces can act on this protein independent of its core proteolyic function, perhaps through repeated exposure to host immunity.
It is important not to overlook the potential importance of multiclonal infections for parasitic disease, both as markers of population level factors such as parasite transmission, but also at the host level, including immunity and disease progression. In this study we have developed an amplicon sequencing approach to probe parasite genetic diversity within and among clinical CD cases to unprecedented depth. While our approach shows the power of this amplicon-seq to resolve diversity in clinical and congenital CD cases, it also highlights the potential biases that might be introduced with the addition of a PCR step. A tool that allows the accurate evaluation MOI would be valuable for tracking transmission rates at restricted disease foci (i.e. villages, outbreaks) in the context of measuring the success of intervention strategies. A similar tool could provide a powerful means of longitudinal tracking of T. cruzi infections in terms of disease progression, treatment failure and immunosuppression. Here we demonstrate that amplicon sequencing could have a role to play in this context. However, as sequencing costs decline and reference genome assemblies improve, whole genome deep sequencing, perhaps even of individual parasite cells, becomes and increasingly viable option as it already has for Plasmodium sp. [7,67].
S1 Fig. TcGP63Ia and Ib amino acid alignments showing amplicon seq primer binding sites in relation to putative functional domains.
Amino acid sequences are derived for those define by Cuevas and colleagues . The colour key on the left hand side indicates primer binding sites and functional domains. The green shaded regions indicate the area covered by the Illumina paired end reads along each amplicon. The purple shaded central region indicates the area not covered.
S2 Fig. Bar plot of amplicon sequence data generated from control DTU mixes.
Expected ratios of ND5 sequence types (far right) are compared to those recovered via amplicon sequencing. All three sequence types (I, II, III-VI) were recovered from all but the two most concentrated control mixes. However, the relative proportions of each sequence type derived from amplicon sequence data were radically different to that expected.
S3 Fig. Maximum likelihood phylogeny of 97% TcGP63I STs derived in this study and available T. cruzi and T. cruzi marinkellei TcGP63 paralogues.
Homologous sequences were recovered from www.TriTrypDB.org via BLAST. The appropriate substitution model was defined as the transversion model with invariable sites plus gamma in Topali . Abundant ST labels correspond with those indicated in Table 2. Branches are coloured by source DTU or red, for sequences generated in this study. Reference sequences TcGP3Ia and TcGP63Ib from the literature are also shown along side 97% sequence types generated in this study .
We gratefully acknowledge the assistance of A Boland and R Olaso at the CNG for the help with sample processing. M Lewis at the LSHTM provided valuable comments in the analytical stages. S. Creer at MEFGL Bangor provided helpful suggestions for appropriate software tools and analyses.
Conceived and designed the experiments: MSL MAM MD. Performed the experiments: MSL LAM AOL LG BC. Analyzed the data: MSL CB. Contributed reagents/materials/analysis tools: FT SBNT AOL ND JFD SS. Wrote the paper: MSL LAM MAM SS AOL.
- 1. Rassi A Jr., Rassi A, Marin-Neto JA (2010) Chagas disease. Lancet 375: 1388–1402. pmid:20399979
- 2. Tarleton RL, Reithinger R, Urbina JA, Kitron U, Gurtler RE (2007) The challenges of Chagas Disease—grim outlook or glimmer of hope. PLoS Med 4: e332. pmid:18162039
- 3. Prata A (2001) Clinical and epidemiological aspects of Chagas disease. Lancet Infect Dis 1: 92–100. pmid:11871482
- 4. Machado FS, Tyler KM, Brant F, Esper L, Teixeira MM, Tanowitz HB (2012) Pathogenesis of Chagas disease: time to move on. Front Biosci (Elite Ed) 4: 1743–1758. pmid:22201990
- 5. Lewis MD, Fortes Francisco A, Taylor MC, Burrell-Saward H, McLatchie AP, Miles MA, Kelly JM (2014) Bioluminescence imaging of chronic Trypanosoma cruzi infections reveals tissue-specific parasite dynamics and heart disease in the absence of locally persistent infection. Cell Microbiol.
- 6. Balmer O, Tanner M (2011) Prevalence and implications of multiple-strain infections. Lancet Infect Dis 11: 868–878. pmid:22035615
- 7. Manske M, Miotto O, Campino S, Auburn S, Almagro-Garcia J, Maslen G, O'Brien J, Djimde A, Doumbo O, Zongo I, Ouedraogo JB, Michon P, Mueller I, Siba P, Nzila A, Borrmann S, Kiara SM, Marsh K, Jiang H, Su XZ, Amaratunga C, Fairhurst R, Socheat D, Nosten F, Imwong M, White NJ, Sanders M, Anastasi E, Alcock D, Drury E, Oyola S, Quail MA, Turner DJ, Ruano-Rubio V, Jyothi D, Amenga-Etego L, Hubbart C, Jeffreys A, Rowlands K, Sutherland C, Roper C, Mangano V, Modiano D, Tan JC, Ferdig MT, Amambua-Ngwa A, Conway DJ, Takala-Harrison S, Plowe CV, Rayner JC, Rockett KA, Clark TG, Newbold CI, Berriman M, MacInnis B, Kwiatkowski DP (2012) Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing. Nature 487: 375–379. pmid:22722859
- 8. Assefa SA, Preston MD, Campino S, Ocholla H, Sutherland CJ, Clark TG (2014) estMOI: estimating multiplicity of infection using parasite deep sequencing data. Bioinformatics 30: 1292–1294. pmid:24443379
- 9. Yeo M, Lewis MD, Carrasco HJ, Acosta N, Llewellyn M, da Silva Valente SA, de Costa Valente V, de Arias AR, Miles MA (2007) Resolution of multiclonal infections of Trypanosoma cruzi from naturally infected triatomine bugs and from experimentally infected mice by direct plating on a sensitive solid medium. Int J Parasitol 37: 111–120. pmid:17052720
- 10. Llewellyn MS, Rivett-Carnac JB, Fitzpatrick S, Lewis MD, Yeo M, Gaunt MW, Miles MA (2011) Extraordinary Trypanosoma cruzi diversity within single mammalian reservoir hosts implies a mechanism of diversifying selection. Int J Parasitol 41: 609–614. pmid:21232539
- 11. Ramirez JD, Guhl F, Messenger LA, Lewis MD, Montilla M, Cucunuba Z, Miles MA, Llewellyn MS (2012) Contemporary cryptic sexuality in Trypanosoma cruzi. Mol Ecol 21: 4216–4226. pmid:22774844
- 12. Perez CJ, Lymbery AJ, Thompson RC (2014) Chagas disease: the challenge of polyparasitism? Trends Parasitol 30: 176–182. pmid:24581558
- 13. Taylor SM, Parobek CM, Aragam N, Ngasala BE, Martensson A, Meshnick SR, Juliano JJ (2013) Pooled deep sequencing of Plasmodium falciparum isolates: an efficient and scalable tool to quantify prevailing malaria drug-resistance genotypes. J Infect Dis 208: 1998–2006. pmid:23908494
- 14. Gibson RM, Meyer AM, Winner D, Archer J, Feyertag F, Ruiz-Mateos E, Leal M, Robertson DL, Schmotzer CL, Quinones-Mateu ME (2014) Sensitive deep-sequencing-based HIV-1 genotyping assay to simultaneously determine susceptibility to protease, reverse transcriptase, integrase, and maturation inhibitors, as well as HIV-1 coreceptor tropism. Antimicrob Agents Chemother 58: 2167–2185. pmid:24468782
- 15. Juliano JJ, Porter K, Mwapasa V, Sem R, Rogers WO, Ariey F, Wongsrichanalai C, Read A, Meshnick SR (2010) Exposing malaria in-host diversity and estimating population diversity by capture-recapture using massively parallel pyrosequencing. Proc Natl Acad Sci U S A 107: 20138–20143. pmid:21041629
- 16. Parobek CM, Bailey JA, Hathaway NJ, Socheat D, Rogers WO, Juliano JJ (2014) Differing patterns of selection and geospatial genetic diversity within two leading Plasmodium vivax candidate vaccine antigens. PLoS Negl Trop Dis 8: e2796. pmid:24743266
- 17. El-Sayed NM, Myler PJ, Bartholomeu DC, Nilsson D, Aggarwal G, Tran AN, Ghedin E, Worthey EA, Delcher AL, Blandin G, Westenberger SJ, Caler E, Cerqueira GC, Branche C, Haas B, Anupama A, Arner E, Aslund L, Attipoe P, Bontempi E, Bringaud F, Burton P, Cadag E, Campbell DA, Carrington M, Crabtree J, Darban H, da Silveira JF, de Jong P, Edwards K, Englund PT, Fazelina G, Feldblyum T, Ferella M, Frasch AC, Gull K, Horn D, Hou L, Huang Y, Kindlund E, Klingbeil M, Kluge S, Koo H, Lacerda D, Levin MJ, Lorenzi H, Louie T, Machado CR, McCulloch R, McKenna A, Mizuno Y, Mottram JC, Nelson S, Ochaya S, Osoegawa K, Pai G, Parsons M, Pentony M, Pettersson U, Pop M, Ramirez JL, Rinta J, Robertson L, Salzberg SL, Sanchez DO, Seyler A, Sharma R, Shetty J, Simpson AJ, Sisk E, Tammi MT, Tarleton R, Teixeira S, Van Aken S, Vogt C, Ward PN, Wickstead B, Wortman J, White O, Fraser CM, Stuart KD, Andersson B (2005) The genome sequence of Trypanosoma cruzi, etiologic agent of Chagas disease. Science 309: 409–415. pmid:16020725
- 18. Franzen O, Ochaya S, Sherwood E, Lewis MD, Llewellyn MS, Miles MA, Andersson B (2011) Shotgun sequencing analysis of Trypanosoma cruzi I Sylvio X10/1 and comparison with T. cruzi VI CL Brener. PLoS Negl Trop Dis 5: e984. pmid:21408126
- 19. Pereira-Chioccola VL, Schenkman S (1999) Biological role of Trypanosoma cruzi trans-sialidase. Biochem Soc Trans 27: 516–518. pmid:10917632
- 20. Ma L, Chen K, Meng Q, Liu Q, Tang P, Hu S, Yu J (2011) An evolutionary analysis of trypanosomatid GP63 proteases. Parasitol Res 109: 1075–1084. pmid:21503641
- 21. Yao C (2010) Major surface protease of trypanosomatids: one size fits all? Infect Immun 78: 22–31. pmid:19858295
- 22. Grandgenett PM, Otsu K, Wilson HR, Wilson ME, Donelson JE (2007) A function for a specific zinc metalloprotease of African trypanosomes. PLoS Pathog 3: 1432–1445. pmid:17953481
- 23. Kulkarni MM, Olson CL, Engman DM, McGwire BS (2009) Trypanosoma cruzi GP63 proteins undergo stage-specific differential posttranslational modification and are important for host cell infection. Infect Immun 77: 2193–2200. pmid:19273559
- 24. Cuevas IC, Cazzulo JJ, Sanchez DO (2003) gp63 homologues in Trypanosoma cruzi: surface antigens with metalloprotease activity and a possible role in host cell infection. Infect Immun 71: 5739–5749. pmid:14500495
- 25. Duffy T, Cura CI, Ramirez JC, Abate T, Cayo NM, Parrado R, Bello ZD, Velazquez E, Munoz-Calderon A, Juiz NA, Basile J, Garcia L, Riarte A, Nasser JR, Ocampo SB, Yadon ZE, Torrico F, de Noya BA, Ribeiro I, Schijman AG (2013) Analytical performance of a multiplex Real-Time PCR assay using TaqMan probes for quantification of Trypanosoma cruzi satellite DNA in blood samples. PLoS Negl Trop Dis 7: e2000. pmid:23350002
- 26. Marsden P, Garcia-Zapata MT, Castillo EA, Prata AR, Macedo VO (1994) [The first 13 years of controlling Chagas' disease in Mambai, Goias, Brazil 1980–1992]. Bol Oficina Sanit Panam 116: 111–117. pmid:8161419
- 27. Schofield CJ, Dias JC (1999) The Southern Cone Initiative against Chagas disease. Adv Parasitol 42: 1–27. pmid:10050271
- 28. Espinoza N, Borras R, Abad-Franch F (2014) Chagas disease vector control in a hyperendemic setting: the first 11 years of intervention in Cochabamba, Bolivia. PLoS Negl Trop Dis 8: e2782. pmid:24699407
- 29. Messenger L, Llewellyn M, Bhattacharyya T, Franzén O, Lewis M, Ramírez J, Carrasco H, Andersson B, Miles M (2011) Multiple mitochondrial introgression events and heteroplasmy in Trypanosoma cruzi revealed by maxicircle MLST and Next Generation Sequencing PLoS Negl Trop Dis In press.
- 30. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32: 1792–1797. pmid:15034147
- 31. Zhou J, Wu L, Deng Y, Zhi X, Jiang Y-H, Tu Q, Xie J, Van Nostrand JD, He Z, Yang Y (2011) Reproducibility and quantitation of amplicon sequencing-based detection. ISME J 5: 1303–1313. pmid:21346791
- 32. Joshi N, Fass J (2011) Sickle: A sliding-window, adaptive, quality-based trimming tool for FastQ files (Version 1.29) [Software]. https://githubcom/najoshi/sickle.
- 33. Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9: 357–359. pmid:22388286
- 34. Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, Lesniewski RA, Oakley BB, Parks DH, Robinson CJ, Sahl JW, Stres B, Thallinger GG, Van Horn DJ, Weber CF (2009) Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 75: 7537–7541. pmid:19801464
- 35. Edgar RC (2013) UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nat Methods 10: 996–998. pmid:23955772
- 36. Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, Fierer N, Pena AG, Goodrich JK, Gordon JI, Huttley GA, Kelley ST, Knights D, Koenig JE, Ley RE, Lozupone CA, McDonald D, Muegge BD, Pirrung M, Reeder J, Sevinsky JR, Turnbaugh PJ, Walters WA, Widmann J, Yatsunenko T, Zaneveld J, Knight R (2010) QIIME allows analysis of high-throughput community sequencing data. Nat Methods 7: 335–336. pmid:20383131
- 37. Oksanen J, Blanchet F, Kindt K, Legendre P, Minchin P, O'Hara R, Simpson G, Solymos P, Stevens M, Wagner H (2015) vegan: Community Ecology Package. R package version 2.2-1. http://CRANR-projectorg/package=vegan.
- 38. Zhang Z, Li J, Zhao XQ, Wang J, Wong GK, Yu J (2006) KaKs_Calculator: calculating Ka and Ks through model selection and model averaging. Genomics Proteomics Bioinformatics 4: 259–263. pmid:17531802
- 39. Yang Z, Nielsen R (2000) Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol Biol Evol 17: 32–43. pmid:10666704
- 40. Machado CA, Ayala FJ (2001) Nucleotide sequences provide evidence of genetic exchange among distantly related lineages of Trypanosoma cruzi. Proc Natl Acad Sci U S A 98: 7396–7401. pmid:11416213
- 41. Lewis MD, Ma J, Yeo M, Carrasco HJ, Llewellyn MS, Miles MA (2009) Genotyping of Trypanosoma cruzi: systematic selection of assays allowing rapid and accurate discrimination of all known lineages. Am J Trop Med Hyg 81: 1041–1049. pmid:19996435
- 42. Keylock CJ (2005) Simpson diversity and the Shannon–Wiener index as special cases of a generalized entropy. Oikos 109: 203–207.
- 43. Yeo M, Mauricio IL, Messenger LA, Lewis MD, Llewellyn MS, Acosta N, Bhattacharyya T, Diosque P, Carrasco HJ, Miles MA (2011) Multilocus sequence typing (MLST) for lineage assignment and high resolution diversity studies in Trypanosoma cruzi. PLoS Negl Trop Dis 5: e1049. pmid:21713026
- 44. Westenberger SJ, Cerqueira GC, El-Sayed NM, Zingales B, Campbell DA, Sturm NR (2006) Trypanosoma cruzi mitochondrial maxicircles display species- and strain-specific variation and a conserved element in the non-coding region. BMC Genomics 7: 60. pmid:16553959
- 45. Ojurongbe O, Fagbenro-Beyioku AF, Adeyeba OA, Kun JF (2011) Allelic diversity of merozoite surface protein 2 gene of P. falciparum among children in Osogbo, Nigeria. West Indian Med J 60: 19–23. pmid:21809706
- 46. Buchholz U, Kobbe R, Danquah I, Zanger P, Reither K, Abruquah HH, Grobusch MP, Ziniel P, May J, Mockenhaupt FP (2010) Multiplicity of Plasmodium falciparum infection following intermittent preventive treatment in infants. Malar J 9: 244. pmid:20796302
- 47. Talisuna A, Okello P, Erhart A, Coosemans M, D’Alessandro U (2007) Intensity of Malaria Transmission and the Spread of Plasmodium falciparum–Resistant Malaria: A Review of Epidemiologic Field Evidence. In: Breman J, Alilio M, White N, editors. Defining and Defeating the Intolerable Burden of Malaria III Progress and Perspectives: American Society of Tropical Medicine and Hygiene.
- 48. Ibanez-Cervantes G, Martinez-Ibarra A, Nogueda-Torres B, Lopez-Orduna E, Alonso AL, Perea C, Maldonado T, Hernandez JM, Leon-Avila G (2013) Identification by Q-PCR of Trypanosoma cruzi lineage and determination of blood meal sources in triatomine gut samples in Mexico. Parasitol Int 62: 36–43. pmid:22995149
- 49. Rabinovich JE, Wisnivesky-Colli C, Solarz ND, Gurtler RE (1990) Probability of transmission of Chagas disease by Triatoma infestans (Hemiptera: Reduviidae) in an endemic area of Santiago del Estero, Argentina. Bull World Health Organ 68: 737–746. pmid:2127382
- 50. Nouvellet P, Dumonteil E, Gourbiere S (2013) The improbable transmission of Trypanosoma cruzi to human: the missing link in the dynamics and control of Chagas disease. PLoS Negl Trop Dis 7: e2505. pmid:24244766
- 51. Borges-Pereira J, Pessoa I, Coura J (1988) Observações sobre as dejeções e o número de T. cruzi eliminados por diferentes espécies de triatomíneos durante a alimentação. Mem Inst Oswaldo Cruz 83: 7.
- 52. Samuels AM, Clark EH, Galdos-Cardenas G, Wiegand RE, Ferrufino L, Menacho S, Gil J, Spicer J, Budde J, Levy MZ, Bozo RW, Gilman RH, Bern C, Working Group on Chagas Disease in B, Peru (2013) Epidemiology of and impact of insecticide spraying on Chagas disease in communities in the Bolivian Chaco. PLoS Negl Trop Dis 7: e2358. pmid:23936581
- 53. Dias JC (2007) Southern Cone Initiative for the elimination of domestic populations of Triatoma infestans and the interruption of transfusional Chagas disease. Historical aspects, present situation, and perspectives. Mem Inst Oswaldo Cruz 102 Suppl 1: 11–18. pmid:17891281
- 54. Baptista Rde P, D'Avila DA, Segatto M, Valle IF, Franco GR, Valadares HM, Gontijo ED, Galvao LM, Pena SD, Chiari E, Machado CR, Macedo AM (2014) Evidence of substantial recombination among Trypanosoma cruzi II strains from Minas Gerais. Infect Genet Evol 22: 183–191. pmid:24296011
- 55. Ocana-Mayorga S, Llewellyn MS, Costales JA, Miles MA, Grijalva MJ (2010) Sex, subdivision, and domestic dispersal of Trypanosoma cruzi lineage I in southern Ecuador. PLoS Negl Trop Dis 4: e915. pmid:21179502
- 56. Oberle M, Balmer O, Brun R, Roditi I (2010) Bottlenecks and the maintenance of minor genotypes during the life cycle of Trypanosoma brucei. PLoS Pathog 6: e1001023. pmid:20686656
- 57. Bua J, Volta BJ, Perrone AE, Scollo K, Velazquez EB, Ruiz AM, De Rissio AM, Cardoni RL (2013) How to improve the early diagnosis of Trypanosoma cruzi infection: relationship between validated conventional diagnosis and quantitative DNA amplification in congenitally infected children. PLoS Negl Trop Dis 7: e2476. pmid:24147166
- 58. Burgos JM, Diez M, Vigliano C, Bisio M, Risso M, Duffy T, Cura C, Brusses B, Favaloro L, Leguizamon MS, Lucero RH, Laguens R, Levin MJ, Favaloro R, Schijman AG (2010) Molecular identification of Trypanosoma cruzi discrete typing units in end-stage chronic Chagas heart disease and reactivation after heart transplantation. Clin Infect Dis 51: 485–495. pmid:20645859
- 59. Gibson W, Garside L (1990) Kinetoplast DNA minicircles are inherited from both parents in genetic hybrids of Trypanosoma brucei. Mol Biochem Parasitol 42: 45–53. pmid:2233899
- 60. Ortiz S, Zulantay I, Solari A, Bisio M, Schijman A, Carlier Y, Apt W (2012) Presence of Trypanosoma cruzi in pregnant women and typing of lineages in congenital cases. Acta Trop 124: 243–246. pmid:22906640
- 61. Burgos JM, Altcheh J, Bisio M, Duffy T, Valadares HM, Seidenstein ME, Piccinali R, Freitas JM, Levin MJ, Macchi L, Macedo AM, Freilij H, Schijman AG (2007) Direct molecular profiling of minicircle signatures and lineages of Trypanosoma cruzi bloodstream populations causing congenital Chagas disease. Int J Parasitol 37: 1319–1327. pmid:17570369
- 62. Virreira M, Alonso-Vega C, Solano M, Jijena J, Brutus L, Bustamante Z, Truyens C, Schneider D, Torrico F, Carlier Y, Svoboda M (2006) Congenital Chagas disease in Bolivia is not associated with DNA polymorphism of Trypanosoma cruzi. Am J Trop Med Hyg 75: 871–879. pmid:17123980
- 63. Clayton C (2013) The regulation of trypanosome gene expression by RNA-binding proteins. PLoS Pathog 9: e1003680. pmid:24244152
- 64. Ubeda JM, Raymond F, Mukherjee A, Plourde M, Gingras H, Roy G, Lapointe A, Leprohon P, Papadopoulou B, Corbeil J, Ouellette M (2014) Genome-wide stochastic adaptive DNA amplification at direct and inverted DNA repeats in the parasite Leishmania. PLoS Biol 12: e1001868. pmid:24844805
- 65. Emes RD, Yang Z (2008) Duplicated paralogous genes subject to positive selection in the genome of Trypanosoma brucei. PLoS One 3: e2295. pmid:18509460
- 66. Lewis MD, Llewellyn MS, Yeo M, Acosta N, Gaunt MW, Miles MA (2011) Recent, independent and anthropogenic origins of Trypanosoma cruzi hybrids. PLoS Negl Trop Dis 5: e1363. pmid:22022633
- 67. Nair S, Nkhoma SC, Serre D, Zimmerman PA, Gorena K, Daniel BJ, Nosten F, Anderson TJ, Cheeseman IH (2014) Single-cell genomics for dissection of complex malaria infections. Genome Res 24: 1028–1038. pmid:24812326
- 68. Milne I, Lindner D, Bayer M, Husmeier D, McGuire G, Marshall DF, Wright F (2009) TOPALi v2: a rich graphical interface for evolutionary analyses of multiple alignments on HPC clusters and multi-core desktops. Bioinformatics 25: 126–127. pmid:18984599