Putative Panmixia in Restricted Populations of Trypanosoma cruzi Isolated from Wild Triatoma infestans in Bolivia

Trypanosoma cruzi, the causative agent of Chagas disease, is subdivided into six discrete typing units (DTUs; TcI–TcVI) of which TcI is ubiquitous and genetically highly variable. While clonality is the dominant mode of propagation, recombinant events play a significant evolutive role. Recently, foci of wild Triatoma infestans have been described in Bolivia, mainly infected by TcI. Hence, for the first time, we evaluated the level of genetic exchange within TcI natural potentially panmictic populations (single DTU, host, area and sampling time). Seventy-nine TcI stocks from wild T. infestans, belonging to six populations were characterized at eight microsatellite loci. For each population, Hardy-Weinberg equilibrium (HWE), linkage disequilibrium (LD), and presence of repeated multilocus genotypes (MLG) were analyzed by using a total of seven statistics, to test the null hypothesis of panmixia (H0). For three populations, none of the seven statistics allowed to rejecting H0; for another one the low size did not allow us to conclude, and for the two others the tests have given contradictory results. Interestingly, apparent panmixia was only observed in very restricted areas, and was not observed when grouping populations distant of only two kilometers or more. Nevertheless it is worth stressing that for the statistic tests of "HWE", in order to minimize the type I error (i. e. incorrect rejection of a true H0), we used the Bonferroni correction (BC) known to considerably increase the type II error ( i. e. failure to reject a false H0). For the other tests (LD and MLG), we did not use BC and the risk of type II error in these cases was acceptable. Thus, these results should be considered as a good indicator of the existence of panmixia in wild environment but this must be confirmed on larger samples to reduce the risk of type II error.


Introduction
Trypanosoma cruzi is the causative agent of Chagas disease, which affects about eight million people in Latin America, of whom 30-40% either suffers or will develop cardiomyopathy, digestive megasyndromes, or both. Moreover, Chagas disease is becoming an emerging health problem in nonendemic areas because of the increasing number of migrants from endemic areas [1]. The T. cruzi species exhibits a very high genetic variability similar to that observed within different species of other kinetoplastidae such as Leishmania [2]. Consensual taxonomy recognized six discrete typing units (DTUs named TcI-TcVI) [3] and one additional group only found in bats (Tcbat) [4] within T. cruzi [5]; TcI is the most genetically diversified and ubiquitous of them, spreading from the United States to Argentina, and present in both sylvatic and domestic biotopes. As a result of the dominant clonal multiplication, identical multilocus genotypes (MLGs) have been sampled over several years and over large geographical distances, leading to considering the species as multiclonal [6]. The long-term clonal evolution is involved in the current important genetic diversity of the species, but more and more "genetic exchange" events are being described. Scarce hybridization events are the source of two hybrid DTUs [7][8][9], mitochondrial introgression events have been detected [10,11], and different levels of gene recombination have been described [12][13][14]. In addition, high genome plasticity is also a source of variability. Aneuploidy is suspected [15], occurrence of allele loss is possible during genetic exchanges, the mitochondrial genome is probably more complex than previously described, and maxicircle gene recombination occurs as well as intragenic recombination [14]; heteroplasmy has also been reported [16]. Several of these genetic exchange mechanisms have been triggered in vitro [17] and are still hotly debated in the field. As previously stated [18]: "From an epidemiological and medical point of view, the important parameter to evaluate is the stability of the genetic clones in space and time." This stability directly depends on the level of genetic exchanges (in the broad sense). Indeed, within a strict clonal framework the clones are stable in space and time, and they convey similar biological characteristics that can be crucial for epidemiological and medical features generation after generation. In contrast, with more or less frequent recombination, such correlations are not necessarily expected, hence the importance of studying genetic exchanges between stocks.
In general terms, to test panmixia, two prerequisites are needed: (i) the use of an appropriate genetic marker not subjected to selection and with a sufficient level of polymorphism and (ii) populations isolated in restricted areas where parasites are assumed to be in sympatry. Our previous work showed that microsatellite markers are relevant for studying the population genetics of T. cruzi at the DTU level [19]. Moreover, abundant and accessible foci of wild Triatoma infestans vectors mainly infected by TcI have been recently described in Bolivia [20,21]; hence, in the present work it was possible to evaluate the level of genetic exchanges in potentially panmictic T. cruzi TcI populations isolated from sylvatic T. infestans in Bolivia.

Data analysis
"For a majority of pathogens, including the Trypanosomatidae family, the reproductive strategy was mainly deduced from population genetics analysis" [24]. Here, the analyses were focused on two kinds of events involved in sexual exchanges: allelic segregation and genetic recombination. Allelic segregation was explored through Hardy-Weinberg equilibrium (HWE) or F is , while genetic recombination was explored through linkage disequilibrium analysis (LD, nonrandom association between genotypes at independent loci) and the presence / absence of repeated multilocus genotypes (MLG). A previous study, based on simulations and aiming to estimate the level of clonal reproduction in diploids [25] advised the simultaneous use of F is (mean and variance) and LD estimators.
F is is a measure of inbreeding of individuals within a subsample; it also represents the deviation from random union of gametes and varies from −1 (fixed heterozygous) to +1 (fixed homozygous) via F is = 0 (Hardy-Weinberg equilibrium). This Wright F-statistic [26] was estimated with Weir and Cockerham's unbiased estimators [27] called f. Negative values of F is (excess heterozygosity) can be caused by accumulation of mutations in an ancient clonal lineage, a phenomenon called the Meselson effect [28], and are generally regarded as a mark of clonality as observed in Bdelloid rotifers [29]. Positive values of F is correspond to inbreeding within the sample, a particular case being the Wahlund effect, when the sample comes from heterogeneous and structured populations. It is worth noting that if the mean F is values are good estimators of HWE, low F is values associated with substantial variance of F is among loci (with some loci displaying an extreme heterozygote deficit and others an extreme excess) can reveal very low levels of sex (cryptic sex) [30]. All statistical tests were based on randomization: data sets fitting the null hypothesis (H 0 = panmixia) were generated by randomizing the relevant unit (allele, genotype, etc.). Here, to test HWE within the subsamples, the alleles were permuted among individuals within each subsample and F is was used as a HWE estimator, while for testing the overall HWE, alleles were permuted among subsamples and F it was used as an estimator. Moreover, since the presence of null alleles artificially increases F is estimations, we tested the impact of null alleles on the increased F is values.
Linkage disequilibrium (LD) is another measure of deviation from panmixia. Here it was estimated in three different ways: (i) by the classical index I A [31], which has the disadvantage of increasing with the number of loci, so we also used a slightly modified index (ȓ d ) which is independent of the number of loci [32]; (ii) by the log-likelihood ratio G-statistics [33]; the P-value of this test is obtained as follows: genotypes at the 2 loci are associated at random a number of times and the statistic is recalculated on the randomized data set; the P-value is estimated as the proportion of statistics from randomized data sets that are larger or equal to the observed and (iii) by comparing the observed number of MLGs and the frequency of the most frequent MLG to the expected ones in simulated panmixia. As for F is , all the LD statistical tests are based on H 0 = panmixia (i.e., the genotypes at the two loci are associated at random a number of times depending on the sample size and the statistics are recalculated on the randomized data set).
To test F is , LD, and MLG, we examined nine subsamples: the six populations under study (Luribay, Mecapaca, Sap-Sap, Sap-Cosi, Qui-Urk, and Qui-Bsia); the subsample "overall Sapini," which clusters the two populations from Sapini (Sap-Sap + Sap-Cosi); the subsample "overall Quillacollo," which clusters the two populations from Quillacollo (Qui-Urk + Qui-Bsia); and the "overall" sample including all stocks (N = 79). The different indices and p-values were associated with their level of significance (NS, not significant; * significant at 5% and ** significant at 1%). As several tests were applied for F is , LD, and repeated MLG, a decision about accepting or rejecting H 0 is proposed in each case, namely "reject H 0 " or "not reject H 0 " when all tests are congruent, and "ambiguous" when at least one of the tests gave a discordant result.
To process the data, different programs were used: (i) the HierFstat package [35] in R [36] to compute the 95% confidence intervals of F is , (ii) the "binom.test" function in R to test the null hypothesis about the probability of success in Bernoulli's experiments, (iii) MicroChecker v.2.2.3 [37] to test the load of null alleles, (iv) Multilocus v1.3b [32] for I A and ȓ d indices and to test the probabilities of repeated MLG and different MLG, (v) Populations (v.1.2.30 © 1999, Olivier Langella, CNRS UPR9034) to build a general clustering analysis between all stocks using the Cavalli-Sforza and Edwards' chord genetic distances [38], and (vi) Fstat [39] for all other tests.

Genetic diversity of the six populations under study
Genetic diversity was explored within the six local wild T. cruzi TcI populations (79 stocks) and within the 21 reference strains. Details of the origin and allelic microsatellite composition of each stock studied are listed in Table 1.
Null alleles: Only two stocks from the Luribay population did not amplify at locus MCL08 and one reference stock at locus A427. Analyzing the six potentially panmictic populations with MicroChecker, 43 null alleles were expected at loci presenting high F is over 1264 alleles, hence 3.40%, which is already very low. The proportion of observed null alleles in this sample (n = 4, hence 0.32%) is lower than expected (exact binomial test, p = 4e-14). Thus, the role of null alleles in inflated F is may be considered here as negligible.
Overall polymorphism: The main indices of genetic diversity as well as observed and expected heterozygotes and F is by locus and by population are listed in Table 2. It is worth noting that, as expected, the subsample of the reference strains (n = 21) is by far the most polymorphic. Moreover, 42 alleles out of 82 (51.2% of the total number of alleles) were specific to reference strains (see Table 1). Eighty-nine different multilocus genotypes (MLGs) were observed among the 100 stocks (including references) versus only 68 MLGs among the 79 stocks under study (without references). The most repeated MLG (no. 89, repeated five times) was identified in a single population, Qui-Urk, in the Cochabamba valley ( Table 1). The number of alleles per locus ranged from 4 to 18 and from 2 to 8 with and without references, respectively. Similarly, the mean allelic richness by locus systematically decreased when reference strains were removed. For the six local populations, the F is values per locus and per population showed high variance, ranging from −1.00 (fixed heterozygosity for locus SCLE10 in Qui-Bsia population) to 1.00 (fixed homozygosity for loci MCLE01 in Sap-Cosi, SCLE11 in Qui-Urk, and C875 in Luribay), while only positive F is values were observed for the reference population (ranging from 0.30 to 0.82) as is expected when pooling differentiated reproductive units within a single subpopulation [25]. The mean allelic richness in local populations was weakly variable, ranging from 1.49 (Qui-Urk) to 2.27 (Sap-Sap) and higher within the reference strains (4.49). The clustering analysis (NJ tree not shown) of all the stocks using the Cavalli-Sforza and Edwards distance method showed that six of the reference strains, namely P209cl93, SABP3, Cutiacl1, SP31, V120, and Cuicacl1, were closely related to some of the wild stocks under study, the other reference strains forming a separate group not supported by a significant bootstrap value. The analysis of genetic distances between each of the 21 reference strains and the 79 wild stocks (mean of pairwise distances) showed that the three reference strains closest to the Bolivian wild stocks were SABP3 from Peru, Cuicacl1 from Brazil, and P209cl93 from Bolivia, with genetic distances of 0.36, 0.41, and 0.50, respectively; the three reference strains farthest from the wild stocks were FX18 from Colombia and 93041401P and 93070103P from the US, with mean genetic distances of 0.89, 0.88, and 0.87, respectively.  with Qui-Bsia (3.3 km apart), and all the populations (overall). F is varied from −0.08 (Qui-Urk population) to 0.29 (overall). Considering the significance using the Bonferroni correction (BC), none of the F is were significant (H 0 not rejected, see Table 3) except for the overall sample. As we know that BC may falsely accept H 0 , we also considered the p-values without BC: here H 0 is rejected with α = 1% within the "overall" sample and for only one sample grouping two local populations "overall Sapini" and was not rejected in all the local populations.

Panmixia tests within the six populations under study
Consequently, the decisions about panmixia were rejection for the "overall" sample, ambiguous for "overall Sapini," and no rejection for all local populations (Table 3). Linkage disequilibrium (LD): three parameters were tested: (i) the proportion of significant LD tests over the total number of comparisons by pairs of loci, using the binomial test, (ii) the  Table 3. Of the six local populations under study, H 0 was not rejected in four of them (Luribay, Sap-Cosi, Qui-Urk and Qui-Bsia); two results were ambiguous (Mecapaca and overall Quillacollo) and three rejected H 0 (Overall, Overall Sapini and Sap-Sap).
Repeated multilocus genotypes: We tested two parameters, the number of different MLGs and the maximum frequency of the most repeated MLG. The results showed ( Table 3) that H 0 is rejected in only one sample (Overall), not rejected in five populations (Luribay, Mecapaca, Sap-Cosi, Qui-Urk, and Qui-Bsia) and ambiguous in three populations (Overall Sapini, Sap-Sap, and Overall Quillacollo).
Considering only the six potentially panmictic populations under study, in four of them (Luribay, Sap-Cosi, Qui-Urk, and Qui-Bsia) the decisions for F is , LD, and MLG were "no rejecting H 0 " , while in the two others (Mecapaca and Sap-Sap) contradictory results were observed between the different tests of panmixia. Nevertheless, for the only F is tests within the populations from Luribay and Sap-Cosi, there is a potential risk of type II error

Likely panmixia in several T. cruzi populations isolated from wild T. infestans
As previously recommended [25], we used three classes of classical population genetics parameters to study the mode of reproduction (i.e., Hardy-Weinberg equilibrium, linkage equilibrium, and presence of repeated MLG) and we showed that in four out of six potentially panmictic T. cruzi populations (Luribay, Sap-Cosi, Qui-Urk, and Qui-Bsia) sampled in Table 3. Analysis of F is , disequilibrium linkage (LD) and repeated multilocus genotypes (MLGs) of the 79 Trypanosoma cruzi strains isolated from six potentially panmictic populations. Results of statistical tests and decisions about H 0 (reject or not reject panmixia). For all tests: NS or NS = not significant; * = significant at 5% risk; ** = significant at 1% risk (1) p-value for F is within samples without Bonferroni correction (BC); (2) significance of the test with BC; (3) Ratio: significant loci pairwise comparisons / total comparisons, tested by the binomial test with R program; (4) Value of index of association; (5)

Role of sympatry and sampling design
To test panmixia, the first condition is natural sympatry; indeed, a nonsympatric sample may lead to genetic structuring and generate a Wahlund effect and consequently a false rejection of H 0 . As nobody knows precisely what sympatry means for this parasite, we picked up the populations within a very small area, not more than 1 ha, in which the triatomes and mammal hosts are assumed to move enough to allow parasite transmission from one host to another and hence generate opportunities for genetic exchanges; we named these populations "potentially panmictic" and tested them. Consequently, in such populations, when H 0 is not rejected, and excluding a type II error discussed above, we can consider, a posteriori, that these populations were truly sympatric. Inversely, when H 0 is rejected by some tests, as is the case for the Sap-Sap population and to a lesser extent for Mecapaca, a Wahlund effect due to a hidden genetic structure (itself possibly due to a lack of sympatry) could be inferred. Interestingly, when we analyzed the microsatellite data by the software Structure [40], we showed the presence of two distinct genomes in only Sap-Sap and Mecapaca, hence a hidden genetic structure, which can explain the rejection of H 0 for some tests within these two populations (data not shown). Meanwhile for these two populations, choosing between the two alternative hypotheses (i.e., lack of sympatry or presence of some extent of clonality) is almost impossible. Sampling in areas that are not actually sympatric may therefore result in falsely rejecting H 0 . Inversely, as previously stated by others [41], selecting only one individual per subpopulation and pooling each of them into an artificial population generates misleading patterns and false conclusions regarding the mode of reproduction, in particular a significant reduction of LD and modified HW equilibrium, sometimes giving an erroneous picture of the recombining organism despite a high level of clonality. Obviously, our sampling method did not fit this pattern and consequently absence of H 0 rejection cannot be attributed to this sampling bias. All these remarks emphasize the importance of sampling design to test the hypothesis under study, for example here, to test panmixia, we need potential sympatric areas, not allopatric areas.

Clonality versus recombination in T. cruzi species
Since the pioneering studies using isoenzymes [6], T. cruzi has been considered by most authors to have a basically clonal population structure, with occasional bouts of genetic exchange or hybridization. These facts were confirmed on many occasions with other genetic markers and a clonal theory of parasitic protozoa was proposed [2,42] with the notable exception of Plasmodium falciparum in which sex occurs [43]; the theory was reaffirmed with both Trypanosoma and Leishmania genera [44] and extended to fungi bacteria and viruses in a recent review [45]. The question of determining whether sex occurs or not in T. cruzi is not trivial, nor needless. Because of a reduced or absent gene flow, clonality must have a major impact on the biological and medical properties of the parasites, which has been explored [46,47]. On the other hand, genetic exchanges can take different forms, the best known being hybridization that has been provoked in vitro [17] and has naturally occurred, playing a crucial role in T. cruzi evolution (generating new DTUs). It is generally admitted that two hybridization events have defined the population structure of T. cruzi [7], the first one very ancient, between TcI and TcII, leading to TcIII and TcIV, and the second one, recent, between TcII and TcIII, leading to TcV and TcVI. The in vitro hybrids showed a fusion of parental genotypes, loss of alleles, homologous recombination, and uniparental inheritance of kinetoplast maxicircle DNA [17], and it is accepted that natural hybridization might occur in a similar but contrasted way [48]. In addition to hybridization, many authors have reported incongruence between phylogenetic trees, which is generally a sign of recombination: for example 13,49, mitochondrial introgression [10,11] and even mitochondrial heteroplasmy (heterogeneous mitochondrial genomes in an individual cell) was demonstrated recently [16] using the promising mtMLST method (mitochondrial multilocus sequence typing), itself derived from the MLST method using nuclear genes [50]. The last way of genetic exchanges might be conventional recombination mechanisms, as in sexual diploids, which can be detected by the usual tools of population genetics (F IS , LD, etc., like here). Because we do not know the cytological mechanisms involved, we named these events "recombinationlike" in order to differentiate them from the known genetic exchanges involving meiosis in sexual diploids. One of the first studies regarding this event [51], reported at one isoenzyme locus (phosphoglucomutase), observed homozygous and heterozygous frequencies almost identical to those predicted by the theoretical Hardy-Weinberg distribution in sylvatic TcI. Later, using microsatellites, some recombinations were suggested in a general clonal framework in sylvatic TcI over the endemic area [52], TcI in Ecuador [53], and TcIII [54]: in the latter, the authors could not effectively discriminate a recombination from a high genome-wide frequency of gene conversion. Finally, three recent studies emphasize the role of genetic exchanges and the extraordinary genome plasticity of T. cruzi, (i) using genomic CNV (copy number variation) [15]; (ii) another team [14] reported gross incongruence in Colombian TcI between nuclear and mitochondrial markers, mosaic maxicircle sequences, and the genetic resorting mechanism; (iii) other authors [55] showed that hybrid stocks contain haplotypes that are mosaics probably originating from intragenic recombination. In all these examples, it is worth noting that hybridization or introgression may occur between distant DTUs, whereas "recombination-like" events generally are intra-DTU, as shown in the present study. The "clonality or genetic exchanges" duality for T. cruzi has definitively became obsolete; this species obviously has used both mechanisms to evolve and probably to adapt to its multiple hosts, associated with an extraordinarily plastic genome shaped by clonal evolution and several kinds of genetic exchanges. The mode of reproduction of T. cruzi could oscillate between clonality and sexuality and the true questions are why, when, how, and to what extent T. cruzi recombines? Nevertheless, we agree with Tibayrenc and Ayala's [45] definition of clonality as "restrained recombination on an evolutionary scale," which has already been observed in T. cruzi since the same MLGs can be sampled at different times and in distant regions. The same authors stated that "recombination seems easier between closely related genotypes pertaining to the same near-clade in both fungi and parasitic protozoa"; this probably constitutes the most parsimonious explanation for the co-occurrence of recombination at restricted space / time levels and of clonality at larger space / time scales. Interestingly, in bacteria "the probability of acceptance of a recombination event decreases exponentially with genetic distance between the donor and recipient DNA" [56], which is an effect of sexual isolation in bacteria [57]; this could be true for T. cruzi and should be further investigated.

Conclusion, limitations and warning
For the first time we report panmixia, notably through linkage disequilibrium statistics, in T. cruzi TcI populations isolated from wild T. infestans in Bolivia. In absence of additional studies involving other sylvatic vectors, it is not possible to associate panmixia with the sylvatic biotopes; further studies of panmixia should be conducted in other biotopes where parasites should be sympatric. As previously mentioned, "mixed clonal / sexual reproduction is nearly indistinguishable from strict sexual reproduction as long as the proportion of clonal reproduction is not strongly predominant" [30], so, although unlikely, we cannot exclude a certain level of clonality in these populations, even when all tests did not reject the panmixia hypothesis. Moreover, it is worth noting that the parasite strains used here were not cloned and some artifacts due to multiple infections could be a possible explanation for some contradictory results between the different tests. The Leishmania genome is aneuploid [58], every chromosome in every cell may be present in different ploidy states (monosomic, disomic, or trisomic). If this is the case for T. cruzi, as suspected [15], there could be a serious bias with all the codominant nuclear markers, particularly in the studies involving microsatellites: artificially decreasing F is in the trisomic state (excess heterozygosity) and artificially increasing F is in the monosomic state (excess homozygosity). Hence, all the F is results should be interpreted with caution, especially when there is a substantial variance of F is between loci. Moreover, F is is not linearly related to the rate of clonal reproduction [59]. As stated above, the sampling strategy is crucial to confirm or reject these results in other natural contexts, avoiding sampling stocks that have a foreign origin because of passive transport by humans. For this purpose (to specify the mating system at the local scale), we recommend starting with a reduced time and space scale in order to avoid the Wahlund bias as much as possible, which does not hamper the opposite strategy previously proposed [45], "taking a birdseye view of genetic variability over years and continents, from different hosts and ecosystems" to look at the evolution of the species over space and time.