Genetic variation and phylogeography of the Triatoma dimidiata complex evidence a potential center of origin and recent divergence of haplogroups having differential Trypanosoma cruzi and DTU infections

The population genetics of Triatoma dimidiata haplogroups was analyzed at landscape and sub-regional scales in Chiapas and regional level across the Mexican Neotropics, and phylogeography of the complex was re-analyzed across its complete geographic range. Two contiguous fragments of the ND4 gene were analyzed due to bias from differential haplogroup specificity using a previously designed sequence. At both landscape (anthropic modification gradient) and regional (demographic, fragmentation, biogeographic, climate) scales, lowest T. dimidiata genetic diversity occurs where there is greatest historical anthropic modification, and where T. cruzi infection prevalence is significantly highest. Trypanosoma cruzi prevalence was significantly higher than expected in haplogroups 1 and 3, while lower than expected in haplogroup 2. There was also a significant difference of DTUI and DTUVI infection frequencies in both haplogroups 1 and 3, while no difference of either in haplogroup 2. All haplogroups from the Mexican Neotropics had moderate to high haplotype diversity, while greatest genetic differentiation was between haplogroups 1 and 3 (above FST = 0.868, p < 0.0001). Divergence of the complex from the MRCA was estimated between 0.97 MYA (95% HPD interval = 0.55–1.53 MYA) and 0.85 MYA (95% HPD interval = 0.42–1.5 MYA) for ND4A and both concatenated fragments, respectively, with primary divergence from the MRCA of haplogroups 2 and 3. Effective population size for Mexican haplogroups 1 and 2 increased between 0.02 and 0.03 MYA. This study supports previous ecological niche evidence for the complex´s origin surrounding the Tehuantepec Isthmus, and provides evidence for recent divergence of three primary dimidiata haplogroups, with differential T. cruzi infection frequency and DTU specificity, important components of vector capacity.

In Mexico, three haplogroups of the T. dimidiata complex have been reported. Haplogroup 1 (Hg1) is reported to the east of the Tehuantepec Isthmus, originally reported from the Yucatan Peninsula (YP) and within the Palenque archeological site (northeast Chiapas) [12,23], although it has now also been reported from northern Guatemala. In the latter case nomenclature was reverted from the original designation, and samples of Hg1 were labeled Hg3 [8,14,18,20,35]. Most recently, a proposal has been made to reassign Hg1 (original designation) as T. spp. aff. dimidiata [36]. Haplogroup 2 (Hg2) is only reported from Mexico, this Hg, Isthmus from northern and coastal counties across Chiapas (Berriozábal, Palenque, Tapachula, Mapastepec and the Soconusco region), and the Yucatan Peninsula (multiple sites in the Yucatán and the Zoh Laguna landscape, in southeast Campeche). Triatomine specimens were collected by inhabitants or by members of the research group (following informed consent) in the framework of community-based triatomine surveillance programs and transmission ecology analyses. Collection locations were primarily from human communities (S1 Table) or regional aggregates (Berriozábal county sub-regions, other Mexican neotropical), and from several complete landscapes, which included human communities, ecotone areas, and sylvatic conserved fragments (Ignacio Zaragoza, Nuevo Montecristo and Rio Blanco, Berriozábal; Zoh Laguna, Campeche). Partial data from the latter site have been reported previously [5], although specimens were re-analyzed with different markers herein and data are reported and included in other Mexican neotropical site and continental scale analyses.
Berriozábal county (16˚48'00" N and 93˚16 '22" O) in northwest Chiapas was the principal collection area for comparative genetic and Trypanosoma cruzi infection analyses in human communities on a regional scale (five sub-regions), and in two landscapes within two of the sub-regions. This area was chosen, based on previous T. dimidiata collections, community engagement and surveillance activities, T. cruzi infection prevalence, historical and geographic importance of the region, and failure by state health services and federal oversight, to treat and attend diagnosed cases of Chagas disease. The county has highly variable topography, with over 70% covered by the Sierra Madre Oriental mountain range in the north, and elevation ranging from 200 m to 1200 m above sea level (Fig 1). The climate is sub-humid tropical with rains ranging between 50 to 1200 mm between June and November. The vegetation is heterogeneous, principally tropical deciduous, medium sub-perennial forest, and perennial forests, with a gradient of progressive anthropic disturbance and land use change northwest to southeast. The southern sub-regions of the county concentrate large farm agriculture, livestock grazing, and small business. Communities in the county were assigned to five sub-regions, based on vegetation, landscape modification (gradient from conserved to permanent human settlement), and transport infrastructure network. Sub-region A is the most northern and conserved (proportionally) abutting the Grijalva river watershed and Malpaso dam and contains the landscape of Ignacio Zaragoza (IZ). Sub-region B is a low plain south of "A", which contains the landscape comprised of conserved sylvatic forest, ecotone (grazing, agriculture), and two human communities (Nuevo Montecristo, Río Blanco). Sub-region C is the western region which has greatest mountain coverage abutting the Chimalapas rainforest, principally conserved, and with least human presence. Sub-region D is south of B and east of C and has the principal road running north-south, hence it is the second most modified sub-region. Sub-region E is the southernmost, historically deforested since the 17 th century and Spanish conquest, where the county seat is located along the highway which connects the county capital (Berriozábal) to the state capital (Tuxtla Gutierrez) and eastern Chiapas and Guatemala, and to the west, the Tehuantepec Isthmus/Chimalapas and the rest of Mexico (Fig 1). Inhabitants from all communities having over 50 houses in Berriozábal county collected bugs found inside or outside houses, placing collections in plastic bags, along with their name, date, specific collection site, and community, and these were turned over to volunteer entomology comites, before channelling to the research team. Bugs were registered and coded, taxonomically verified [43] and preserved in 90% EtOH.
The present study, as others analysing landscape modification, biotic communities [5,44] and sociocultural components related to vector amplification along modification gradients [41], use the definition of Danielson 1991 [45] for landscape: large areas (measured at spatial scales of km 2 or higher) that comprise more than one type of habitat distributed in numerous patches. For the purpose of landscape ecology analyses, which require specific intensive study designs, we have defined three habitats to categorize landscapes for T. cruzi dispersal: sylvatic (> 85% conserved), ecotone (partially modified, usually agriculture and livestock grazing, but not having permanent human presence), and human domestic (permanent communities) [5]. Intensive bug collections both by the population and by members of the research group (following verbal informed consent) were carried out in both rainy and dry seasons in the landscape containing the two rural communities, Nuevo Montecristo (NM) and Río Blanco (RB) (sub-region B). The landscape had been historically modified more than 70 years ago by establishment of the original community, Río Blanco. Nuevo Montecristo was recently established, as a result of the Ejército Zapatista de Liberación Nacional (EZLN) conflict, on forested land to the west side of the only road through the area and RB. The area was given to 45 families for relocation and subsistence farming by the government in 1995. Primary forest was felled and houses built central to the land grant, in accordance with agrarian reform collective holding rules. Land plots surrounding the new housing community of NM were established, further modifying the original forested area. The combined NM-RB diversity are reported in the section for sub-region B, but were also analyzed individually.
Data generated from specimens were registered in databases (collection site, date, tentative species and stages, town or georeference, county, census information, ecoregion), and were aggregated at community and landscape levels, at sub-regional, regional (Berriozábal), and other Mexican neotropical scales, to analyse haplogroup presence and abundance, and T. cruzi infection and DTU haplotype. Sampling design, size, and individual haplogroup distributions guided these analyses. More than 98% of collections were carried out by inhabitants or technical personnel in the human domestic habitat (intra or peridomestic) and hence all continental (all data from this study and registered in GenBank), regional (Berriozábal county, all other Mexican neotropical sites), and sub-regional (Berriozábal sub-regions) data are representative  Table)  of this habitat category. Samples from several other regions were also collected using a representative collection design for human community aggregates (Berriozábal county, Nopala county, Yucatán state, Chiapas Soconusco coast). Finally, samples from three landscape studies are included (IZ, MC/RB, and Zoh Laguna, Campeche), since they represent a more complete survey of vector and parasite populations independent of a bias from anthropic modification.

DNA extraction, amplification and sequencing of ND4 fragments
Genomic DNA was obtained from each sample from two legs or gonads for bug genetic analyses and from the midgut contents for T. cruzi detection and bloodmeal source identification [5]. DNA was extracted using DNAzol (Invitrogen, San Diego, California, USA), following manufacturer's instructions, and stored at -20˚C. The ND4 fragment was amplified using primers described by Dotson & Beard [42] and is designated ND4A herein. The PCR for this fragment was conducted in a final volume of 25 μl using a 50 ng DNA template, 1.5 mM MgCl 2 , buffer PCR 5X, 10 mM dNTPs, 10 pM of each primer and 0.2 U of Taq DNA polymerase (GoTaq DNA Polymerase, Promega, USA). The amplification protocol was 5 min at 94˚C, followed by 36 cycles of 30 s at 94˚C, 30 s at 48˚C, 2 min at 72˚C, followed by a 72˚C extension for 7 min, and 4˚C indefinitely.
Only a low proportion of samples from Mexico amplified using the ND4A (< 60%; N = 51/ 92), despite sample DNA quality (amplification controls). In order to test the hypothesis that low specificity was related to the population´s genetic background, and in order to develop other tools for landscape analyses, a second fragment from the ND4 gene was identified. The primers for fragment ND4B were designed from the complete genome of T. dimidiata [42] initiating at the 3´terminal (8500-8759). Primers were designed in silico using the Primer3 programme (http://bioinfo.ut.ee/primer3-0.4.0/), which amplified a 230 bp fragment, 8500 F: 5 CAC AGC CCA CAA AAA CCA 3´and 8759 R: 5´TGA CTT CCA AGG GCT CAT GT 3´. The PCR reaction was standardized for a final volume of 30 μl using a 50 ng DNA template, 2.08 mM MgCl 2 , buffer PCR 5X, 10 mM dNTPs, 10 pM of each primer and 0.16 U of Taq DNA polymerase (GoTaq DNA Polymerase, Promega, USA). The amplification protocol was 5 min at 94˚C, followed by 40 cycles of 30 s at 94˚C, 45 s at 50˚C, 45 s at 72˚C, followed by a 72˚C extension for 10 min, and 4˚C indefinitely. Trypanosoma cruzi infection and bloodmeal source were identified in all specimens using PCR as previously described [5], and parasite populations were genotyped using the mini-exon [46] and 18S rRNA genes [47,48]. All PCR products were separated by electrophoresis in 1.5% agarose gels stained with 0.5 μg/ml ethidium bromide and visualized under UV light. PCR products were purified using the QIAquick PCR purification kit (QIAGEN, Valencia, CA) and sequencing was carried out on an ABI Prisma 310 (High-Throughput Genomics Center, University of Washington, Department of Genome Sciences) or using ABI 3730XLs (Macrogen, Korea).
Amplification success for each ND4 fragment was analysed separately and compared among haplogroups for the complete Berriozábal county, and all other Mexican Neotropical sites, using a chi-squared frequency analysis with Bonferroni correction. Significant differences in genotype success between ND4A and ND4B for each haplogroup was analysed using the log-likelihood ratio (G-test) for goodness of fit with Williams´correction for sample size variation. All tests were considered significant if p < 0.05 [49].

Genetic diversity and differentiation (F ST ) between T. dimidiata haplogroups
Forward and reverse sequences from all samples were used to generate consensus sequences. These sequences were manually aligned using MEGA v.7 [50], and all sequences of unique haplotypes were deposited in GenBank (accession numbers ND4A: MH410755-MH410810; ND4B: S1 File). Genetic diversity was analyzed separately for both ND4A and ND4B for each haplogroup from Berriozábal, and likewise, haplogroup diversity was analyzed for all samples across the Mexican Neotropical region, without Berriozábal (S1 Table). At the landscape scale, Hg2 haplotype diversity was calculated for NM and RB separately, and for Hg2 and Hg3 from IZ (where all 3 haplogroups were collected). Genetic diversity only for Hg2 was compared among four of the five sub-regions within Berriozábal, again due to insufficient collection of the two minority haplogroups (Hg1, Hg3) in all but one of the sub-regions. It was also compared between all Berriozábal samples vs. all other Mexican neotropical samples. Genetic diversity measures were estimated from the number of mutations (ƞ), number of segregating sites (S), number of unique sites (Su), mean number of pairwise differences (k), number of haplotypes (h), haplotype diversity (Hd), nucleotide diversity (π) and nucleotide polymorphism index (Ɵ) using DNASP v.5.10 [51]. Neutrality tests Fs [52] and Tajima's D [53] were estimated using Arlequin v.3.5 [54].
The pairwise genetic differentiation (F ST ) between T. dimidiata haplogroups was analysed separately for both ND4A and ND4B fragments, for Berriozábal county and for all other Mexican Neotropical sites using Arlequín v.3.5. Genetic distance was estimated between haplogroups, from Berriozábal and from all other Mexican neotropical sites, using the Kimura 2-parameter (K2P) and MEGA v.7 with 10,000 random permutations.

Phylogenetic relationship and divergence times
Phylogenetic inferences were analyzed using two datasets: a) ND4A (including, Berriozábal, all other Mexican Neotropical sites, and continental GenBank sequences), and b) concatenated ND4A and ND4B from Berriozábal and all other Mexican neotropical sites. The ND4A sequences used for the phylogeographic analysis from GenBank were: GQ202181-GQ202190;GQ202161-GQ202167; GQ202158; GQ202155; GQ202151; AF454697-AF454699; AF454695; AF454685-AF454694; JN620155-JN620169; JN620171-JN620177 [4,18]. A median joining (MJ) haplotype network was obtained using Network v.4.6.1.1 [55] (http://www.fluxus-engineering.com) assuming epsilon of 0, to evaluate the genealogical relationships between haplogroups. Divergence times among haplogroups were assessed using Beast v.1.7 [56]. The best-fitting model of evolution for each dataset was estimated using the Bayesian Information Criterion (BIC) implemented in JModeltest v.2 [57]: a) the GTR+I+G for ND4A (Berriozábal, all other Mexican Neotropical sites from this study, and continental GenBank sequences), and b) the HKY+G model for haplotypes from concatenated sequences of ND4A + ND4B (Berriozábal and all other Mexican Neotropical sites). An uncorrelated lognormal relaxed-clock was calibrated using the nucleotide substitution rate estimated by Pfeiler et al. [58]. This mutation rate was corrected from that previously described [59], which resulted in a rate of 0.11 substitutions/site/million year (Myr). The priors for the molecular clock were normally distributed with a mean equal to 0.11 and a standard deviation of 0.011. The posterior distributions of parameters were estimated using the Coalescent Exponential Growth option for the Metropolis-coupled Markov chain Monte Carlo (MCMC) analysis. A total of 100 x 10 6 steps sampled every 10,000 generations was used for the ND4A dataset. A total of 30 x 10 6 steps samples every 3,000 generations were used for the concatenated sequences of ND4A + ND4B. The first 10% of trees were discarded as burn-in. The appropriate mixing of the MCMC search was analyzed using Tracer v.1.7 [60] by calculating the effective sampling sizes (ESS) for each parameter; ESS higher than 200 indicates convergence of the search. The maximum clade credibility tree was produced using TreeAnotator v.1.7 of the Beast package and visualized using FigTree v.1.4 [61]. Sequences of T. phyllosoma (access number GenBank JX848650) which belongs to the closely related phyllosoma complex and T. nitida (access number GenBank JX848652) belonging to the more distant protracta complex, were used as outgroups for phylogenetic inferences of ND4A, while sequences of T. pallidipennis (phyllosoma complex) generated in this study was used as outgroup also for ND4A (accession number Gen-Bank MH410811), and for the concatenated ND4A + ND4B sequence (S1 File).

Historical T. dimidiata demographic changes
Historical T. dimidiata population changes were inferred using the Bayesian Skyline Plot (BSP) implemented in Beast v.1.7 [56] (http://beast.bio.ed.ac.uk/). This coalescent-based demographic model uses standard MCMC sampling procedures to estimate the posterior distribution of the effective population size over time directly from heterochronous sequence data using best-fit substitution models, which were calculated using JModeltest v.2 [57]. Analyses were performed for A) ND4A from Berriozábal + other Mexican Neotropical sites + continental GenBank sequences, B) ND4A for Hg3 from CA and Colombia, C) ND4A from Berriozábal + other Mexican Neotropical sites, D) the concatenated ND4A + ND4B sequences from Berriozábal and other Mexican Neotropical sites, E) the concatenated ND4A + ND4B sequences only from Hg2 from Berriozábal and other Mexican Neotropical sites, and F) ND4A from Hg3 of Colombia [19]. Convergence to the stationary distribution and acceptable mixing were analyzed for effective sampling sizes greater than 200, using the Tracer v.1.7 [56]. The plots were obtained with Tracer v.1.7.

Trypanosoma cruzi infection and bloodmeal source of T. dimidiata haplogroups
Trypanosoma cruzi infection prevalence, genotype success, and DTU prevalence for each dimidiata haplogroup was calculated for each sub-region in Berriozábal, for all Berriozábal county, for all other Mexican neotropical sites, and for the complete Mexican dataset. A Chi squared frequency analysis of independence with a Bonferroni correction was used to compare infection frequency among haplogroups. Differences between haplogroups in T. cruzi genotype success for DTUI and DTUVI were analysed using the log-likelihood ratio (G-test) with Wil-liams´correction for sample size variation. All tests were considered significant if p < 0.05 [49]. An animal:human bloodmeal index (% animal blood/ % human blood) in bugs was calculated for each haplogroup and sub-region in Berriozábal separately.

Specificity of ND4A and ND4B fragments for dimidiata complex haplogroups
A total of 314 specimens from Berriozábal and an additional 92 specimens from other Mexican Neotropical sites were analyzed using both ND4 fragments. Overall, only 57.3% of specimens amplified (180/314) of those collected in Berriozábal. 55.4% of all samples with quality DNA amplified using the ND4A fragment and 85.9% amplified using the ND4B. ND4A primers amplified all Hg1, but only 56.6% of Hg2, and 38.5% of Hg3 (all having quality DNA; Fig 2). In contrast, ND4B did not amplify from two specimens of Hg1(with quality DNA) from Berriozábal, although it did from 86.8% of Hg2, and 100% of Hg3. Amplification success frequency was significantly less than expected for Hg1 in Berriozábal, but this may have been due to low sample size (X 2 = 18.2, df = 2, p < 0.00001). The difference between amplification success using ND4A and ND4B was significant for Hg2 (G = 4.9, df = 1, p = 0.02).
The amplification success rate for other Mexican Neotropical specimens was 80.2% globally, 67.4% using the ND4A and 82.0% using the ND4B. The ND4A amplified in over 88.0% of Hg1 specimens, significantly more than expected (X 2 = 8.2, df = 2, p = 0.004), while it amplified significantly less than expected in Hg2 (44.7%, X 2 = 8.9, df = 2, p = 0.002), and amplified 66.6% of Hg3 specimens. Using the ND4B, Hg1 were amplification significantly less than expected (X 2 = 6.7, df = 2, p = 0.009), while 94.7% of Hg2 and 100% of Hg3 specimens amplified (Fig 2). The difference in amplification success using ND4A and ND4B was significant only for Hg2 (G = 6.9, df = 1, p = 0.008), similar to that for Berriozábal populations. The distribution of haplogroups identified from Berriozábal (Fig 1) and other Mexican neotropical sites in Mexico is summarized in Fig 3 and S1 Table. Comparative haplotype diversity at the landscape level Haplotype diversity was analyzed from single landscapes in two sub-regions: A) IZ from subregion A, and B) NM and RB in sub-region B. Ignacio Zaragoza is the only community sampled to date in Mexico where all three dimidiata haplogroups are sympatric (nymphs and adults). However, only one Hg1 specimen had quality DNA in IZ, while another specimen of this haplogroup was collected in another community in the same sub-region, and hence we only analyze Hg2 diversity. Hg2 from IZ had two haplotypes using ND4A (N = 3, Hd = 0.667 ± 0.314), and three for ND4B (N = 8, Hd = 0.679 ± 0.22). There was only one haplotype for Hg3 in IZ using either ND4A (N = 2) or ND4B (N = 6) fragment. Five haplotypes (N = 12) with moderate diversity for ND4A (Hd = 0.667 ± 0.141) were identified in Hg2 from NM, while two haplotypes (N = 13) and low diversity (Hd = 0.154 ± 0.126) were identified in Hg2 from RB. Using the ND4B, there were four Hg2 haplotypes (N = 13, Hd = 0.423 ± 0.164) in NM, and four in RB (N = 24, Hd = 0.236 ± 0.109). There was only one haplotype of Hg3 in NM.

Triatoma dimidiata Hg2 genetic diversity among Berriozábal sub-regions
A total of 43 sequences of ND4A from T. dimidiata Hg2 were analyzed from four of the Berriozábal sub-regions: A (N = 5, 3 communities), B (N = 25, 2 communities), C (N = 6, 2 communities), and D (N = 7, 3 communities) ( Table 1). There were 3.2% polymorphic sites and 10 haplotypes, varying from two to six according to sub-region. Global haplotype diversity was  Phylogeography and potential center of origin of the Triatoma dimidiata complex moderate (Hd = 0.482 ± 0.094), while nucleotide diversity and polymorphism index were low (π = 0.005 ± 0.001; θ = 0.008 ± 0.001, respectively). Highest genetic diversity was in sub-regions A and C, north and west in the county, located furthest from primary transportation routes and where there was the lowest proportion of modified landscape (Fig 1). The B sub-region had a significant negative neutrality test (population expansion signal) (Tajima test D = -1.926, p < 0.05) ( Table 1A,  A total of 66 sequences of ND4B were analyzed from Hg2 from the sub-regions A (N = 11), B (N = 37), C (N = 6), and D (N = 12). There were 8.9% polymorphic sites and 11 haplotypes (ranging between 3 and 5 in each sub-region). Global ND4B haplotype diversity was moderate (Hd = 0.584 ± 0.070), but nucleotide diversity and polymorphism index were low (π = 0.010 ± 0.001; θ = 0.018 ± 0.005, respectively). The same sub-regions with greatest genetic diversity for the ND4A fragment, A and C, were also the highest using the ND4B. A significant positive neutrality test (bottleneck signal) was identified in the A sub-region only (Table 1B,

Population differentiation and genetic distances among haplogroups
A high genetic differentiation was observed among T. dimidiata haplogroups from Berriozábal, with global F ST values of 0.936 (p < 0.0001) and 0.918 (p < 0.0001), for ND4A and ND4B,  (H44, H46), and an additional 8 mutations from the remaining principal Hg2 sub-clade west and north of the Tehuantepec Isthmus (Fig 4, S1 Table).  Table).

Phylogenetic relationship and divergence times using the concatenated ND4A and ND4B fragments
The relationship between haplogroups from all Mexican neotropical sites (including Berriozábal) using both concatenated ND4 fragments was analyzed using 51 unique haplotypes (Fig 6). The network distinguishes three haplogroups separated by 48-73 mutational steps. The Hg1 has samples from YP and Palenque (Chiapas), which are separated by at least 50 mutational steps from Hg2 (Playa Vieja, Oaxaca). The Hg3 contains samples from Chiapas, H51 has samples from Berriozábal (Hg3B; Nuevo Progreso and IZ), which are separated from the Pacific coast Chiapas clade (Hg3A; Manacal, Nuevo Horizonte, Tapachula). The greatest number of mutational steps separating haplogroups using the concatenated fragments is between Hg1 and Hg3, not between Hg1 and Hg2, as when only the ND4A is analyzed (including all Central America and Colombia samples).  Divergence time from the concatenated fragments ND4A and ND4B was analyzed using the same 51 unique haplotypes (Fig 7). The most appropriate model was HKY+G (-InL = 2045.9746, Delta BIC = 0, BIC = 5170.6641) with gamma = 0.196. The phylogenetic tree separates again the three primary haplogroups (PP = 1.0), a topology similar to that using only ND4A. The MRCA for the primary divergence was estimated at 0.85 MYA (95% HPD interval = 0.42-1.5 MYA). The combined Hg2-Hg3 clade divergence was estimated at 0.46 MYA (95% HPD interval = 0.24-0.76 MYA), more recent than using the ND4A alone. Secondary divergence of both Hg2 and Hg3 clades occurred prior to that of Hg1, at 0.22 MYA (95% HPD interval = 0.11-0.39 MYA) and 0.25 MYA (95% HPD interval = 0.11-0.45 MYA), respectively. While divergence time of the Hg2 is similar whether analyzed using ND4A alone, or with both fragments, this was not the case for Hg3 (probably due to greater sample bias of only 35 to 60% of Hg3 populations amplifying with the ND4A primers). Hg3 divergence gives rise on the one hand to the current designated clade Hg3B (H51, Berriozábal), and the Hg3A clade, the latter only with haplotypes from the Chiapas Pacific coast. Hg2 divergence gives rise to three distinct sub-clades using the concatenated sequences, the oldest or last to diverge contains Pacific coast populations (Pacific coast west of the Tehuantepec Isthmus), while the primary Hg2 sub-clade further diverges (PP = 1.0) giving rise to one sub-clade containing

Historical population changes of T. dimidiata haplogroups
The BSP for the concatenated sequences (ND4A + ND4B) using corrected mutation rates of all haplogroups of only Mexican neotropical populations reported herein, indicates an increase in effective population size approximately 0.025 MYA (Fig 8A). The BSP for Hg2 alone, using the concatenated dataset from the Mexican neotropical sites, also indicates a population increase at approximately 0.02 MYA (Fig 8B). Re-analysis of a Colombian dataset for ND4A alone [19], using a corrected mutation rate, indicates an increase in population size approximately 0.005 MYA (Fig 8C). Since previous studies had suggested much older increases in effective population size, but only used the ND4A fragment, we reanalysed the continental dataset and separately, the CA/Colombian dataset (Hg3) which estimated an MRCA of 0.05

Trypanosoma cruzi and DTU infection in T. dimidiata haplogroups
A total of 314 specimens of T. dimidiata were analyzed for T. cruzi infection, from 16 communities in Berriozábal. Overall prevalence was 18.8% (Table 4), with significantly more infected in the dry (29.1%) vs. the rainy season (9.6%) (G = 8.801, df = 1, p � 0.003). Trypanosoma cruzi infection in bugs was not associated with higher zoonotic blood sources (animal:human bloodmeal index), although highest human bloodmeal rates (9.5% vs 11.9%) occurred in the region with greatest human population movement (sub-region D; Table 4). Infection prevalence was similar among sub-regions overall, but for Hg2, the only haplogroup collected in all sub-regions, it was not uniform across the communities within sub-regions, with null or lowest rates measured from the southern historically modified landscapes (E) and where there was highest elevation (D) ( Table 5). High infection prevalence was found in the most conserved Mexican Neotropical sites and for C) ND4A from Colombia only Hg3 [19]. The thick black line is the median estimate and the solid (blue) interval shows the 95% highest posterior density limits.

Discussion
Phylogeographic analyses of Mexican neotropical populations of T. dimidiata analyzed herein, support previous studies providing evidence for three primary haplogroups (Hg1, Hg2, Hg3) [8,9,12,14,18,19,23,30,35]. There is also significant support for further genetic divergence of both Hg2 and Hg3 with clear geographic structuring [18,35]. A recent re-analysis [36] suggests designation of three species for the dimidiata complex which do not coincide with phylogeographic analyses: T. dimidiata s. s. (includes Hg2 and Hg3 from the present study), T. sp. aff. dimidiata (Hg1 from the present study) and T. sp. aff. dimidiata-caves (a cave haplogroup not present in Mexico). This latter proposal does not consider evidence for genetic, phenetic, and ecological niche traits differentiating Hg2 and Hg3 haplogroups. The MRCA for the primary divergence of all current T. dimidiata haplogroups was dated herein at 0.97 MYA, within the latter half of the middle Pleistocene. This is more recent than another estimate using uncorrected substitution rates (1.1-1.8% per Myr) by Monteiro et al. [18]. The correction is necessary since the loss of transient polymorphisms by selection and drift, particularly for mitochondrial genes, reduces the long-term substitution rate relative to the short-term mutation rate by 10-fold. The first clade to further diverge at 0.60 MYA gave rise to two haplogroups, Hg2 distributed almost exclusively west and north of the Tehuantepec Isthmus (recent collections in the YP are the exception) [5,8], and Hg3 only distributed east and south of the Tehuantepec Isthmus into northern South America (NSA). Both of these haplogroups further diverged, Hg2 most recently at 0.25 MYA separating Gulf and Pacific coast clades. The second clade, Hg3, had multiple divergence events, the first at 0.44 MYA giving rise to the two currently designated Hg3B (Berriozábal/northern Chiapas, CA and NSA), and Hg3A (Pacific coast Chiapas, Santander and Guajira, Colombia). The former clade, Hg3B, again diverged at 0.25 MYA, around the same time as Hg2, into one sub-clade with populations only from areas around the Chimalapas/Tehuantepec Isthmus (both west and in Berriozábal), and a second sub-clade with all CA and NSA populations. It is important to highlight that Hg3 is the only group successfully dispersing south from southern Mexico to NSA and has the broadest distribution among the three haplogroups [9,10]. The last major group to diverge was Hg1, the basal clade, at 0.20 MYA, coincidental with other divergence events of all haplogroups, and with the second to last interglacial period. Hg1 distributions are restricted east of the Tehuantepec Isthmus (no collections west to date), predominately into the YP and its continuous northern Guatemala region, but no further south through CA or NSA [8,9,14,36]. Ibarra-Cerdeña et al. [34] estimated that speciation events of Neotropical triatomines occurred principally during the Pleistocene. Indeed, more than nine interglacial cycles of varying magnitude were recorded since 0.97 MYA in the American continent, principally in the NA Nearctic realm, with alternating phases of cold and semi-arid conditions during the glaciations and warm/humid conditions during the interglacial or deglaciation periods [62,63]. Although less important or existent in the Neotropical Mesoamerican region, northern continental glacial/deglaciation events played an important role on atmospheric conditions in the Neotropics, as well as on landscape fragmentation, mammal extinctions and divergence, provoking population contractions, followed by expansions, and dispersal [64]. Effective population increases of the three majors Mexican haplogroups analyzed together, and separately for Hg2, coincided with the peak of the Last Glacial Maximum (LGM; 0.03-0.02 MYA). This would coincide with the last great faunal extinctions, and subsequent faunal expansions (south and northward), including presence and expansion of humans in the American continent. Mexico was one of the major agricultural crop developers and exporters south to CA and SA during the Holocene [65]. Effective population expansion of Hg3 from Colombian sub-groups is the most recent (0.005 MYA), although this may be preliminary, and should be reanalysed using populations previously analyzed, perhaps using the ND4B fragment to balance regional representation [19].
The driest and coolest period in the Neotropical region (both Mexican coasts and the Yucatan peninsula, south through the Tehuantepec Isthmus/Chimalapas rainforest, to CA) was post-LGM, based on a greater proportion of carbon from tropical forests in the region as compared to LGM, or pre-industrial Holocene [64,[66][67][68]. The Chimalapas rainforest traverses the Tehuantepec Isthmus, an important biogeographic barrier [69,70], which is the southern border of the Mexican Trans-volcanic belt and Nearctic realm, and it is the northwest limit of the Mesoamerican biodiversity region, and Neotropical realm. The Mexican neotropical bioregion also extends north only along the Gulf of Mexico through Veracruz, and west along the Pacific coast to Jalisco, but not the intermediate high plains region (between Sierra Madres). All three T. dimidiata haplogroups have only been collected sympatric (all life stages) in the landscape along the northwest region of Chiapas in Berriozábal, which is the eastern edge of the Chimalapas. This may be a fortuitous collection (in more than one community), or it may be key evidence to help interpret T. dimidiata complex evolution. Ecological niche models for the three haplogroups overlap in this region [10], and intensive bug collections over representative portions of the Mexican neotropical region over two decades, as well as passive collections reported in other studies over the last 70 yrs, have not reported the three haplogroups sympatric since it was assumed that they would not be reproductively isolated. This leads us to propose that the MRCA of the T. dimidiata complex evolved in the Chimalapas/Tehuantepec Isthmus region. In order to analyze this hypothesis further, more complete collections of the two minority haplogroups (Hg1 and Hg3) and further genetic analyses of all three haplogroups will be necessary with appropriate sample design for geographic and ecological relevance. Both at regional (Berriozábal) and Mexican neotropical geographic scales, all haplogroups were significantly differentiated (F ST > 0.7), indicating an insignificant amount of genetic exchange among them. Similar differentiation has been reported previously using ND4A and nuclear markers [4,9,13,18,19]. However, greatest differentiation was estimated herein between Hg1 and Hg3 using both fragments, in contrast to average but highest differentiation between Hg1 and Hg2 reported in a previous study, albeit with a sample representation bias [14], and herein using only the ND4A. Hence, care should be taken to interpret data solely using this latter fragment alone. The evidence further supports the clear differentiation of Hg2 and Hg3 for future systematic revision and species designations, which should also consider the vast evidence for phenetic distinctions, most recently using evidence for dimidiata haplogroup interactions mediated by volatile compounds [30][31][32]. Antennal phenotype was significantly different between Hg2 and Hg3 females, and significantly different between Hg1 and Hg3 males [32], whileHg3 (specifically Hg3A) released significantly fewer alarm compounds as compared to both Hg1 and Hg2 [30].

Genetic diversity of haplogroups
At a regional analytical scale (Berriozábal county), lowest genetic diversity (haplotypic and nucleotide), with evidence of bottlenecks, occurs in the most anthropically modified and transited eastern and southern sub-regions. Highest diversity for Hg2 (the most abundant haplogroup in Berriozábal county) was, as expected, in the most environmentally conserved subregions, with moderate to high haplotype, although low nucleotide diversity, indicating at least specifically for this haplogroup, recent bottlenecks [71]. This trend was also evident in other regions where it is distributed in Mexico [5]. Hg2, the second most widely distributed haplogroup of the complex, is distributed along two principal axes only in Mexico naturally (1) the Gulf of Mexico coast and northern foothills of the Eastern Sierra Madre, and (2) the Pacific coast and foothills west and south of the Western Sierra Madre from Colima to the Isthmus in Oaxaca. Both of these distributions are west and north of the Tehuantepec Isthmus. Recent Hg2 dispersal to the YP may have occurred due to human migratory trends to and from the Chol and Tzeltal populations in Tabasco and northern Chiapas (northern Tehuantepec Isthmus) to the YP and from human migrations in the last 50 yrs, an important consideration for vector control programs [5,8,41]. Hg1 is the oldest of the lineages, distributed in the YP, northern Chiapas (Palenque National Park and Arqueological area only, northern Berriozábal) and northern Guatemala, and has high genetic diversity and expanding populations. This basal haplogroup has the second narrowest distribution of the complex, T. hegneri having the most restricted [10].
Diversity of the Hg3 from the Pacific coast of Chiapas (Hg3A) was higher than that of Hg3B from Berriozábal (only one haplotype). Hg3 is distributed east and south of the Tehuantepec Isthmus through CA to Colombia and other NSA countries. It has also been reported from the northern edge of the Tehuantepec Isthmus in Tuxpan, Veracruz [8], where the Grijalva river (origin in Huehuetenango, Guatemala), opens into the Gulf of Mexico. The Grijalva runs west and north from Guatemala through the northern Chiapas mountain range and borders Berriozábal´s sub-region A. Its course is a common human migratory route, also used to transport wood, exotic species, agricultural products, and commerce [72,73]. Hg3 diverged approximately 0.44 MYA into two subgroups, the first designated Hg3B distributed from the Chimalapas/Grijalva river basin of northern Chiapas, through CA, to Colombia, Peru, and Ecuador. The Hg3A is only found in one restricted region of the southern Pacific Soconusco coast of Chiapas in Mexico (no more than a few samples collected across the Suchiate river in San Marcos in the Mexico-Guatemala border region) and in two Colombian regions (Guajira and Santander) [9]. More importantly, it is absent from all other regions in Chiapas where the Hg3B occurs, and any region thus far analyzed in the Mexican Neotropics. Phylogenetic inferences could be explained based on the historical population exchange that has occurred between this pre-classical Mayan region (Izapa, Chiapas) with Incan populations of Colombia, and may have established either in Chiapas or Colombia through cultural/agricultural hitch-hiking and a founder effect. Since the Hg3 is the only haplogroup shared from Mexico to NSA, information regarding its population structure, dispersal ecology, and vector capacity will be important to design, monitor, and analyze regional vector control and T. cruzi transmission prevention.

Vector capacity and DTU specificity in dimidiata haplogroups
Hg1 and Hg2 infection frequencies for bugs from Berriozábal and from sub-region A (where the three are sympatric in two modified landscapes) were similar, both significantly higher than Hg3. Hg1 infection was also significantly more prevalent than expected in the rest of the Mexican neotropical region (double that of Hg2), while Hg2 infection was significantly less than expected overall, but similar to that in Berriozábal. All pairwise differences in haplogroup infections were significant using the complete Mexican dataset. Since there is important mammal species turnover where the three haplogroups are distributed, differences in T. cruzi infection may not be related to intrinsic specificity, but may be due to differing infection rates in reservoirs. However, data from synanthropic rodents present in all three distribution regions indicate similar infection prevalence across the region which does not coincide with haplogroup infection differences based on host infections. There was also similar infection prevalence in bats in areas with either Hg1 or Hg2 [5,6,74].
Trypanosoma cruzi infection in bugs overall was not associated with higher zoonotic blood sources, although in the domestic habitat, there was high mixed human-zoonotic bloodmeal rates. The availability and mix of bloodmeal sources may provide the opportunity for vector population amplification, and a benefit for T. cruzi maintenance in human modified habitats. Triatoma dimidiata (Hg3) collected from houses in Guatemala, (examined using ELISA), had fed most on humans, although mixed meals with opossums and cows were also reported [75], while studies of feeding patterns have frequently documented multiple hosts by domestic insects [76][77][78][79]. Bugs from Berriozábal having only human or mixed bloodmeals, had predominantly T. cruzi DTUI, while those with only zoonotic sources had both DTUI and DTUVI [80]. In addition to feeding patterns, insect density and spatial and temporal occurrence of hosts and insects contribute to host choice [81], and the number, identity, and proximity of synanthropic species also influence feeding patterns [77]. Biotic interaction models for all haplogroups predict significant links are strongest with synanthropic mammal reservoirs [39], an important indicator for T. cruzi transmission exposure hazard [41]. The threshold for reservoir diversity and abundance below which T. cruzi or specific DTUs are no longer transmitted, has yet to be analyzed.
Independent of infection prevalence, a triatomine´s capacity to sustain one DTU, may not be similar to another, but may be also related to the presence or efficacy of parasite acquisition from reservoir communities. DTUI and DTUVI frequencies were significantly different among the three T. dimidiata haplogroups. DTUI and DTUVI infection rates in Hg1 and Hg3 were significantly different, while there was no difference between them in Hg2, indicating DTUVI amplification in this latter haplogroup. The DTUI frequency in Hg1 was significantly more than expected, while that in Hg2 was significantly less. In contrast, DTUVI frequency in Hg1 specimens was significantly less, while more than expected in Hg2. Not only is the prevalence of T. cruzi DTUs among haplogroups different, but specific parasite haplotypes may be haplogroup-specific. A unique DTUI haplotype found only in Hg2 in IZ/sub-region A, was not found in any of the few Hg1 and Hg3 infected specimens from the same site. Additionally, two unique haplotypes of DTUVI were identified in Hg2 from Berriozábal, and only from one other site in the Mexican Neotropics, on the western edge of the Chimalapas/Tehuantepec Isthmus; the zoonotic source of these haplotypes has yet to be identified [74,80]. There was no genetic differentiation of DTUI (both 18S and ME haplotypes) from sylvatic or human reservoirs, and bugs, within Berriozábal, although there was genetic differentiation of these populations with those from Palenque (eastern Chiapas, Hg2 and Hg1), from Tapachula (southern Pacific coast Chiapas, Hg3A), and from southern Campeche (Hg1 and Hg2) [74,80]. This information is important for vector control strategies, since not only are there specific vector populations expanding along the socioeconomic gradient, but they are transmitting distinct parasite populations, despite relative proximity (300-1000 km).
There is sufficient genetic evidence currently, not only to distinguish the three principal T. dimidiata haplogroups, but also to identify evolving differences in parasite populations, a key component of vector capacity. Characterizing and understanding bug haplogroups (genetic and phenetic traits), and this species complex´evolution, will allow us to analyze associations of life history traits with population amplification across regions, and expansion potential across each haplogroup´s distribution. It also allows us to understand the impact of humanmodification and practices within landscapes, and the transmission of specific parasite populations to the human population, as well as other synanthropic reservoirs.
On a final note, preliminary studies regarding the dimidiata complex population genetics in Mexico, alerted us to a bias in annealing of parasite DNA to the existing ND4 primers, which were designed based on a large fragment of the ND4 gene [42]. This bias occurred differentially depending on the collection site in Mexico, which also coincided with haplogroup type. This bias is probably based on an increase in nucleotide polymorphisms at primer binding sites of the originally designed ND4 oligonucleotides (herein ND4A), and therefore in order to reduce bias for landscape genetic studies, and phylogeography, a complementary fragment of the same gene was identified and used. Although principal phylogenetic topology is the same using only the previously existing fragment (ND4A), the use of the two fragments (concatenated ND4A and ND4B), reveals greater structure of more recent divergence of subclades (i.e. 3 within Hg2 and potentially 3 in Hg3). Together, both fragments eliminate the bias regarding proportional sample representation across geographic sites, and provide more precise analysis of genetic diversity and differentiation particularly at the landscape scale, since the ND4B is more polymorphic. Additionally, there is an important bias in network analyses using the ND4A alone, since Hg1 is the most distant from Hg3, when the concatenated fragments are used. Effective population increases, probably biased by haplogroup representation in sample sets, was double when the ND4A alone was used. Hence, recent regional and landscape level population shifts cannot be analyzed appropriately using the ND4A alone. Other mitochondrial markers, also used in previous studies (cyt b, COI) may have similar bias.
Trypanosoma cruzi infection prevalence in the Berriozábal region was associated with a reduction from north to south, which also coincided with conserved to highly modified landscapes, and highest human bloodmeal rates (occurring in the region with greatest human population movement). Understanding gene flow and migration of insect vectors is critical to effective control and elimination of vector-borne diseases, and in the present study we have contributed to a growing body of evidence regarding differences among the three dimidiata complex haplogroups. These primary haplogroups have different traits related to ecological niche and distribution, persistence or expansion in modified habitats, modification of food source abundance in anthropic habitats, and T. cruzi infection and DTU specificity. At least Hg1 has expanding populations in Mexico, an alarm bell for vector control programs. Despite the presence of the dimidiata complex across the Neotropical region of Mexico, the three main haplogroups are sympatric only in the western region of Chiapas, which is the eastern fringe of the Chimalapas rainforest, primary ecotope of the Tehuantepec Isthmus. The Isthmus is an important biogeographic barrier diverging many taxa west and east between 2.4 and 10 MYA [69,70]. In the present study, there is evidence of a significant difference of T. cruzi lineages in the three T. dimidiata haplogroups. Knowledge of the parasite´s genetic diversity in an area is important to analyze efficacy of therapeutic tools, since T. cruzi DTU or haplotype may be associated with cardiomyopathy manifestations, parasitemia in infected individuals, or susceptibility to pharmacological agents [82]. Vector diversity has direct implications for the design and success of vector control strategies of the dimidiata complex in Mexico and the Mesoamerican biodiversity region [83].
Supporting information S1