Human T-cell lymphotropic virus type 1 transmission dynamics in rural villages in the Democratic Republic of the Congo with high nonhuman primate exposure

The Democratic Republic of the Congo (DRC) has a history of nonhuman primate (NHP) consumption and exposure to simian retroviruses yet little is known about the extent of zoonotic simian retroviral infections in DRC. We examined the prevalence of human T-lymphotropic viruses (HTLV), a retrovirus group of simian origin, in a large population of persons with frequent NHP exposures and a history of simian foamy virus infection. We screened plasma from 3,051 persons living in rural villages in central DRC using HTLV EIA and western blot (WB). PCR amplification of HTLV tax and LTR sequences from buffy coat DNA was used to confirm infection and to measure proviral loads (pVLs). We used phylogenetic analyses of LTR sequences to infer evolutionary histories and potential transmission clusters. Questionnaire data was analyzed in conjunction with serological and molecular data. A relatively high proportion of the study population (5.4%, n = 165) were WB seropositive: 128 HTLV-1-like, 3 HTLV-2-like, and 34 HTLV-positive but untypeable profiles. 85 persons had HTLV indeterminate WB profiles. HTLV seroreactivity was higher in females, wives, heads of households, and increased with age. HTLV-1 LTR sequences from 109 persons clustered strongly with HTLV-1 and STLV-1 subtype B from humans and simians from DRC, with most sequences more closely related to STLV-1 from Allenopithecus nigroviridis (Allen’s swamp monkey). While 18 potential transmission clusters were identified, most were in different households, villages, and health zones. Three HTLV-1-infected persons were co-infected with simian foamy virus. The mean and median percentage of HTLV-1 pVLs were 5.72% and 1.53%, respectively, but were not associated with age, NHP exposure, village, or gender. We document high HTLV prevalence in DRC likely originating from STLV-1. We demonstrate regional spread of HTLV-1 in DRC with pVLs reported to be associated with HTLV disease, supporting local and national public health measures to prevent spread and morbidity.


Introduction
Human T-cell lymphotropic virus type 1 (HTLV-1), the first discovered human retrovirus, is widespread globally and can be highly oncogenic in some infected persons [1,2]. With increased human mobility and migration and limited communication regarding effective prevention strategies, HTLV-1 remains a significant public health threat [3,4]. Furthermore, two new HTLV groups were discovered recently and their epidemiology is poorly understood [5][6][7][8]. Thus, a deeper understanding of the epidemiology of HTLV is increasingly important, with leading HTLV scientists and public health experts calling for renewed efforts to eradicate HTLV [3].
Molecular characterization of primate T-lymphotropic viruses (PTLVs), which consists of both simian and human T-lymphotropic viruses, reveals that some HTLV subtypes share closer genetic ties to certain simian T-lymphotropic viruses (STLVs) than to other HTLV subtypes, suggesting sustained zoonotic transmission of STLV between nonhuman primates (NHPs) and humans [12]. Phylogenetic and epidemiologic evidence from Central Africa supports the notion that, in addition to known human transmission pathways of HTLV (sexual, mother-to-child, sharing of needles, transplantation of infected tissues, and blood transfusions), crossover events of interspecies transmission of STLV to humans have occurred [4,7,12,14]. Since HTLV testing is not routinely employed in most countries except for blood banks in developed nations, asymptomatic carriers can unknowingly transmit HTLV both vertically and horizontally. HTLV-1, HTLV-3, and HTLV-4 have all been shown to originate from closely related STLVs (STLV-1, STLV-3, and STLV-4, respectively), whereas HTLV-2 is more distantly related to STLV-2 making its origin less clear. Phylogenetic analysis shows that all HTLV-1 subtypes except for cosmopolitan subtype A likely have primate origins [22].
Regions of frequent and close contact with wild animals have been implicated in the initiation and propagation of major infectious disease zoonoses throughout history, highlighted by the current COVID-19 pandemic [23][24][25]. In settings where environmental barriers between humans and animal habitats are diminished, exposure to the tissues and bodily fluids of wild animals can lead to the spread of zoonotic agents, including STLV. In the Democratic Republic of the Congo (DRC), an estimated 52 million people reside in rural, often densely forested areas and rely heavily on the hunting, trading, and consumption of bushmeat including that from NHPs, as a major source of nutrition and income [26][27][28]. Studies from densely forested populous areas of this Central African nation estimate that small diurnal monkeys, specifically Cercopithecus spp. and Cercocebus spp. are preferred protein sources and account for around one third of the bushmeat market [29,30]. The encroachment of local populations on these forests of the biodiverse Congo Basin for nutritional and economic supplementation provides opportunities for cross-species transmission via bodily fluid exchange with smaller species as well as NHPs, increasing the probability of novel HTLV emergence [9,10,19]. Divergent STLV-1, STLV-2, and STLV-3 have been reported in monkeys and apes in DRC, combined with the hunting and eating of NHPs in this area, increases the likelihood of exposure to these viruses [31][32][33][34]. In fact, one recent study reported an association of severe NHP bites with HTLV-1 infection and showed that a significant number were genetically related to STLV-1 from gorillas and monkeys, though human-to-human transmission could not be excluded [33]. Concerningly, high-risk exposures to multiple species of wild animals has been observed in the DRC and other ecologically/anthropologically similar regions of Central Africa, making this region a hotspot for continued pathogen spillover with potential novel disease initiation events [35][36][37][38].
Despite the ubiquity of close animal contact throughout DRC and other parts of Central Africa, little evidence exists to describe the contact type most likely to result in increased zoonotic transmission of PTLV infection, based on species encountered, animal interactions, and contact frequency [32,33,39,40]. Data to examine the risk of familial and intra-household transmission of HTLV once STLV crosses over into humans are similarly sparse, hampering our understanding of secondary transmission of HTLV [41,42]. To better understand these zoonotic and person-to-person transmission pathways, we conducted a population-based survey among residents from two health zones in the rural Sankuru province of the DRC to assess the prevalence, epidemiologic risk factors, and markers of HTLV infection in this highly bushmeat-exposed population. Previously, we have shown that persons exposed to NHPs in this population were infected with simian foamy virus (SFV), another simian retrovirus, highlighting their risk for exposure to additional simian retroviruses [36]. In our current study, we aimed to characterize the transmission dynamics of HTLV, and to evaluate pathways for STLV zoonotic transmission from animals to humans and from person-to-person (sexual, vertical) in this rural Congolese population.

Ethics statement
The UCLA Institutional Review Board (IRB #10-000094-CR-00009) and the Kinshasa School of Public Health Ethics Committee approved collection, storage, and future testing of blood samples collected in 2007 from all consenting study participants. A non-research determination was approved for retrovirus testing of anonymized samples at CDC.

Study population
This study was originally conceived of as a means of conducting zoonotic surveillance of monkeypox disease in DRC, and the study design and population have been described [35,36]. Briefly, we conducted a population-based survey in rural villages of Sankuru province, DRC from August to September 2007. Two monkeypox-endemic health zones within Sankuru province, Kole and Lomela, were chosen for study activities; village lists of these two health zones provided by local officials were used to randomly select 9 villages as study sites (Fig 1). All healthy individuals � 1 year of age in selected villages were eligible for enrollment. Local, trained health care workers obtained verbal informed consent from all participating adults and assent from children 7-18 years with parental or guardian consent and administered a questionnaire in either French or the local language, Tetela. Consenting parents and guardians of participants <7 years of age answered on behalf of their children. All participant data was anonymized using a unique ID number of randomly assigned check digits, which was attached to both survey data and biological samples.

Questionnaire administration
We collected socio-demographic information via an orally administered questionnaire for each participant. Household information, including location of household and an individual's role within the household was also collected, with household role categorized according to each respondent's relationship to their respective head of household. Animal exposure data was collected with special care taken to reduce misclassification by translating scientific taxa to local nomenclature. We used focus groups to identify local names of commonly hunted animals in the region and created a handout with representative photos or drawings of each species to aid in bushmeat classification and identification. Participants were asked about the frequency and types of exposures they may have had to over 26 different animal species in the past month, including 11 NHPs found in the Sankuru province of DRC, and they also had the opportunity to specify additional species not included in the standard curated list (See S1  Table for full list of animal species). All surveys were administered by local, trained interviewers. To minimize bias that may be associated with unauthorized hunting activities, questions regarding animal exposures were never prompted or asked in a framework of legality, nor were certain species grouped by their vulnerability or conservation status.

Biological specimen collection and laboratory analysis
Venous blood specimens were collected by trained phlebotomists from all consenting participants using ethylenediamine tetra acetic acid (EDTA)-treated vacutainer tubes (Fisher Scientific, Pittsburgh, PA). Blood specimens were processed for plasma and buffy coats in DRC, stored at -80˚C, and sent to collaborating laboratories at the US National Institutes of Health before being sent to the US Centers for Disease Control and Prevention (CDC) for final analysis.
Amplicons from the tax and LTR PCRs were purified using the Qiaquick PCR purification kit (QIAGEN Inc., Germantown, MD) and sequenced using the Big Dye Terminator Reaction Mix (Applied Biosystems) and an ABI 3500 sequencer. Sequences were assembled using Geneious v 9.0. Genetically related sequences were identified using BLAST searches (http://www. ncbi.nlm.nih.gov/BLAST/) and added to the analyses for comparison along with HTLV references, HTLV sequences previously isolated from DRC, and STLV sequences from various species native to the region. As with other HTLV molecular epidemiology studies, the inclusion of other DRC and African sequences in the phylogenetic analyses helps to elucidate their origin of transmission and evolutionary histories [7,14,32,46,51,52,[54][55][56][57][58][59][60]. We kept the top ten PTLV sequences identified by the BLAST search and then removed any duplicates and LTR sequences < 400 nucleotides in length. For example, STLV-1 sequences (n = 34) recently reported from a variety of NHPs in DRC, including Allenopithecus nigroviridis, Cercopithecus ascanius, C. denti, and C. mitis, were excluded from our analysis since these sequences overlapped our HTLV-1 sequences and most other PTLV-1 sequences at GenBank by < 220-nt and hence were not sufficient for phylogenetic analysis [34]. In addition, the lack of phylogenetic signal in these short tax sequences in the alignment was confirmed using likelihood mapping analysis in IQ-Tree v1.6.0 [61]. We also limited the number of HTLV-1 sequences from a specific study and country, except DRC, to 2-4 taxa to reduce the computational complexity of the Bayesian analysis. We performed DNA alignments using MAFFT v7.017. GUIDANCE2 was used to identify and remove phylogenetically unreliable regions in the alignment at the recommended confidence score of 0.93 [62]. All LTR sequences in the final alignment passed both the composition Chi square test and likelihood mapping analysis in IQ-Tree. We inferred HTLV-1 LTR phylogenies using Bayesian inference using BEAST v1.8.4 [63] with an uncorrelated, lognormal relaxed molecular clock, a birth-death tree prior and 450 million Markov Chain Monte Carlo (MCMC) iterations with a 10% burn-in. These parameters have been shown previously to accurately infer PTLV evolutionary histories [56,60,64]. Convergence of the MCMC was assessed by calculating the effective sampling size (ESS) of duplicate runs using the program Tracer v1.6 (http://tree.bio.ed.ac.uk/software/tracer/). We used the model test algorithm in MEGA v6 to determine the best fitting nucleotide substitution model, which was inferred to be the general time reversible (GTR) model with gamma (G) distribution (GTR+G). An xml file is provided in the supplementary material (S1 BEAST) which includes the sequences and parameters for the BEAST analysis. All parameter estimates showed ESSs > 750 indicating sufficient mixing. The tree with the maximum product of the posterior clade probabilities (maximum clade credibility (MCC) tree) was chosen from the posterior distribution of 9,001 sampled trees after burning in the first 1,000 sampled trees with the program Tree Annotator. Branch support was determined using posterior probabilities. Trees were displayed and edited in FigTree v 1.4.0 (http://tree.bio.ed.ac.uk/software/figtree/). LTR sequences were considered as belonging to transmission clusters if the inferred tree node posterior probabilities were > 0.8 and the members of the cluster shared > 99% nucleotide identity.

Statistical analyses
We used t-test, Wald chi-square analyses, and Fisher's exact test to assess differences in HTLV WB positivity, demographic, behavioral, NHP exposure groups, and differences in pVLs. Wald chi-square tests of proportions were also used to determine differences in demographic breakdown between those with and without eligible biological specimens. From the serologic assays, HTLV WB positivity was classified for EIA-reactive samples showing HTLV-1-like, HTLV-2-like, or untypeable profiles. Samples with indeterminate WB results were classified as negative since these results are rarely from persons with HTLV infection [43,49,65]. Stratified and adjusted analyses were performed via logistic regression to compare the magnitude and significance of various demographic and behavioral risk factors on HTLV seroreactivity. Only households with at least one HTLV seroreactive individual and complete, identifiable household information were included in household-centered analyses (n = 534 individuals in 217 households). All statistical computations were performed in SAS version 9.3 (SAS Institute, Cary, NC).

Participant sociodemographics and HTLV serology
In total, 4,572 of 5,687 (80.4%) eligible persons were enrolled in the study from 9 villages in Lomela and five villages in Kole (Fig 1). Of these enrollees, 3,071 (67%) had biological specimens of sufficient quality for testing. We found 281/3,071 (9.15%) samples were repeat reactive by ELISA, of which 172 were seropositive by WB, indicating an HTLV seroprevalence of 5.6% in this population. Of these, 135 (78.56%) were HTLV-1-like, three (1.8%) were HTLV-2-like, and 34 (19.8%) were HTLV-positive but untypeable. HTLV-indeterminate profiles were seen in 80 (2.6%) samples ( Table 1). The majority of samples with untypeable WB results (58.8%, Three individuals had HTLV-1-like WBs and were also found to be positive for simian foamy virus (SFV) antibodies and sequences in our previous study [36], indicating dual retroviral infection. Fourteen other study participants were SFV seropositive but HTLV-negative. An association between SFV and HTLV co-infection was not observed (Fisher's pvalue = 0.08995).

Households relationships, role and HTLV WB seropositivity
To analyze potential intra-household transmission of HTLV, a sub-cohort was established of all 165 HTLV seropositive individuals and their 158 known family members. 132 households had just one HTLV seropositive individual, 12 households had two, and four households had three seropositive individuals residing within them. Across the 11 different household roles for which we collected data, 84% of HTLV seropositive persons were either the head of household (HH), a primary or secondary wife (PW or SW, respectively), or a biologically related child (e.g. the nuclear family).

NHP exposure
NHP exposures were divided into two analytical frameworks: exposures based on animal species and exposures based on activity type. For the former, any type of exposure activity with each of nine distinct NHP species was assessed: Cercocebus chrysogaster, Cercopithecus ascanius, Cercopithecus neglectus, Cercopithecus nictitans, Cercopithecus wolfii, Colobus angolensis, Lophocebus aterrimus, Piliocolobus tholloni, and Pan paniscus. For the latter framework, exposure to any NHP during each of six distinct exposure activities was examined: hunting, picking up dead animals, butchering and skinning, cooking, eating, and playing with or being bitten or scratched by a live animal. Chi-squared analysis showed no association between HTLV seropositivity and exposure to any individual NHP species nor to any NHP exposure type. Rates of any NHP contact were similar at each age category (p = 0.1652) and were between 66.7-83.9%. While the majority of NHP activities had odds of HTLV seropositivity with low p values, the odds ratios crossed the null hypothesis (1.0) and were not considered significant (Table 3). NHP exposure activities were also heavily gendered. Over 95% of hunters were men, whereas 76.7% of those reporting cooking NHPs were women. Women were more likely to be exposed to NHPs in general, and were also found to cook or eat NHPs significantly more than men (χ2 p value = 0.0006, < .0001, 0.0051, respectively), whereas men were more likely to hunt or pick up dead NHPs (χ2 p -value < .0001 for both) ( Table 3). The frequencies of men and women reported butchering and skinning NHPs were nearly equal and was not a significantly different (Table 3).

HTLV PCR and phylogenetic analyses
Two hundred and forty HTLV WB seropositive and seroindeterminate persons (96.0%) had buffy coats available for testing by nested PCR using generic tax primers for determination of HTLV group and by generic tax qPCR for simultaneous viral detection and pVL determination. All DNA specimens from the 240 buffy coats had amplifiable β-actin sequences demonstrating the integrity of the extracted nucleic acids. About 40% of the WB seroreactive specimens tested positive for tax sequences by either nested (n = 110) and/or qPCR (n = 107  [66]. One sample with an HTLV-2 WB profile had HTLV-1 tax and LTR sequences. We obtained partial LTR sequences from 109 persons and used these for phylogenetic analyses. A BLAST search identified a total of 36 unique LTR sequences (28 HTLV-1, 8 STLV-1) that were genetically related to those from our study and were included with other African and subtype reference sequences (65 HTLV-1 and 63 STLV-1) for a total of 273 PTLV-1 sequences in the phylogenetic analyses. The Bayesian topology showed the LTR sequences clustered by subtype as expected demonstrating the robustness of our analysis (Fig 2), with those sequences from our study clustering in the Central African subtype B clade. Within the subtype B clade our DRC LTR sequences clustered within two separate clades with strong support (posterior probability (PP) = 1) and both contained STLV-1 sequences (Fig 3). The first clade consisted of 32 HTLV-1 and five STLV-1, including three STLV-1 from different Cercopithecus monkeys from DRC (Cne8, Cwo39, Cas88) and two apes from Cameroon (PtrCAR.875, GgoGolda). Seventeen DRC LTR sequences from our study clustered with three DRC HTLV-1 sequences (N276, N305, N307) from a previous study where all three reported monkey exposure (PP = 0.89) [32]. Two additional DRC sequences from our study clustered with two HTLV-1 from Gabon (StDen, PH559) and one from Cameroon (Lobak89) [33,51,67]. Person Lobak89 reported a severe gorilla bite; NHP exposure information was not available for StDen and PH559. The second clade consisted of 117 HTLV-1 and 10 STLV-1 sequences, including three STLV-1 from Allenopithecus nigroviridis (Angwis, Angmer, Angven) and one from a Cercopithecus monkey (Cwo5), both from DRC (Fig 3) [46,58]. The remaining 7 STLV-1 were from apes from Cameroon (Cam43, GGoCam12, Ggo02Cam3157, GgoM10431) and the Ivory Coast (25.4_Pve, Regina_Ptr). Eighty-five DRC HTLV-1 from our study clustered strongly (PP = 0.97) in a subclade in the 127-member clade with four HTLV-1 from DRC (MWMG, contains 127 taxa, including 85 HTLV-1 LTR sequences from our study, and is highlighted with a yellow background. The branches for those clusters with high support are colored green and marked with an asterisk. Taxa for clusters of persons belonging to the same household are colored orange and are annotated with demographics in the format: StudyID_HouseholdID_sex_village_relationship to head-of-household (HH). F, female; M, male, PW, primary wife; sib, sibling; R, responsible (same as HH); SW, secondary wife; BC, biological child; P, parent; GC, grandchild; GP, grandparent. Taxa for the eight-person cluster with the 11-bp deletion are in green; however, ones from the same household are in orange. STLV-1 taxa are blue and simian species origin are provided as three letter codes (ANG, Allenopithecus nigroviridis, Ggo, Gorilla gorilla; Ptr, Pan troglodytes; Pve, P. vellerosus; Cne, Cercopithecus neglectus; Cwo, C. mona wolfii; Cas, C. ascanius. Country of origin is provided in taxon name when known; Zaire is now DRC; CAM, Cameroon; IC, Ivory Coast; EG, Equatorial Guinea; CAR, Central African Republic; GAB, Gabon; NG, Nigeria; GAM, Gambia. Some branches were collapsed to improve visualization of the DRC genetic relationships. Accession numbers are provided for taxa sequences obtained at GenBank for the analysis. https://doi.org/10.1371/journal.pntd.0008923.g003 GL, MOMS, ITIS) from the 1990s [66,68]. This 89-member DRC taxa clade was sister to a five-member clade containing three DRC sequences from our study (MPX4653, MPX5655, MPX41392) and two DRC HTLV-1 from a previous study (PH1250, L047) for whom NHP exposure was not reported [32,67]. However, separation of these two DRC clades was weakly supported (PP = 0.13). Interestingly, the three A. nigroviridis STLV-1 were ancestral to this 89-member DRC taxa clade with good support (PP = 0.77). Two additional DRC sequences from our study (MPX3275, MPX4653) clustered in an 8-member clade with two HTLV-1 from Nigeria (TBU, HFS), one HTLV-1 from the Central African Republic (12503), an STLV-1 from a Cercopithecus monkey from DRC (Cwo5), and an ape from Cameroon (Ptr_Cam43).
In the BEAST analysis, we found 18 potential transmission clusters that included a total of 50 persons (Table 4). Three pairs of HTLV-1 LTR sequences, 2 persons in a three-member cluster, and three persons in an 8-member cluster all clustered with good support (PP > 0.83) and the participants in each group were from the same household (Fig 3, Table 4). Among these eleven persons from five households, just one cluster had an additional HTLV seroreactive member who was not a part of the cluster. In this house, from the village of Tokondo, a primary wife and a female biological child (MPX8304 and MPX8326) clustered with strong support and one of the three additional male biological children in this household was seroindeterminate, but PCR-negative. Four of these households had additional family members, primarily other biological children who were not positive for HTLV. In the fifth house, only the two found positive participated in the study (MPX20031 and MPX20064). Interestingly, in one cluster a primary wife and a female biological child (MPX7501 and MPX7991) clustered with a male biological child from a different household (MPX7243) despite there being two other male biological children in their household. In one 10-person household in the village of Ndjale, three women were a part of a large eight-person cluster.
It is important to note that we also found several instances (11 pairs, two triads, two fivemember clusters, one 8-member cluster) where sequences from different households, and even different villages, clustered together with good phylogenetic support (Fig 3, Table 4). The DRC map (Fig 1) shows the connectivity of the villages by roadways, which for the two health zones are 130-160 km apart. The majority (8/15, 53.3%) contained only females but there were five male/female pairs, one male/male pair, and one triad consisting of two males (7 and 28 yo) and one female (78 yo). One male/female pair Cluster 13) and the two females and one male in a triad (Cluster 9) were from the same village but from different households. Both females in the triad were from the same household; one was the primary wife and her biological child. Persons in these two pairs differed in age by only four years. In one triad (Cluster 16) and in one pair (Cluster 5) at least one member was even from a village from a different health zone. Both five-member clusters 7 and 18 consisted of four females and one male and were from different villages in the Lomela health zone. The latter five-member cluster also clustered with the HTLV-1-ITIS LTR sequence from an adult male from DRC with HTLV-1-associated myelopathy/ tropical spastic paraparesis (HAM/TSP) [66].
The eight-member cluster (Cluster 8) consisted of all women of different ages from two different villages in the Lomela health zone. Examination of the HTLV-1 LTR alignment showed all eight LTRs had the same 11-bp deletion. The deletion is located just before the first transcription enhancer element in the LTR. Five women were from Ndjale, of who three (MPX5423, 5434, and 5456) were from the same household as described earlier. In relation to the head of household, MPX5423 is the primary wife and MPX5456 is a sibling; the relationship of MPX5434 is not known. Two of the eight were from Tokondo, and one was from Bahamba. The MPX8223 HTLV-1 sequence in this cluster is the only one of the three confirmed SFV-infected persons in our study whose SFV originated from an Angolan colobus monkey (Colobus angolensis) endemic to DRC [36].

HTLV-1 proviral loads
The percentage of infected cells in 107 persons with detectable pVLs ranged from 0.01-73.06% with a mean and median of 4.38% and 1.58%, respectively. Almost 65% (74/107) of persons had over 1% of their PBMCs infected of which 69.12% (47/68, six persons did not report gender) were female. Thirty-seven of these 47 women (78.72%) reported NHP exposures compared to 8/17 (47.06%) men with > 1% infected PBMCs. Five women and one man had > 10% infected PBMCs (range 11.77-73.06%), of which four women (66.7%) reported NHP exposure. We next examined the mean and median percentage of infected PBMCs in the 20 phylogenetic clusters consisting of two to 10 members. The overall mean and median percentages of infected PBMCs in these 20 clusters was 5.18% and 1.46% compared to 3.54% and 4.69% for singletons, respectively, but these differences were not statistically significant. The average and median percentages of infected PBMCs for potential transmission pairs, one three-member cluster, and the five-and eight-member clusters were 2.83% and 1.50%, 4.37% and 4.21%, 1.30% and 3.98%, 0.92% and 0.35%, 11.91% and 2.76%, respectively. Differences in mean and median percentages of infected cells in each cluster in comparison to those of singletons were not significant. Two women in potential transmission clusters had the highest percentages (71.20% and 73.06%) of infected PBMCs compared to that for singletons (19.11%; a male). We did not find a difference in the average and median percentage of infected PBMCs between women (5.055% and 1.79%, n = 99) and men (2.78% and 1.47%, n = 46), respectively. Participants in the 0-5 yo age group had the highest mean and median percentages (19.74% and 2.95%, respectively) of infected PBMCs but this was likely skewed by the highest value (73.06%) in our study in a 15 yo female (MPX3146). Interestingly, the 8-member cluster with the highest mean percentage of infected cells (11.91%) contains LTR sequences with the 11-bp but again this is likely skewed by the high 73.06% results for the female member of this group (MPX3146).

Discussion
Limited information exists on the current prevalence and characteristics of HTLV infection in DRC with most studies conducted in the 1990s [13,32,[69][70][71][72][73]. To fill this knowledge gap, we conducted a cross-sectional, population-based survey among participants from two health zones in the rural Sankuru province of DRC to assess the prevalence, transmission risk factors, and biomarkers of HTLV infection in this population with high bushmeat exposure. We found an overall HTLV seroprevalence of 5.4% in our population and an HTLV-1-specific prevalence of 4.2%, further adding to our understanding of HTLV burden in the DRC. In comparison, previous prevalences reported in DRC ranged from 3.1-19.6%, excluding studies that focused on clusters of persons with HAM/TSP in which the prevalence was as high as 78.1% [13,32,[69][70][71][72][73]. For example, a 1993 study of randomly sampled persons from the general population residing in Inongo, DRC, reported a crude HTLV-1 prevalence rate of 3.1% [72]. In Inongo, the authors report fish as the primary dietary protein, which may help explain the lower prevalence observed in this population compared to Sankuru, where NHP are a common food staple. The 19.6% seroprevalence was seen in adults from a leprosy hospital in Northwest DRC in the province of Mbandaka [70]. The most recent study reported in 2017 by Mossoun et. al in three villages in the Bandundu province of DRC, around 450 km west of Sankuru, reported a 1.3% prevalence rate for HTLV-1, though this sample only included 302 DRC individuals and only three persons were confirmed with infection by WB and PCR testing [32]. Two HTLV studies in DRC's capital, Kinshasa, reported prevalence rates of 7.3% and 3.2% among sex workers, a population at higher risk for sexually transmitted infections, albeit less likely to have ubiquitous contact with NHP bushmeat [69,74]. HTLV-associated pathologies such as ATLL and TSP/HAM have been difficult to estimate in DRC due to a lack of diagnostic facilities, trained medical personnel, and limited health system infrastructure [75]. Hence, research on HTLV-1-related diseases in DRC and across Africa is urgently needed and would help improve public health and disease prevention [3]. HTLV prevalence in sub-Saharan Africa varies across country, region, and ethnic group as a result of forest proximity, hunting activities, and other high-risk behaviors. A recent metaanalysis of HTLV-1 from published population-based studies in sub-Saharan Africa showed a higher seroprevalence in Central Africa (4.16%) compared to Western (2.66%) and Southern Africa (1.56%) [76]. This meta-analysis also found higher seroprevalence in women (3.27% vs 2.26%) and rural locations (3.34% vs 3.18%), congruent with our findings. Jeannel et. al also reported a higher seroprevelance in women (3.5% vs 2.6%) and showed the highest seroprevalence (6.5%) in the Bolia ethnic group in Inongo, DRC compared to 1.5% in the Sengele [77]. A more recent study in rural Gabon also reported a higher overall HTLV-1 prevalence rate (7.3%) with higher infection prevalence in women (9.0%) [77]. Taken together, the literature consistently shows an increased HTLV infection risk among women, especially as they age, matching our findings. Others have hypothesized that increased infection in women is likely from sexual transmission via condomless sex and higher viral loads in their male partners [1,2,41,77]. Noteably, several of these studies were conducted in the 1990s and the lower specificity of HTLV assays at that time likely inflated these reported numbers [2].
Among HTLV-1 PCR-positive persons, three quarters were female compared to about 58% of the total study population. We previously reported more SFV infections in females in this population, including a woman with concurrent HTLV-1 infection, and identified an association of SFV seropositivity with butchering and skinning NHPs, which parallels the reported association of SFV infection with severe NHP bites in NHP hunters in Cameroon [33,36,40]. Some of these SFV-infected Cameroonians were also infected with HTLV-1 subtype B and F strains like those found in NHPs from Cameroon.
To better understand HTLV-1 transmission dynamics in our population, we conducted Bayesian phylogenetic cluster analyses. While most clusters were pairs, we also identified four clusters with three, five, and eight members each. As with our statistical analysis of HTLV seroreactivity, we found that most clusters consisted of females reporting NHP exposures and frequent forest activity. Importantly, we found that most transmission flowed across households, villages and health zones. Some of the villages in these clusters were 2 to 118 km apart along both major and local roads, suggesting possible HTLV transmission across long geographic distances and not just within and between proximal households. We only observed transmission within five households, including two male-female pairs indicating likely sexual transmission and two wife and child pairs which, based on age and household dynamics, could be indicative of vertical transmission, though the biological relatedness between wives and children was not assessed. Our findings are similar to those reported for HIV-1 in rural Africa that showed clusters of pairs within the same household that were connected to infections in other villages mostly via sexual contact with females [78]. The finding of mostly women in these clusters also suggests more opportunities for vertical transmissions and that we are likely missing important epidemiologic links. This is further supported by our finding of an equal number of singletons (persons not clustering) of which the majority were also women. Nonetheless, our finding of HTLV-1 disseminated across households, villages, and health zones indicate public health prevention programs at both the local and national levels are needed to interrupt transmission. As for HIV prevention, increased testing and educational strategies with focused cluster detection and response efforts can help stem the spread of HTLV in these communities and may also help fill the prevention gaps identified here [79].
Phylogenetic analyses showed that our DRC HTLV-1 LTR sequences shared an evolutionary history with those from STLV-1. Most significantly, three STLV-1 LTR sequences from A. nigroviridis (STLV-1ang) from DRC were ancestral to most DRC HTLV-1 sequences from our study [58]. The three STLV-1ang sequences were obtained from captive A. nigroviridis from a zoo in Paris, France [58]. Reportedly, these three monkeys were the only seropositive animals at the zoo and were the offspring of an older dam from Central Africa, but the country was not provided. The habitat range of A. nigroviridis is in swamp forests in the Congo Basin and includes eastern Republic of Congo, western DRC where our study sites are located, and southern parts of the Central Africa Republic [80,81]. A. nigroviridis are not yet an endangered species because swamp forests are not being logged or cleared but they are commonly found in the swamp trees and are easily hunted from boats in the Congo River and sold as bushmeat [80,81]. Interestingly, A. nigroviridis has the highest prevalence of infection among seven STLV-1-infected NHPs tested in DRC and at 36.2% may be endemic in this species (S1 Table).
However, our phylogenetic results for this large clade of DRC HTLV-1 do not suggest multiple and recent STLV-1 introductions but rather a likely older introduction of STLV-1 that continued to spread in this population after becoming established in humans as a divergent HTLV-1 subtype B infection. This finding is analogous to the cosmopolitan HTLV-1a genotype for which a parental STLV-1 sequence has not yet been identified. Given that all HTLV likely originated from STLV, then HTLV-1a must have originated from a closely related STLV-1 after introduction into humans and then became endemic as for this HTLV-1 clade in DRC. Nonetheless, we did find STLV-1 sequences from apes and monkeys from DRC and Cameroon that phylogenetically clustered with our DRC sequences, but which were not strongly supported limiting our conclusions for these genetic relationships. Inclusion of additional STLV-1 sequences from DRC may help to resolve the origins of HTLV-1 in this population. Similar results have been reported in central Africa (Cameroon and Gabon) that showed the HTLV-1 in persons with severe NHP bite exposures did not always share a direct evolutionary history with STLV-1 including those from the same region and in persons with dual SFV infection [32,33,40]. However, our results are supported by the lack of an association of NHP exposures and HTLV WB positivity in our study suggesting community spread from person-to-person versus multiple primary zoonotic infections. While bonobos (Pan paniscus) in DRC are the only STLV-2-infected NHPs identified to date (S1 Table), and three persons in our study had positive HTLV-2 WB profiles, we did not observe any confirmed PTLV-2 infections in our study population [82]. Likewise, we did not find any HTLV-3 or HTLV-4 infections although STLV-3 is endemic in various monkeys in DRC but at lower prevalences than STLV-1 (S1 Table) [46].
We did, however, find evidence for dual SFV and HTLV-1 infection in three persons in our study [36]. Although all three persons (MPX8223, MPX21044, MPX40224) were infected with SFV most similar to NHPs endemic to DRC, including SFVcan from Angolan colobus and SFVasc from C. ascanius (red-tailed guenon) monkeys, all three HTLV-1s from these persons were within the large cluster of DRC sequences from our study that are potentially descendent from STLV-1ang. MPX8223 was also infected with the unique HTLV-1 strain identified in our study with an 11-bp deletion in the LTR region though this is likely unrelated to their SFV infection. Recently, it was shown that STLV-1 co-infection is associated with increased blood SFV pVLs and the authors showed that the STLV-1 tax protein can transactivate the SFV LTR to increase its replication [83]. While little is known about pVLs in dually infected humans, SFV pVLs in these three persons were within the range reported for SFV-infected humans and NHPs suggesting that their dual infections with HTLV-1 may not have affected their SFV pVL [36,84,85]. However, the HTLV-1 pVLs for MPX8223 and MPX21044 were both greater than 1%, and as described below, can indicate risk for HTLV-1-associated disease. Cross-sectional studies such as ours cannot discriminate which retrovirus infection, SFV or HTLV, occurred first in dually infected persons.
Previous studies have estimated that elevated HTLV-1 pVLs can be indicative of progression to disease or are found in persons with HTLV-1 disease, including ATLL and HAM/TSP, and can also increase the risk for person-to-person transmission. For example, persons with ATL and HAM/TSP have much higher pVLs compared to asymptomatic carriers and can also help predict disease progression in infected carriers [86]. Higher pVLs have also been associated with shorter survival times in ATLL patients [87]. A meta-analysis for HAM/TSP patients from the UK and US showed all had pVLs > 1% in PBMCs (> 100 proviral copies/10 4 PBMCs), though a definitive cutoff has not been established [88,89]. Nearly 66% of HTLV-1-infected persons in our study had mean percentage of infected cells > 1% (range 1.1-73.2%) of which 69% were females. Interestingly, more women (n = 5) than men (n = 1) had pVLs > 10% (range 11.7-73.2%) though the mean percentage of infected PBMCs by gender and age were not significant despite more women testing PCR-positive in our study. While we did not record health status for our participants, our results indicate a large proportion may be susceptible to HTLV-1-associated diseases supported by previous studies that identified clusters of HAM/TSP in DRC [72,73]. Indeed, the two women with extremely high percentages of HTLV-infected cells (71.2 and 73.2%) are more similar to the high pVLs seen in persons with ATLL than to those with HAM/TSP, which in one study had a mean of 50.3% median pVLs compared to 14.7%, respectively [86]. Our high pVL results are less than 100% suggesting that they are not due to multiple HTLV-1 integrations per PBMC cell which could complicate interpretation of the results and their potential association with transmission and/or disease [86]. We did not find a clear association of pVLs in potential transmission pairs or clusters identified on our study, except for the cluster containing LTR sequences with the 11-bp deletion that had the highest mean percentage of infected cells (11.9%) for clusters larger than two persons. This finding may reflect that the 11-bp deletion provides a viral replication advantage though the deletion occurs before the first transcription enhancer element in the LTR. It is possible the deletion changes the secondary structure of the LTR to increase replication though additional experiments are required to test this hypothesis. It should also be noted that this eight-person cluster contains the person with the highest percentage of infected cells which positively skews the mean percentage of infected cells in this group.
Overall, our epidemiologic findings were consistent with previous PTLV studies in DRC and reflect the challenges of studying a low incidence disease in participants with overlapping exposures to multiple animals. We found that rates of seroreactivity increased with each increasing age stratum suggesting continued exposure to PTLVs over the life course. Despite this, we were unable to detect a relationship between the nine individual NHP species included in our study questionnaire and HTLV seropositivity. Previous studies of PTLVs in Central Africa have found a near ubiquitous exposure to NHP, whereas our study found only 65% of participants reporting an NHP exposure in the past month [32]. This distinction might be explained by regional practices of Sankuru, our classification system, which focused on exposures from the previous month only, or other measurement errors arising from NHP misidentification by participants. We aimed to limit potential bias in NHP exposure reporting by providing pictorial representations of all species included in our study, rather than relying on name-based identification. Nonetheless, we also did not find any evidence of recent STLV-1 infection in our population despite high NHP exposure and previous identification of SFV infection in this same group. In the inferred phylogeny recent infection of persons from DRC in our study with STLV-1 would have been indicated by a direct link to an STLV-1, i.e. an HTLV-1/STLV-1 pair. Rather, our finding of only HTLV-1 infection in our study that is most closely linked to other HTLV-1 from DRC likely suggests a more evolutionarily distant crossspecies transmission that has since become established in this area.
Our study had several limitations. First, blood transfusions and injection drug use, two possible pathways for horizontal transmission, were excluded from the questionnaire due to their low prevalence in the community. Due to challenges of collecting robust biological samples among young children ages 0-5, this age group was underrepresented in the analytical sample. This may have made our observed prevalence of HTLV appear higher in the study population as the risk of viral infection was found to increase with age (lifetime exposure). Additionally, our capture of household family trees and completeness of biological data was based on convenience sampling of all household members present during sample collection and may likely explain the over-representation of women in our study and missing potential transmission linkages. Alternatively, some of the singletons could result from older infections in which the transmission link is lost or could reflect dead-end infections. We were also unable to obtain HTLV sequences and pVLs from all seroreactive persons which could have uncovered additional potential transmission linkages or phylogenetic relatedness to HTLV and STLVs and helped to further understand transmission and pathogenicity in our study population. The negative PCR results in these seroreactive persons could reflect low proviral loads, sequence divergence at the PCR primer binding sites, false-reactive serology results, and/or other factors. Indeed, commercial HTLV serology assays that only include HTLV-1 and HTLV-2 antigens have limited validation for detecting STLV-2, -3 and -4 which may limit their sensitivity for detecting these divergent viruses [47,90]. Nonetheless, we PCR-tested all WB reactive samples with PTLV generic tax PCR primers and did not find HTLV-2, -3 or -4 in our study population. The addition of HTLV-3 and -4 specific antigens to existing serologic assays could help improve detection of these variants but must weigh the cost and public health benefits of doing so. In Sankuru, defining family relations presented challenges, particularly due to customs of polygamy and arrangements for which certain wives and their children may live in separate physical spaces overseen by a single head of household. In addition, due to the nature of household role and its inextricable link with both age and sex, a full explanatory model of the relationship between these three factors and HTLV serostatus could not be tested. This necessarily limits our understanding of how social structures, familial duties, and household clustering may impact person to person transmission of HTLV in the study population. Deep investigation into social networks within and between villages would be needed to trace transmission events with greater certainty, but our sequence analyses suggest evidence of some human-to-human transmission events consistent with the epidemiology of HTLV. Finally, although we identified high HTLV-1 pVLs in PBMC specimens from persons without a reported clinical diagnosis of disease, recent studies suggest that patient testing should also include pVL testing of cerebrospinal fluid or tissues from persons with suspected HAM-TSP or ATLL-lymphoma for diagnosis confirmation [86]. Furthermore, additional studies are required to determine pVL cutoffs to distinguish asymptomatic carriers from persons with HAM/TSP and ATLL and to standardize variation between assays as suggested [91].
Without the ability to determine a specific species or activity of concern, public health risk communication for zoonoses remains a challenge. Nonetheless, our results provide insight into the spread of HTLV-1 within and across distant villages, which requires clear HTLV-1 prevention communication and effective strategies at both the local and national levels as proposed to help eradicate HTLV-1 infection [3]. Further research is required to understand, why, if exposure is constant across the life course and occurring at a high rate, only some individuals may become seropositive for PTLVs. Future work must be done involving molecular typing of HTLV strains and STLV strains to help reveal zoonotic transmission links and further explore person-to-person transmission risks.

S1 Table. Animals in the Democratic Republic of Congo included in the study participant questionnaire.
(DOCX) S1 BEAST. Phylogenetic analysis BEAST xml file that includes the study sequences and parameters used in the analysis. (XML)