Water, Sanitation and Hygiene (WASH) and environmental risk factors for soil-transmitted helminth intensity of infection in Timor-Leste, using real time PCR

Background No investigations have been undertaken of risk factors for intensity of soil-transmitted helminth (STH) infection in Timor-Leste. This study provides the first analysis of risk factors for intensity of STH infection, as determined by quantitative PCR (qPCR), examining a broad range of water, sanitation and hygiene (WASH) and environmental factors, among communities in Manufahi District, Timor-Leste. Methods A baseline cross-sectional survey of 18 communities was undertaken as part of a cluster randomised controlled trial, with additional identically-collected data from six other communities. qPCR was used to assess STH infection from stool samples, and questionnaires administered to collect WASH, demographic, and socioeconomic data. Environmental information was obtained from open-access sources and linked to infection outcomes. Mixed-effects multinomial logistic regression was undertaken to assess risk factors for intensity of Necator americanus and Ascaris infection. Results 2152 participants provided stool and questionnaire information for this analysis. In adjusted models incorporating WASH, demographic and environmental variables, environmental variables were generally associated with infection intensity for both N. americanus and Ascaris spp. Precipitation (in centimetres) was associated with increased risk of moderate-intensity (adjusted relative risk [ARR] 6.1; 95% confidence interval [CI] 1.9–19.3) and heavy-intensity (ARR 6.6; 95% CI 3.1–14.1) N. americanus infection, as was sandy-loam soil around households (moderate-intensity ARR 2.1; 95% CI 1.0–4.3; heavy-intensity ARR 2.7; 95% CI 1.6–4.5; compared to no infection). For Ascaris, alkaline soil around the household was associated with reduced risk of moderate-intensity infection (ARR 0.21; 95% CI 0.09–0.51), and heavy-intensity infection (ARR 0.04; 95% CI 0.01–0.25). Few WASH risk factors were significant. Conclusion In this high-prevalence setting, strong risk associations with environmental factors indicate that anthelmintic treatment alone will be insufficient to interrupt STH transmission, as conditions are favourable for ongoing environmental transmission. Integrated STH control strategies should be explored as a priority.


Conclusion
In this high-prevalence setting, strong risk associations with environmental factors indicate that anthelmintic treatment alone will be insufficient to interrupt STH transmission, as conditions are favourable for ongoing environmental transmission. Integrated STH control strategies should be explored as a priority.

Introduction
Surprisingly little evidence convincingly demonstrates the benefits of water, sanitation and hygiene (WASH) interventions on reducing soil-transmitted helminth (STH) infections [1,2]. Yet it is widely believed that WASH improvements together with anthelmintics could break STH transmission cycles in settings in which anthelmintics alone are insufficient [3,4]. There has been inadequate epidemiological investigation of the role of improved WASH in reducing the STH burden, but there is a growing need for evidence to enable more effective investment in WASH and integrated strategies for STH control.
Intensity of STH infection is important to assess in epidemiological analyses. STH are highly aggregated in humans, with a small number of people harbouring large numbers of helminths, and the majority harbouring few or none [5]. As with prevalence, intensity of worm burden is marked within various groups of the community such as different age groups and gender [6]. This well-described phenomenon is a key feature of this macroparasite relationship with the human host. For quantitative investigations it is therefore problematic to use solely prevalence of infection as an indicator of STH burden or transmission, because large changes in intensity may only be accompanied by small changes in prevalence [6]. STH do not reproduce within the host; infection intensity depends on the time and extent of exposure [7]. Where STH are endemic, maximum worm intensity usually occurs at ages five to ten for Ascaris lumbricoides and Trichuris trichiura, and in adolescence or early adulthood for hookworm [6]. Whilst the reasons for this are unknown, it may be due to behavioural and social factors, nutritional status, genetic and immunological factors [5,[8][9][10][11]. There is evidence that some individuals are predisposed to heavy or light STH infections [9,12]. Intensity of T. trichiura infection reacquired by an individual after treatment has been found to be significantly correlated with the intensity of infection prior to treatment [13]. Additionally, intensity of infection with STH has been identified as substantially greater when any of the species occurred in combination with one or more of the others [14], probably also due to exposure, genetic and immunological factors, which could then act in determining risk of associated morbidities. Despite this knowledge, there is much focus on the use of prevalence to measure STH infection endemicity.
The relationships between intensity of STH infection and risk factors have been inadequately explored, yet could provide useful information as to why intensities differ by host age, environment, and helminth species. Because a key feature of the STH life cycle is the soildwelling stage, STH survival, development and transmission potential all rely on a complex assortment of environmental, social, behavioural and host factors. Therefore, in addition to investigating associations between WASH and STH, community-based associations must be considered within their environmental context [15,16]. Although more evidence is required, STH associations with WASH have been systematically appraised [2]. Studies have additionally identified temperature, rainfall, soil porosity and pH, vegetation and elevation ranges as influencing N. americanus larval development and STH transmission [16,17]. We have previously separately reported on WASH [18] and environmental [19] risk factors for STH prevalence in Manufahi District, Timor-Leste. Given exposure-related risks, and associations between heavy-intensity infection and morbidity, this analysis was conducted to investigate whether WASH-and environmental-related risk factors in this district may also be associated with infection intensity, using categories derived from quantitative PCR (qPCR), a highly sensitive and specific diagnostic technique [20]. By combining data on both WASH and environmental risk factors this analysis provides a more complete picture of risks and thereby augments the current knowledge of risk factors for STH in Timor-Leste. Knowledge of WASH risk factors will be used to inform control strategies in this country. Whilst many environmental risk factors may not be modifiable, the inclusion of these factors will enable targeting of control strategies to areas of greatest need. This is one of very few extensive investigations of combined WASH and environmental risk factors for STH undertaken. It is additionally the first epidemiological analysis of risk factors undertaken using categorised intensity of STH infection from qPCR.

Results
From 24 communities, 2827 eligible people provided baseline survey data, of whom 2152 participants (1038 males, 1114 females) completed both an individual questionnaire and provided a stool sample and were included in this analysis (Table 1, [18]). Using our infection intensity cut-points, more than half (52%) of participants had heavy-intensity N, americanus infection; Notes: a Baseline WASH and environmental risk factors for this population have previously been reported [18,19]. Parasitological outcomes determined by PCR, types of household latrines observed by interviewer, remaining WASH data are self-reported. Household water source reported at household (N = 594) not individual level. Education level asked of adults only (N = 1090). "Improved" household water source as defined by WHO/ UNICEF Joint Monitoring Programme (JMP) for Water Supply and Sanitation to include piped water into dwelling or yard, public tap or standpipe, tubewell or borehole, protected dug well, protected spring [33]. "Unhygienic toilet" defined as any people who did not use a hygienic toilet (this included people who used a mixture of hygienic and non-hygienic toilets; hygienic toilets defined as use of a house/school/ village/neighbour toilet and nothing else). "Always wearing shoes" was contrasted to sometimes/never wearing shoes. 10% had heavy-intensity Ascaris infection (Table 1). There was very low prevalence of water or sanitation infrastructure, and most households owned few assets. Most heavy-intensity Ascaris infection occurred in children (Fig 1). Heavy-intensity N. americanus infections were more spread across age groups (Fig 2). Heavy-intensity Ascaris infection varied significantly by socioeconomic quintile (P = 0.012); N. americanus infection intensity did not (P = 0.468).

Factors associated with N. americanus intensity of infection
Environmental factors were associated with N. americanus infection ( increasing from 3.2 to 9.6; see Table 2), however this was less evident for moderate-intensity infection (with the exception of being aged 65 years or older having four-fold increased risk of infection; ARR 4.4, 95%CI 1.6, 11.9). For males, relative to no infection, being aged 18 to 64 years was significantly associated with more than three-fold increased risk of any intensity infection (moderate-intensity ARR 3.3, 95%CI 1.3, 8.7; heavy-intensity ARR 3.6, 95%CI 1.8, 7.3). Sex in participants aged one to five years (i.e. reference group) was not associated with intensity of infection. A gradient of generally increasing risk of moderate-and heavy-intensity infection was also evident with worsening socioeconomic quintile (being significant across most subgroups for heavy-intensity), with people in the poorest quintile having more than twice the risk of infection for both intensity levels (moderate-intensity ARR 2.0, 95%CI 1.1, 3.7; heavy-intensity ARR 2.2, 95%CI 1.3, 3.6).
Few associations were found between WASH variables and STH outcomes in adjusted analyses. Of note is that a shared piped water supply was associated with strongly reduced risk of heavy-intensity infection compared to an unprotected stream (ARR 0.32, 95%CI 0.12, 0.84), and use of surface water was associated with twice the risk of moderate-intensity infection compared to an unprotected stream (ARR 1.9, 95%CI 1.1, 3.2). Boiling household water was associated with half the risk of moderate-intensity N. americanus infection compared to not boiling water (ARR 0.52, 95%CI 0.34, 0.80). Having one preschool-aged child in the household was protective against heavy-intensity N. americanus infection (ARR 0.57, 95%CI 0.40, 0.82). For moderate-intensity infection having one preschool-aged child in the house was not significant (ARR 0.81, 95%CI 0.52, 1.3), but having more than one was associated with reduced risk (ARR 0.57, 95%CI 0.34, 0.94). People reporting three or more bowel motions during the previous 24 hours (indicating diarrhoea) was associated with reduced risk of heavy-intensity infection compared to people who reported less than three bowel motions (ARR 0.40, 95%CI 0.17, 0.96). People who reported having access to anthelmintic drugs and people who reported actually taking deworming treatment within the previous 12 months, was not associated with risk   of infection in adjusted models, despite these factors being highly significant in univariable analysis for heavy-intensity infection. Methods of post-defecation anal cleansing, and shoe wearing, all of which were highly significant in univariable analyses for heavy-intensity infection, did not emerge as risk factors in adjusted analyses.

Factors associated with intensity of Ascaris infection
Factors significantly associated with Ascaris infection were age, and environmental variables, particularly alkaline soil and elevation above sea level (   [33], with the exception of "piped water" which was grouped due to low observation numbers. Definitions: "Used unhygienic toilet" any people who did not use a hygienic toilet (this included people who used a mixture of hygienic and non-hygienic toilets; hygienic toilets defined as use of a house/school/village/neighbour toilet and nothing else). "Other household toilet type" indicates hanging latrines (low observation numbers).
"Household rubbish disposed of by other method" includes disposing it into a bin, a river, burying it or composting it. "Village rubbish disposed of by other method" includes burying it or disposing of it in the river. Reference categories: Interpretation notes: Because this table has an interaction term, parameterisation and correct interpretation in the adjusted model are as follows: b the age group main effect is the stated age group (e.g. six to 11 years) relative to age group one to five years in females because females are the reference group (i.e. demonstrating the relationship between age group and N. americanus infection intensity in females). Similarly, c the male sex main effect is being male relative to being female in age group one to five years (reference group). Males in age groups six to 11 years and older is relative to males in age group one to five years (because males and older age groups are not the reference groups).   [33], with the exception of "piped water" which was grouped due to low observation numbers. Definitions: "Used unhygienic toilet" any people who did not use a hygienic toilet (this included people who used a mixture of hygienic and non-hygienic toilets; hygienic toilets defined as use of a house/school/village/neighbour toilet and nothing else). "Other household toilet type" indicates hanging latrines (low observation numbers). "Household rubbish disposed of by other method" includes disposing it into a bin, a river, burying it or composting it. "Village rubbish disposed of by other method" includes burying it or disposing of it in the river. Reference categories: General domain: lowest age group in the stratum (age 1-5 years); female sex. Individual hygiene domain: uses soap/ash to wash hands; washes hands after defecation only; always wears shoes outside house; always wears shoes when toileting. Individual sanitation domain: uses hygienic toilet only; household has no toilet; cleans self with leaves only after toileting; village has no public toilet. Household sanitation domain: household toilet being a pit latrine with slab; household toilet observed (by interviewer) to be dirty; household rubbish disposed of in bush only. Household water supply domain: main water supply being an unprotected spring; main water supply located separate from household compound; main water supply always running; household water not stored; no secondary water source used; secondary water supply used was unimproved (according to JMP definitions [33]) or no answer provided; household water is not boiled.

Discussion
This analysis presented the first investigation of combined WASH, environmental and demographic factors for intensity of STH infection in Timor-Leste. Using PCR-derived intensity of infection categorisation, similar infection intensity profiles to previous epg-based profiles [25] were found for each of Ascaris and N. americanus, with the most intense Ascaris infections in children, declining intensity and prevalence in adulthood, and prevalence and intensity of N. americanus being high in both childhood and adulthood. For N. americanus, heavy-intensity infections occurred in older age groups, although at low proportions. Whilst current risk factor models were not separately analysed by age groups, these results are in agreement with previous findings of different age-specific risk factors for different STH species in the study area [18]. This highlights the potential role of exposure-related risk factors, although other factors, such as acquisition of some level of immunity, may play a role [25]. It has previously been hypothesised that sex and age associations with STH are strongly related to exposure-associated behaviours [26]. Females showed a highly significant, increasing gradient for risk of heavy N. americanus infection with increasing age. Although less significant, a gradient was also evident for moderate-intensity N. americanus infection. Whilst overall male sex in those aged one to five was not a significant risk factor for N. americanus infection intensity compared to females of this age group, there was again an observation of greater heavy-intensity infection in males aged 18 to 64 relative to males aged one to five. These observations suggest that there are additional age-and sex-related factors occurring. This may include age as an expected indicator of time-accumulation given STH do not multiply in the host [14] and the longevity of N. americanus [27]. Alternatively, there could be exposure behaviours in older adults (compared to children) that are important to identify as they may be amenable to modification. There could also be differences in host immunity, particularly at different ages. Increased animal and soil contact through agricultural activities represents a direct potential transmission pathway (particularly in males) that requires further exploration. Further investigation into the female-age group association with heavy-intensity infection need to be undertaken; this could reflect particular household-related practices undertaken by women but not men. Further activities, such as constructing daily activity diaries, would be valuable to enable further insights in this setting. Alternatively, the findings of different sex and age patterns may be indicative of other factors such as host genetics [26].
Mixed-effects multinomial models were used to investigate the statistical relationship between intensity of STH infection, and WASH and environmental risk factors, whilst accounting for heterogeneity within village and household random effects. The lack of autocorrelation identified in semivariograms after accounting for large-scale environmental trends indicates that environmental variables explained the majority of spatial correlation in the data [19]. In our adjusted models incorporating WASH, demographic and environmental variables, environmental variables were generally associated with the greatest ARRs for infection intensity for both N. americanus and Ascaris spp. Precipitation was associated with increased risk of N. americanus infection of any intensity, but not Ascaris intensity. It is important to note that the precipitation variable included in these analyses was derived from 50-year averaged data from the driest month [19]; it is not reflecting seasonality, which could have had an impact on N. americanus survival rates in the soil [17]. Seasonal fluctuations could affect transmission potential but are not considered likely to have a strong influence on infection patterns given the longevity of N. americanus in the human host [17].
High rainfall contributes to suitably moist conditions for eggs and larvae to survive in soil, including the propensity for N. americanus larvae to remain near the soil surface and thus be available for human infection [17], but for Ascaris, excess rainfall may have negative impacts, possibly because the eggs sink lower in the soil as rainfall drains away. Our analysis showed strong associations between sandy-loam soil and highly increased risks of N. americanus infection, yet conversely, no significance in adjusted models for Ascaris spp. Observational associations between hookworms and sandy soil have been reported since the early 1900s (reviewed in [17]). Significance of soil type and rainfall likely reflect an important difference in life cycles and transmission potential between these two STH. N. americanus survive in the external environment as motile ensheathed larvae, but Ascaris spp. are present as (non-motile) eggs. The interrelated features of large-particle "sandy" soil, which tends to be less dense, aids both larval motility and water draining during/after rainfall, being therefore more amenable to N. americanus larval survival [17,26] and subsequent transmission potential. Ascaris eggs, on the other hand, are more susceptible to extremes, being able to dessicate in dry soil and to retard development in extremely wet soils [28]; this supports the lack of association between Ascaris spp., sandy soil and precipitation in this analysis. These factors, plus the shorter developmental time to infectivity in the soil of N. americanus compared to Ascaris [16], may contribute to the considerably greater prevalence of N. americanus.
We have previously reported on a protective association observed between alkaline soil type and Ascaris infection [19]; in this current analysis there was some evidence of a gradient of increasing infection intensity, although numbers were low and this finding therefore requires verification. Other studies use soil acidity data in spatial analyses [29,30], with one study reporting associations between acidic soil and increased infection risk ( [30]; although this study used categories of pH that were all considered acidic compared to our definitions which defined neutral soil type as pH 6.6 to 7.3 [31]). Generally, soil acidity information is still rarely collected, yet this is an important potential determinant that could vary with precipitation, and other ecological or land use factors. Further analysis of pH ranges in epidemiological studies will contribute to knowledge of the optimal conditions for survivability of these helminths.
Differences in motility and survivability also potentially explain the direct association between increased elevation and Ascaris intensity of infection, with downhill runoff and draining after rainfall potentially facilitating survivability (and hence transmission) of those Ascaris eggs that remain in soil at higher elevations (i.e. those that do not get washed away); it would be plausible that Ascaris eggs that are washed downhill may be washed into rivers and streams, or lie within saturated environments that are less conducive to development. For N. americanus this was an inverse relationship for heavy-intensity infection; the protective association seen from elevation may reflect lower temperatures at higher elevations (as temperature was not included in multivariable analyses due to its high correlation with elevation). Negative correlations between hookworms and elevation, and, less consistently, positive correlations between A. lumbricoides and elevation, have previously been reported (reviewed in [17]).
Given high STH prevalence, poverty and poor existing WASH infrastructure [18], and the large quantity of risk factors investigated, few WASH risk factors have emerged in these analyses. Homogeneously poor access to improved WASH resources in study communities would limit our ability to find major associations [18], and is the most likely explanation for this. No significant WASH risk factors for Ascaris intensity of infection were found in adjusted models. For N. americanus, the protective association with boiling water against moderate-intensity infection is slightly surprising. There is evidence that N. americanus larvae can survive and remain infective for several days in water (decreasing with duration of water exposure) [32]. Whilst there is negligible published evidence for N. americanus infection via ingestion, this finding points to faecal contamination of drinking water sources as a possible exposure pathway.
Water supply effects were also not significantly associated with intensity of N. americanus infection in the expected direction, with different levels of risk between surface water and an unprotected spring; both of which are unimproved water sources [33]. This could possibly be due to location of communities downhill from springs (thus positioned for gravity-fed flow), whilst communities may be generally uphill from surface water. There may additionally be a greater tendency for people to remove footwear when going to surface water, compared to (potentially smaller) springs. Alternatively, this may reflect heavy N. americanus contamination in the vicinity of particular water sources in the study area. Self-reporting error or a misunderstanding of water source definitions used in our study are also possible explanations. The protective effect of shared piped water, but not other 'improved' sources such as tubewells, is of interest and may reflect a heightened level of hygiene awareness in situations where multiple households use the same source, or alternatively, high correlation between some other variable and this one (although confounding and collinearity were investigated). The general lack of WASH associations, particularly with levels of sanitation, is similar to results that we have reported previously for prevalence [18]; however it was not previously clear whether this was because prevalence models were age-stratified, which could have affected power to detect effects. Lack of WASH risk factors therefore most likely reflects homogeneously poor access to WASH infrastructure, with flow-on impacts on amenable hygiene behaviours, in these communities, or a true lack of association with STH in this district. Alternatively, with a multinomial outcome, analyses could have adequate power to detect only moderate-large associations (see limitations).
Prevalence of N. americanus has previously been reported to be significantly associated with low socioeconomic status in this study area [18]. As has previously been identified socioeconomic status in this community reflects relative poverty that was still measurable within a general setting of poverty [18] and it is interesting that, for N. americanus, slightly higher estimates of association were seen for socioeconomic strata in heavy-intensity relative to moderate-intensity infection. This highlights an advantage of investigating socioeconomic status in defined districts on high-resolution (i.e. village-level) scale as opposed to national scale; it has been previously reported that between-and within-village heterogeneity may limit the usefulness of socioeconomic proxies in aggregated large-scale analyses [28]. The greater level of detail from this multinomial model provides additional insight into the N. americanus-poverty relationship. An interesting protective effect was the presence in the household of preschoolaged children; possibly this reflects adoption of hygienic behaviours when there are young children to protect them from disease exposures. The finding of reduced risk of heavy infection in people who reported three or more bowel motions is not surprising given that diarrhoea causes dilutive effects on quantities of helminths [34]. Our finding that recent deworming was not significantly associated with infection in adjusted models may be due to self-report error, with possible confusion about medications received.
The risks associated with environmental variables have important implications for STH control. The high rainfall, mountainous, tropical environment combined with high levels of poverty, poor WASH infrastructure and behaviour, and the longevity of STH eggs and larvae survival in soil [16], provides a fertile environment for STH transmission in this district. This is a challenge for helminth control because environmental variables themselves are not modifiable. Despite this, awareness of high-risk factors can influence other activities, primarily hygiene-and sanitation-associated behaviours to manage environmental risks. This provides a strong justification for investment in WASH activities irrespective of their individual statistical significance in risk factor analyses, as this is an exposure-reduction pathway that can potentially be manipulated. As current evidence for hygiene behaviours on STH control is sparse (reviewed in [1,2]), further research on the hygiene behaviours that could have greatest impact in this scenario needs to be undertaken as a priority. This analysis is an important contribution to an ongoing RCT that will assess the benefits of augmented albendazole with WASH for STH control in Manufahi District, Timor-Leste [21]. As well as a detailed understanding of baseline WASH infrastructure and behaviours upon which to benchmark trial-related improvements in WASH, the knowledge of environmental factors is an essential prerequisite for effective targeting of interventions.

Limitations and strengths
This is an observational analysis and, as such, cause and effect cannot be determined. As has been noted previously [18] much of the WASH data collected involved self-report of infrastructure and behaviours. Presence, type and cleanliness of household and village latrines were verified by interviewer observation. Self-reporting is a frequently-encountered drawback of measuring WASH characteristics. Further, extensive heterogeneity in assessing WASH behaviours on STH outcomes makes assessment of WASH characteristics challenging [15,35]. An important research priority is to develop specific WASH measurement guidelines for STH control. Power calculations indicated power to detect low associations for N. americanus and moderate associations for Ascaris infection intensity in multinomial models.
There are particular strengths to this study. This is one of very few epidemiological investigations of risk factors for STH infection intensity; this is particularly important to assess for environmental factors, given the links to STH transmission dynamics and correlations with morbidity. In this paper a community-based risk analysis is presented that combines high-resolution environmental, WASH and demographic variables in adjusted models. Advanced statistical techniques have been used to adjust for multinomial intensity outcomes, dependency of observations, effects of poverty, and confounding from other measured variables. As with all analyses, there is the possibility of residual confounding from unmeasured factors. However this provides the most comprehensive assessment of STH risk factors that we have identified in any setting.
A further strength is the use of PCR; a highly sensitive and specific technique [20] that is increasingly used for STH diagnosis. PCR-derived intensity of infection categorisation is a recent development, and requires further validation in different epidemiological settings [23]. Notwithstanding the need for further refinement of cut-points, different risk factors for moderate and heavy-intensity STH infections were found in this study area, with some evidence of a scale of increasing risk for factors such as soil type. This contributes useful, and highly relevant, information on risk factors within these communities. Use of infection intensity to determine risk factor associations requires more investigation. In particular, use of prevalence alone could mask significant intensity-related associations. This may mean that key evidence for WASH benefits may be overlooked in epidemiological studies that use prevalence of infection as the outcome. The possibility that WASH significance may be underreported in this way has been inadequately explored.

Conclusion
With intensity of STH infection as the outcome, a comprehensive risk analysis of environmental, WASH and demographic variables is presented for communities in Manufahi District, Timor-Leste. Strong risk associations with environmental variables were identified. However, generally few associations with WASH risk factors were evident. This raises the importance of accurate measurement of WASH, and the need for clear guidelines on measuring WASH epidemiological research. This result also has important implications for STH control activities. Even in the absence of WASH significance, WASH infrastructure and behavioural-related activities are the only identified mechanism that could reduce or prevent transmission in an environment of high STH transmission potential. In this setting, anthelmintic treatment alone will not interrupt STH transmission; this provides a strong justification for application of integrated STH control strategies in this district.

Ethical approval and consent
This analysis used baseline data from 18 communities in a cluster randomised controlled trial (RCT), supplemented with data from an additional six communities, in Manufahi District, Timor-Leste (Australian and New Zealand Clinical Trials Registry ACTRN12614000680662) [21]. STH have recently been reported as endemic in this community, with prevalence of N. americanus of 60% and Ascaris spp. of 24%, as detected by qPCR [18].
The University of Queensland Human Research Ethics Committee; the Australian National University Human Ethics Committee; the Timorese Ministry of Health Research and Ethics Committee; and the University of Melbourne Human Research Ethics Committee approved the study protocol. Participant informed consent processes included explaining the study purpose and methods, and obtaining signed consent from all adults and parents or guardians of children under 18 years [21]. Children aged less than 12 months were excluded [21].

Study setting, design and collection of data
The RCT commenced in May 2012. Detail on the RCT design is provided in the trial protocol [21]. A baseline survey of 18 communities involved in the RCT, and six additional communities, was conducted between May 2012 and October 2013. All communities surveyed were rural, and agrarian occupations predominated. Manufahi District has terrain varying from flat coastal plains to relatively mountainous inland areas (with elevation exceeding 1100 metres in some communities). It is a tropical region, with very high average rainfall of 190cm [19] and a wet season extending for close to ten months of the year. The average annual temperature is 24.5˚C [19].
A single stool sample per participant was collected and fixed in 5% potassium dichromate. Multiplex qPCR was used to analyse stool samples for the presence and intensity of STH infection. Details on the qPCR diagnostic method are provided elsewhere [20]. Village, household and individual level questionnaires encompassing a broad range of potential WASH and socioeconomic risk factors were administered by trained field workers [18,21]. Interviewer observation of household and village latrines, their type and cleanliness was undertaken; all other questions were self-reported. Data were collated and entered into a Microsoft Access database and extracted to STATA 13.0 (Stata Corporation, College Station, Texas) for error checking.

Data analysis
Individual-level data were linked to questionnaire and parasitological outcomes and household GPS coordinates [18,21]. Principal component analysis was used to create a wealth index, based on ownership of household assets (animals, transport and appliances), house floor type, reported income, and presence of electricity [18,22]. Using eigenvalues above 1, four principal components were retained and used to produce a final wealth score which was categorised into quintiles of relative socioeconomic status [18].
Outcome variables were intensity of N. americanus and Ascaris infection, which were analysed separately. Intensity of infection was derived from qPCR DNA cycle threshold (Ct) values, and categorised into two groups: (i) heavy-intensity, and (ii) moderate-to light-intensity infection (hereafter called "moderate intensity") using algorithms generated from seeding experiments to correlate Ct-values to eggs per gram of faeces (epg) equivalents. Full detail of this method is provided elsewhere [20,23]. Exposure variables were WASH variables from study questionnaires, grouped into domains of related variables (e.g. household sanitation; household water supply; household hygiene; household socioeconomic status), and environmental variables that were sourced separately. Environmental variables were selected for analysis based on reported prior relationships with STH development [17], and availability via open-access sources. Temperature, precipitation, elevation, soil texture, soil pH, landcover and vegetation data were selected for analysis (Table 4) and processed using the geographical information system ArcMap 10.3 (ESRI, Redlands, CA) [19]. Very few environmental analyses incorporate information on soil texture and soil pH; it has been possible to incorporate these variables due to soil surveys conducted in the study region between 1960 and 1965 [24]; soil type was not considered to have changed dramatically since that time. A range of environmental variables related to the above factors was produced according to long-term average data, seasonal periods, and spatial resolution [19], with household as the data point, and a 1 km buffer applied (whereby the median raster value within a 1 km radius of the household was used [19]). Quality checks and exploratory analyses were undertaken to determine the most suitable version of each variable for analysis. Separate assessment of spatial autocorrelation was undertaken using semivariograms of residuals from multivariable models of selected environmental variables, with Risk factors for soil-transmitted helminth intensity of infection in Timor-Leste household and village random effects [19]; no additional autocorrelation was identified [19]. The analysis of environmental covariates in this study was limited to risk factor investigation. Predictive risk maps for STH infection in Manufahi District are published separately [19]. Variables were investigated for multicollinearity according to likely relationships determined from literature, using tetrachoric analysis and the STATA "collin" user written package, according to the type of variable. Temperature and elevation were collinear; each variable was analysed in separate univariable models and subsequent variable selection was based on lower Akaike's Information Criterion (AIC), indicating better predictive performance of the model. Chi-squared tests were conducted to compare intensity of infection by age, sex and socioeconomic quintile. Using categorised intensity of infection as the outcome, univariable and multivariable mixed effects multinomial regression was undertaken, with household and village random effects to account for dependence of observations. Regression analyses were undertaken for N. americanus and Ascaris spp. separately.
Regression models were not age-stratified due to insufficient numbers for some combinations of outcome and explanatory variables. Univariable regression was undertaken for each risk factor, with inclusion of variables in multivariable regression if they had P<0.2 on the Wald test in univariable analyses. All multivariable models included age group, sex, and socioeconomic quintile as covariates. Forward stepwise variable addition was used with variables retained if P<0.1 within, then across, domains of variables, until the most parsimonious adjusted model for each outcome was achieved. A categorised age variable, and a sex Ã age interaction term, were investigated, as the association between sex and the outcome was anticipated to vary by age group. Interactions were investigated by developing models without, then with, the interaction term and comparing these using the likelihood ratio test, with P<0.1 being the inclusion criterion for the interaction. Applying this criterion, the interaction term was retained in the final N. americanus model, but not the Ascaris model. A 5% significance level was used, however this analysis reports results of up to 10% significance, which is important for epidemiological interpretation. Analyses were conducted using generalised structural equation models in STATA 14.1 (Stata Corporation, College Station, Texas). Due to uncertainty regarding the linearity of the association of continuous environmental variables and the infection outcomes, quadratic terns were also investigated in all models; however as none of these quadratic terms were significant in the adjusted models, these results are not presented. Postanalysis power calculations indicated 80% power, with a 5% significance level, to detect relative risks of 1.2 to 1.8 for N. americanus infection intensity (depending on level of intensity), and, reflecting lower prevalence overall, relative risks of 2.7 to 3.9 for Ascaris infection intensity.