Evaluation of facial cleanliness and environmental improvement activities: Lessons learned from Malawi, Tanzania, and Uganda

The World Health Organization promotes the SAFE (Surgery, Antibiotics, Facial cleanliness, and Environmental improvements) strategy for trachoma control and prevention. The F&E components of the strategy focus on promotion of healthy hygiene and sanitation behaviors. In order to monitor F&E activities implemented across villages and schools in Malawi, Tanzania, and Uganda, an F&E Monitoring and Evaluation (FEME) framework was developed to track quarterly program outputs and to provide the basis for a pre and post evaluation of the activities. Results showed an increase in knowledge at the school and household levels, and in some cases, an increase in presence of hand/face washing stations. However, this did not always result in a change in trachoma prevention behaviors such as facial cleanliness or keeping compounds free of human feces. The results highlight that the F&E programs were effective in increasing awareness of trachoma prevention but not able to translate that knowledge into changes in behavior during the time between pre and post-surveys. This study also indicates the potential to improve the data collection and survey design and notes that the period of intervention was not long enough to measure significant changes.


Trachoma and water, sanitation, and hygiene
Trachoma, the result of ocular infection with the bacterium Chlamydia trachomatis, is the leading infectious cause of blindness worldwide [1]. Over time, repeated infection can cause the inside of the eyelid to become scarred and the eyelid and eyelashes to turn inward. Left untreated, the painful condition of the in-turned eyelashes scrapping the cornea can cause the individual to become irreversibly visually impaired or blind. Children have been documented to have higher levels of trachoma infection compared to other age groups, with the prevalence of active trachoma greatest among preschool age children [2][3][4]. The progression of the disease to the advanced blinding stage occurs over time with the onset of visual impairment typically occurring in those aged 35 years and over [5].
Transmission of the bacterium Chlamydia trachomatis can occur when fingers and fomites such as clothes and towels come into contact with infected ocular and nasal secretions from one individual and then contact the eyes of another individual [6]. Flies have also been documented to transmit Chlamydia trachomatis when the fly feeds on the ocular and nasal secretions of an infected person and then lands on the eyes of another person [7]. These same flies are strongly attracted to odors produced by human feces and lay their eggs on exposed feces on the ground [8]. Trachoma frequently affects poor and marginalized populations that typically sleep in close quarters, experience overcrowding, and lack adequate access to water and sanitation [1,9,10]. It has been challenging to determine the relative importance of one route of transmission compared to others, and it is unlikely that any single route of transmission is responsible for all trachoma transmission [6,11]. Despite not knowing the extent that Water, Sanitation, and Hygiene (WASH) programs impact trachoma prevalence within communities, certain human behaviors have been demonstrated to reduce the risk of trachoma transmission. First, reducing exposed human feces, through methods such as using pit latrines, reduces fly breeding grounds [7,8]. Second, avoiding sharing cloth with infected ocular or nasal discharge reduces the transmission of the bacterium between people [11]. Third, keeping one's face clean from ocular and nasal secretions reduces the attractiveness of the face to the flies that feed on the secretions for nutritional purposes while simultaneously transmitting the bacterium [7]. Lastly, increasing communities' sufficient and reliable access to water helps increase the likelihood that residents will be able to keep their faces, clothes, and bedding clean thereby reducing risk [12]. It is because of the transmission dynamics and progression of the disease that the World Health Organization (WHO) promotes the SAFE strategy (Surgery, Antibiotics, Facial cleanliness, and Environmental improvements) for trachoma control and prevention [1]. The F&E components of the strategy are focused on promoting healthy hygiene and sanitation behaviors and have a significant overlap with WASH activities. Given that the F&E components are outcomes, there are diverse types of interventions that can be used to achieve these outcomes. The lack of standardization and consistency in how these interventions are designed and implemented present a challenge in comparing across settings and clearly demonstrating effectiveness.

Selection of F&E interventions
The trachoma control programs in Malawi, Tanzania, and Uganda received funding from The Queen Elizabeth Diamond Jubilee Trust (hereafter referred to as the Trust) for F&E activities, with additional funding provided to Tanzania from the United Kingdom's Department For International Development (DFID) [13,14]. In order to guide decision making about how these funds should be spent the Ministry of Health in each country used a multi-step process. First, a situational analysis of WASH and trachoma programming within each country was conducted in order to understand the partners, resources, and existing WASH, trachoma, and F&E activities already taking place at the regional and district levels [15]. Second, the "All You Need for F&E" International Coalition for Trachoma Control (ICTC) toolkit was used to guide an F&E stakeholder workshop [16]. Representatives from ministries of water and sanitation, health, and education participated in this workshop alongside representatives from NGOs from the WASH, neglected tropical diseases (NTDs), and research sectors. Workshop participants identified a list of F&E activities and then participants narrowed down the number of activities based on available donor funding and what they felt should be prioritized and could be achievable in three years. Table 1 shows an overview of the different activities chosen for implementation by region within each country. A more detailed description of activities

Monitoring and evaluating F&E
An F&E Monitoring and Evaluation (FEME) framework was developed for each country to assist with conducting quarterly monitoring of F&E activities and to provide the basis for a pre and post evaluation of F&E activities. The indicators used within each country's FEME included a combination of WASH and NTD indicators identified during a Delphi consultative process [17] and country specific indicators requested by the Ministry of Health. Country specific FEMEs are provided in supplemental information (S1, S2 and S3 Tables). The FEME can be divided into two parts: 1) the logical framework that includes outcome and output indicators; and 2) program implementation activities. Theoretically implementation of activities leads to the achievement of the desired outputs and outcomes reflected by changes in their indicators. For example, conducting community meetings about trachoma (activity) should result in an increase in the percentage of people who have knowledge of hygiene practices in relation to trachoma prevention (output) which thereby contributes to an increase in the percentage of children with clean faces (outcome). Throughout the life of the Trust funded project within the three countries the F&E activities were reported on a quarterly basis. For measuring progress on achieving outcomes and outputs, each country conducted a pre and post-survey.

Objectives of the paper
The purpose of this paper is threefold: 1) present the methods and results of the pre and post surveys in each of the three countries; 2) discuss challenges and successes of the survey process and indicator measures selected; and 3) provide recommendations based on this experience for implementing F&E monitoring and evaluation mechanisms. The study conducted in Uganda had additional ethical approval from Emory University (eIRB#: IRB00093647). All survey participants gave written informed consent prior to participating, head teachers provided written informed consent on behalf of school children who individually assented to take part. Assent was documented.

Evaluation units
F&E programmatic intervention regions were used as evaluation units (EUs) with sampling of villages from a sample frame of all intervention villages. Where the scale of interventions was small (Tanzania) multiple regions were grouped into a single EU. For regions where there was both school and community programming the EUs were considered separately ( Table 2).
In all three countries, implementation of F&E activities began prior to the pre-surveys being conducted. Pre-surveys were conducted in 2017 and post-surveys conducted in 2018.

Sampling
The sample size calculations were performed in STATA using the "sampsi" function. Sample size was based on the indicator 'percentage of children with a clean face' assuming 50% at baseline and the desire to detect a policy relevant change of 10 percentage points. We used 80% power in all calculations, incorporated a design effect of 2 (villages) and 1.5 (schools) and an anticipated non-response rate of 15%.

Community sampling
Household data was collected using a population-based survey following a two-stage cluster sampling methodology. In each community EU all villages targeted for F&E interventions were listed along with their estimated population sizes. The village was the primary sampling unit and was selected using probability proportional to size. The same villages were surveyed at both pre and post-survey. The secondary sampling unit, the household, was randomly selected using a household listing approach. Households that did not have at least one adult (�18 years in Tanzania and Uganda, �15 years in Malawi) who identified as the primary care giver and at least one child under nine years old were excluded from participating and replaced. Every effort was made to collect data from the selected households, which were visited three times before being replaced in cases of recurrent non-occupancy. Non-participating selected households were also replaced. Twelve villages in Lindi which had both school and community programming were excluded from the sample frames for this EU.

School sampling
In Uganda schools were randomly selected from a list of all intervention schools in the EU. In Malawi they were selected using probability proportionate to size. In Tanzania, all intervention schools were included in the survey. In all three countries, in each selected school, two classes were randomly selected from all classes targeted with the trachoma F&E interventions. Within each of these two classes, 21 students were randomly selected for questionnaires. The same schools were used in the pre and post-surveys.

Data collection
The surveys were designed and conducted through collaboration of partners in each country. Independent teams were recruited for the data collection who were blind to the interventions conducted and other indicators that would be generated from the data. Household questionnaires were conducted with the primary caregiver. All present household residents were directly observed for signs of ocular or nasal discharge [18]. As a proxy indicator for hand and face washing, data was collected on the presence and functionality of hand/face washing stations [19]. Direct observation of the availability of handwashing and toilet facilities was conducted. School questionnaires were conducted with head teachers along with observations of availability of toilet and handwashing facilities. Individual questionnaires were conducted with students along with observations of presence of ocular or nasal discharge on the face. Student hand and face washing behaviors were observed for a period of at least three hours in each school. In order to reduce bias we ensured that all observers/research assistants were from the respective countries, spoke the local language, and observed the events as discretely as possible.
We also did not tell the schools exactly what we were observing (i.e. we did not say we were there to observe hand and face washing practices specifically). The extended observation time (3 hours) was selected in an effort to capture different opportunities in which children would wash their hands and face and also allowed the children to get used to the observer being present. There is always a risk of observer bias in the study, and we recognized this as a limitation and made as many efforts as possible to limit these biases. Household questionnaire guides were translated into the predominant local language. All questionnaires were pilot tested in the language in which it was to be conducted and revisions were made to increase clarity. School questionnaires with the head teacher were conducted in English while student ones were conducted in the local language. The English version of the questionnaires and observational data collection tools are provided in supporting information (S4, S5 and S6 Tables). Table 3 documents the different types of data collection methods used in schools and communities. With the exception of the student 'hand/face washing behavior observation', all observations were embedded components of the school or household questionnaire.

Data management and analysis
Data was collected electronically using a purpose-built Open Data Kit-based Android smartphone application-LINKS for pre-survey and CommCare for post-survey [20,21]. Data was downloaded, imported to STATA13 and weighted according to sampling design and analyzed for a difference between pre and post-surveys within EUs using chi squared statistical test. A p-value of <0.05 was used to attribute a statistical difference between pre and post.

Results of pre and post surveys
The implementation of F&E activities and quantitative evaluations of those activities were not intended as a one size fits all approach; however, common themes emerged across the three countries. In order to clearly present results of the pre and post-surveys, results are organized into the thematic groupings of WASH infrastructure, trachoma knowledge, and F&E related behaviors and are presented first for school EUs and then community EUs.

School EU Results
Key results from each school EU within each country are provided in Table 4.
Only the EU in Tanzania showed a significant increase in the percentage of schools that had hand washing facilities with soap with 4.8% (Confidence Interval (CI): 0.6-30.7) at baseline and 35.0% (CI: 17.3-58.0) at post-survey (P = 0.043). In Uganda, there was significant increase in the percentage of schools that had at least one clean latrine for both boys and girls, with 27.8% (CI: 9.9-57.3) at baseline and 70.8% (CI: 39.1-90.2) at post-survey (P = 0.039).
There was minimal positive behavior change across the three countries. Only the two school EUs in Malawi showed a significant change in the percentage of school compounds free of human feces, with those in the Central region increasing from 19.6% (CI: 6.0-48.1) at baseline to 92.6% (72. .3) at post-survey (P < 0.01) and those in the Southern region increasing

Community EU Results
WASH Infrastructure. Key WASH infrastructure results from each EU within each country are provided in Table 5. Access to WASH infrastructure varied throughout the EUs.  Trachoma knowledge. Key results on trachoma knowledge from EUs within each country is provided in Table 6. For purposes of this manuscript, three indicators were used to measure a change in trachoma knowledge. These include: percentage of household respondents who knew one or more symptoms of trachoma; percentage of household respondents who had seen or heard any message about trachoma; and percentage of household respondents who knew one or more ways on how trachoma spreads. The Southern region of Malawi had no significant change in these three indicators. In Central Malawi, there was only an increase in percentage of respondents who knew one or more symptoms of trachoma with an increase from 39.9% (CI: 33.5-46.7) at baseline to 51.5% (CI: 42.9-60.1) at post-survey (P = 0.035). In Tanza Table 7 regarding trachoma behavior related indicators. Two indicators are reported here: clean face, defined as a face free from ocular and nasal discharge; and households free of human feces. The facial cleanliness indicator within households was broken down into three age groups: children nine years and younger; children 14 years and younger; and adults 15 years and above. This classification accommodates the nine years and younger that is typically used within trachoma programs to measure trachoma prevalence in children and the classification of adults as those 15 years and above for purposes of determining the advanced stage of trachoma (trachomatous trichiasis) in the adult population. Results across the facial cleanliness

Discussion
In order to meet the stated objectives of the paper, the discussion is broken down into three sub-sections. First, a discussion of country specific programmatic achievements in WASH infrastructure, trachoma knowledge, and F&E related behavior. Second, an examination of the challenges and successes in the survey design and indicator measures used, and finally recommendations for future implementers.
Programmatic achievements WASH infrastructure. Data was collected on a range of WASH related indicators. For purposes of this paper, results and discussion focus on indicators that highlighted a household's hygiene and sanitation related behaviors such as hand and face washing and latrine use, as these are believed to decrease the likelihood of trachoma transmission. As a proxy indicator for hand and face washing, data was collected on the presence and functionality of hand/face washing stations. The presence of soap and water were assumed to show an increased likelihood that the household was using the hand/face washing station. It was only in the Karamoja region of Uganda that there was a significant increase in households with hand/face washing stations and the presence of soap and water at those locations. Despite the statistically significant increases in Karamoja and a few of the other regions for select WASH indicators, programmatically the results are not encouraging. At post-survey, the overall percentages of households with hand/face washing stations ranged from a low of 3.8% in Tanzania to a high of 41.7% in the Southern region of Malawi. Hand/face washing stations with water, a likely sign of their proper use, ranged from a low of 4.0% in Karamoja to a high of 31.9% in the Southern region of Malawi. These results highlight that even where there were hand/face washing stations, the percentage that had soap and/ or water was much smaller. This could signify that simple presence of hand/face washing stations did not guarantee their use and issues revolving around access to and prioritization of water use remain. Trachoma knowledge. Results showed that there was increased knowledge around trachoma in villages receiving interventions in most EUs. Though in some cases it was minimal, the results imply that at minimum F&E programs were effective at increasing knowledge within communities, a factor in behavior change programs [22]. However, it is important to recognize that knowledge is one element. The study also looked at risk perception and noted there was little change in perception of risk across the different evaluation units. It is unclear if this lack of change in perception was because the respondents felt they were putting measures in place to reduce their risk and therefore they were not concerned or if they consistently did not feel trachoma was an issue in their communities.
F&E related behaviors. Though the percentage of children and adults with a clean face was at least 60% across the five community-based EUs, it was not expected that the post-survey results would show a significant decrease across all age groups and regions. This could be due to a number of reasons. For example, data collection was not always standardized in the months and time of day when data was collected. The months of the pre and post-surveys varied, with some data collection occurring during the rainy season and some in the dry season when water is typically less available. This could have impacted availability of water for hand and face washing use. The time of day when clean face was documented also varied as survey teams began at one house and moved throughout the village throughout the day. Other studies have shown that time of day and physical location of data collection may impact the likelihood of a face being clean [23][24][25].
Ultimately the survey results from schools and households showed that though there was an increase in knowledge at the school and household level, and, in some cases, an increase in presence of hand/face washing stations, this did not always result in a measurable change in trachoma prevention behaviors such as facial cleanliness at the household level. This shows that the interventions used were effective in increasing awareness of trachoma prevention, which is a first step to changing behavior, but there remains a gap either to translate that knowledge into changes in behavior or to measure the behaviors effectively. It may also show that the data collection itself needs improvement or that the period of intervention was not long enough to measure significant changes.

Challenges and successes
The pre and post-surveys were powered to measure changes within the EU and did not measure the success or failure of specific F&E interventions used within EUs-most of which had multiple partners and heterogenous intervention design. This means we could not compare specific F&E interventions across districts, EUs, or countries to determine what did or did not work. It is also unknown if the F&E activities were truly implemented as intended, as the only point of reference were quarterly reports submitted by F&E implementing partners to the donors. Additionally, the surveys did not include trachoma infection data, therefore, these surveys cannot claim that particular F&E related activities directly led to a decrease in trachoma prevalence.
Throughout the period of implementation of F&E activities there was not a consensus on the percentage change needed within each indicator to determine if the programs were achieving success. For example, in Tanzania, the percentage of household respondents who knew one or more ways that trachoma spreads increased statistically significantly from 22.8% to 35.0%; however, programmatically 35% would be considered a sub-optimal achievement following almost two years of F&E related programming. On the other end of the spectrum, there were instances when the baseline was already high, such as facial cleanliness in schools or household levels of sanitation. Programmatically having over 90% of school children with a clean face, greater than 78% of households having access to latrines, and 88% of households free of human feces would be considered a high level of sanitation coverage in trachoma endemic regions. For indicators that began with a high baseline, it was difficult to power statistical evidence of increase [26]. The challenge therefore is determining what the minimal thresholds are for WASH related indicators rather than purely focusing on percentage change or statistically significant changes during evaluations.
There were multiple school related indicators in the FEME. These included children having a clean face; students having access to hand/face washing stations with water and soap present; schools having functional clean latrines for staff and students; improved water sources located on premises and accessible to all users during school hours; and an awareness about how to prevent and treat trachoma. These indicators are still viewed by the authors as adequate indicators to measure F&E school-based programming for trachoma prevention and education.
There were challenges implementing the pre and post-surveys within the schools. While observations were attempted in every school, observations could not be made in schools that had no water supply or no hand/face washing facilities. In this situation, observations were made for an hour and if no hand or face washing occurred the observer proceeded to help with the school questionnaires. There was no provision in the pre-survey data collection form for recording the absence of handwashing when a student used the latrine and did not wash their hands afterwards. The student was only recorded when they did wash their hands. Therefore, a comparison between handwashing and not handwashing could not be calculated. In the post-survey, the observation form was updated and allowed for individual observations to be collected in addition to collecting when a hand/face washing event should have occurred but did not (i.e. going to the toilet but not washing hands).
In enumeration of school WASH facilities, latrines/toilets were not assessed individually for each characteristic, rather, there was a count of all latrines meeting each characteristic. This means for percentage of schools with clean and functional latrines we could not look at both clean and functional latrines together and therefore could only pick one indicator. We chose to report on cleanliness of latrines only. This decision was made due to the survey design. There were many elements that made up a functional latrine and because we did not assess each latrine individually, we could not say, for example, if the one latrine counted for having a super structure was also the one latrine counted for having a drop cover. Therefore, we selected clean latrines because that was a stand-alone question included in the survey (i.e. how many of the latrines are clean). The 'percentage of children washing their faces when washing their hands during the school day' is not a critical data collection point as it does not inform if the face needed to be washed. Children not washing their face during this observed moment does not directly imply dirtier faces or an improper behavior. We would therefore not recommend this indicator but rather simply 'percentage of children with clean faces' at defined observation points, such as arrival at the school in the morning and before leaving school to return home.
There were lessons learned from the data collection process during the pre-survey that were implemented for the post-survey to improve data quality and assurance. This included allowing data collectors and supervisors to check the data before sending to ensure increased data quality and control. In addition, mobile data capture forms were designed to validate eligibility criteria before allowing enumerators to proceed with data collection. Based on the experience from the pre-survey, a supervisor form was developed for the post-survey to calculate household replacement rates and capture the population of the village for weightings (S7 Table).
In all three countries, F&E activities funded by the Trust and DFID began before the presurveys were conducted, due to funding dynamics and the time it took to get survey protocols developed, approved, and implemented. This delay in pre-survey implementation creates a limitation in that the surveys might not have detected some of the changes produced by the interventions in the months before the pre-survey was conducted.
Though the F&E programs were implemented for multiple years in each country, pre and post-surveys were conducted 16 months apart in Malawi, 13 months apart in Tanzania, and 20 months apart in Uganda. As behavior change is a long process, this is likely not enough time to measure a change in behavior. The surveys were expensive due to sample sizes across multiple regions and long household survey questionnaires that collected data on a range of topics, much of which was ultimately unused or was not processed in time to be of use to programming.

Recommendations
Based on the experience of selecting, implementing, monitoring, and evaluating F&E activities, there are several recommendations. First, the F&E activities chosen for implementation in these three countries were based on donor funding, implementing partner presence and perceptions of what was achievable in three years. Though these factors are important for program implementation, it is recommended that behavior change theory also be used to develop programs and M&E. This will not only help tailor programming to the specific needs of each community but also aid in the consistent and clear measurement directly linked to the activities. Second, ideally any future pre-surveys should occur before F&E activities are implemented if time and resources allow; however, in areas that have on-going F&E related activities, this might not be possible. Third, programs should determine what their output targets are before implementing F&E programs and when determining methods of measuring success. This should include not only change from pre and post-survey but also pre-identification of indicator levels which are considered optimal and which the maintenance of (as opposed to statistically significant increase) would be considered a successful outcome. Fourth, in order to evaluate whether activities are effective, it is important to know if they are being implemented as intended. Routine monitoring of activity implementation is critical and can provide insights and add depth to results from program evaluations. The FEME (S1, S2 and S3 Tables) provides a document that could be used as an example on which to improve. Fifth, there should be careful consideration of possible confounders of seasonality and time of day that data is collected, with efforts made to standardize the months and time of day data is collected. Sixth, unless multiple surveys are being conducted to measure longitudinal data, at least 24 months should pass between pre-and post-surveys allowing for more time for behavioral change practices to take root. Lastly, where possible, instead of conducting a standalone F&E survey, F&E data should be collected in connection with other planned programmatic, demographic, or health surveys in order to streamline human and financial resources.

Conclusion
The application of the FEME in Malawi, Tanzania, and Uganda, and the implementation of a pre and post-survey to measure change for select F&E indicators represents an effort to fill a gap in understanding how best to evaluate F&E activities in trachoma programs. It is clear from the lessons learned and recommendations that the FEME framework, survey indicators, and survey methodologies could use some improvement or modifications to make monitoring and evaluation of F&E activities more effective. Additionally, a more robust system for monitoring implementation of F&E activities would have aided in programs making quicker programmatic decisions and allowing a better understanding of the results of the post-surveys. Despite the limitations, the experience gained from implementing the FEME, pre and postsurveys, and the supplemental materials provided in this manuscript contribute towards the effort to progress our understanding of how best to evaluate F&E activities.