Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Support To Rural India’s Public Education System (STRIPES2) and impact on numeracy and literacy scores: A cluster randomized trial in rural villages of Madhya Pradesh, India

  • Ila Fazzio ,

    Roles Conceptualization, Investigation, Methodology, Supervision, Writing – original draft, Writing – review & editing

    if@effint.org

    Affiliation Effective Intervention, London, United Kingdom

  • Siddharudha Shivalli,

    Roles Conceptualization, Investigation, Methodology, Project administration, Supervision, Writing – original draft, Writing – review & editing

    Affiliation London School of Hygiene and Tropical Medicine, London, United Kingdom

  • Nicholas Magill,

    Roles Data curation, Formal analysis, Investigation, Methodology, Writing – original draft, Writing – review & editing

    Affiliation London School of Hygiene and Tropical Medicine, London, United Kingdom

  • Diana Elbourne,

    Roles Conceptualization, Data curation, Investigation, Methodology, Project administration, Validation, Writing – original draft, Writing – review & editing

    Affiliation London School of Hygiene and Tropical Medicine, London, United Kingdom

  • Suzanne Keddie,

    Roles Data curation, Formal analysis, Validation, Writing – original draft, Writing – review & editing

    Affiliation London School of Hygiene and Tropical Medicine, London, United Kingdom

  • Dropti Sharma,

    Roles Conceptualization, Data curation, Methodology, Writing – original draft, Writing – review & editing

    Affiliation Pratham Education Foundation, New Delhi, India

  • Sajjan Singh Shekhawat,

    Roles Conceptualization, Methodology, Supervision, Writing – review & editing

    Affiliation Pratham Education Foundation, New Delhi, India

  • Arjun Agarwal,

    Roles Conceptualization, Project administration, Supervision, Writing – review & editing

    Affiliation Pratham Education Foundation, New Delhi, India

  • Rukmini Banerji,

    Roles Conceptualization, Project administration, Supervision, Writing – review & editing

    Affiliation Pratham Education Foundation, New Delhi, India

  • Sridevi Karnati,

    Roles Conceptualization, Data curation, Investigation, Methodology, Project administration, Software, Supervision, Validation, Writing – review & editing

    Affiliation GH Training and Consulting, Hyderabad, Telangana, India

  • Harshavardhan Reddy,

    Roles Conceptualization, Data curation, Methodology, Project administration, Resources, Software, Supervision, Validation, Writing – review & editing

    Affiliation GH Training and Consulting, Hyderabad, Telangana, India

  • Tony Brady,

    Roles Data curation, Project administration, Software, Supervision, Validation, Writing – review & editing

    Affiliation Sealed Envelope, London, United Kingdom

  • Piotr Gawron,

    Roles Data curation, Project administration, Software, Supervision, Validation, Writing – review & editing

    Affiliation Sealed Envelope, London, United Kingdom

  • Pei-Tseng Jenny Hsieh,

    Roles Data curation, Investigation, Methodology, Supervision, Validation, Writing – review & editing

    Affiliation National Foundation for Educational Research, London, United Kingdom

  • Alex Eble,

    Roles Conceptualization, Data curation, Investigation, Methodology, Validation, Writing – review & editing

    Affiliation Teachers College Columbia University, New York, United States of America

  • Peter Boone,

    Roles Conceptualization, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Writing – original draft

    Affiliation Effective Intervention, London, United Kingdom

  •  [ ... ],
  • Chris Frost

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Supervision, Validation, Writing – original draft, Writing – review & editing

    Affiliation London School of Hygiene and Tropical Medicine, London, United Kingdom

  • [ view all ]
  • [ view less ]

Abstract

Introduction

Rates of primary school enrolment have improved in India, but levels of learning achievement remain low. In the Support To Rural India’s Public Education System (STRIPES) trial, a para-instructor intervention improved numeracy and literacy levels in Telangana, India (2008−10). The STRIPES2 trial was designed to assess whether a similar intervention in a younger cohort of children would have similar effects in Satna and Maihar districts of Madhya Pradesh, India, and be cost-effective.

Methods

In this Madhya Pradesh cluster-randomized controlled trial, 196 villages (clusters) were randomized to receive either a health (CHAMPION2: community health promotion and medical provision and impact on neonates) or education (STRIPES2) intervention. Villages receiving the health intervention were controls for the education intervention and vice versa. For children newly enrolled in primary school, the STRIPES2 intervention comprised before/after-school classes (2 hours per day, 6 days a week) given by trained para-instructors from the local community, frequent monitoring, and engagement with caregivers to motivate children, delivered by the Pratham Education Foundation. STRIPES2 activities had to be suspended twice for around ten and a half months, and some components of the intervention modified due to the COVID-19 pandemic. The period of the trial was extended with the primary outcome (a composite literacy and numeracy score of Early Grade Reading and Mathematics Assessments) assessed around 30 months after classes started.

Results

Composite test scores were significantly higher in the intervention arm (98 villages; 3054 children) than in the control arm (98 villages; 3275 children) at the end of the trial. The mean difference on a percentage point scale was 14.17; 95% CI 11.36 to 16.97; p < 0.001, equating to a 0.58 (95% CI 0.47 to 0.71) standard deviation difference. The cost per child per 0.1 SD increase in composite test score was INR 2476 (US$33.5).

Conclusion

Despite COVID-19 interruptions and disruptions, STRIPES2 resulted in a major improvement in children’s literacy and numeracy. However, the cost of achieving such benefits was substantial.

Introduction

In common with many other low- and middle-income countries (LMICs), India has witnessed a massive expansion in school enrolment over the last 20 years, and yet many students finish primary education without the foundational literacy and numeracy skills that would be expected as indicated by their grade and national curriculums [1,2]. More than half of children in LMICs, who have finished, or are close to finishing, their primary education have not acquired the basic skills to read and comprehend a short story appropriate for their age [36]. In Madhya Pradesh, India, according to ASER (Annual Status of Education Report) 2018 [5], the overall school enrolment of children between six and fourteen years old in rural villages was very high (96%), with about 72% of them enrolled in public schools in 2017–18 [5]. Yet, the results of the ASER reading tests showed that only 41.6% of children in grade 5 (10–11 years old) were able to read the short story (with familiar words and a simple sentence construction that is typical of Indian Grade 2 textbooks). Among the Grade 3 children (7, 8 and 9 years old), only about 10.4% of those enrolled in public schools were able to read the short story. In mathematics, ASER 2018 data showed that only about 8.5% of grade 3 students in public schools in rural Madhya Pradesh could perform at least 2-digit by 2-digit subtraction with borrowing (this being expected of children who finish 2nd grade in most states). This low reading and mathematics attainment of students despite high school enrolment can be partly attributed to weak engagement, support, and monitoring of teachers. There is some evidence that teachers frequently do not attend classes or are not engaged in teaching even when present in the classroom [7]. Moreover, in public schools in Madhya Pradesh (as in all India states), teachers are instructed to complete the syllabus for each grade, regardless of students’ learning outcomes and ability to understand the content of what is being taught. As a result, slow performing students do not have a chance to catch up on what is being taught and frequently end up showing a very poor performance even in basic skills.

Studies aiming to raise students’ learning, followed by several reviews [810], have provided frameworks for comparing interventions and recommending the best ones in terms of years of learning and cost benefits, so governments and organizations can make informed investment to improve the quality of education. In these reviews, remedial instruction taught by community instructors with an improved pedagogy to match teaching to students’ learning level is cited as one of the most effective strategies for primary school age children [6,8,9,11,12]. Remedial instruction targets the foundational skills that a child needs to master. Its core idea is that if a child has not acquired the foundational skills in literacy and numeracy, it is not possible to assimilate more complex concepts and skills.

Many of the strategies aimed at addressing the poor quality of education fall short of enhancing children’s learning. This inadequacy is mainly due to their focus on interventions that were limited to providing facilities, materials, or school access, rather than focusing on improving the child’s learning experience [810,1315].

Studies from Chile and various parts of India reported that extra teaching by tutors or volunteers from the local community improved literacy and numeracy skills among early primary grade children [1618]. For instance, offering remedial classes to third and fourth graders in public schools located in Mumbai and Vadodara, India, led to an increase of 0.28 standard deviations in overall test scores for literacy and numeracy in the second year [17]. Another study from Uttar Pradesh, India, demonstrated a large positive impact in both language (Hindi) and mathematics of children in Grades 3–5 who consistently attended learning camps facilitated by community volunteers for 40 days where they were grouped and taught according to their learning levels for 1.5 hours per day [11]. Similarly, a tutoring initiative spanning three months and targeting low performing Grade 4 students in Chile resulted in an important improvement in their reading abilities [16].

The STRIPES trial [18] in Telangana, India (from 2008−10) evaluated the effectiveness of an intervention that provided 18 months of supplementary, remedial teaching and learning materials (and an additional “kit” of materials for girls) to children in Grades 2–4 at baseline. The primary outcome was a composite of language and numeracy test scores, revealing significantly higher scores in the intervention arm (107 villages; 2364 children) compared to the control arm (106 villages; 2014 children) at the end of the trial (mean difference on a percentage scale 15.8; 95% CI 13.1 to 18.6; p = 0.001; 0.75 standard deviation (SD) difference). The cost per 0.1 SD increase in composite test score per child was INR 382.97 (£4.45, $7.13). The STRIPES trial thus provided evidence that supplementary teaching, when implemented in remote rural areas, can significantly enhance numeracy and literacy skills at a reasonable cost. Given the scarcity of evidence from previous trials, there is potential to adapt and expand the STRIPES trial to ascertain the generalizability of its findings across diverse settings.

Building on STRIPES, the STRIPES2 trial was conducted to assess whether structured supplementary classes [19] in early grades had a similar effect on the literacy and numeracy of primary school age children in rural villages of Madhya Pradesh, and if it was cost-effective.

Despite the STRIPES and STRIPES2 interventions being designed and implemented by two different organizations (the Naandi Foundation in Telangana and Pratham in Madhya Pradesh) the core approach remained the same: providing supplementary classes led by para-educators recruited from local communities [19].

Methods

Study design, setting and participants

The CHAMPION2/STRIPES2 (Community Health Promotion And Medical Provision and Impact On Neonates, and Support To Rural India’s Public Education System and impact on numeracy and literacy scores) trial was a cluster-randomized trial conducted in rural villages in the Satna and Maihar districts of Madhya Pradesh. At the time the trial was conducted the Maihar district was part of the Satna district. Full details of the STRIPES2 trial design and planned statistical analysis can be found in the CHAMPION2/STRIPES2 protocol paper [20], protocol update [21] and statistical analysis plan [22]. In brief, 196 villages were randomized, with 98 receiving a package of interventions to improve literacy and numeracy (STRIPES2) and 98 (CHAMPION2) receiving a package of maternal and newborn interventions. The villages receiving the CHAMPION2 intervention acted as controls for the STRIPES2 trial and vice-versa. The reason this design was adopted was (as with the original STRIPES and CHAMPION trials) to create a positive presence in all the villages in both trial arms, thereby encouraging STRIPES2 control children to give us data and attend the midline and endline tests even though they were not receiving any educational intervention. The CHAMPION2 intervention was primarily aimed at improving outcomes in pregnancy, whereas the STRIPES2 intervention aimed to improve educational standards in primary school-age children. We accordingly anticipated that the CHAMPION2 intervention would not materially impact on outcomes in STRIPES2 and vice-versa.

The trial was conducted in Satna district, Madhya Pradesh, India. Satna district is further divided into 10 tehsils (sub-districts). Three tehsils (Birsinghpur, Majhgawan and Raghurajnagar) were excluded due to difficult access (forest area), risk of violent robbery, and being an urban sub-district. The remaining seven tehsils comprised 1263 villages (68% of all villages in Satna), with a population of 1,441,930 [23]. From these we selected the 484 villages that (i) were considered ‘rural’ with a population less than 2500 inhabitants (ii) had at least 120 children under the age of 6 years and (iii) were accessible by road. An algorithm was written to select a maximally sized subset of these villages such that (iv) each village center was at least 5 km away from the nearest Community Health Centre (CHC) and (v) there was a minimum of 3 km between village centers to avoid contamination (buffer zones). The requirement that villages should have at least 120 children under the age of 6 was included because this gave an expectation of 20 children in each academic year and therefore a good chance of there being at least 15 children eligible for the intervention. In the event that (vi) fewer than 15 eligible children were identified at enumeration (or that villages were considered to be urban rather than rural) villages were not randomized.

In STRIPES2, the target group was children born between 16 June 2010 and 15 June 2013, whose caregivers were planning to enroll them in Grade 1 for the first time in the 2018–2019 school year, and resident in the trial villages. Because of delays in getting the trial started, enumeration took place twice (July 2017 and April 2019), so eligible children who were missed the first time had a second chance to participate in the trial.

All data collection and related research activities for the STRIPES2 trial were organized by independent teams recruited, trained and monitored by GH Training and Consultancy (GHTC), India.

This trial employed multiple tiers of consent: village and individual level consent from the caregiver on behalf of the child. Agreement to approach eligible villages was first obtained from the village heads (Sarpanchs). In the trial villages, consent was obtained from the village after the trial had been presented in a meeting with village elders representing all the castes and village residents. Consent was given orally by the elders during a village meeting, and the Sarpanchs of the villages signed (or gave their thumbprint) in a document that described the trial. During the enumeration survey, GHTC interviewers read a simple standard consent explanation of the trial in Hindi (local language) and caregivers of eligible children were offered the chance to participate, giving signed or thumbprint consent if they wished to. This trial’s protocol [20] and its amendment [21] were approved by the Ethics Committees of L V PRASAD Eye Institute, Hyderabad, India (LEC 02–16−008) and London School of Hygiene and Tropical Medicine (LSHTM Ethics Ref: 10482). We also obtained the approvals from the Indian Council of Medical Research, New Delhi and the Government of Madhya Pradesh to conduct this trial in Satna district. The trial complies with the Declaration of Helsinki, local laws, and the International Conference on Harmonization Good Clinical Practice (ICH-GCP).

Intervention design

The STRIPES2 intervention aimed at building foundational literacy and numeracy skills for Grade 1 and 2 children in a holistic manner, combining multiple interventions to promote learning outside and inside the child’s home. Caregivers (especially mothers) had a crucial role in engaging with their children’s learning. Also, its content extended beyond academic learning, as it included activities to develop children’s social and emotional skills. The approach involved a combination of individual, small group, and large group activities, incorporating play-based teaching methods that included: storytelling, reading, playing, singing, coloring, drawing, and role-playing. This approach aims to help make learning enjoyable and engage children in a manner that aligns with their natural instincts.

The STRIPES2 intervention design was based on Pratham’s experience across multiple states in India, and especially in Madhya Pradesh [11]. Pratham has been working in Madhya Pradesh with Grade 1 and 2 children since 2005, indirectly supporting projects that involved training and fostering community volunteers to interact with local schools’ teachers and mothers to engage in children’s academic, emotional and physical learning through the development of workbooks for Grades 1 and 2. However, this was the first time that Pratham was working directly in the trial area (Satna and Maihar districts).

Over the years, Pratham has worked closely with the state government to provide primary school teachers with innovative teaching methods and curricula, aiming to enhance children’s proficiency in core subjects. With a focus on Grades 1 and 2, programs in partnership with the government have addressed essential competencies in language and mathematics. Teacher training, utilizing Pratham’s “teaching-learning” and “teaching-at-the-right level” methods, have been central to these initiatives, overseen by both government and Pratham personnel. In 2013, Pratham partnered with the Madhya Pradesh government in selecting, training and sometimes hiring education coordinators to provide on-site support to teachers, enhance instructional materials and practices, ensure effective curriculum implementation, address challenges in the classroom (such as diversity in learning levels etc.) and improve overall quality in primary education. The design of the teaching component of the STRIPES2 intervention was based on Pratham’s innovative [18,24] activity-based content and curriculum package developed for Grades 1–2 that focuses on improving foundational literacy and numeracy (FLN) whilst fostering other developmental areas such as physical, social, and emotional growth. It follows a phased approach, starting with a warm-up phase to prepare children from diverse backgrounds (see Manual for Instructors/Teachers: Warm Up – Phase I Language & Math Class 1 & 2, in S1 Appendix), followed by instructional phases that strengthen literacy and numeracy skills [18,24].

The key components of Pratham’s method are:

  • Simple Testing Tools: One-on-one oral tools and worksheets are used to track progress, with assessments conducted at baseline and end-line stages.
  • Daily free play (10–15 minutes), group socio-emotional activities like storytelling and role-playing, and physical tasks such as running and jumping (see examples of games and activities in the Manmauji Ganit Booklet – S2 Appendix).
  • Activity-Based Instruction: Activities adapted to individual learning levels and encourage gradual improvement in reading and arithmetic.
  • Contextual and Low-Cost Materials: Locally relevant materials are used, including stories, flashcards, and print-rich environments to support language development. Context-specific materials and culturally adapted rhymes are included as optional elements. To ensure effective implementation, teachers are trained with hands-on practice, supported by activity booklets and a structured two-hour lesson plan framework.
  • Focus Domains: Emphasis is placed on language (oracy, phonological awareness, vocabulary, fluency, comprehension, writing) and mathematics (pre-mathematics concepts, numbers and operations, measurement, geometry, data handling)

The para-educators, called Pratham instructors (PIs), were female residents in the intervention villages. All had completed 12th Grade and passed a written exam during the selection process. By hiring female teachers, Pratham also intended to empower local women. PIs received a 6-day training by Trainers (Manual for Master Trainers – Grades 1 and 2 in S3 Appendix)

PIs were responsible for conducting classes either before or after school hours, maintaining contact with families, and engaging mothers in the learning process. They were overseen by a team of full-time cluster leaders (CLs), each CL being responsible for approximately 10–12 villages. CLs provided continuous monitoring, training, and mentoring support to PIs. Additionally, the CLs played a crucial role in facilitating the engagement of PIs with the local community, assisting them in talking to parents about the importance of these before/after school classes and finding adequate places in the village to hold the classes.

Before the launch of the program, Pratham teams piloted the activities and materials for these before/after school classes in rural villages of Bhopal district, Madhya Pradesh. In November 2019, Pratham started the 3-month preparatory phase for students’ readiness. The initial 2–3 weeks period was a warm-up phase and focused on nurturing attendance (see S1 Appendix). During that stage PIs worked on mobilizing families to send their children to the classes, motivating children to come on time to classes and getting them used to sitting for long activities. PIs carried out a series of activities such as playing, singing, coloring, and drawing, to make children feel at ease within the school and class so that they were able to interact comfortably and independently with the instructors and their peers (see Activity Book Manmauji GanitS2 Appendix).

After the warm-up phase, the STRIPES2 teaching-learning approach was implemented in three phases: readiness (phase I) and instructional (phases II and III). The readiness phase (for 4 weeks) included simple, interesting, and well-designed activities focusing on a preparatory stage, aiming to familiarize children with the classroom environment, develop emergent literacy and numeracy skills, and promote overall growth in various developmental areas. The emphasis on diverse activities and the teacher’s active role in understanding individual needs contribute to a comprehensive approach to early childhood education. Each of the instructional phases II and III were implemented for about 2–3 months and aimed to strengthen children’s literacy (Hindi) and numeracy learning.

In addition to teaching, the PIs were actively involved in maintaining frequent communication with caregivers through door-to-door interactions (home visits) and fortnightly meetings with caregivers (caregivers’ meetings). During these home visits and caregivers’ meetings, the PIs engaged with caregivers to discuss children’s learning progress, offer suggestions for fun activities that could support the learning process and investigate whether caregivers were conducting these activities (reading and mathematics), and check whether the activities were appropriate to the child’s learning level.

In March 2020, about 4 months after the intervention had started in the communities, the government of India declared a nationwide lockdown due to the COVID-19 pandemic. As a result, the intervention activities were first suspended and then resumed with adaptations. Between August and November 2020, the program had to be limited to calls and text messages by the PIs. The initial text messages were focused on health precautions and then shifted to focus on basic language and mathematics. These activities were adapted from Pratham’s national campaign the “Karona:thodi Masti, Thodi Padhai” (Do it: A little fun, a little study) [25]. It is important to highlight that these activities were not intended as formal teaching but rather as a means of promoting caregivers’ engagement and sparking children’s interest. In contrast, the instructor-led model that was adopted as soon as COVID-restrictions were lifted involved focused, structured learning for two hours daily [25]. Moreover, PIs also conducted daily follow-up calls to ensure the seamless delivery of the content. These calls played a crucial role in building a rapport with the parents and ensuring their active participation in the activities. Despite the changes made during the pandemic, the education program remained mainly instructor-led, not caregiver-led.

During the lockdown period, the relief schemes (like distribution of activity books and text messaging) provided by the government were designed and implemented uniformly, ensuring similar access and support in both trial arms.

As COVID-19 restrictions eased (November 2020), the subsequent phase involved setting up small group interactions with 2–5 children (small group classes) (Fig 1) to understand learning levels of children after a prolonged break and prepare them to resume classroom activities. In parallel, PIs restarted home visits to actively engage with children and their caregivers/family members (both individually and in small groups), during which they talked with caregivers/family members and children about queries or concerns. The frequency of these visits and group activities was related to the availability and willingness of parents and the village head, as well as the number of children that each PI was responsible for. In parallel with these activities, Pratham teams conducted six ASER-like assessments throughout the trial period to keep track of how children’s learning was improving. These are represented in Fig 1 as red stars (December 2019, November 2020, April and August 2021, February and June 2022).

thumbnail
Fig 1. Timeline of the STRIPES2 trial activities in Madhya Pradesh, India.

https://doi.org/10.1371/journal.pone.0330203.g001

Around July 2021, the intervention team, with the help of the local community, identified suitable class locations that allowed for social distancing, so that PIs could restart teaching the before/after school classes. Additionally, PIs were trained to discuss the risks associated with COVID-19, ways to reduce its transmission and sensitize caregivers to avoid sending their children to classes if anyone at home had symptoms.

From January 2022 onwards, a group of 163 young volunteers started conducting 20- to 40-minute-long fun activities with groups of 4–5 children on a regular basis (volunteer classes). The 163 volunteers engaged in this initiative were primarily young females from the community who had demonstrated a keen interest in working with children on a consistent basis. They had foundational skills in reading and mathematics, enabling them to facilitate learning activities with the children. To mobilize these potential volunteers, a month-long collaborative effort was undertaken by PIs and Community Leaders (CLs). This initiative involved various strategies, including door-to-door meetings and large group gatherings within villages, facilitated with the support of the local Sarpanch (village head) and other key community stakeholders.

A month later, the same volunteers were trained in performing (and creating materials for) storytelling (for instance: drawing, painting, and cutting cardboards to create characters and scenarios; and ways of dramatizing when telling a story). In March 2022, informal libraries were created in the local shops to provide a platform where children’s books could be easily borrowed, encouraging interest in books, and providing material for children to practice reading.

In April 2022, volunteers began using tablet games to reinforce and support learning through fun activities in sixty-five of the intervention arm villages. These sessions, conducted daily for 25–30 minutes with groups of 4–5 children, were offered to all enrolled children but targeted especially at children who were struggling to recognize and interpret the sounds associated with syllables (which is essential for reading and writing) and number recognition.

Table 1 presents a list of the components of Pratham’s intervention, indicating whether each was considered core or optional. Although each component could not be implemented simultaneously in all villages, Pratham teams followed a consistent process of implementation for each of them except the formal libraries.

thumbnail
Table 1. Core and optional components of Pratham’s intervention.

https://doi.org/10.1371/journal.pone.0330203.t001

Outcomes

The primary outcome was the arithmetic mean of the child’s scores on EGRA (Early Grade Reading Assessment) and EGMA (Early Grade Mathematics Assessment) tests [26,27]. We chose to use EGRA and EGMA because of our positive experiences applying them in similar contexts (as they are sensitive in measuring small differences in ability among children who have very low levels of learning), and because they are widely used tests for assessing early grade literacy and numeracy.

Secondary outcomes included separate scores for literacy and numeracy; caregivers’ engagement in their child’s learning; enrolment in school; caregiver’s report of school attendance; and the cost effectiveness of the intervention. Owing to the interruptions caused by the COVID-19 pandemic, the STRIPES2 intervention finished later than had initially been planned, meaning that the intervention (and the final testing from July-September 2022) was carried out in older children than would otherwise have been the case.

Sample size and randomization

The sample size calculation was primarily driven by the CHAMPION2 (health) intervention (details published in the protocol and update) [20,21]. Originally, the intention was to randomize 300 villages, which would give over 90% statistical power to detect a difference of 0.25 standard deviations in mean standardized test scores for STRIPES2. In the first STRIPES trial (Telangana), [8] the estimated effect was a 0.75 standard deviation (SD) increase in mean score: however, an effect of smaller magnitude than this would still be important to detect.

After drawing the buffer zones, it turned out that only 204 villages could be selected. These 204 villages had a mean population of 1487 (minimum 558, maximum 2490) and a standard deviation of 505 (equating to a coefficient of variation of 0.34). Estimating the number of children in each school year from the number under the age of six years old, the mean number of children in each school year was 38.3 (minimum 20, maximum 71) with a standard deviation of 13.3 (a coefficient of variation of 0.35). Assuming that 25% of the children would not satisfy the eligibility criteria, the mean number of eligible children per village was estimated as 28.7 with a minimum of 15. Further assuming that i) 95% of the 204 villages would ultimately participate, ii) 60% of the eligible children in these villages would take the test at the end of the trial, iii) an intra-cluster correlation coefficient of 0.23 (as seen in the STRIPES trial [8]) and iv) a coefficient of variation in numbers taking the test by village of 0.35 gave the trial 88% power to detect a difference of 0.25 SD in mean standardized scores between intervention and control villages using a conventional 2-sided statistical significance level of 5%. In fact, 196 of the 204 villages were randomized (6 villages being removed because they were found to be too close to urban areas to be considered rural, and 2 being removed because there were insufficient eligible children).

All children were enumerated and enrolled before randomization. Randomization of clusters was performed by the trial statistician based in London in June 2019 using a random number generator, with stratification by village size (above or below median) and distance to the nearest Community Health Centre or Civil Hospital (above or below median).

Adherence

We measured adherence through children’s attendance at before/after school classes recorded by PIs, the participation of caregivers in fortnightly meetings and their compliance with reading and mathematics exercises (that were given by PIs for caregivers to practice with their children).

Before/after-school classes were not run in exactly the same way in all of the villages. Partly this was due to COVID-19 concerns and restrictions and the difficulty some children had to reach the place where these classes were run. As a result, there was variability in the number of planned classes per week, the length of these and the size of classes.

For simplicity, we decided to use counts of the numbers of classes i) offered to and ii) attended by each child. We assumed that, had the intervention run as planned, then each child would have been offered 360 classes (6 classes a week for 60 weeks, this corresponding approximately to a 16-month period with allowance for holidays etc.). We refer to this as the ideal number of classes. We calculated the total number of classes that were offered to each child and the total number of classes that each child attended. Where more than 360 classes were offered or attended, we took the number(s) to be 360. We defined adherence at child level as the proportion attended of ideal, proportion offered of ideal, and proportion attended of offered. We also did this at village level by calculating mean proportions within each village.

As well as recording attendance in classes, PIs recorded the attendance of caregivers at caregivers’ group meetings and asked whether they had carried out the language and mathematics activities/exercises with their children. These meetings occurred every two weeks so that caregivers could share experiences and talk about doubts related to their children’s learning. Attendance at these meetings was summarized in an analogous way to attendance in classes with ideal attendance being 24 classes. If the caregiver did not come to the meeting it was assumed that she had not carried out the activities with the child.

Baseline data collection

Prior to randomization, data were collected on children’s gender, age, whether or not parents were alive and who their primary female and male caregivers were (mother, father, grandmother, grandfather etc.). In addition, data were collected on the caregivers’ religion, caste, literacy and education levels.

Midline test and survey

GHTC teams conducted the midline tests with trial children about six months after PI classes restarted and a month after schools reopened. These midline ASER-like tests were developed by an expert in educational evaluations using the structure and rules of the main ASER reading and mathematics tools adapted to the context of rural Madhya Pradesh (S4 Appendix). They were conducted in both trial arms between December 2021 and January 2022 with the aim of understanding the learning levels of children soon after the pandemic-long break.

To ensure efficient and timely testing, this midline test and a midline survey were conducted in two separate periods. This minimized the risk of children gaining access to the tools before being tested.

The midline survey was conducted between February and April 2022 with the main caregivers. The aim of doing this survey was to monitor residence and school enrolment of participant children (both before and after the lockdown – academic years of 2019−20 and 2020−21) and gather some information about expenses related to schooling and educational challenges faced by and support given to children a few months after primary schools reopened in Madhya Pradesh.

Data were also collected at the midline survey (or at the endline survey if missed at midline) on the materials that houses were constructed from (whether the materials used for the floor, roof and walls were synthetic or natural) and on whether household members owned a television, radio, motorbike and/or a 4-wheeled vehicle. These variables were used to construct wealth indices.

Endline tests and survey

The EGRA and EGMA protocols were adapted to the local context by the National Foundation for Educational Research (NFER, [28]) and a board of local language and curriculum experts and primary teachers. Three rounds of workshops were organized to adapt and evaluate the subtasks and the individual items in the test in terms of relevance, accordance to curriculum, and suitability to the local context and language. The validation of the Hindi test instruments included expert review and qualitative trials with a small number of children. The validated EGRA and EGMA protocols were uploaded onto tablets (using the Tangerine platform designed for EGRA and EGMA- https://www.tangerinecentral.org/tangerine to facilitate tests’ application), which provided an opportunity for immediate online uploads of the test scores and monitoring of the data (the English versions of the full test papers are given in S5 Appendix). A quantitative pilot of the assessments was conducted with children from nearby non-trial villages to further trial the test administration process, investigate possible order effects, and inform the assessment design. NFER conducted psychometric analysis (Rasch Model) with the pilot data and the result informed the final revision of the EGRA/EGMA protocols. To minimize the risk of children sharing the test content after they had taken the test, we conducted the test in each trial village over the course of a single day.

GHTC recruited an independent team of test administrators for the Endline test. They were otherwise not involved in the trial and were unaware of the randomization. The tests were administered orally and sequentially by two groups of test administrators (one group each for EGRA and EGMA). Tests were administered in one-on-one sessions to each participant child present in the village on the day of the assessment. High attendance at the tests was ensured by the village level mobilization team (see S6 Appendix for a more detailed description of the process of data collection). Tests were conducted about one month after the intervention teams had stopped the main activities (between 24 July and 19 September 2022).

The NFER team coordinated the process of adaptation, translation, and initial monitoring of the assessment administration.

The endline survey was a brief survey conducted by GHTC between November 2022 and January 2023 with the main caregivers to assess the enrolment of participant children in school (2022–2023 academic year), their reported attendance in school during the two weeks prior to the interview, and caregivers’ engagement with children in reading and mathematics activities at home (estimated hours per week).

Statistical analysis

The main analysis was conducted according to the intention-to-treat principle. All enumerated children satisfying the eligibility criteria were included in the primary analysis. The primary outcome of the trial was the composite literacy and numeracy test score. Each test score was calculated as a simple arithmetic mean of the percentage of correct answers on each of the subtasks, evenly weighting each task and not accounting for time remaining (for subtasks composed of two parts (a and b), we used a mean of the two results): for a detailed description of the statistical analysis see the Statistical Analysis Plan [22]. Summary statistics were reported at both child level (in the main paper) and at cluster level (in S7 Appendix).

In the primary analysis the composite language and mathematics test scores at follow-up were compared using a linear regression model with the clustered sandwich estimator of variance (allowing for clustering at the level of the village). Covariates were randomization arm and the randomization stratification variables. The primary analyses were conducted using scores calculated on a percentage scale. We also present the comparison as a standardized effect to allow comparison with other studies. No external standard deviation (SD) was available, so the standardized effect was estimated by fitting a linear mixed model with the same covariates as above to the scores on the percentage scale. The linear mixed model included cluster-specific random effects, with the variance of these random effects and the residual variance allowed to differ between randomization arms. The standardized effect was estimated by dividing the estimated adjusted difference in scores by the estimated total SD in the control arm (this being the square root of the sum of the between- and within-cluster variances in the control arm). A nonparametric bootstrap confidence interval (bias corrected and accelerated, 2000 replications at cluster level, stratified by randomization arm) was computed for this standardized effect.

We used similar linear regression models to the main analysis to explore the effect of interactions between baseline factors and the intervention on the primary outcome. These factors were village population size, gender, two wealth indices, caste, female caregiver literacy and male caregiver education (the male caregiver was often not present to read the sentence, meaning many cases were missing). The data used to compute the two wealth indices were collected post-randomization, but it is implausible that the intervention will have impacted on these variables. In addition to these pre-specified analyses, we carried out a post-hoc interaction analysis investigating whether the effect on the primary analysis differed according to whether or not there was a pregnancy in the household as recorded in the CHAMPION2 trial, this being done to assess whether household involvement in the CHAMPION2 trial might have impacted on the STRIPES2 primary outcome. Were the CHAMPION2 intervention to have materially impacted on STRIPES2 outcomes it is likely that the most marked effects would be seen in children from households where there was a pregnancy.

For the primary outcome we also conducted a pre-specified per-protocol analysis comparing those with high adherence to all controls. High adherence was defined as attending more than 75% of the ideal number of before/after-school classes. We also utilized an instrumental variables approach to explore how the primary outcome varied with attendance as a percentage of ideal in a post-hoc analysis. We adopted the two-stage structural mean model (SMM(G)) approach described by Maracy and Dunn (2011) [29], assuming a linear relationship between attendance and outcome, stratifying by the randomization stratifiers and computing bootstrap 95% confidence intervals (bias corrected and accelerated, 2000 replications at cluster level, stratified by randomization arm) for the whole process.

We estimated the effects of the intervention on reading and mathematics test scores at midline, learning support (hours spent engaging child in reading or writing activities) and number of school days missed in the previous two weeks using analogous linear regression models to those used in the main analysis of the primary outcome. The intervention effect on whether the child was enrolled in school was expressed as an odds ratio with a 95% confidence interval obtained from a GEE model with a binary outcome, a logit link, and a ‘working’ assumption of independence, with robust standard errors to take account of clustering.

In a secondary analysis of the primary outcome, we addressed missing data using multiple imputation by chained equations. We included as auxiliary variables the randomization stratification factors, caste, gender, male and female primary caregiver literacy, the wealth indices, the adherence to intervention variables defined above, the midline test scores, school enrolment at endline, number of school days missed in the two weeks before endline interview, number of hours the caregiver spent engaging child in reading or writing activities post lockdown, caregiver’s report of school attendance, whether or not the child was enrolled in school pre and post the COVID-19 lockdown, school grade at endline, the child’s residence status and the variables quantifying the learning support (and spending) provided by family, school teachers, NGOs and/or private tutors during the time when schools were closed. We performed multiple imputation using the “jomo” package in R [30], which is able to impute clustered data. We carried out imputations separately by trial arm; we used 20 imputations.

In a further secondary analysis, we considered the composite score and the reading test score with one problematic item omitted. Soon after the tests started being administered, the NFER consultant (unaware of randomization status) noticed an issue with a reading comprehension task (first item in subtask 5b), as fewer children than expected were getting that item correct and the result was somewhat inconsistent with individual performances in the rest of the test. The question asked the child what the day/weather was like, after the child had read a related text about it being a windy day. This could have happened because in some parts of India, wind is associated with good weather (because it makes the usually hot days more pleasant), hence quite a number of children answered “good, nice” and sometimes “cool”, or because the question asking about the state of the weather was not accurately translated. Given that this issue was not flagged during the validation or the training, we decided to carry on conducting the tests as they were and conduct a sensitivity analysis omitting the score from EGRA subtask 5b question 1, which was judged to be potentially misleading.

We used a significance level of 0.05 and report 95% confidence intervals. For the interaction tests, claims of different effects in subgroups were only made if there was strong evidence (p < 0.01) of an interaction. We conducted all analyses apart from the multiple imputation in Stata 18 [31]. Full details of analyses are described in the statistical analysis plan [22].

Cost effectiveness

Cost effectiveness was calculated using actual budget expenditures. In 2020 and 2021, the before/after school classes had to be suspended for 10 and a half months because of COVID-19 restrictions. Although salaries continued to be paid normally for all the staff working for the program, we present the results of cost effectiveness per child both including and excluding the expenses during the period that these classes were not being conducted due to the COVID-19 pandemic.

For cost effectiveness, we assumed the total number of children impacted by the program to be the numbers taking the endline test.

Annual costs were converted to 2021 prices using the Indian GDP deflator, and then adjusted to USD using the average annual Indian Rupees (INR) to US dollars exchange rate.

Results

The flow diagram in Fig 2 shows the selection of villages and flow of participants throughout the study. At randomization it was thought that there were 7103 children (3419 in the intervention arm, 3684 in the control arm) included in the trial. However, during the course of the trial it was discovered that 31 of the children had been erroneously enrolled twice, leaving 7072 (3405 in the intervention arm, 3667 in the control arm).

thumbnail
Fig 2. Flow diagram showing the selection of villages and the flow of participants.

https://doi.org/10.1371/journal.pone.0330203.g002

3054 out of 3405 (89.7%) children in the intervention arm and 3275 out of 3667 (89.3%) children in the control arm completed the final test. For the Endline survey 3163 of 3405 (92.8%) caregivers in the intervention arm and 3358 of 3667 (91.5%) of caregivers in the control arm were interviewed.

Baseline characteristics of villages and children are shown by trial arm in Tables 2 and 3 respectively. Table 3 and all subsequent tables present results at the child level (cluster level means and standard deviations are given in S7 Appendix). Following recommended best practice [3234] we do not conduct statistical tests for differences in baseline characteristics, as any differences would necessarily have arisen by chance. Villages in the two arms were very similar in terms of size and distance to the nearest hospital/community health center. Children in the intervention and control arms were similar in all characteristics surveyed at baseline, with the gender ratio, and distributions of family caste, caregivers’ education, caregivers’ literacy, and children’s age all being comparable. In both arms, almost all families were Hindu. Almost all children in both control and intervention arms had biological parents as the main caregivers. There was little missing data for most variables, the exception being male caregiver literacy (assessed by handing over a literacy card to be read aloud), because many fathers were not present at the time of the enrolment interview.

Migration and mortality

The participant children’s residence status was verified at midline (March 2022) and endline (December 2022). A high percentage (89.6% in the intervention arm, 89.0% in the control arm) of the primary caregivers completed in the midline survey interviews, with the percentage being slightly higher at the endline survey (92.9% in the intervention arm, 91.6% in the control arm). Table 4 shows the percentage of children who were resident in the village across intervention and control groups in both surveys done during the trial. In both intervention and control arms, very few children were not resident in their village at each of the surveys. In total, 40 children died during follow-up (16 in the intervention arm and 24 in the control arm).

thumbnail
Table 4. Children resident in the study villages at midline and endline.

https://doi.org/10.1371/journal.pone.0330203.t004

Adherence and attrition

Adherence measurement results can be found in Table 5. Adherence varied markedly between children. In some intervention arm villages, classes ran for up to 23 months (October 2019 to March 2020, January 2021 to April 2021 and July 2021 to July 2022 (inclusive)), longer than the initially planned ideal length of 17 months. However, there were significant disruptions due to the COVID-19 pandemic in many villages, impacting the continuity of classes even during these periods, leading to interruptions and potential gaps in learning for the children. On average 77% of the planned ideal number of classes were offered to children, with the average number of classes attended being 53% of that regarded as ideal. Only 37% of children attended more than 75% of the number of the classes that were regarded as ideal before the trial started. There were no substantial differences in patterns of attendance according to baseline characteristics (Table S4A in S7 Appendix).

thumbnail
Table 5. Adherence: Before/after-school classes attended in the intervention arm.

https://doi.org/10.1371/journal.pone.0330203.t005

In the intervention arm 3054 out of 3405 children (89.7%) contributed the primary outcome as did 3275 out of 3667 (89.3%) children in the control arm.

Table 6 shows attendance at the caregiver’s meetings. 75% of caregivers attended at least one class, but only 32% attended more than 75% of the number regarded as ideal. Amongst those who attended, only a few reported that they had not done the language and mathematics activities.

thumbnail
Table 6. Attendance at the intervention arm caregivers’ group meetings.

https://doi.org/10.1371/journal.pone.0330203.t006

Primary outcome

The primary outcome from the endline test was obtained in 3054 out of 3405 children (89.7%) in the intervention arm and 3275 out of 3667 (89.3%) in the control arm. The primary outcome (composite score) and total scores for the reading (EGRA) and mathematics (EGMA) tests are displayed by randomization arm in Fig 3 with summary statistics shown in Table 7. On average, intervention children scored 14.17 percentage points higher than control children on the tests (95% CI 11.36, 16.97; p < 0.001), adjusting for the randomization stratification factors and including all children who did the test (even those who did not attend classes or attended very few of them). The standardized intention-to-treat estimate was 0.580 (95% CI 0.466 to 0.706). In the same table, we present the results for the total mathematics and reading scores separately. The differences between intervention and control children’s overall mathematics and reading test scores were similar to their difference in the primary outcome and both were highly statistically significant: 13.90 (95% CI 10.91 to 16.88) and 14.44 (95% CI 11.54 to 17.34) percentage points respectively, both p < 0.001. Results without the problematic item 5b were almost identical (Table S5 in S7 Appendix).

thumbnail
Table 7. EGRA and EGMA composite and separate total scores.

https://doi.org/10.1371/journal.pone.0330203.t007

thumbnail
Fig 3. Distribution of child endline test scores by intervention arm.

https://doi.org/10.1371/journal.pone.0330203.g003

Baseline characteristics in those children who did and did not take the test and contributed the primary outcome were similar (Table S4B in S7 Appendix) with minor differences being consistent in size and direction across the two trial arms (for example the percentages of children where the biological mother and father were not the primary caregivers were slightly higher when the test result was not taken in both trial arms). In the intervention arm the proportion of children with the primary outcome increased with increasing attendance at the before/after school classes (with 280/434 of those attending no classes contributing the primary outcome compared with 413/425 in those attending 100% of the classes, Tables 5 and 7). Our multiple imputation sensitivity analysis explored the potential for this and other differences between those with and without the primary outcome to introduce bias. With multiple imputation the estimated intervention effect was 14.04 (95% CI 9.17, 18.90; p < 0.001), adjusting for the randomization stratification variables, very similar to the result from the primary complete-case analysis.

Per protocol analysis and dose-response relationship with class attendance

The per-protocol population was defined as the children enumerated in the intervention villages who attended more than 75% of the ideal number of before/after-school classes. There were 1259 (37%) such children in the intervention arm. The mean composite score was 66.19 (SD 19.42) among intervention children who attended more than 75% of before/after-school classes. The per-protocol estimate was 23.77 (95% CI 20.86, 26.68; p < 0.001) percentage points. In a post-hoc analysis we explored how the effect of the intervention varied according to attendance at the classes. The mean composite score in the intervention arm increased with increasing attendance (Table 7) being similar to that in the control arm when no classes were attended and rising to 71.17 when attendance was ideal. Using the structural mean model (SMM(G)) described by Maracy and Dunn [29] it was estimated that each 10 percentage point increase in attendance increased the composite score by 2.63 (95% CI 2.13, 3.16).

EGRA and EGMA subtasks

Table 8 shows scores for each EGRA subtask, with the average percent correct for all subtasks, and the fluency scores for timed subtasks. Intervention children outperformed control children in reading in all subtasks, and the control-intervention difference was at least 10 percentage points (out of 100) for each. Children in the intervention arm demonstrated higher reading skill mastery across subtasks of all difficulty levels.

Analogous results from the EGMA subtask are shown in Table 9. The mean scores for children in the intervention arm are higher for all mathematics tasks, and the differences are broadly similar in magnitude to the ones observed in the reading tasks.

Subgroup analyses

The results of subgroup analyses of the primary outcome by pre-specified variables of interest based on characteristics of the child and village are shown in Table 10. Numbers of observations, means and standard deviations are given by trial arm for each level of the moderators. As per our statistical analysis plan, we considered male caregiver education, rather than male caregiver literacy, as a potential moderator due to the large amount of missing data for the latter variable. The intervention effect was broadly consistent across subgroups, albeit with a suggestion of larger effects in poorer households. In particular, there was statistically significant evidence at the 1% level of a differential impact of the intervention by wealth and female caregiver literacy. Our results suggest that the positive gain in learning was higher among children from poorer households (as per the materials households are made of – wealth index 1). The children whose female caregivers were less literate also had higher gains compared to children of female caregivers who could read the entire sentence. A smaller difference is seen between the intervention effects in boys and girls, where the positive impact of the intervention seems higher in girls, albeit with the p-value being above the threshold (p = 0.01) we used for these interaction tests. There is also a suggestion of a larger difference in smaller villages: the 95% confidence interval here is wide because this is a completely between cluster comparison. In a post-hoc analysis there was no evidence that the intervention effect was different when there was a pregnancy in the household.

thumbnail
Table 10. Composite test scores by subgroup, with interaction tests.

https://doi.org/10.1371/journal.pone.0330203.t010

School enrolment and attendance

School enrolment is presented in Table 11 School enrolment levels were high for both intervention and control children in the academic years of 2019−20, 2021−22, and 2022−23. Differences between the percentage of enrolment between control and intervention arms were small at all three time points, albeit statistically significant at endline (2022−23), the pre-specified hypothesis testing time.

At the endline survey, caregivers were asked to report the number of days children had missed in school over the past two weeks (relative to the day of the interview). Attendance reported by the caregivers was similar for both arms (intervention arm mean 2.81 days missed: control arm mean 2.89 days missed: adjusted difference −0.10 days, 95% −0.40, 0.20; p = 0.492, full details in Table S10 in S7 Appendix).

Learning support

There was no evidence (Table 12) that caregivers in the intervention arm gave more time to support their children with learning-related activities, despite efforts from the intervention teams to engage caregivers (primarily mothers) in their children’s learning (through home visits, telephone messages and caregivers’ meetings).

Midline test and survey results

Table 13 shows the results of the midline tests. In these ASER-like tools, the tasks increase or decrease in complexity depending on whether a child can comfortably solve a given task according to the rules of the test (S4 Appendix). The midline tests’ results show that the differences in basic literacy and numeracy were statistically significantly higher among children in the intervention arm. In mathematics, children in the intervention arm scored significantly higher than those in the control arm. In the most difficult task, subtraction with borrowing, there were 13 percentage points more children in the intervention arm who could solve two operations correctly. The results in the reading test are sizeable, with 39% of the intervention children reading a grade 2 level short story (i.e., fluently and with ease at a good pace and making 3 or fewer mistakes) compared to 23% of the control arm children.

Table 14 shows access to technology, learning materials and help at home during the time schools were closed (due to COVID-19) for children in the intervention and control arms separately, whilst Table 15 shows the expenses related to schooling and learning done by caregivers during the first academic year that schools were closed. There were no large differences in caregivers’ support at home for studying. Most caregivers in both arms were helping their children to study at home. There were not many families who bought new devices in either arm. Most children were studying through textbooks and worksheets (many of them provided by the school) and had very limited access to tablets. The percentage of children with access to a smartphone was slightly higher among intervention children (24%) than control children (17%).

thumbnail
Table 14. Support during the COVID-19 schools’ closures (midline).

https://doi.org/10.1371/journal.pone.0330203.t014

thumbnail
Table 15. Adult caregiver spending (INRs) on education between July 2020 and June 2021.

https://doi.org/10.1371/journal.pone.0330203.t015

Costs and cost-effectiveness

The primary expenditures of the intervention program along with their corresponding percentages are presented in Table 16. Forty-three percent of expenditures were on teaching staff (PIs) and 20% of spending covered cluster supervisors and manager salaries. Although PI salaries accounted for a large portion of the expenses, the net amount they received per month was 4,362 INR, in line with the government’s salary guidelines for half-time, unskilled labor. This amount is substantially lower than what local government teachers earn. For example, a third-grade teacher receives a minimum salary of 25,000 INR per month plus dearness allowance during their two-year probationary period, with the total nearly doubling after probation. There were minor capital costs (laptops and tablets), and these were fully depreciated over the three years of the trial. Since the program had been previously developed by the implementing partner, Pratham, only minor costs related to the development and piloting of the activities and materials for the before/after school classes were included.

The main components of Pratham’s education program were highly affected by the disruptions caused by COVID-19, thus, in our primary analysis we calculated the cost per child excluding the ten months and a half period in which the before/after school classes could not be conducted. Excluding the two periods of interruption (mid-March to October 2020 and April to June 2021) and converting all the prices to 2021 prices using the India GDP deflator, the cost per child was 13,631 INR (184 USD) and the cost per 0.1 SD improvement in learning was 2,476 INR (33.5 USD). Equivalently, the intervention yielded a gain of 0.299 per $100 spent.

Using the same price conversion, and accounting for the period PIs could not conduct before/after school classes due to COVID-19 restrictions, the cost per child in 2021 values would be 20,213 INR (279 USD), and the cost per 0.1 SD improvement would be 3744 INR or 50.66 USD (a gain of 0.197 SD per $100 spent).

Discussion

In this study we addressed the potential for improving foundation skills in reading and mathematics among children in their first years at primary school in rural villages of Madhya Pradesh, India. We showed that this strategy was very successful, with children in the intervention arm scoring materially better in all reading and mathematics subtasks of EGRA and EGMA tests. In the composite score, the difference was 14.17 percentage points and highly statistically significant (95% CI 11.36, 16.97; p < 0.001).

The main components of our intervention were: (i) supplementary classes (either before or after school) provided by a female para-educator (PI) from the community (ii) involvement of caregivers in their children’s education (through proposed activities via mobile text messages, home visits by para-educators and biweekly meetings), and (iii) implementation of teaching-learning methods and materials together with frequent training of PIs.

Supplementary teaching with para-instructors and volunteers have shown large effects: increasing literacy and numeracy among primary-school age children in India, The Gambia, China and Chile [11,17,18,24,35,36]. Experiences in India and The Gambia showed that para-teachers (or in some cases volunteers) who receive regular monitoring, coaching and are trained on structured lessons are highly effective [11,17,24]. This is likely to be because para-instructors, like contract-teachers, feel highly motivated due to their short-period contracts and tend to be more open-minded about new educational methods [17,24,37]. On the other hand, implementing regular extra classes outside school can be challenging, especially with young children in rural settings. In fact, this was one of the limitations of our study, as discussed below.

The second important strategy of our intervention was the frequent and in-depth interaction between PIs and the main caregiver. The main purpose of these interactions was to promote an environment at home to boost learning through simple activities tailored to the child’s learning level and get regular feedback on the child’s progress. This was done through daily mobile messages, weekly home visits and biweekly group meetings. Evidence shows that the home learning environment is important to foster children’s literacy and numeracy development [3840]. A study targeting mothers in rural India, found that mathematics skills of young children as well as other aspects of home learning environment improved when mothers (some illiterate) were encouraged and guided to help with children’s learning [38]. Parents’ involvement in their children’s education became even more important when face-to-face contact was restricted due to the COVID-19 pandemic. In fact, exchanging phone messages became the backbone of the ongoing interactions in several programs implemented by Pratham during 2020 [41]. Our results suggest that caregivers in the control and intervention arms spent similar amounts of time engaging with children while studying at home (Table 12). However, this does not imply that encouraging and guiding caregiver-child interactions has no effect on learning outcomes. The impact may lie in the quality of the engagement rather than the amount of time spent with the child.

A key aspect of the intervention was the pedagogical approach of “teaching-learning method”, which is designed specifically for early education (Grades 1 and 2). This method goes beyond the curriculum content and emphasizes also the child’s emotional, social, and physical development. The PIs began classes by discussing with the children their recent experiences and the emotions associated with them. After this, the children participated in physical activities and engaged in group tasks to develop their social skills.

Important methodological strengths of the study are its randomized design, large size, careful implementation, and thorough data collection (with a rigorous follow-up of the trial children done by independent teams) (see S6 Appendix). The high proportion of randomized children taking the endline test (close to 90% in both arms) is a major strength. The fact that this proportion was very similar in the two arms (as was data collection in general) suggests that simultaneously carrying out the STRIPES2 and CHAMPION2 trials was successful in encouraging STRIPES2 control children and their families to contribute data.

It is theoretically possible that CHAMPION2’s monthly health promotion activities, such as participatory women’s groups, may have raised awareness around maternal and child health, hygiene, or caregiving practices at the community level. Because all STRIPES2 control villages received the CHAMPION2 intervention this cannot be investigated empirically, but we believe that due to the focus (pregnancy and neonatal care) and low frequency of these activities, it is very unlikely that they influenced parents’ behaviour related to early childhood education or cognitive development. It is also theoretically possible that children in those households where a woman became pregnant and received CHAMPION2 services benefited from improved caregiver knowledge, more caregiver free time and reduced stress. To investigate this possibility, we carried out a (post-hoc) analysis investigating the extent to which the effect of the STRIPES2 intervention differed according to whether or not the child’s household had active participation in CHAMPION2 through a household member becoming pregnant. The estimated interaction effect was small and not statistically significant, suggesting that the simultaneous implementation of the CHAMPION2 intervention in the STRIPES2 control arm did not materially bias results.

We had low attrition rates, and they were very similar in the control and intervention arms. Our primary outcome (EGRA and EGMA composite test score) was missing for about 10% of the children in the intervention (351/3405) and 11% in the control arms (392/3667). The results of the sensitivity analysis using multiple imputation and the primary analysis were very similar, with the estimated intervention effect being around 14 percentage points in both analyses.

Interestingly, our subgroup analysis (Table 10) reveals that the intervention has a somewhat larger impact on children coming from poorer families and those with female caregivers with lower literacy levels. Similarly, a randomized controlled trial that offered community-based preschools in Mozambique found larger effects on primary school enrolment and learning outcomes for children from poorer households [42].

The study has a few key limitations. It was not possible to blind the participants due to the nature of intervention. However, outcome assessors who administered the EGRA and EGMA tests were blinded. Secondly, we do not know if the effect of the intervention persisted after the intervention was completed as our study did not continue to conduct further follow-up.

The supply of Pratham Instructors (PIs) was a major challenge. PIs should ideally be from the local area to ensure a deeper understanding of the community’s needs and culture. Throughout the trial many PIs married and were absent due to maternity leave, which significantly disrupted class schedules, and finding a substitute for these PIs proved to be a complex process. Moreover, this significantly added to the cost of the intervention.

Another limitation is the relatively low adherence to the intervention. Just 37% of children in the intervention villages attended 75% or more of the ideal number of before/after-school classes (our definition of high adherence, as defined prior to the start of the statistical analysis). Many children could not come to the classes because the venue where these before/after-school classes were conducted was too far from their houses. Even though the Pratham team had access to a detailed map with the location of all participants’ households (prepared by GHTC), it was challenging to identify a spot that would be easily accessible by all the participants. In rural Madhya Pradesh, households are typically highly dispersed within villages. Pratham instructors anecdotally reported that exhaustion was also a reason given by some caregivers to explain why some children were not coming to classes. They also reported that in some villages, it was difficult to find a time when all the children could attend the before/after school classes. Our findings suggest that the magnitude of the impact could have been greater with improved adherence to the before/after-school classes. The per-protocol analysis shows that children in the intervention villages who attended more than 75% of before/after-school classes had a mean score of 66.19 (SD 19.42), with a treatment difference that was about ten percentage points higher than that in the intention-to-treat results.

We cannot completely rule out the possibility of spillovers through community meetings or shared resources in some villages, although we believe that this did not occur to any material extent. Such spillovers would tend to reduce the observed effects of our intervention, not increase it. However, Pratham took measures to try and ensure that control group children did not attend classes in intervention villages. Registers of children eligible to attend classes were maintained, and though on occasion additional children were allowed to attend classes, Pratham carried out visits to households to ensure that such children were resident in the intervention villages. Before randomization, we explored the possibility of having a larger distance between villages, but this would have drastically reduced the number of villages in the trial. Thus, the decision to go for 3 km buffer zones was pragmatic.

Finally, the COVID-19 pandemic caused considerable disruption to the original plans. In March 2020, all activities had to be suspended a few months after their launch in the communities because of the imposed severe lockdown due to COVID-19. India experienced “lockdown” and movement restrictions for several months in 2020 and in 2021, and schools remained closed for almost two years. Not all Pratham Instructors had been hired when lockdown was imposed and therefore in about 18 villages the activities started much later. Even when “lockdown” restrictions were loosened, activities such as school-going or attending neighborhood community-based classes did not resume immediately. These interruptions inevitably affected the process of learning.

Cost-effectiveness

In the STRIPES2 trial, cost per child (13,631 INR, 184 USD) and cost per 0.1 SD improvement (2,476 INR or 33.5 USD) were substantially higher than the original STRIPES trial [18], where the cost per child in 2021 values was 4143 INR (56 USD), and the cost per 0.1 SD improvement was 557 INR or 7.54 USD. Equivalently, STRIPES yielded 1.326 SD per $100 spent, while STRIPES2 yielded only 0.29 SD per $100 spent.

This relatively lower cost-effectiveness observed in the STRIPES2 trial on children’s outcomes can be attributed to several factors. The higher costs primarily reflected substantially higher labor costs, as well as the additional costs added due to an extra effort by the Pratham teams to compensate for the learning losses during the time schools were closed when the contact with children and parents was limited to mobile text messages. These additional costs reflect both a greater outlay of raw resources in the STRIPES2 intervention, and a greater expense of operating in India (holding resource levels constant) today as compared to then, given India’s rapid development and steady inflation over the 13 years that have elapsed since the end of the original study.

The STRIPES2 program design was labor intensive as we needed to hire many Pratham Instructors. This was partly needed because of the trial structure, given that houses in our trial area were spread over a large area and therefore it was challenging to visit all families and ensure that all trial children would come to the before/after-school classes. The overhead costs for management of the trial could be reduced if implemented within the existing educational structures. For that, teachers need to be fully supported by the existing structures, have clear instructions to implement the methods and devote a time slot of the daily class to apply it. As an example of this approach, a randomized evaluation of a teacher incentive program [43] found children in the treatment groups, where teachers received incentives calibrated to 3% of salaries, scored 0.17 and 0.27 SD higher on language and math scores compared with controls respectively. For a teacher earning a salary of 600,000 INR per year, and teaching 20 students, these incentives would cost 1800 INR per student over two years, markedly less expensive than the costs per student reported for STRIPES 2. Another option is to train volunteers to lead the activities in the schools. These two models have proven to be the most effective when scaling up the “Teaching at the Right Level” methods [11]

The STRIPES2 trial was also costlier than a similar project implemented in Gambia (pre-COVID-19). Although the average cost per child (adjusted to endline test takers) in the SCORE trial in Gambia [24] was high, the improvement in test scores was greater, and the cost per 0.1 SD improvement was 25.59 USD, (i.e., 0.389 SD per 100 USD). In rural Gambia, learning levels and learning trajectories are much lower and flatter, respectively, than in India, allowing more room for improvement there. Therefore, the cost-effectiveness will vary also depending on the existing learning levels.

Conclusion

STRIPES2 results showed that bundled interventions that combine community trained para-instructors, caregivers’ engagement in child’s learning and teaching-at-the-right-level methods can accelerate the end of learning poverty in areas where learning has been low and stagnant for many years. This is consistent with other studies that found that well-structured supplementary education programs can have a large positive impact on learning levels. The approach used in STRIPES2 could be applied in numerous other settings, in India and beyond, which closely resemble our trial area in terms of size, remoteness and level of services provided by the government. Although learning gains were not dramatic relative to the high unit cost which was driven by trial logistics and labor costs, the intervention may be sustainable if the package can be integrated into existing services, either in public schools by existing teachers or in the community by volunteers (unpaid).

Supporting information

S1 Appendix. Manual for instructors/teachers: warm up – phase I language & math class 1 & 2.

https://doi.org/10.1371/journal.pone.0330203.s001

(PDF)

S3 Appendix. Manual for master trainers – grades 1 and 2.

https://doi.org/10.1371/journal.pone.0330203.s003

(PDF)

S7 Appendix. Statistical analysis – full results.

https://doi.org/10.1371/journal.pone.0330203.s007

(PDF)

S1 File. Inclusivity in global research questionnaire answered.

https://doi.org/10.1371/journal.pone.0330203.s011

(DOCX)

Acknowledgments

We would like to acknowledge all the PIs, volunteers, shop keepers for keeping the village libraries for their work in implementing STRIPES2 intervention; all the teams of cluster leaders and other members of Pratham team; Jitendra Ahirwar for helping develop the content; Nikhil Swaminathan and Varsha Hari Prasad for helping develop the internal measurement systems and processes for STRIPES2; Ketan Verma for helping to develop the midline assessment; Ravi Prakash Rudroj for his contribution to development and implementation of the endline assessment. We would like to acknowledge the VEs, and Test Administrators, data supervisors, cluster coordinators and other members of GHTC team for their work in enumerating children, monitoring and for achieving high follow up rates for tests and parental surveys. We would like to extend our deepest gratitude to all the children and their caregivers for their participation in STRIPES2 and village heads (Sarpanchs) and villagers for their continued support throughout the trial period. We also want to thank the members of the Trial Steering Committee (TSC) and the Data Monitoring Committee (DMC) for their contribution providing an independent view, monitoring and guidance.

References

  1. 1. Global Education Monitoring Report 2016: education for people and planet: creating a sustainable future for all; 2016. Available from: doi: https://doi.org/10.54676/AXEQ8566
  2. 2. Pritchett L. Rebirth of education: schooling ain’t learning. Center for Global Development; 2013. Available from: https://books.google.com/books/about/The_Rebirth_of_Education.html?id=PQ72AAAAQBAJ
  3. 3. Akmal M, Pritchett L. Learning equity requires more than equality: learning goals and achievement gaps between the rich and the poor in five developing countries⋆. Int J Educ Dev. 2021;82:102350. pmid:33814691
  4. 4. Pritchett L, Viarengo M. Learning outcomes in developing countries: four hard lessons from PISA-D. RISE Working Paper Series; 2021;21. doi: https://doi.org/10.35489/BSG-RISE-WP_2021/069
  5. 5. Annual Status of Education Report (Rural) 2018. New Delhi; 2019. https://asercentre.org/. Available from: https://asercentre.org/aser-2018//
  6. 6. Gust S, Hanushek EA, Woessmann L. Global universal basic skills: current deficits and implications for world development. J Dev Econ. 2024;166:103205.
  7. 7. Chaudhury N, Hammer J, Kremer M, Muralidharan K, Rogers FH. Missing in action: teacher and health worker absence in developing countries. J Econ Perspect. 2006;20(1):91–116. pmid:17162836
  8. 8. Glewwe P, Muralidharan K. Improving education outcomes in developing countries: evidence, knowledge gaps, and policy implications. In: Machin S, Woessmann L, Hanushek LA, editors. Handbook of the economics of education. Elsevier; 2016. doi: https://doi.org/10.1016/B978-0-444-63459-7.00010-5
  9. 9. Evans DK, Popova A. What really works to improve learning in developing countries? An analysis of divergent findings in systematic reviews. World Bank Res Obs. 2016;31(2):242–70.
  10. 10. Ganimian AJ, Murnane RJ, Alfonso M, Álvarez Trongé M, Barrera-Osorio F, Beuermann D, et al. Improving educational outcomes in developing countries: lessons from rigorous impact evaluations. Cambridge (MA); 2014. Report No.: 20284. doi: https://doi.org/10.3386/W20284
  11. 11. Banerjee A, Banerji R, Berry J, Duflo E, Kannan H, Mukherji S, et al. Mainstreaming an effective intervention: evidence from randomized evaluations of “teaching at the right level” in India. National Bureau of Economic Research; 2016 [cited 2024 Mar 28. ]. doi: https://doi.org/10.3386/W22746
  12. 12. Angrist N, Patrinos H, Schlotter M. An expansion of a global data set on educational quality: a focus on achievement in developing countries; 2016. Report No.: 6536. Available from: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2295861
  13. 13. McEwan PJ. Improving learning in primary schools of developing countries. Rev Educ Res. 2015;85(3):353–94.
  14. 14. Masino S, Niño-Zarazúa M. What works to improve the quality of student learning in developing countries? Int J Educ Dev. 2016;48:53–65.
  15. 15. Banerjee AV, Banerji R, Duflo E, Glennerster R, Khemani S. Pitfalls of participatory programs: evidence from a randomized evaluation in education in India. Am Econ J Econ Policy. 2010;2(1):1–30.
  16. 16. Cabezas V, Cuesta J, Gallego FA. Effects of short-term tutoring on cognitive and non-cognitive skills: evidence from a randomized evaluation in Chile. J-PAL Working Paper. •povertyactionlab.org; 2011. Available from: https://www.povertyactionlab.org/sites/default/files/research-paper/493%20-%20short-term%20tutoring%20May2011.pdf
  17. 17. Banerjee AV, Cole S, Duflo E, Linden L. Remedying education: evidence from two randomized experiments in India. Q J Econ. 2007;122(3):1235–64.
  18. 18. Lakshminarayana R, Eble A, Bhakta P, Frost C, Boone P, Elbourne D, et al. The Support to Rural India’s Public Education System (STRIPES) trial: a cluster randomised controlled trial of supplementary teaching, learning material and material support. PLoS One. 2013;8(7):e65775. pmid:23874383
  19. 19. Education – Pratham. [cited 2024 Mar 28]. Available from: https://pratham.org/programs/education//
  20. 20. Agarwal A, Banerji R, Boone P, Elbourne D, Fazzio I, Frost C, et al. Protocol for a cluster randomised trial in Madhya Pradesh, India: community health promotion and medical provision and impact on neonates (CHAMPION2); and support to rural India’s public education system and impact on numeracy and literacy scores (STRIPES2). Trials. 2020;21(1):569. pmid:32586400
  21. 21. Shivalli S, et al. Community health promotion and medical provision and impact on neonates (CHAMPION2) and support to rural India’s public education system and impact on numeracy and literacy scores (STRIPES2): an update to the study protocol (v 11) for a cluster randomised trial in Madhya Pradesh, India. Trials. 2024. https://trialsjournal.biomedcentral.com/articles/10.1186/s13063-024-08245-z
  22. 22. Keddie S, Fazzio I, Shivalli S, Magill N, Elbourne D, Sharma D, et al. Statistical analysis plan for a cluster randomised trial in Madhya Pradesh, India: support to rural India’s public education system and impact on numeracy and literacy scores (STRIPES2). Trials. 2023;24(1):469. pmid:37481559
  23. 23. Satna District Population, Madhya Pradesh, List of Tehsils in Satna. [cited 2024 Mar 29]. Available from: https://www.censusindia2011.com/madhya-pradesh/satna-population.html
  24. 24. Eble A, Frost C, Camara A, Bouy B, Bah M, Sivaraman M, et al. How much can we remedy very low learning levels in rural parts of low-income countries? Impact and generalizability of a multi-pronged para-teacher intervention from a cluster-randomized trial in the Gambia. J Dev Econ. 2021;148:102539.
  25. 25. COVID-19 response – Pratham. [cited 2024 Mar 28]. Available from: https://www.pratham.org/covid-19-response/
  26. 26. Dubeck MM, Gove A. The early grade reading assessment (EGRA): Its theoretical foundation, purpose, and limitations. Int J Educ Dev. 2015;40:315–22.
  27. 27. Platas LM, Ketterlin-Geller LR, Sitabkhan Y. Using an assessment of early mathematical knowledge and skills to inform policy and practice: examples from the early grade mathematics assessment. IJEMST. 2016;4(3):163. https://www.learntechlib.org/p/175776/
  28. 28. Home - NFER. [cited 2024 Mar 28]. Available from: https://www.nfer.ac.uk//
  29. 29. Maracy M, Dunn G. Estimating dose-response effects in psychological treatment trials: the role of instrumental variables. Stat Methods Med Res. 2011;20(3):191–215. pmid:19036909
  30. 30. Carpenter JR, Kenward MG. Multiple imputation and its application; 2013. p. 1–345. doi: https://doi.org/10.1002/9781119942283
  31. 31. StataCorp. Stata statistical software: release 18. College Station (TX): StataCorp LLC; 2023.
  32. 32. Campbell MK, Piaggio G, Elbourne DR, Altman DG, CONSORT Group. Consort 2010 statement: extension to cluster randomised trials. BMJ. 2012;345:e5661. pmid:22951546
  33. 33. Moher D, Hopewell S, Schulz KF, Montori V, Gotzsche PC, Devereaux PJ, et al. CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. BMJ. 2010;340(mar23 1):c869–c869.
  34. 34. Bruhn M, McKenzie D. In pursuit of balance: randomization in practice in development field experiments. Am Econ J: Appl Econ. 2009;1(4):200–32.
  35. 35. Behrman JR, Fan CS, Wei X, Zhang H, Zhang J. After-school tutoring, household substitution and student achievement: experimental evidence from rural China; 2020. Report No.: 2020–36. Available from: https://scholars.ln.edu.hk/en/publications/after-school-tutoring-household-substitution-and-student-achievem
  36. 36. Cabezas V, Cuesta JI, Gallego FA. Does short-term school tutoring have medium-term effects?: Experimental . . . - Verónica Cabezas, José Ignacio Cuesta, Francisco A. Gallego - Google Books. ontificia Universidad Católica de Chile; 221AD. Available from: https://books.google.co.in/books/about/Does_Short_term_School_Tutoring_Have_Med.html?id=PzSyzwEACAAJ&redir_esc=y
  37. 37. Duflo E, Dupas P, Kremer M. School governance, teacher incentives, and pupil–teacher ratios: experimental evidence from Kenyan primary schools. J Public Econ. 2015;123:92–110.
  38. 38. Banerji R, Berry J, Shotland M. The impact of maternal literacy and participation programs: evidence from a randomized evaluation in India. Am Econ J: Appl Econ. 2017;9(4):303–37.
  39. 39. Niklas F, Cohrssen C, Tayler C. Parents supporting learning: a non-intensive intervention supporting literacy and numeracy in the home learning environment. Int J Early Years Educ. 2016;24(2):121–42.
  40. 40. Susperreguy MI, Douglas H, Xu C, Molina-Rojas N, LeFevre J-A. Expanding the Home Numeracy Model to Chilean children: Relations among parental expectations, attitudes, activities, and children’s mathematical outcomes. Early Child Res Q. 2020;50:16–28.
  41. 41. Banerji R. Learning for all: lessons from ASER and Pratham in india on the role of citizens and communities in improving children’s learning. In: Ra S, Jagannathan S, Maclean R, editors. Powering a learning society during an age of disruption education in the Asia-Pacific region: issues, concerns and prospects. Singapore: Springer; 2021. p. 181–94. doi: https://doi.org/10.1007/978-981-16-0983-1_13
  42. 42. Martinez S, Naudeau S, Pereira VA. Preschool and child development under extreme poverty: evidence from a randomized experiment in rural Mozambique; 2017. Report No.: 8290. Available from: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3092440
  43. 43. Muralidharan K, Sundararaman V. Teacher performance pay: experimental evidence from India. J Pol Econ. 2011;119(1):39–77.