Validation of the Malawi Developmental Assessment Tool for children in the Dominican Republic: Preliminary results

Background This study initiated the validation process of a translated and adapted version of the Malawi Developmental Assessment Tool (MDAT) for children in the Dominican Republic (DR). Like Malawi before the development of the MDAT, the DR did not have early childhood development (ECD) tools explicitly designed for low-resource areas that are also valid assessments of child development. We chose MDAT because it underwent a rigorous validation process and retained measurements of test items that were culturally adaptable from the Denver Developmental Screening Test II. We aimed to test the internal consistency and inter-rater reliability of the MDAT in children under the age of two years living in low-income neighborhoods in Santo Domingo in 2017. Methods and findings Forty-two children from 2 to 24 months of age (mean = 11.26, SD = 6.37, boys = 22, girls = 20) and their corresponding caregiver participated in the study. We conducted a cross-sectional, pre-experimental study. The primary outcome measure was an index of ECD, as assessed by the Dominican adaptation of the MDAT. The tool evaluates children in four domains: social, fine motor, language, and gross motor. To determine internal consistency, we obtained Spearman-Brown split-half reliability for each sub-scale. The results showed a good consistency (>.6) for social, fine motor, and gross motor, and an acceptable consistency (>.5) for language. Second, to test the inter-rater reliability, we conducted a Kendall’s Taub test of independence for both the general scale and each sub-scale. Significant rτ scores ranged from .923 to .966, indicating appropriate inter-rater reliability. Third, we correlated the age variable with each subscale to determine if the development scale followed a progression of abilities that are expected to increase with maturation. The age variable correlated positively with all the subscales (social r = .887, p < .001; fine motor r = .799, p < .001; language r = .834, p < .001; gross motor r = .805, p < .001), indicating that the older the child, the better scores in the development measurements, as expected. There were no adverse events. This study, however, has multiple limitations. We did not gather information about socioeconomic position, which is an important variable when assessing child development; however, all participants lived in a low-income neighborhood. Given that this is the first ECD tool specific to the Dominican Republic, norm-referenced scores for the Dominican population do not yet exist. This study sample size is insufficient to make inferences about the national population. Conclusions This study represents the first attempt to obtain a valid tool to screen for development milestones in children living in poverty in the DR. More research is needed to refine the instrument. The availability of the tool will enable impact evaluations of ECD intervention programs and the development of evidence-based public policies in the DR.


Methods and findings
Forty-two children from 2 to 24 months of age (mean = 11.26, SD = 6.37, boys = 22, girls = 20) and their corresponding caregiver participated in the study. We conducted a cross-sectional, pre-experimental study. The primary outcome measure was an index of ECD, as assessed by the Dominican adaptation of the MDAT. The tool evaluates children in four domains: social, fine motor, language, and gross motor. To determine internal consistency, we obtained Spearman-Brown split-half reliability for each sub-scale. The results showed a good consistency (>.6) for social, fine motor, and gross motor, and an acceptable consistency (>.5) for language. Second, to test the inter-rater reliability, we conducted a Kendall's Taub test of independence for both the general scale and each sub-scale. Significant r τ scores ranged from .923 to .966, indicating appropriate inter-rater reliability. Third, we correlated the age variable with each subscale to determine if the development scale followed a progression of abilities that are expected to increase with maturation. The age variable correlated positively with all the subscales (social r = .887, p < .001; fine motor r = .799, p < .001; language r = .834, p < .001; gross motor r = .805, p < .001), indicating that the older the child, the better scores in the development measurements, as expected. There were no a1111111111 a1111111111 a1111111111 a1111111111 a1111111111

Introduction
Developmental assessment or screening tools provide a standardized method of assessing a child's neurological and musculoskeletal growth through the observation of the child's performance of age and culturally-specific activities [1]. The child is observed performing a set of tasks associated with specific interrelated domains and evaluated based on direct structured observations of the expected behavior, caregiver reports, or unstructured observation from evaluators [2]. As the assessment progresses, the child engages in activities of increasing difficulty [2].
There are numerous benefits associated with the availability and use of developmental screening tools. At the individual level, these screening tools help determine if a child is on track in his or her development, identify interventions to compensate for any eventual delay, and implement early interventions that help improve their health and educational outcomes [3]. At the program level, developmental screening tools are used as baseline and outcome variables in impact evaluations to help determine a program's effectiveness [4]. At the public policy level, the use of screening tools helps guide the development of evidence-based health and education policies [5].
Several tools have been created to measure early childhood development (ECD) in a range of domains, standardized with large representative samples in places that have health data readily available, piloted, and validated. These data-backed assessments of the tools' ability to assist health professionals in the measurement of ECD make them appropriate resources for assessing different aspects of child development in those locations [6]. Despite the availability of these tools and their translation into a variety of languages, they may not necessarily be adequate to measure ECD in cultural and socioeconomic contexts for which the instruments were not specifically created. For example, a study in Chile adapted the Bayley-III developmental tool and validated it with a sample of children from families of higher socioeconomic position, which was "representative of the private medical center where the study was conducted" [7]. This shows that while the adapted screening tool was valid for that specific context, it was not necessarily applicable to participants of lower socioeconomic position regardless of their shared geographic location and language. For this reason, it is essential to ensure that development tools are designed with the input of participating communities and validated with a sample representative of the specific population in which it will be used.
Children's development depends on multiple factors, including childrearing practices that are culture-specific. Therefore, using development tools without validating them in the cultural and socioeconomic context where they will be used can lead to an under-or over-estimation of ECD [8]. Some experiences exist across the world of contextualized ECD screening tools for specific populations in India, Pakistan, and Zambia [9], Malawi [10], Sri Lanka [11], Cambodia [12], and Aboriginal Australia [13]. These tools were designed and validated with as many culture-free items as deemed possible, but also with items that account for specific population characteristics of environments that are frequently not represented by the most commonly used developmental screening tools.
In addition to having a more culturally relevant measurement to assess ECD, it is necessary for these screening tools to be accessible for projects, programs, and research at the national level. The accessibility guarantees the constant use of the instrument and the standardization of ECD measurement across projects. Therefore, commercial ECD screening tools used to measure development or to screen for developmental delay in children are expensive and are used mostly in clinical settings [14]. Tools that can help health professionals in these areas identify at-risk children for developmental delays and assess if they are developing according to their age need to be available at low or no cost to the provider to maximize their use [6].
The purpose of our study was to test an ECD tool that could be used in the Dominican Republic (DR) at the community level in a resource-poor setting and at no cost. The DR faces multiple challenges in educational attainment, as reflected by international educational reports, which show that Dominican students have the lowest scores from a subset of fifteen Latin American countries in reading, writing, and math in third and sixth grades [15]. An early literacy national study conducted in 2015 showed that second graders had still not acquired basic literacy skills [16], partly due to low oral comprehension-a skill that the education system implicitly assumes the child has acquired before entering formal educational settings [17]. On the other hand, no ECD testing tools have been developed specifically for the Dominican context, as the only ones that are used are available in private clinics, such as the Developmental Profile 3 [18] and the Denver Developmental Screening Test II (DDST-II) [19].
In our study, we aimed to test the internal consistency and inter-rater reliability of the Dominican version of the Malawi Development Assessment Tool (MDAT) [10] in a group of children under the age of two years in the Dominican Republic. The MDAT is an ECD screening tool that focuses on a continuum of skills from four different domains-gross motor, fine motor, language, and social-with the purpose of identifying children with severe disabilities. Like Malawi before the development of the MDAT, the Dominican Republic did not have ECD tools designed specifically for low-resource areas in the country that are also valid assessments of child development. After reviewing a variety of ECD screening tools, we chose the MDAT because it was developed for children ages 0 to 5 years, underwent a rigorous validation process informed by Malawian health workers and pediatricians, and retained measurements of test items that were culturally adaptable [6,10] from the Denver Developmental Screening Test II (DDST-II) [19]-which is one of the most used instruments to assess child development in a short amount of time and that can be used by "anyone who works well with children and meticulously follows directions for administration" [6]. These qualities are ideal for use in low-resource environments where many children must be assessed quickly and highly-trained health care workers are not available.

Participants
Forty-two children from 2 to 24 months of age (mean = 11.26, SD = 6.37, boys = 22, girls = 20) and their corresponding caregiver-their mother in all the cases-participated in the study. We recruited study participants in Los Guandules and Guachupita, two areas belonging to the neighborhood of Domingo Savio in Santo Domingo, the capital city of the DR. According to the latest assessment of multidimensional poverty in Santo Domingo, conducted in 2012, Domingo Savio is the city neighborhood with the highest concentration of extreme poverty (22.7%) and one of the highest in moderate poverty (57.6%)-totaling 80.3% of the population living with multiple deprivations [20].
Inclusion criteria for the study were being a child from 0 to 24 months of age with a parent or guardian aged 18 years or older who understood Spanish-regardless of whether their first language was Spanish or Haitian Kreyol. Since our goal was to determine the validity of a tool that measures typical ECD, we excluded children with diagnosed developmental disabilities.
Volunteers from the Pastoral Materno Infantil (PMI), a Jesuit organization that promotes maternal and child health among low-income families throughout the Dominican Republic through trained community mobilizers who live in the community, recruited participants via convenience sampling by a phone call from the pool of PMI beneficiaries. Once the community mobilizers had identified a group of participants interested in the study, they gathered them and brought them to the evaluation setting. The Institutional Review Boards from the Universidad Iberoamericana (UNIBE) in Santo Domingo and Tulane University approved the study. We obtained oral and written consent from the child's caregiver before data collection.

Instruments
Sociodemographic interview. The interview consisted of two parts to assess participants' background and their home environment: (a) information related to the child, including presence of low birthweight and prematurity, daycare attendance, access to stimulating materials such as books and toys, and interaction with other people such as singing, speaking, and storytelling; and (b) information about the primary caregiver, including level of education and the relationship with the child.
Malawi development assessment tool-Dominican version. At our request, the MDAT team provided us with materials to assist in our adaptation with thorough documentation of the process they underwent to create and validate the tool. We translated the MDAT into Spanish from English by first directly translating the MDAT and then reviewing this version with community volunteers from PMI. As part of the assessment of the translation, we adapted the tasks of the original MDAT to the Dominican context by accounting for different availability of materials and participants' familiarity with certain activities. We named this new version of the test Tamizaje de Desarrollo Infantil Dominicano (TDID) or Dominican Child Development Screening Tool. The appropriateness of the choice of words used and tasks involved in the TDID were informed by discussions with staff and volunteers from PMI.
The TDID consists of four subtests that assess development in four different domains: social, fine motor, language, and gross motor. Each subtest contains a list of 34 items of behaviors that progress in complexity, totaling 136 items. The child's age determines the starting point of each domain. We tested each item and recorded the result as "pass observed" if the evaluator observed the behavior, "pass reported" if the caregiver reported that the child performs the task, or "fail" when the task was neither observed by the evaluator nor reported by the caregiver as having been performed. We gave a score of 1 when the task was accomplished and a score of 0 when it was not. We administered the items sequentially and, when the child failed to complete six tasks in a row, the evaluator moved on to the next subtest.

Procedure
Data collection took place throughout eight days in February 2017 at Centro Bonó-another Jesuit center in the same sector of Santo Domingo. A group of nine evaluators conducted the assessments in three separate rooms; two evaluators assessed each child and each of them provided their own set of scores. These evaluators were clinical psychology undergraduate students from UNIBE who had already completed research and ECD measurement courses. The local principal investigator (PI), a cognitive neuroscientist, provided them with a 4-hour training on the study protocol, participant protection, and childhood development, and supervised them when interacting with participants to ensure participant safety and study integrity.
First, the evaluators conducted the sociodemographic interview with each caregiver using a structured multiple-choice questionnaire that took approximately 10 minutes. Upon completion of the interview, the evaluators administered the TDID under the supervision of the local PI. Once the evaluators completed data collection, the data entry team, consisting of UNIBE undergraduate psychology students, inputted the data, which were reviewed by the local PI. By numerically adding the "pass" responses, each child received a score from 0 to 34 on a continuum for each subscale. We analyzed the scores for internal consistency and inter-rater reliability.

Results
The codebook (S1 File), dataset in Excel (S2 File), and dataset in SPSS (S3 File) are available in the supporting information section.

Sociodemographic information
The age of the 42 children who participated in the study ranged between 2 and 24 months, as shown in Table 1.
According to their caregivers, 23.8% of the children were born with low birth weight, and 16.7% were born prematurely. When asked who was regularly in charge of caring for their children, most reported that the main caregiver was the mother, followed by mother and father, and the mother and grandmother (see Table 2). The results show that ten (23.8%) of the children's mothers had an elementary education level, 27 (64.3%) had secondary school education level, and five (11.9%) attended college. Forty-one caregivers spoke Spanish as a first language, and one caregiver spoke Haitian Kreyol as a first language. Ten caregivers (23.8%) reported to have at least one book at home; six had one book and four households had two books. All the caregivers reported to have between one and 20 toys at home, with an average of 6.3 toys per home. Table 3 depicts the home background analysis, which includes access to stimulating materials and stimulating activities. None of the children in our sample attended daycare, about a quarter of the children had stories read to them, about half were told stories, in more than a third their caregiver counted and named objects to them, most were taken for a walk, and all the caregivers played with the children.

TDID psychometric properties
First, we analyzed the TDID's internal consistency to determine the degree to which all parts of the test contribute to each measurement. We obtained Spearman-Brown split-half reliability for each sub-scale: social, gross motor, language, and fine motor. Table 4 contains general descriptive statistics of each sub-scale, in addition to the Spearman-Brown coefficient. The Spearman-Brown coefficients indicate a good consistency (>.6) for social, fine motor, and gross motor, and an acceptable consistency (>.5) for language [21].
Second, to test the inter-rater reliability to ensure that multiple observers would obtain similar scores, we conducted a Kendall's Taub test of independence for the general scale, as well as for each sub-scale. Scores obtained by the first evaluator were not independent from scores obtained by the second evaluator in any of the tests (social r τ = 0.953, p < 0.001; fine motor r τ = 0.923, p < 0.001; language r τ = 0.966, p < 0.001; gross motor r τ = 0.977, p < 0.001; total r τ = 0.954, p < 0.001). Our interpretation of these results is that the scale has appropriate interrater reliability.

Correlations
We correlated the age variable with each subscale to determine if the development scale followed a progression of abilities that are expected to increase with maturation. The age variable correlated positively with all the subscales (social r = .887, p < .001; fine motor r = .799, p < .001; language r = .834, p < .001; gross motor r = .805, p < .001), indicating that the older the child, the better scores in the development measurements, as expected. See Fig 1 for a visual representation.

Discussion
The primary objective of this study was to determine the feasibility of using an adapted version of the MDAT in a community context in vulnerable areas of Santo Domingo, Dominican Republic. We evaluated children that were between 2 months and 2 years of age, since items in developmental scales in such early stages are less culture-dependent and, therefore, require Validation of the Malawi Developmental Assessment Tool for children in the Dominican Republic minimal adaptations. We obtained measurements of internal consistency of the instrument, as well as inter-rater reliability, while informally assessing the logistics and methodology of the study. The instrument showed appropriate psychometric properties, including good internal consistency for three sub-scales and acceptable for the fourth, and good inter-rater reliability. This indicates a low probability of measurement errors from the design and content of the test itself. Good inter-rater reliability index indicates instrument stability across observers. By reducing error variance, threats to internal validity are reduced. As expected in any developmental scale that follows a path in child development, we confirmed a progression of scores as children were older.
In our study, we asked the question about low birthweight to the caregiver, which is the same question used in the 2014 Multiple Indicator Cluster Survey (MICS) [22], and we found that 23.8% were born with low birthweight; we also asked the caregiver about prematurity, and we found that 16.7% of the children were born prematurely. According the 2014 MICS, among Dominican children whose mother has none or primary education, 21.9% are born below the average size and 15.7% are born with low birthweight; among those from the poorest quintile, 23.9% are born below the average size at birth and 17% are born with low birth weight. Although our finding of low birthweight is higher than the expected, it corresponds with those born below the average size for the Dominican context [22].
All of the families in our study were low income and only 23.8% had books at home. According to the 2014 MICS, only 10% of Dominican households with children under the age of 5 years have three or more books at home, and among households with a child under the age of 2 years it decreases to 3.7%; among children whose mothers have none or primary education, it is 3% [22]. In Santo Domingo, there is one public library with children's books. The library is not at walking distance from the study site and it requires tremendous efforts from caregivers to get there with their child, which means that it is not likely that they are reading books from the public library.
Regarding logistics, one of the main strengths of this study was the affiliation with the Pastoral Materno Infantil. We chose PMI because they have a history of engaging the community and providing services that enable them to access health services. By partnering with PMI, we respected the way in which the community engages the health system. The evaluation setting was a space that participants already knew and visited regularly, and the parents trusted the community mobilizers who invited them to participate in the study. It would be interesting to explore the possibility of training the community mobilizers in the application of the screening tool, increasing the benefit of this project to the community and making this a sustainable community-engaged project.
This study, however, has multiple limitations. While there was no language requirement for participation by either participants or their caregivers, we observed that the child of the one caregiver with more limited Spanish abilities did not perform as well as the other children. This could be because some questions were directly asked to parents, and if the parent did not understand the questions being asked, the child's score could be affected negatively. Even though this was not common in this pilot study, for additional studies in communities with immigrant populations, we recommend translating the materials and recruiting bilingual evaluators to ensure appropriate representation of minority groups of languages. Since we conducted the study, we have translated the tool to Haitian Kreyol for this purpose.
The present study did not gather information about socioeconomic position, which is an important variable when assessing child development. The community coordinator and community mobilizers recruited participants from two neighborhoods that include a large proportion of households under the local poverty line, but we did not take into account socioeconomic variability among the participants.
Given that the purpose of this study was to evaluate the feasibility of using TDID and determine its psychometrics properties, we chose a small sample size. Therefore, the data cannot be used to make inferences about the population of Dominican children. Because there are no available ECD tools specific to the Dominican Republic, there has been no developmental assessment on a national level. Therefore, we recommend the use of the TDID as an instrument to be used nation-wide to obtain norm-referenced scores for the Dominican population. The standardization of the scores would allow the use of the TDID for clinical and monitoring purposes at the community and national level. However, although developmental screening tools have the potential to infer about general development milestones, and probably detect children with significant impairments that require further testing, the use of screening tools may not be able to identify subtle developmental delays [2].
This study represents the first attempt to obtain a valid tool to screen for development milestones in children living in poverty in the Dominican Republic. More research is needed to refine the instrument, to have an available tool that is reliable and accessible to be used by health workers, and that could be used in future studies on factors that affect or enhance early childhood development. The availability of the tool will enable impact evaluations of early child development intervention programs and the development of evidence-based public policies on early childhood development in the Dominican Republic.