The INTER-NDA is a novel assessment of early child development measuring cognition, language, motor skills, behaviour, attention, and socio-emotional reactivity in 2 year olds in 15 minutes. Here, we present the results of an evaluation of the INTER-NDA against the Bayley Scales of Infant Development III edition (BSID-III), its sensitivity and specificity and its psychometric properties.
Eighty-one infants from Oxford, UK, aged 23.1–28.3 months, were evaluated using the INTER-NDA and the BSID-III. The agreement between the INTER-NDA and the BSID-III was assessed using interclass correlations (for absolute agreement), Bland-Altman analyses (for bias and limits of agreement), and sensitivity and specificity analyses (for accuracy). The internal consistency of the INTER-NDA and uni-dimensionality of its subscales were also determined.
The interclass correlation coefficients between the BSID-III and the INTER-NDA cognitive, motor and behaviour scores ranged between 0.745 and 0.883 (p<0.001). The Bland-Altman analysis showed little to no bias in the aforementioned subscales. The sensitivity and specificity of INTER-NDA cognitive scores ≤1 SD below the mean are 66.7% and 98.6% respectively, with moderate agreement between INTER-NDA and BSID-III classifications (κ = 0.72, p<0.001). The sensitivity and specificity of INTER-NDA scores <2 SD below the mean, in predicting low BSID-III scores (<70), are 100% each for cognition, and 25% and 100% respectively for language. More than 97% of children who scored in the normal range of the INTER-NDA (<1SD below mean) also scored in the normal range in the BSID-III (≥85). The INTER-NDA demonstrates satisfactory internal consistency and its subscales demonstrate good unidimensionality.
Citation: Murray E, Fernandes M, Newton CRJ, Abubakar A, Kennedy SH, Villar J, et al. (2018) Evaluation of the INTERGROWTH-21st Neurodevelopment Assessment (INTER-NDA) in 2 year-old children. PLoS ONE 13(2): e0193406. https://doi.org/10.1371/journal.pone.0193406
Editor: Shi Wu Wen, University of Ottawa Faculty of Medicine, CANADA
Received: July 9, 2017; Accepted: February 9, 2018; Published: February 28, 2018
Copyright: © 2018 Murray et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: The authors gratefully acknowledge the contribution of the Wellcome Trust who funded the OPT study (grant no 090139 to AS) and the Bill and Melinda Gates Foundation who funded the INTERGROWTH-21st Project (Grant ID# 49038 to SK and JV). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
The first 1000 days of life, from conception to age 2 years, is the most important period of human neurodevelopment[1–5]. The age of 2 years marks the point at which early child development (ECD) can reliably be assessed because after this period: (i) neurodevelopment is no longer confounded by transient neurological syndromes of prematurity; and (ii) instruments, such as the Bayley Scales of Infant Development (BSID), have good predictive validity in the medium- to long-term. However, ECD assessment at population level depends on: (i) the availability of large numbers of appropriately trained professionals[7, 8]; and (ii) objective assessment tools that can be administered reliably and easily. These often limit steps in population-wide ECD screening and surveillance efforts.
A number of short, resource-light neuropsychological assessments, targeted at 2-year-old children, endeavor to address these requirements and are currently being used in clinical and research settings. These include the Ages and Stages Questionnaire (ASQ), the Brief Infant Toddler-Social and Emotional Assessment (Brief ITSEA), the Child Behavior Checklist (CBCL), the Developmental Milestones Checklist (DMC), the Rapid Neurodevelopment Assessment (RNDA) and the Ten Questions Questionnaire (TQQ). While some focus on sensitively assessing target areas of neurodevelopment (for example, the CBCL for behavior and the Brief-ITSEA for social and emotional regulation), others (such as the DMC and RNDA) employ a more generalist and culturally appropriate approach. Nevertheless, many evaluations, including ‘gold standard’ assessment tools, are vulnerable to administrative and technical errors . In the BSID-III, for example, it is estimated that these errors occur in approximately 39% of assessments, accounting for inaccuracies in >10% of scores.
The challenges for ECD surveillance are: (i) the assessment of multiple dimensions of neurodevelopment, at population-level, with a high degree of reliability, objectivity and precision; and (ii) the suitability of such assessments for international use, without compromising on features essential for scalability (namely short administration time, low costs and relative ease of administration by non-specialists).
To address this issue, The International Fetal and Newborn Growth Consortium for the 21st Century (INTERGROWTH-21st) developed, in 2014, an objective, rapid ECD assessment for use by non-specialists, in low-, middle- and high-income countries. Entitled the INTERGROWTH-21st Neurodevelopmental Assessment (INTER-NDA), it assesses cognition, expressive and receptive language, gross and fine motor skills, behaviour, attention and social-emotional reactivity in 2 year olds with an administration time of 15–20 minutes. It consist of 53 directly administered, concurrently observed and caregiver reported items. The INTER-NDA was designed to be free from cultural biases and is based upon objective reporting (rather than subjective judgement) of the child’s performance. Outcomes are reported on a 5-point scale characterising the child’s performance in each domain across a spectrum. The inter-rater and test-retest reliability of the INTER-NDA, determined across 21 assessors in Brazil, India, Italy, Kenya and the UK using Cohen’s kappas, were k = 0.70, 95% CI: 0.47–0.88 and k = 0.79, 95%CI: 0.48–0.96 respectively. The training materials and operation manual for the INTER-NDA are freely available at www.intergrowth21.org.uk/protocol.aspx?lang51).
This report presents the results of a study evaluating the performance of the INTER-NDA against a well-established measure of child development, the BSID-III. The specific aims of the current study were to: (i) evaluate agreement between the INTER-NDA and the BSID-III using interclass correlations and the Bland Altman analyses; (ii) determine the sensitivity and specificity of the INTER-NDA in identifying low scores on the BSID-III; and (iii) determine the internal consistency of the INTER-NDA and the uni-dimensionality of its subscales as an ECD measure.
Material and methods
Eighty-one children (44 boys and 37 girls), with a mean age of 25.4 (SD = 1.1) months, were drawn randomly from UK-based families enrolled in the Oxford Postnatal Treatment (OPT) Study, a treatment trial for women with postnatal depression, between March 2011 and December 2013. The families resided in Oxfordshire, Buckinghamshire and Berkshire, England. GPs and Health Visitors gave potential participants an overview of the study to and provided a leaflet and contact information. Interested mothers could then telephone, return self-referral forms in the post, or email the study team. Mothers needed to contact the study directly rather than be referred by a healthcare professional. In some instances, following discussion with a mother, a Health Visitor or GP contacted the team on the mother’s behalf. A member of the study team would then contact the mother to assess her eligibility for the study. Women were eligible for inclusion if they met full diagnostic criteria for Major Depressive Disorder (MDD) persistently over the previous 3 months, were ≥18 years old, and their infants were born ≥ 35weeks gestation, had birth weight of ≥ 2000g, were aged 4.5 to 9 months, and had no serious medical conditions. Women were excluded if they were unable to converse in English, suffered from a severe psychiatric diagnosis (other than MDD) or serious physical illness, were not cohabiting with the child, or were currently receiving psychological therapy. The mean maternal age at delivery was 31.80 years (SD = 5.49 years), 91% of mothers (n = 73) had an education level of GSCE (A*-C) and above, and 98% were married or cohabiting (n = 79). The children had a mean birth weight of 3539g (SD = 543g) and a mean gestational age at birth of 39+6 weeks (SD = 1+4). Further demographic information is presented in S1 Table. The cognitive, expressive communication, receptive communication, overall language and behaviour subscales of the Bayley Scales of Infant Development-Third Edition (BSID-III) were administered at follow-up when the children were approximately two years of age.
The BSID-III is a well-established child development assessment, measuring cognition, language skills, social-emotional skills, motor skills and adaptive behavior from 1 to 42 months. It has a binary (pass/fail) scoring system for each item with assessments continuing until a child fails five consecutive items. This results in a ceiling score, which yields a developmental age for the child. The latter is clinically important in terms of diagnosis, monitoring and evaluating the impact of interventions. The administration time is 60–90 minutes. The cost of the test, at the time of writing this paper, is US$ 1135.00.
The version of the BSID-III used in the OPT study comprised five of the seven subscales (cognitive, expressive communication, receptive communication, overall language and behaviour subscales). Fine and gross motor subscales were not administered as part of the OPT study. In keeping with the BSID-III scoring system, scaled scores (M = 10, SD = 3) were derived for expressive and receptive communication, and composite scores (M = 100, SD = 15) were derived for cognitive and overall language. The total behaviour score was converted to a z-score.
This method of using a subset of subscales for validation has been shown from previous evidence to be an accepted method of validation of a new measure[20, 21]. This is because comparisons are conducted at subscale level, rather than at the level of a singular, global result[20, 21].
As described above, the INTER-NDA measures cognition, language, motor skills, behavior, attention and social-emotional reactivity in the 22 to 26 month age group. The cost of the INTER-NDA, at the time of writing this paper, is US$ 121.00 (Fig 1).
At the time of their second birthday, children enrolled in the OPT study were assessed for cognition, expressive and receptive communication, overall language and behavior on the BSID-III by qualified BSID-III assessors. Assessments were carried out in a clinical research setting with a mean administration time of 60 minutes. All assessments were video-recorded using three cameras at different angles to ensure adequate capture of all aspects of the child’s performance and interaction with the assessor. An independent assessor (EM), trained in the INTER-NDA, reviewed each recording and extracted INTER-NDA cognitive, language and behavior scores (Table 1) for each child from these recordings. The motor, attention and emotional reactivity subscales of the INTER-NDA were not extracted as these subscales were not administered during the BSID-III assessment in the OPT study.
The video-based approach was selected over a conventional 2-session approach (one for the INTER-NDA and one for the BSID-III) to ensure that the child’s scores were not affected by: (i) temporal differences in the child’s mood and rapport with the assessor; or (ii) familiarization with tasks, which may result in higher scores at the second session (given the conceptual overlap between some INTER-NDA and BSID-III items). Furthermore, as the motor subscale of the INTER-NDA overlaps significantly with the motor subscale of the BSID-III, and the attention and social-emotional reactivity subscale of the INTER-NDA overlaps significantly with the Child Behaviour Checklist, it was not considered necessary to perform an evaluation.
The OPT study was approved by the Research Ethics Committee (REC ref: 10/H0505/55). The INTERGROWTH-21st Project was approved by the Oxfordshire Research Ethics Committee ‘C’ (reference: 08/H0606/139). In both studies, parents/guardians provided written informed consent on behalf of their children.
I. Agreement between the INTER-NDA and the BSID-III.
BSID-III and INTER-NDA assessors independently calculated BSID-III and INTER-NDA scores respectively, and were blinded to each other’s calculations. Scaled BSID-III scores (M = 10, SD = 3) were derived for expressive and receptive communication, and composite scores (M = 100, SD = 15) were derived for cognitive and overall language. The total behavior score was converted to a z-score. For the INTER-NDA, mean and total scores were calculated for each subscale in accordance with the INTER-NDA scoring system (Table 1), and converted to z-scores.
The total and mean INTER-NDA z-scores for each subscale were assessed against the corresponding BSID-III subscales. Difference scores between BSID-III and INTER-NDA (BSID-III minus INTER-NDA) were also calculated.
The agreement between the INTER-NDA and the BSID-III was evaluated using four statistical methods, as recommended by Lee  and Bland and Altman: (i) repeated measures t-tests to assess whether there was a difference between INTER-NDA and BSID-III scores within subjects; (ii) single measure interclass correlation coefficients (ICCs) for absolute agreement for each subscale, using a two-way mixed effects model (to quantify the strength of the association between INTER-NDA and BSID-III scores); (iii) bias and limits of agreement statistics; and (iv) Bland-Altman plots to identify whether the INTER-NDA scores differed systematically across different levels of the BSID-III, and linear regression analyses of the relationship between difference score and BSID-III score.
II. Sensitivity and specificity analysis.
To assess the accuracy of the INTER-NDA in determining low scores on the BSID-III, INTER-NDA cores were dichotomized into low (≤1SD) and normal (>1SD) and BSID-III scores were dichotomized as borderline (≤85) and normal (>85). Accuracy was assessed using sensitivity and specificity analyses to determine the ability of low INTER-NDA scores to predict borderline composite BSID-III scores. Cohen’s kappa was used to determine the level of agreement between INTER-NDA and BSID-III classifications.
III. Internal consistency and uni-dimensionality of the INTER-NDA.
Cronbach’s alphas were calculated for each INTER-NDA subscale. Cronbach’s alpha values are considered “good” if they were above a threshold of 0.7. In conceptualizing the INTER-NDA a key consideration was that all the items measure an underlying construct i.e. ‘neurodevelopment at 2 years of age’. To evaluate the extent to which this is true the unidimensionality of scales was evaluated using a confirmatory factor analysis (CFA) in STATA 15. Comparative Fit, chi square, Tuckler Lewis and Root Mean Sqaure Error of Approximation indices were selected to test the CFA model that best represented the data.
All INTER-NDA subscales are presented for mean (and not total) INTER-NDA scores, unless expressly stated otherwise. Mean INTER-NDA subscale scores were selected over total INTER-NDA subscale scores because the former are not affected by external factors (such as the caregiver interfering in the assessment, a sudden distracting influence in the assessment room, or the assessor being aware that he/she has made an error in the task administration). In such situations, the assessor would score the child as ‘X’ (i.e., unable to assess) for that item. This item would be reflected in the total score but not the mean score.
I. Agreement between the INTER-NDA and BSID-III
The mean INTER-NDA and BSID scores for the domains of cognition, receptive language/communication, expressive language/communication, overall language, behavior, positive behavior and negative behavior are presented in Table 2. The results of the four statistical approaches are:
- Repeated measures t-tests showed no significant differences in BSID-III and INTER-NDA scores across all subscales (Table 2).
- The ICCs for the BSID-III and INTER-NDA subscales (Table 3) indicate a strong association between all INTER-NDA subscales and BSID-III equivalents. All INTER-NDA subscales are within the limit of acceptability for the lower limit of the ICC confidence interval proposed by Bland & Altman (not <0.75)[23, 24]. Fig 2 depicts the association between BSID-III scores and INTER-NDA z-scores across the subscales.
- The Bland-Altman analysis indicated no, or very low, bias in the subscales (Table 4), suggesting very little difference between INTER-NDA and BSID-III scores.
- The Bland-Altman plots and linear regression analyses of the difference scores (BSID-III minus INTER-NDA) revealed positive associations between the subscales (Table 5; Fig 3) such that variation in the BSID-III score accounted for 7.2%, 13.4%, 6.7%, 5.1%, 7.6%, 6.6% and 12.2% of the difference between the BSID-III and INTER-NDA score in the cognitive, receptive language, expressive language, overall language, total behavior, positive behavior and negative behavior subscales, respectively.
II. Sensitivity and specificity analysis
The results of this analysis are presented in Table 6. The sensitivity and specificity of INTER-NDA cognitive scores <2 SD below the mean, for determining low BSID-III scores (<70), are 100% respectively, with strong agreement between INTER-NDA and BSID-III classifications (κ = 1.00, p<0.001). The sensitivity and specificity of INTER-NDA language scores <2 SD below the mean, for determining low BSID-III scores (<70), are 25.0% and 100% respectively, with fair agreement between INTER-NDA and BSID-III classifications (κ = 0.39, p<0.05).
III. Internal consistency and uni-dimensionality of the INTER-NDA
The Cronbach’s alpha scores are presented in Table 7. These were good for the cognitive, receptive language, expressive language, and positive behaviour subscales of the INTER-NDA and was acceptable for the negative behaviour subscale. The corresponding internal consistencies of the BSID-III, calculated from the OPT data, are also presented in the table for comparison.
The seven subscales of the INTER-NDA (cognition, receptive language, expressive language, fine motor, gross motor, positive behaviour and negative behaviour) showed a good fit to the model since all the fit indices were above the recommended values with the exception of Root Mean Square Error of Approximation: The fit indices were: Comparative Fit Index = 0.90 (recommended: >0.90); χ2 = 66.52, p < .001; Tuckler Lewis Index = 0 .94 (recommended: >0 .90, and Root Mean Square Error of Approximation = 0.16 (recommended: <0.80). Fig 4 presents the standardized coefficients from the loadings.
The INTER-NDA is a multi-dimensional ECD instrument measuring cognition, language, motor skills, behavior, attention and social-emotional reactivity in the 22–26 month age group. It has been used to assess neurodevelopment in children from the INTERGROWTH-21st Project sites in Brazil, India, Italy, Kenya and the UK. In this paper, we demonstrate that: (i) it has substantial agreement with the BSID-III for children in the 22–26 month age group; (ii) its sensitivity and specificity for determining low BSID-III scores for the cognitive and language domains are 100%; and 25% and 100% respectively; and (iii) it demonstrates satisfactory internal consistency and its subscales demonstrate good uni-dimensionality in contributing to neurodevelopment at 22–28 months.
The results of our evaluation of the agreement between the INTER-NDA and BSID-III show no significant differences between scores for the same child, and substantial associations between the subscales of the two measures. Importantly, the Bland-Altman plots and linear regression analyses (Fig 3; Tables 5 and 6) reveal that, for the extremely low scores of the BSID-III, children scored relatively higher on the INTER-NDA (floor effect) and, where children scored extremely highly on the BSID-III, they scored relatively lower on the INTER-NDA (ceiling effect). This finding reflects the differences in the age range that the tests were designed to assess, i.e. the INTER-NDA for 22–26 months and BSID-III for 1–42 months. It confirms that the INTER-NDA functions well in agreement with the BSID-III within the age group for which it was designed. Moreover, the results of the Bland-Altman analysis confirm the agreement between the INTER-NDA and BSID-III by showing that a child with a mean cognitive score on the INTER-NDA is likely to score 101 on the cognitive composite score of the BSID-III, with a 95% probability that the true value of their score will fall in the range of 85–118. Given that the SD of the BSID-III composite score is 15, the results indicate that 95% of the children scoring at the mean of the INTER-NDA will fall within 1SD below and just over 1SD above the BSID-III mean, evidencing that the two scales are functioning similarly from a statistical and distributional point of view. These results indicate that the likelihood of a child scoring >±1SD of the BSID-III mean, when he/she achieves a mean score on the INTER-NDA, is statistically negligible. We have also shown that the internal consistency of the INTER-NDA subscales is good to acceptable, and comparable with the BSID-III, even though the INTER-NDA has five times fewer items than the BSID-III, and a fifth of the BSID-III’s administration time.
Our study was limited in that all children were UK-based, despite the INTER-NDA being designed for international use. It was also limited in that the agreement between the BSID-III and the INTER-NDA for the gross and fine motor domains could not be ascertained as these subscales were not administered in the OPT Study. Furthermore, the INTER-NDA was scored using video-recordings of the BSID-III and not in real-time. Nevertheless, there are some strengths to using the video-recording evaluation design over the conventional 2-session evaluation approach: first, as the child did not have two separate assessment sessions, changes in the child’s mood, and his/her familiarization with items, were less likely to confound scores. Second, the design permitted BSID-III and INTER-NDA assessors to score the child’s performance on the respective scales independently, without the presence of multiple examiners in the assessment room, but effectively during the same assessment. Third, in essence, each child completed the INTER-NDA and BSID-III assessments at exactly the same age and time, with the same level of rapport with their mother and assessor. Fourth is that the children were drawn from the OPT trial and therefore were exposed to maternal postnatal depression at some stage during their infancy (assessment was conducted at the end of the treatment trial), which in itself might bias the sample. However, it was unlikely that this would affect the validation results, as low-scorers on the BSID-III might be expected to score low on the INTER-NDA as well. Finally, estimates of accuracy (sensitivity and specificity) are based on the underlying assumption that the reference standard is 100% sensitive. Although the Mental Development Index of the BSID has demonstrated moderate sensitivity (57%) and high specificity (90–100%), and is considered the most comprehensive assessment available for infant neurodevelopment; it does have a degree of imperfection, which must be considered when interpreting results.
Despite these limitations, this study provides evidence of the agreement between the INTER-NDA and BSID-III in assessing cognitive, language and behavioural components of early childhood development in the 22–28 month age group. This is important because the motivation behind the development of the INTER-NDA was to overcome the dependence on time and infrastructure in the context of ECD measurement, by providing a validated, objective assessment that is rapid, reliable and easy to administer in high-, middle- and low-income settings. The use of such a measure might provide a useful, scalable solution for population-based ECD assessments by shifting the emphasis of delivery channels from time-intensive in-depth assessments, to rapid, easy to administer assessment tools. The INTER-NDA is currently in use in the INTERGROWTH-21st Project study sites in Brazil, India, Italy, Kenya and the UK where more than 1000 children have been assessed to date. In this study, the INTER-NDA has therefore proved to be a satisfactorily valid measure of ECD. Nevertheless, its discriminant and predictive validity, and its potential to be used as a diagnostic ECD measure, remain to be explored. Confirmatory validation across larger sample sizes, and in different socio-geographical contexts, is also needed.
The authors gratefully acknowledge the contribution of the mothers, children and researchers who are part of the OPT study and the contribution of the Bill and Melinda Gates Foundation in supporting the INTERGROWTH-21st Project (Grant ID# 49038).
- 1. Vuguin P, Hartil K, Kruse M, Kaur H, Lin C-L, Fiallo A, et al. Shared Effects of Genetic and Intrauterine and Perinatal Environment on the Development of Metabolic Syndrome. PLoS ONE. 2013;8(5): e63021. pmid:23690974
- 2. Seckl JR, Holmes MC. Mechanisms of Disease: glucocorticoids, their placental metabolism and fetal 'programming' of adult pathophysiology. Nat Clin Pract End Met. 2007;3(6):479–88.
- 3. Patandin S, Lanting CI, Mulder PG, Boersma ER, Sauer PJ, Weisglas-Kuperus N. Effects of environmental exposure to polychlorinated biphenyls and dioxins on cognitive abilities in Dutch children at 42 months of age. The Journal of pediatrics. 1999;134(1):33–41. pmid:9880446
- 4. Ernst M, Moolchan ET, Robinson ML. Behavioral and neural consequences of prenatal exposure to nicotine. Journal of the American Academy of Child & Adolescent Psychiatry. 2001;40(6):630–41.
- 5. Rauh VA, Garfinkel R, Perera FP, Andrews HF, Hoepner L, Barr DB, et al. Impact of prenatal chlorpyrifos exposure on neurodevelopment in the first 3 years of life among inner-city children. Pediatrics. 2006;118(6):e1845–e59. pmid:17116700
- 6. Bode MM, D'eugenio DB, Mettelman BB, Gross SJ. Predictive validity of the Bayley, at 2 years for intelligence quotient at 4 years in preterm infants. Journal of Developmental & Behavioral Pediatrics. 2014;35(9):570–5.
- 7. Engle PL, Black MM, Behrman JR, Cabral de Mello M, Gertler PJ, Kapiriri L, et al. Strategies to avoid the loss of developmental potential in more than 200 million children in the developing world. The Lancet. 369(9557):229–42. http://dx.doi.org/10.1016/S0140-6736(07)60112-3.
- 8. NHS England and Public Health England; Guide to the Early Years http://www.england.nhs.uk/wp-content/uploads/2014/03/hv-ey-hlth-prof.pdf (2014) NHS England. Accessed 12 Dec 2014.
- 9. Department of Health, London, UK. Healthy Child Programme–The two year review (2009) http://www.partnershipforyounglondon.org.uk/data/files/Health/review_healthy_child.pdf. Accessed 12 Dec 2014.
- 10. Bricker D, Squires J. Ages and Stages Questionnaire: A Parent-Completed, Child-Monitoring System. 2 ed: Paul H. Brookes Publishing Co.; 1999.
- 11. Briggs-Gowan MJ, Carter AS. Applying the Infant-Toddler Social & Emotional Assessment (ITSEA) and Brief-ITSEA in early intervention. Infant Mental Health Journal. 2007;28(6):564–83. pmid:28640493
- 12. Nolan TM, Bond L, Adler R, Littlefield L, Birleson P, Marriage K, et al. Child Behaviour Checklist classification of behaviour disorder. Journal of Paediatrics and Child Health. 1996;32(5):405–11. pmid:8933400
- 13. Abubakar A, Holding P, Van de Vijver F, Bomu G, Van Baar A. Developmental monitoring using caregiver reports in a resource-limited setting: the case of Kilifi, Kenya. Acta Pædiatrica. 2010;99(2):291–7. pmid:20353499
- 14. Khan NZ, Muslima H, Begum D, Shilpi AB, Akhter S, Bilkis K, et al. Validation of Rapid Neurodevelopmental Assessment Instrument for Under-Two-Year-Old Children in Bangladesh. Pediatrics. 2010;125(4):e755–e62. pmid:20308214
- 15. Durkin MS, Davidson LL, Desai P, Hasan ZM, Khan N, Shrout PE, et al. Validity of the Ten Questions Screen for Childhood Disability: Results from Population-Based Studies in Bangladesh, Jamaica, and Pakistan. Epidemiology. 1994;5(3):283–9. pmid:7518697
- 16. Costantini L, D'Ilario J, Moddemann D, Penner K, Schmidt B. Accuracy of Bayley Scores as Outcome Measures in Trials of Neonatal Therapies. JAMA Pediatrics. 2014. pmid:25546560
- 17. Fernandes M, Stein A, Newton CRJ, Ismail LC, Kihara M, Wulff K, et al. The INTERGROWTH-21st Project Neurodevelopment Package: A novel method for the multi-dimensional assessment of neurodevelopment in pre-school age children PLoS ONE. 2014;9((11): e113360). Epub 25 November 2014. pmid:25423589
- 18. Stein A, Netsi E, Lawrence PJ, Granger C, Kempton C, Craske MG, et al. Mitigating the effect of persistent postnatal depression on child outcomes through an intervention to treat depression and improve parenting: a randomised controlled trial. The Lancet Psychiatry. 5(2):134–44. pmid:29413138
- 19. Bayley N. Bayley scales of infant and toddler development, third edition. San Antonio, TX: Pearson Education Inc.; 2006.
- 20. Aina OF, Morakinyo O. The validation of Developmental Screening Inventory (DSI) on Nigerian children. Journal of tropical pediatrics. 2001;47(6):323–8. pmid:11827298
- 21. Gollenberg AL, Lynch CD, Jackson LW, McGuinness BM, Msall ME. Concurrent validity of the parent‐completed Ages and Stages Questionnaires, with the Bayley Scales of Infant Development II in a low‐risk sample. Child: care, health and development. 2010;36(4):485–90.
- 22. Lee J, Koh D, Ong C. Statistical evaluation of agreement between two methods for measuring a quantitative variable. Computers in biology and medicine. 1989;19(1):61–70. pmid:2917462
- 23. Martin Bland J, Altman D. Statistical Methods for Assessing Agreement Between Two Methods of Clincal Measurement The Lancet. 1986;327(8476):307–10. http://dx.doi.org/10.1016/S0140-6736(86)90837-8.
- 24. Bland JM, Altman DG. A note on the use of the intraclass correlation coefficient in the evaluation of agreement between two methods of measurement. Computers in Biology and Medicine. 1990;20(5):337–40. http://dx.doi.org/10.1016/0010-4825(90)90013-F. pmid:2257734
- 25. Bland JM, Altman DG. Statistics notes: Cronbach's alpha. BMJ. 1997;(314):572. http://dx.doi.org/10.1136/bmj.314.7080.572.
- 26. Tavakol M, Dennick R. Making sense of Cronbach's alpha. International journal of medical education. 2011;2:53. pmid:28029643
- 27. Rutjes A, Reitsma J, Coomarasamy A, Khan K, Bossuyt P. Evaluation of diagnostic tests when there is no gold standard. A review of methods. Health Technology Assessment—Southampton 2007;11(50).
- 28. Cifuentes J, Mariani G, Nelson K, Johnson S, Biasini F, Steele R. Sensitivity of Bayley Scales of Infant Development at One Year for Later Severe Neurodevelopmental Delay in ECMO Patients. Pediatric Research. 1996;39:103.