Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Screening for developmental delay at 18 months using the Infant Toddler Checklist: A validation study

  • Cornelia M. Borkhoff ,

    Contributed equally to this work with: Cornelia M. Borkhoff

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Supervision, Validation, Visualization, Writing – original draft

    Affiliations Institute of Health Policy, Management and Evaluation, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada, Child Health Evaluative Sciences, The Hospital for Sick Children Research Institute, Toronto, Ontario, Canada

  • Haris Imsirovic,

    Roles Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Writing – review & editing

    Affiliation ICES, Toronto, Ontario, Canada

  • Imaan Bayoumi,

    Roles Conceptualization, Investigation, Methodology, Writing – review & editing

    Affiliations ICES, Toronto, Ontario, Canada, Department of Family Medicine, Queen’s University, Kingston, Ontario, Canada

  • Colin Macarthur,

    Roles Conceptualization, Data curation, Funding acquisition, Investigation, Methodology, Resources, Writing – review & editing

    Affiliations Child Health Evaluative Sciences, The Hospital for Sick Children Research Institute, Toronto, Ontario, Canada, Department of Pediatrics, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada

  • Kimberly M. Nurse,

    Roles Conceptualization, Investigation, Methodology, Writing – review & editing

    Affiliation Child Health Evaluative Sciences, The Hospital for Sick Children Research Institute, Toronto, Ontario, Canada

  • Teresa To,

    Roles Conceptualization, Data curation, Investigation, Methodology, Resources, Writing – review & editing

    Affiliations Institute of Health Policy, Management and Evaluation, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada, Child Health Evaluative Sciences, The Hospital for Sick Children Research Institute, Toronto, Ontario, Canada, ICES, Toronto, Ontario, Canada

  • Mark E. Feldman,

    Roles Conceptualization, Investigation, Methodology, Writing – review & editing

    Affiliation Department of Pediatrics, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada

  • Eddy Lau,

    Roles Conceptualization, Investigation, Methodology, Writing – review & editing

    Affiliation Department of Pediatrics, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada

  • Braden Knight,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Writing – review & editing

    Affiliation ICES, Toronto, Ontario, Canada

  • Catherine S. Birken,

    Roles Conceptualization, Data curation, Funding acquisition, Investigation, Methodology, Resources, Writing – review & editing

    Affiliations Institute of Health Policy, Management and Evaluation, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada, Child Health Evaluative Sciences, The Hospital for Sick Children Research Institute, Toronto, Ontario, Canada, ICES, Toronto, Ontario, Canada, Department of Pediatrics, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada

  • Jonathon L. Maguire,

    Roles Conceptualization, Data curation, Funding acquisition, Investigation, Methodology, Resources, Writing – review & editing

    Affiliations ICES, Toronto, Ontario, Canada, Department of Pediatrics, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada, MAP Centre for Urban Health Solutions, Li Ka Shing Knowledge Institute, Toronto, Ontario, Canada

  • Patricia C. Parkin

    Roles Conceptualization, Data curation, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Writing – original draft

    patricia.parkin@sickkids.ca

    Affiliations Institute of Health Policy, Management and Evaluation, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada, Child Health Evaluative Sciences, The Hospital for Sick Children Research Institute, Toronto, Ontario, Canada, Department of Pediatrics, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada

Abstract

Objective

The Infant Toddler Checklist (ITC) may be promising as a single tool at the 18-month visit to detect a range of developmental concerns. We examined the predictive validity of the ITC; and the association between positive ITC screening and health care utilization (HCU).

Methods

Prospective cohort study of children at average-risk for developmental delay attending their 18-month visit in primary care in Toronto, Canada. Parents completed the ITC. HCU from the single-payer provincial health system was collected from health administrative databases ensuring complete follow-up. Physician billing code for a neurodevelopmental consultation was the primary outcome and criterion measure. Six other HCU types were assessed.

Results

Of 1460 children with a mean age at screening of 18 months, 11% screened ITC positive. Mean age at follow-up was 8 years, 2.6% had a neurodevelopmental consultation. Screening test properties (with neurodevelopmental consultation as the criterion measure): 40% sensitivity (95% CI 24%, 57%), 90% specificity (95% CI 88%, 91%), 10% false positive rate (95% CI 9%, 12%). Using multivariable negative binomial regression, a positive ITC was associated with higher rates of 6 of 7 HCU types, including neurodevelopmental consultation (aRR 2.78, 95% CI 1.37, 5.67, p = 0.005).

Conclusion

The ITC had high specificity and a low false positive rate, suggesting that most children with a negative ITC will not have a later neurodevelopmental consultation, and use of the tool may minimize unintended harms such as anxiety and resource use. The low sensitivity highlights the importance of ongoing developmental surveillance. Low sensitivity of other screening tools is discussed.

Introduction

Professional organizations in many countries advocate for identification of developmental delays and disorders in early childhood through developmental surveillance and use of standardized developmental screening tools in primary care and public health settings [19]. Screening has been shown to be superior to surveillance for identification, referral, and eligibility for intervention services [10]. The American Academy of Pediatrics (AAP) recommends that all children with positive screening should receive a referral for a medical and developmental evaluation [1]. While developmental surveillance is a long-standing clinical practice, there is international variation in the implementation of developmental screening with regard to age of screening and type of screening tool [19].

Developmental screening may occur at multiple ages or once at a specific age. For example, the American Academy of Pediatrics recommends screening at multiple ages (9, 18, 24, 30 months) [1,2]; the Canadian Paediatric Society recommends screening only once at 18 months [3]; and the U.K. Healthy Child Programme recommends screening at 2 years [7,8]. Despite these differences, screening at 18–24 months is recommended, highlighting the importance of this age at which delays in communication and language development are often evident [1].

Types of screening tools include general/broadband tools and domain/disorder-specific tools (including autism spectrum disorder [ASD]-specific tools). At the 18-month visit, the AAP recommends a general tool plus an ASD-specific tool [1,2]. A survey of American pediatricians found that commonly used screening tools are the Ages and Stages Questionnaire (ASQ) as a general tool and the Modified Checklist for Autism in Toddlers (M-CHAT) as an ASD-specific tool [11]. However, there is some evidence that these tools have low sensitivity and positive predictive value in younger (18–24 months) average-risk children [1215]; and a recent study showed that a positive screen on the ASQ communication domain was a stronger predictor of referral than a positive M-CHAT screen [16]. This raises the question of the need for two types of screening tools to identify children needing referral. Administration of two screening tools may be burdensome and is not recommended in other countries such as Canada and the U.K.

In this context, the Infant Toddler Checklist (ITC) may be promising as a single screening tool at the 18-month visit. The ITC developers have reported its ability to detect a range of developmental concerns, including language delay, global developmental delay, and ASD [1720]. The ITC is freely available, takes 5 minutes for parents to complete and 2 minutes for scoring [21]. The feasibility of administering the ITC at 18 months has been assessed by 203 primary care pediatricians in over 40,000 children in California [21,22]; and in over 2,600 children in publicly-funded Swedish child health services [23,24]. However, the validity of the ITC in routine primary care has not been fully evaluated.

The validity of developmental screening tools may be assessed with measurement of the tool and criterion at the same time (concurrent validity) or measurement of the criterion at some time after the tool (predictive validity) [25]. Concurrent validity studies often use a diagnostic assessment as the criterion measure [12,13,2629]. Predictive validity studies examine future outcomes that are meaningful to children, families, practitioners and the health care system as the criterion measure [25,30,31]. Tool accuracy is influenced by child age at screening, with lower sensitivity at younger compared with older age; risk for developmental delay, with lower sensitivity in average-risk compared with high-risk children; and diagnostic assumptions made for children lost to follow-up, with higher sensitivity when lower prevalence of diagnosis assumed [26,30,32]. For application in our primary care and public health settings, we aimed to assess the predictive validity of the ITC in average-risk children screened at 18 months of age with complete follow-up at a later age.

Developmental screening tools are not intended to be used as a diagnostic assessment [33]. Rather, these tools are intended to identify children who should receive close monitoring (i.e., a follow-up visit) or referral (i.e., for a diagnostic assessment), as recommended by the AAP [1]. It has also been emphasized that screening tools must not only be accurate but also improve care by influencing clinician and parent decision-making [32]. Therefore, health care utilization for consultation and monitoring are meaningful outcomes for assessing the predictive validity of a screening tool and its association with clinical care.

We hypothesized that children with developmental concerns identified in early childhood would be more likely to receive a neurodevelopmental consultation; and that children requiring close monitoring and/or referral would have greater health care utilization including unscheduled primary care visits and non-primary care and specialist visits.

Our primary objective was to examine the predictive validity of the ITC in average-risk children at the 18-month primary care visit, using neurodevelopmental consultation as the criterion measure. The secondary objective was to examine the association between positive screening using the ITC at the 18-month visit with later health care utilization (HCU).

Methods

Design

We used a prospective cohort study design. Developmental screening results from parent-completed ITC were collected at the child’s 18-month visit in primary care. Later neurodevelopmental consultation and other HCU were collected from health administrative databases. We followed the Standards for Reporting of Diagnostic Accuracy (STARD) guideline [34]. The study period was October 1, 2011 to March 31, 2021.

Setting and usual care

The study was embedded in real-world primary care practices participating in the TARGet Kids! practice-based research network (www.targetkids.ca) in the province of Ontario, Canada, where primary care is delivered through the universal health care system by both pediatricians and family physicians. The Canadian Paediatric Society recommends 11 health supervision visits from birth to 5 years [35] and use of a developmental screening tool once at the 18-month visit [3]. Beginning in 2009, as part of a physician incentive program, the provincial government provided the Nipissing District Developmental Screen (NDDS) free of charge which has been widely used as the developmental screening tool at the 18-month visit [3,36,37]. Despite its widespread use, the NDDS has been reported to have poor performance characteristics and in 2023 the province announced they would not be renewing the province-wide license and providing the NDDS free-of-charge [36,38]. However, as our study period preceded 2023, and in keeping with usual practice, parents completed the NDDS which was used by the child’s physician for clinical care purposes.

Referral is required for neurodevelopmental assessments which are completed by developmental and consultant pediatricians. During this study, usual care (including developmental surveillance, screening using the NDDS, and later health care utilization such as monitoring and referral) was conducted as per the child’s primary care physician, with no involvement by the research team.

Data sources

The TARGet Kids! database was the source for participant characteristics and ITC results which were collected prospectively. At the 18-month visit, parents completed a standardized data collection form and the ITC. The TARGet Kids! cohort profile has been published and registered at ClinicalTrials.gov (NCT01869530) [39].

Population-based health administrative databases held at ICES (Ontario, www.ices.on.ca) were the source for HCU data. ICES is an independent, not-for-profit research institute which holds health data that are routinely collected for the publicly-funded, single-payer healthcare system in the province of Ontario, Canada. Use of these data is authorized under Ontario’s Personal Health Information Protection Act. Several ICES databases were used as the source for HCU data: The Ontario Health Insurance Plan (OHIP) physician claims database includes physician fee-for-service billings; the Canadian Institute for Health Information Discharge Abstract Database (DAD) includes acute-care hospitalizations; the National Ambulatory Care Reporting System (NACRS) includes emergency department (ED) visits. Data were available from January 1, 2010 to March 31, 2022.

Data from children in TARGet Kids! were deterministically linked at the individual level using each child’s encrypted OHIP number to ICES databases. Using data from this single-payer system ensured complete follow-up of all participants.

Participants

Children were eligible if they were 16–23 months of age attending a scheduled 18-month health supervision visit and had a parent-completed ITC between October 1, 2011 and July 31, 2019.

To ensure a study population at average-risk for developmental delay, we excluded children with health conditions affecting growth and development, established diagnosis of developmental delay, gestational age < 35 weeks, and birthweight <2.5 kg; as well as unscheduled visit, parents unable to communicate in English, no valid OHIP, not continuously eligible for OHIP between birthdate and index screening or between index screening and 1 year post screening, index date after March 31, 2021, or did not have at least 1 year of follow-up.

Infant Toddler Checklist

Wetherby and Prizant developed the ITC for children ages 6–24 months (see the checklist and scoring available online) [17] and evaluated the concurrent and predictive validity in non-primary care settings [18,19]. In children 12–24 months, the developers reported a sensitivity of 83%−89% and specificity of 70%−77% for a communication disorder, and a sensitivity of 93% for ASD [19,20].

The ITC includes 24 items which yield three composite scores (social, speech, symbolic) and their sum produces a total score [1720]. The 10th percentile cut-off for each month of age indicates concern/no concern for each of the four scores (three composite scores and total score).

For this study, parents completed the ITC once at the 18-month visit. The ITC was completed for research purposes; physicians and parents were blind to the results. We examined three components of the ITC using definitions from the developers’ recommendations, which were included as predictors in our analytical models. First, a positive ITC was defined as concern for speech delay (defined as concern on the speech composite), or concern for other communication delay (defined as concern on the social composite, symbolic composite, or total score), or concern for both. Then, we separately examined: concern for speech delay (recommendation: re-screen in 3 months to determine if referral is advisable) and concern for other communication delay (recommendation: refer).

Neurodevelopmental consultation

The physician billing code for a neurodevelopmental consultation (at least once at any time in the follow-up period) was the primary outcome and criterion measure. The Ontario Ministry of Health Schedule of Benefits defines a neurodevelopmental consultation (billing code A667, fee CAD$401.30) as: “a consultation in which the physician provides all the elements of a consultation (A265) for an infant, child or adolescent with complex neurodevelopmental conditions (e.g., autism, global developmental disorders, etc.) and spends a minimum of 90 minutes of direct contact with the patient and caregiver. The service is limited to a maximum of one per patient, per physician, per 12-month period” [40]. We have recently published an analysis of the validity of the NDDS using this same criterion [38].

Other health care utilization

For our secondary objective to assess other HCU at any time in the follow-up period, which may represent monitoring and/or referral, the following were collected: special pediatric consultation/assessment (at least once), scheduled primary care visit billings (cumulative number of billings), unscheduled primary care or minor visit billings (cumulative number of billings), other non-primary care visit billings (cumulative number of billings), hospitalizations (0, 1, 2+), ED visits (0, 1, 2+). The category special pediatric consultation/assessment includes two physician billing codes for extended pediatric consultation (75–90 minutes) and three specialized codes for developmental and/or behavioral care. Billing codes were grouped by three experienced primary care physicians by consensus. See S1 Table for physician billing codes.

Statistical analysis

Descriptive statistics were used to characterize all participants, as well as for those children who screened positive and who screened negative. We also described health care utilization at baseline from birth until index screening using the ICES data prior to index date (i.e., the look-back window).

To examine the predictive validity of the ITC, we calculated the screening test properties (sensitivity, specificity, false positive rate, positive predictive value [PPV], negative predictive value [NPV]), with 95% confidence intervals (CIs), for each of the three components of the ITC at screening. We used the neurodevelopmental consultation physician billing code as the criterion measure.

We also examined the association between ITC (predictor) with HCU (outcome) using multivariable negative binomial regression analyses. The logarithm of observation time was used as an offset to account for variation in the window of observation. For our binary outcomes (e.g., neurodevelopmental consultation), the offset was the time to first event of outcome or end of follow-up, for those without the outcome. For count outcomes (e.g., scheduled primary care visits), the offset was the entire time a child spent in the study. We used separate regression models to estimate the rate ratio (RR) and 95% CIs for each of the 7 HCU outcomes, which were measured after the index date when the ITC was administered. Models were adjusted for child age, child sex, number of siblings, maternal ethnicity, self-reported annual family income (CAD$), family immigration status, and family history of developmental concern (ASD, attention deficit disorder, learning disability) regardless of statistical significance [41]. All covariates were measured at index screening. Child age and child sex had complete data. All other covariates had < 12% missing data, except for family history of developmental concern, which had 22% missing data; multiple imputation by chained equation (MICE) was used to impute missing covariate data [42]. Statistical significance was defined as p < 0.05; all statistical tests were two-sided. Statistical analysis was conducted using SAS V.9.4 statistical software (SAS Institute).

Ethics

Ethics approval was obtained from the Hospital for Sick Children and Unity Health Toronto Research Ethics Boards, Toronto, Canada. Parent and/or legal guardian informed written consent was obtained. For clinical care, physicians followed usual care as described in ‘Setting and usual care’.

Results

Participants

Fig 1 shows the participant flow for the final cohort (n = 1,460, mean age 18.2 months). Table 1 shows the participant characteristics at index screening, and HCU from birth until index screening. Of 1,460 children screened, 160 (11.0%) screened positive. Table 2 shows HCU from index screening to follow-up. The mean (SD) time to follow-up was 6.5 (2.2) years, 38 (2.6%) had a neurodevelopmental consultation, and 165 (11.3%) had a special pediatric consultation/assessment.

thumbnail
Table 1. Participant characteristics at index screening with the Infant Toddler Checklist (n = 1,460) and health care utilization from birth until index screening.

https://doi.org/10.1371/journal.pone.0326751.t001

thumbnail
Table 2. Health care utilization from index screening at 18 months to follow up for children screened using the Infant Toddler Checklist (n = 1,460).

https://doi.org/10.1371/journal.pone.0326751.t002

Predictive validity

The screening test properties with neurodevelopmental consultation as the criterion measure are shown in Table 3. A positive ITC screen had a sensitivity of 39.5% (95% CI 24.0%, 56.6%), specificity of 89.8% (95% CI 88.1%, 91.3%), false positive rate of 10.2% (95% CI 8.6%, 12.0%), PPV of 9.4% (95% CI 5.3%, 15.0%), and NPV of 98.2% (97.4%, 98.9%). Low sensitivity and high specificity were also found for concern for speech delay and concern for other communication delay.

thumbnail
Table 3. Screening test properties of the 18-month Infant Toddler Checklist (n = 1,460) at the 18-month visit compared with later neurodevelopmental consultation at mean follow-up of 6.5 years.

https://doi.org/10.1371/journal.pone.0326751.t003

The adjusted rate ratios (aRR) are shown in Table 4. A positive ITC screen was associated with higher rates of neurodevelopmental consultation (aRR 2.78, 95% CI 1.37, 5.67, p = 0.005), special pediatric consultation/assessment (aRR 1.75, 95% CI 1.17, 2.61, p = 0.007), scheduled primary care visit billings (aRR 1.11, 95% CI 1.02, 1.21, p = 0.02), unscheduled primary care or minor visit billings (aRR 1.44, 95% CI 1.14, 1.81, p = 0.002), other non-primary care visit billings (aRR 1.15, 95% CI 1.00, 1.32, p = 0.05), and hospitalizations (aRR 1.48, 95% CI 1.01, 2.19, p = 0.046) compared with a negative ITC screen.

thumbnail
Table 4. Association between positive developmental screening at 18 months using the Infant Toddler Checklist (n = 1,460) and later health care utilization.

https://doi.org/10.1371/journal.pone.0326751.t004

Discussion

In this study of 1,460 children receiving developmental screening at 18 months, 11% had a positive ITC (similar to rates in California [14%] and Sweden [13%]) [22,24], and 2.6% received a neurodevelopmental consultation at an average age of 8 years. We examined both tool accuracy and influence on clinical decision-making [32]. A positive ITC had high specificity (90%) and low false positive rate (10%) but low sensitivity (40%). This suggests that most children with a negative ITC will not have a later neurodevelopmental consultation (specificity), but the tool cannot accurately identify those who will have a later neurodevelopmental consultation (sensitivity) resulting in many false negatives. The low false positive rate may minimize unintended harms such as anxiety and resource use [43]. A positive ITC at 18 months was associated with higher rates of six of seven types of health care utilization measured at approximately 8 years. This suggests a strong association between the ITC and clinical decision-making with respect to children’s need for health care services, given that physicians and parents were blind to the results of the ITC.

Pierce and colleagues have reported on outcomes and feasibility of implementing the ITC in a network of 203 primary care pediatricians in San Diego County and established a system for screening, evaluation, and referral for neurodevelopmental risk [21,22]. The investigators selected the ITC as the screening tool due to its ability to identify children with a range of developmental concerns, including young children with ASD who do not exhibit full symptoms and who may be missed using ASD-specific screening tools [22]. Over 40,000 children were screened at 12, 18, and/or 24 months (median age of 18 months), 14% had a positive ITC screen, and 2% were referred and completed at least one diagnostic evaluation (median age of 20 months) [22]. Following evaluation, children received a range of diagnoses including language delay, global developmental delay, and ASD or ASD features. Of pediatricians completing a satisfaction survey, 96% provided a positive evaluation [21]. The authors reported a positive predictive value of 75%, but noted that their studies were not designed to assess sensitivity and specificity which could not be calculated due to incomplete follow-up of all screen positive and screen negative children [21]. The authors have also reported that the mean age of screening and evaluation was similar among children of diverse ethnic or racial backgrounds, suggesting their system may ensure equitable access to care for all children [44]. This system of screening using the ITC followed by diagnostic evaluation has been successfully used for other clinical and genetic studies of ASD [45,46].

Fäldt and colleagues have reported on the implementation of the ITC at the 18-month visit in publicly-funded Swedish child health services using the RE-AIM framework (reach, effectiveness, adoption, implementation, maintenance) [23,24]. Of 2,633 children screened, 13% had a positive ITC and the referral rate to a speech and language pathologist increased from 0.4% pre-implementation to 6.9% post-implementation [24]. However, calculation of concurrent validity of the ITC was limited as not all children with a positive screen and only a random sample of children with a negative screen were referred and assessed with the criterion measure [23].

We previously examined the predictive validity of the ITC using two different criterion measures: parent-reported developmental diagnosis at 3–5 years (n = 488), and teacher-reported school readiness at 4–6 years (n = 293) [47,48]. The previous analyses showed comparable sensitivity, specificity and false positive rates despite different criterion measures and prevalence rates (prevalence: 5.3% for parent-reported developmental diagnosis, 18.4% for school readiness vulnerability). Our current analysis provides a larger sample size (n = 1460) and a health systems related criterion measure (physician billing for a neurodevelopmental consultation, prevalence 2.6%). In addition to evaluating predictive validity of the ITC using a variety of criterion measures, we have examined factors associated with a positive ITC, and found an association with low family income, child male sex, and low birthweight, providing additional evidence of ‘known-groups’ validity of the ITC [49].

Strengths of the ITC include its focus on communication, which is highly relevant to developmental milestones at the 18-month visit, the easy-to-complete 1-page format, accessibility free-of-charge, and ability to detect a range of developmental concerns (language delay, global developmental delay, and ASD). The ITC does not screen specifically for motor delays and disorders. To address this limitation, a milestone approach to developmental surveillance, including physical examination and history, may lead to identification of motor delays without the use of a screening tool [1,50]. A recent study examining the predictors of early intervention referral after a positive developmental screen in primary care, reported no association with the scores on the gross motor domain of the ASQ [16].

Other general and ASD-specific screening tools are widely used. Review of the accuracy of these tools reveals low sensitivity and high specificity when used in young (18–24 months), average-risk children, especially when predictive validity is assessed (described as the ‘thorny nature’ of predictive validity studies of developmental screening), and complete follow-up is ensured [25,26,32].

The Ages and Stages Questionnaires (ASQ), now in its third edition (ASQ-3), is a general development tool for children 1–66 months, and screens 5 domains: communication, gross motor, fine motor, problem solving and personal-social [51]. Screening test properties have been evaluated in several recent systematic reviews [27,28,30,31]. However, reported summary measures of sensitivity and specificity may be misleading, as included studies represent a wide age range (0–60 months), mixed populations (average- and high-risk), and various criterion measures. Sheldrick et al examined concurrent validity of the ASQ-3, compared with the Bayley Scales of Infant and Toddler Development (BSID-III), in children 0–42 months recruited from primary care offices and reported a sensitivity of 35% and specificity of 89% [12]. Veldhuizen et al examined the concurrent validity of the ASQ, compared with the BSID-III, in a community sample of children 1–36 months (n = 587) and reported a sensitivity of 41% and specificity of 82% [13]. Solgi et al reported that only the communication domain was associated with referral to early intervention following screening with the ASQ-3 at 18–24 months in primary care [16].

The Modified Checklist for Autism in Toddlers, Revised with Follow-Up (M-CHAT-R/F) is an ASD-specific tool for children ages 16–30 months and a risk score determines if a follow-up structured questionnaire is required [52]. Screening test properties have been evaluated in two recently published systematic reviews [26,29]. Aishworiya et al [26] included 15 studies; however, summary measures of sensitivity and specificity were based on limited follow-up and evaluation of screen-negative children [53]. Using results from a published large-scale screening study [52], Sheldrick et al developed simulation models with six scenarios, each using different assumptions regarding loss to follow-up, and reported a sensitivity ranging from 40% to 94% across scenarios [32]. Wieckowski et al [29] included 50 studies; however, sensitivity varied widely according to child age at screening (14–86 months), population (average- or high-risk), and design (concurrent or predictive). In two recent predictive validity studies of average-risk children 18–24 months in primary care, sensitivity was 33–39% and specificity was 94–98% [14,15].

Strengths of our study include a large sample size, recruitment of average-risk children from primary care practices during scheduled health supervision visits, recruitment of families with diverse characteristics (included in the adjusted analyses), blinding of parents and physicians to the ITC results, use of standardized physician billing codes, and predictive validity design. Whereas many studies are limited by the challenges of completing follow-up and evaluation of all screened children [26,32,33,53], we were able to ensure complete follow-up of both screen positive and screen negative children by linking individual parent-reported data and population-based health administrative databases.

Limitations include the use of physician billing codes for neurodevelopmental consultation rather than diagnosis codes for developmental disorders (which have not been validated); however, the aim of screening is to identify children who should receive monitoring or referral, and other predictive validity studies have used outcomes other than diagnosis, such as educational outcomes and referral to early intervention [16,30,31]. While billing code algorithms have not been validated, these codes are likely to be more accurate as they represent a remunerative event rather than a subjective clinical impression. Our study was also limited by exclusion of parents unable to communicate in English. Given our aim to assess ITC validity in average-risk 18-month-old children to predict physician health care, we were unable to report on ITC validity in high-risk children, older children, or prediction of non-physician health care (such as speech-language therapy).

Conclusion

There are several implications for practice and policy. First, screening at 18 months using the ITC has high specificity and a low false positive rate. Second, the ITC has low sensitivity, consistent with current literature showing low sensitivity of other commonly used screening tools when examined prospectively in young, average-risk children in real-world primary care settings; this highlights the importance of ongoing developmental surveillance beyond the 18-month visit. Third, screening with a single tool which may identify children with a range of developmental concerns, rather than administering two tools, may be an efficient approach warranting further study.

Acknowledgments

We thank all participating children and families for their time and involvement in this study. We thank the TARGet Kids! Collaboration for supporting this study (details may be found on our website www.targetkids.ca). The TARGet Kids! Collaboration is a primary care practice–based research network and includes practice site physicians, research staff, collaborating investigators, trainees, methodologists, biostatisticians, data management personnel, laboratory management personnel, and advisory committee members.

This study was supported by ICES (details may be found on the ICES website https://www.ices.on.ca/). Parts of this material are based on data and/or information compiled and provided by the Canadian Institute for Health Information and the Ontario Ministry of Health. The analyses, conclusions, opinions, and statements expressed herein are solely those of the authors and do not reflect those of the data sources; no endorsement is intended or should be inferred.

Supporting information

S1 Table. OHIP fee codes used to define study outcomes.

https://doi.org/10.1371/journal.pone.0326751.s001

(DOCX)

References

  1. 1. Lipkin PH, Macias MM, Council on Children with Disabilities, Section on Developmental and Behavioral Pediatrics. Promoting optimal development: identifying infants and young children with developmental disorders through developmental surveillance and screening. Pediatrics. 2020;145(1):e20193449. pmid:31843861
  2. 2. Hyman SL, Levy SE, Myers SM, Council on Children with Disabilities, Section on Developmental and Behavioral Pediatrics. Identification, evaluation, and management of children with autism spectrum disorder. Pediatrics. 2020;145:e20193447.
  3. 3. Williams R, Clinton J, Canadian Paediatric Society, Early Years Task Force. Getting it right at 18 months: In support of an enhanced well-baby visit. Paediatr Child Health. 2011;16(10):647–54. pmid:23204907
  4. 4. Zwaigenbaum L, Brian JA, Ip A, Canadian Paediatric Society, Autism Spectrum Disorder Guidelines Task Force. Canadian Paediatric Society Position Statement: Early detection for autism spectrum disorder in young children. Paediatr Child Health. 2019;24(7):424–32.
  5. 5. Garg P, Ha MT, Eastwood J, Harvey S, Woolfenden S, Murphy E. Health professional perceptions regarding screening tools for developmental surveillance for children in a multicultural part of Sydney, Australia. BMC Fam Pract. 2018;19:42.
  6. 6. McLean K, Goldfeld S, Molloy C, Wake M. Screening and surveillance in early childhood health: Rapid review of evidence for effectiveness and efficiency of models. An Evidence Check Review brokered by the Sax Institute for NSW Kids and Families, Australia. 2014.
  7. 7. Bellman M, Byrne O, Sege R. Developmental assessment of children. BMJ. 2013;346:e8687. pmid:23321410
  8. 8. UK Office for Health Improvement & Disparities, Department of Health and Social Care. The Healthy Child Programme: Resources to help keep children healthy and well from preconception to adulthood. 2023. [cited 2025 Mar 19. ]. Available from: https://www.gov.uk/government/collections/healthy-child-programme
  9. 9. Wilson P, Wood R, Lykke K, Hauskov Graungaard A, Ertmann RK, Andersen MK, et al. International variation in programmes for assessment of children’s neurodevelopment in the community: understanding disparate approaches to evaluation of motor, social, emotional, behavioural and cognitive function. Scand J Public Health. 2018;46(8):805–16. pmid:29726749
  10. 10. Guevara JP, Gerdes M, Localio R, Huang YV, Pinto-Martin J, Minkovitz CS, et al. Effectiveness of developmental screening in an urban setting. Pediatrics. 2013;131(1):30–7. pmid:23248223
  11. 11. Lipkin PH, Macias MM, Baer Chen B, Coury D, Gottschlich EA, Hyman SL. Trends in pediatricians’ developmental screening: 2002-2016. Pediatrics. 2020;145(4):e20190851.
  12. 12. Sheldrick RC, Marakovitz S, Garfinkel D, Carter AS, Perrin EC. Comparative accuracy of developmental screening questionnaires. JAMA Pediatr. 2020;174(4):366–74. pmid:32065615
  13. 13. Veldhuizen S, Clinton J, Rodriguez C, Wade TJ, Cairney J. Concurrent validity of the ages and stages questionnaires and bayley developmental scales in a general population sample. Acad Pediatr. 2015;15(2):231–7. pmid:25224137
  14. 14. Guthrie W, Wallis K, Bennett A, Brooks E, Dudley J, Gerdes M, et al. Accuracy of autism screening in a large pediatric network. Pediatrics. 2019;144(4):e20183963. pmid:31562252
  15. 15. Carbone PS, Campbell K, Wilkes J, Stoddard GJ, Huynh K, Young PC, et al. Primary care autism screening and later autism diagnosis. Pediatrics. 2020;146(2):e20192314. pmid:32632024
  16. 16. Solgi M, Calub C, Feryn A, Hoang A, Fombonne E, Matushak C, et al. Predictors of early intervention referral after a positive developmental screen in community primary care clinics. Acad Pediatr. 2025;25(3):102591. pmid:39395610
  17. 17. Infant Toddler Checklist, First Words Project – checklist and scoring. [cited 2025 Mar 19. ]. Available from: https://firstwords.fsu.edu/pdf/checklist_scoring_cutoffs.pdf
  18. 18. Wetherby AM, Prizant G. CSBS DP Manual. First Normed Edition ed. Baltimore: Brookes Publishing; 2008.
  19. 19. Wetherby A, Goldstein H, Cleary J, Allen L, Kublin K. Early identification of children with communication disorders: concurrent and predictive validity of the CSBS developmental profile. Infant Young Child. 2003;16:161–74.
  20. 20. Wetherby AM, Brosnan-Maddox S, Peace V, Newton L. Validation of the Infant-Toddler Checklist as a broadband screener for autism spectrum disorders from 9 to 24 months of age. Autism. 2008;12(5):487–511. pmid:18805944
  21. 21. Pierce K, Carter C, Weinfeld M, Desmond J, Hazin R, Bjork R, et al. Detecting, studying, and treating autism early: the one-year well-baby check-up approach. J Pediatr. 2011;159(3):458-465.e1-6. pmid:21524759
  22. 22. Pierce K, Gazestani V, Bacon E, Courchesne E, Cheng A, Barnes CC, et al. Get SET early to identify and treatment refer autism spectrum disorder at 1 year and discover factors that influence early diagnosis. J Pediatr. 2021;236:179–88.
  23. 23. Fäldt A, Fabian H, Dahlberg A, Thunberg G, Durbeej N, Lucas S. Infant-Toddler Checklist identifies 18-month-old children with communication difficulties in the Swedish child healthcare setting. Acta Paediatr. 2021;110(5):1505–12. pmid:33251672
  24. 24. Dahlberg A, Levin A, Fäldt AE. Implementation of the Infant-Toddler Checklist in Swedish child health services at 18 months: an observational study. BMJ Paediatr Open. 2024;8(1):e002406. pmid:38531549
  25. 25. Marks K, Glascoe FP, Aylward GP, Shevell MI, Lipkin PH, Squires JK. The thorny nature of predictive validity studies on screening tests for developmental-behavioral problems. Pediatrics. 2008;122(4):866–8. pmid:18829812
  26. 26. Aishworiya R, Ma VK, Stewart S, Hagerman R, Feldman HM. Meta-analysis of the Modified Checklist for Autism in Toddlers, revised/follow-up for screening. Pediatrics. 2023;151(6):e2022059393. pmid:37203373
  27. 27. Muthusamy S, Wagh D, Tan J, Bulsara M, Rao S. Utility of the ages and stages questionnaire to identify developmental delay in children aged 12 to 60 months: a systematic review and meta-analysis. JAMA Pediatr. 2022;176(10):980–9.
  28. 28. Balasubramanian H, Ahmed J, Ananthan A, Srinivasan L, Mohan D. Comparison of parent or caregiver-completed development screening tools with Bayley Scales of Infant Development: a systematic review and meta-analysis. Arch Dis Child. 2024;109(9):759–66. pmid:38811056
  29. 29. Wieckowski AT, Williams LN, Rando J, Lyall K, Robins DL. Sensitivity and specificity of the modified checklist for Autism in Toddlers (Original and Revised): a systematic review and meta-analysis. JAMA Pediatr. 2023;177(4):373–83. pmid:36804771
  30. 30. Cairney DG, Kazmi A, Delahunty L, Marryat L, Wood R. The predictive value of universal preschool developmental assessment in identifying children with later educational difficulties: a systematic review. PLoS One. 2021;16(3):e0247299. pmid:33661953
  31. 31. Schonhaut L, Maturana A, Cepeda O, Serón P. Predictive validity of developmental screening questionnaires for identifying children with later cognitive or educational difficulties: a systematic review. Front Pediatr. 2021;9:698549. pmid:34900855
  32. 32. Sheldrick RC, Hooker JL, Carter AS, Feinberg E, Croen LA, Kuhn J, et al. The influence of loss to follow-up in autism screening research: taking stock and moving forward. J Child Psychol Psychiatry. 2024;65(5):656–67. pmid:37469104
  33. 33. Wallis KE, Usher R. Applying autism screening research to real-world scenarios: a commentary on Sheldrick et al. (2023). J Child Psychol Psychiatr. 2024;65:720–2.
  34. 34. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, et al. Standards for Reporting of Diagnostic Accuracy. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD Initiative. BMJ. 2003;326:41–4.
  35. 35. Canadian Paediatric Society, Caring for Kids. Information for parents from Canada’s paediatricians. Schedule of well-child visits. [cited 2025 Mar 19. ]. 2021. Available from: https://caringforkids.cps.ca/handouts/pregnancy-and-babies/schedule_of_well_child_visits
  36. 36. Cairney J, Clinton J, Veldhuizen S, Rodriguez C, Missiuna C, Wade T, et al. Evaluation of the revised Nipissing District Developmental Screening (NDDS) tool for use in general population samples of infants and children. BMC Pediatrics. 2016;16:42.
  37. 37. Guttmann A, Saunders NR, Kumar M, Gandhi S, Diong C, MacCon K, et al. Implementation of a physician incentive program for 18-month developmental screening in Ontario, Canada. J Pediatr. 2020;226:213–220.e1. pmid:32451126
  38. 38. Borkhoff CM, Imsirovic H, Bayoumi I, Macarthur C, Nurse KM, To T, et al. Developmental screening at 18 months using the Nipissing district developmental screen. Paediatr Child Health. 2025:pxaf023.
  39. 39. Carsley S, Borkhoff CM, Maguire JL, Birken CS, Khovratovich M, McCrindle B, et al. Cohort Profile: The Applied Research Group for Kids (TARGet Kids!). Int J Epidemiol. 2015;44(3):776–88. pmid:24982016
  40. 40. Ministry of Health, Schedule of Benefits Physician Services. [cited 2025 Mar 18. ]. Available from: https://www.ontario.ca/page/ohip-schedule-benefits-and-fees
  41. 41. Harrell F. Regression modeling strategies with applications to linear models, logistic regression, and survival analysis. New York, NY: Springer; 2001.
  42. 42. van Buuren S, Groothuis-Oudshoorn K. Mice: Multivariate imputation by chained equations in R. J Stat Softw. 2011;45:1–67.
  43. 43. Canadian Task Force on Preventive Health Care. Recommendations on screening for developmental delay. CMAJ. 2016;188(8):579–87. pmid:27026672
  44. 44. Pham C, Bacon EC, Grzybowski A, Carter-Barnes C, Arias S, Xu R, et al. Examination of the impact of the Get SET Early program on equitable access to care within the screen-evaluate-treat chain in toddlers with autism spectrum disorder. Autism. 2023;27(6):1790–802. pmid:36629055
  45. 45. Pierce K, Wen TH, Zahiri J, Andreason C, Courchesne E, Barnes CC, et al. Level of attention to motherese speech as an early marker of autism spectrum disorder. JAMA Netw Open. 2023;6(2):e2255125. pmid:36753277
  46. 46. Bao B, Zahiri J, Gazestani VH, Lopez L, Xiao Y, Kim R. A predictive ensemble classifier for the gene expression diagnosis of ASD at ages 1 to 4 years. Mol Psychiatry. 2023;28(2):822–33.
  47. 47. Borkhoff CM, Atalla M, Bayoumi I, Birken CS, Maguire JL, Parkin PC. Predictive validity of the Infant Toddler Checklist in primary care at the 18-month visit and developmental diagnosis at 3-5 years: a prospective cohort study. BMJ Paediatr Open. 2022;6(1):e001524. pmid:36053584
  48. 48. Nurse KM, Janus M, Birken CS, Keown-Stoneman CDG, Omand JA, Maguire JL, et al. TARGet Kids! Collaboration. Predictive Validity of the Infant Toddler Checklist in Primary Care at the 18-month Visit and School Readiness at 4 to 6 Years. Acad Pediatr. 2023;23(2):322–8. pmid:36122830
  49. 49. Nurse KM, Parkin PC, Keown-Stoneman CDG, Bayoumi I, Birken CS, Maguire JL, et al. Association between family income and positive developmental screening using the Infant Toddler Checklist at the 18-month health supervision visit. J Pediatr. 2024;264:113769. pmid:37821023
  50. 50. Dosman CF, Andrews D, Goulden KJ. Evidence-based milestone ages as a framework for developmental surveillance. Paediatr Child Health. 2012;17(10):561–8. pmid:24294064
  51. 51. Ages and Stages Questionnaires. [cited 2025 Mar 19. ]. Available from: https://agesandstages.com/
  52. 52. Robins DL, Casagrande K, Barton M, Chen C-MA, Dumont-Mathieu T, Fein D. Validation of the modified checklist for Autism in toddlers, revised with follow-up (M-CHAT-R/F). Pediatrics. 2014;133(1):37–45. pmid:24366990
  53. 53. Aishworiya R, Ma VK, Feldman HM. Commentary: Taking stock and moving forward - the need to consider the influence of loss to follow-up in autism screening research. J Child Psychol Psychiatry. 2024;65(9):1243–4.