Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A scoping review and quality assessment of machine learning techniques in identifying maternal risk factors during the peripartum phase for adverse child development

Abstract

Maternal exposure to environmental risk factors (e.g., heavy metal exposure) or mental health problems during the peripartum phase has been shown to lead to negative and lasting impacts on child development and life in adulthood. Given the importance of identifying early markers within highly complex and heterogeneous perinatal factors, machine learning techniques emerge as a promising tool. The main goal of the current scoping review was to summarize the evidence on the application of machine learning techniques in predicting or identifying risk factors during peripartum for child development. A critical appraisal was also conducted to evaluate various aspects, including representativeness, data leakage, validation, performance metrics, and interpretability. A systematic search was conducted in PubMed, Web of Science, Scopus, and Google Scholar to identify studies published prior to the 14th of January 2025. Review selection and data extraction were performed by three independent reviewers. After removing duplicates, the searches yielded 10,336 studies, of which 60 studies were included in the final report. Among these 60 machine learning studies, a majority were pattern-focused, using machine learning primarily as a tool to more accurately describe associations between variables, while 16 studies were prediction-focused (26.7%), exploring the predictive performance of their models. For prediction-focused machine learning studies, a diverse range of methodologies was observed. The quality assessment showed that all studies had some important criteria that were not fully met, with deviations ranging from minor to major, limiting the interpretability and generalizability of the reported findings. Future research should aim at addressing these limitations to enhance the robustness and applicability of machine learning models in this field.

Introduction

The peripartum phase, broadly defined as the period during pregnancy and up to 12 months after delivery [1], is a critical phase in children’s development [2,3]. Previous evidence shows that exposure to risk factors during this critical period (e.g., depression, alcohol, heavy metals) may result in structural and functional changes in the brain of neonates [4,5], and cognitive and neuropsychological development delays in infancy and early childhood [6,7]. The identification of early markers and predictors in the developmental pathway can help to improve personalized prevention programs and timely intervention. However, the influence of perinatal factors on child development is complex and often arises from the interplay of sociodemographic, psychosocial, biological, and environmental factors [710], making it challenging to identify clear risk factors and reliable predictors. Recent advancements in interdisciplinary research have enabled modern machine learning (ML) algorithms to effectively process large-scale, high-dimensional, and unstructured data [11]. In this review, we define ML as a set of algorithms that leverage heuristic, iterative optimization processes, thus adapting to data patterns without the constraints of requiring a predefined functional form or determining a closed-form solution. To capture the breadth of ML definitions in the literature, we distinguish between two categories of ML-based studies in this scoping review. Pattern-focused ML studies use machine learning methods to build models and gain insights into relationships among variables, but do not explicitly evaluate these models for predictive performance. In contrast, prediction-focused ML studies develop and assess models specifically for forecasting new data points. ML has been widely utilized by healthcare-related experts [12,13], epidemiologists [14], and developmental scientists [15] for predicting child outcomes. However, although research is increasingly focused on big data in recent years, challenges such as the lack of high-quality data and the transparency (often referred to as the “black box problem”) of ML approaches might reduce the quality of results and limit their interpretability [16]. Furthermore, critical appraisal of ML-related research, particularly in the form of review studies, is still uncommon. In this scoping review, we synthesized existing evidence from studies that employed ML approaches to examine maternal risk factors during the peripartum phase as predictors of child development. Moreover, we employed a novel assessment tool developed for evaluating the quality of ML research [17] to detect methodological challenges and limitations specifically in prediction studies.

Potential maternal risk factors during the peripartum phase for developmental problems in offspring

It has been hypothesized that exposure to risk factors during pregnancy might shape the in-utero environment and change the developmental trajectories during the critical period of neurogenesis [18]. Additionally, the postpartum period, extending up to 12 months after delivery, is important as several maternal and family psychosocial risk factors (e.g., persistent depression, disharmony) [19,20], postnatal exposure to pollutants [21], and traumatic experiences [22], can elevate the risk for behavioral problems, depression, or academic challenges in children. Specifically, extensive research has shown that internal risk factors during the perinatal phase, such as perinatal mental health problems and adverse traumatic childhood experiences, are related to offspring’s internalizing and externalizing problems [23], social-emotional, cognitive, and motor development [6,7,24,25]. Moreover, studies have reported that perinatal mental distress and the use of selective serotonin reuptake inhibitors (a type of antidepressant drug) might cause disturbances in fetal brain development and increase the risk of neurodevelopmental disorders [4,2628]. For external risk factors, it has been shown that toxin exposure [29,30], even prior to pregnancy, may alter a child’s developmental trajectories, and may affect brain structure in early childhood [31]. Research focused on environmental factors has demonstrated the long-term adverse effects of prenatal exposure to various metals on offspring’s cognition, intelligence quotient (IQ) [32], self-regulatory function, learning difficulties [33], and more. Moreover, emerging evidence indicates that prenatal exposure to alcohol [34], chemical substances [35], air pollution [32], nutritional problems [36], and smoke [37] might elevate the risk of neurodevelopmental disorders (e.g., attention deficit/hyperactivity disorder and autism spectrum disorder) in children. Though the etiologies of attention deficit/hyperactivity disorder (ADHD) and autism spectrum disorder (ASD) are still unclear, several epigenetic studies have provided evidence that perinatal exposure to these internal and external risk factors is associated with an increased risk for ADHD or ASD [38,39]. In recent years, the increasing prevalence of ADHD [40] and ASD [41] has brought more attention to early detection and intervention. Furthermore, besides risk factors during the peripartum period, research focusing on cross-generational transmission has reported that fetal brain and child outcomes may be vulnerable to a mother’s adverse experiences (e.g., maternal childhood maltreatment, experience of stressful legal problems) or substance exposure prior to pregnancy [4245]. Together, these findings underscore the complex interplay between various risk factors, making it a major challenge to achieve early identification of developmental problems and identify early predictors. This complexity also hinders efforts to support diagnosis, prevention, and timely intervention.

Applications of machine learning techniques in identifying maternal risk factors during the peripartum phase for adverse child development

ML provides insights into large amounts of data using robust probabilistic techniques. One of the main advantages of ML is its ability to incorporate interactive predictors and account for variables with non-linear relationships. To address the challenges posed by complex and heterogeneous data in predictive modeling, ML techniques have emerged as a promising tool for supporting the diagnosis of psychiatric disorders [46,47], cancers [48], and ASD [49]. Furthermore, extensive research has shown that ML can utilize perinatal data to assist in predicting fetal health, neonatal outcomes (e.g., birth weight), preterm birth, stillbirth, length of intensive care, and mortality [17,5054]. In the context of perinatal risk factors and clinical outcomes of the child, several previous studies have focused on prenatal exposure to alcohol [55,56], cytokines [57], toxins and metals [58], or chemical compositions [59], and have investigated the various prediction models of developmental disorders, such as ASD or ADHD. While the application of ML for prediction is considered as a promising tool in many fields, potential pitfalls related to biases in the data, data quality, and the lack of standardized evaluation of prediction performance, as well as generalizability and its transfer into practice, have been discussed [15,16]. With the increasing adoption of ML applications in the field, several systematic reviews have examined their use in prediction models for pregnancy outcomes [60], preterm birth [61], neurodevelopmental outcomes after preterm birth [17], and children’s cognitive outcomes [15]. However, there is a notable gap in synthesizing the evidence on the application of ML to predict later child development based on maternal risk factors in non-preterm or extremely low birth weight infants. This gap exists partly because full-term infants are often considered at lower risk compared to preterm or extremely low-birth-weight infants, who are more closely monitored and frequently included in predictive studies. Furthermore, although guidelines for ML applications exist [62], an encompassing evaluation of the quality of evidence required to overcome methodological challenges is often lacking [17].

This scoping review

This scoping review summarizes studies focusing on the application of ML techniques for describing or predicting child development based on maternal risk factors during the peripartum phase, covering full-term and healthy-born children. While preterm and low-birth-weight infants often receive early interventions due to known risks, identifying risk factors or predictors of developmental delay in full-term healthy infants is challenging. Additionally, a critical appraisal was performed, evaluating several aspects, including representativeness, data leakage, internal and external validation, performance metrics, and interpretability. This review provides an overview of the current research on predicting child development outcomes, highlights gaps in the literature, and suggests directions for future studies to enhance the applicability of ML techniques in this context.

Methods

The current review was not pre-registered publicly in a repository. For an overview of search results and information extracted from the studies, please see the following link on the Open Science Framework (https://osf.io/n5gyd/).

Search strategy

Systematic searches were conducted using PubMed, Web of Science, Scopus, and Google Scholar to identify published studies. The literature search process was carried out with the assistance of two senior librarians, one of whom is a co-author. The searches were restricted to articles published up to January 14, 2025. Keywords related to maternal factors included perinatal, antenatal, prenatal, postnatal factors, postpartum, and pregnancy. Keywords related to child development included infant outcome, motor development, cognitive development, regulation, social-emotional development, and neurodevelopment. Other keywords included terms such as ML, artificial intelligence, artificial neural networks, Bayesian kernel machine regression (BKMR), decision tree, deep learning, k-means, neural network, random forest, regression tree, supervised learning, semi-supervised learning, and unsupervised learning. These keyword groups were used in a Boolean search across all four databases. Additionally, a MeSH search was conducted in PubMed and a topic search was performed in Web of Science and Scopus to ensure broader coverage of potentially related content (see S1 Table). After the search, we used Rayyan [63] to support the removal of duplicates before initiating the screening process.

Inclusion criteria for study eligibility

This review follows the PRISMA guidelines (Preferred Reporting Items for Systematic Reviews and Meta-Analyses, S2 Table). The relevance of the identified articles was evaluated through a three-step process. In the first step, authors assessed the articles’ titles, then reviewed the abstracts in the second step, and finally, examined the full articles in the third step. Each step was independently completed by two authors. To maximize the number of articles included, when evaluating the articles, the inclusion criteria were applied conservatively, meaning that the articles were retained as long as one author chose them. All articles at each step were evaluated based on the following inclusion criteria: (1) journal article in English, (2) original study (not review or commentary), (3) use of artificial intelligence or ML in their methodology (as defined in the Introduction), (4) use of maternal risk factors during the peripartum phase (e.g., psychosocial problems, or environmental exposure) as independent variables, and (5) use of children’s developmental outcomes (e.g., cognitive, social-emotional, motor development, or the presence of neurodevelopmental disorders) as dependent variables. Case studies and book chapters without empirical study results were excluded. Studies focused on fetal development, birth outcome (e.g., preterm, very low weight, still birth, mortality), or physiological health related outcomes (e.g., allergy, thyroid function, and congenital heart disease) were also excluded. Additionally, children born through in vitro fertilization, which is not in the scope of the current review, were excluded. Neural imaging studies, such as those using electroencephalogram or magnetic resonance imaging, were also excluded due to their complexity and challenges in data interpretation compared to more accessible and interpretable methods, such as questionnaire-based data and electronic health records. An overview of the inclusion and exclusion criteria is provided in Table 1.

Screening and data extraction

We performed three steps to screen titles, abstracts, and full texts based on the inclusion and exclusion criteria. After screening full texts, information such as publishing details, study population, data source, perinatal risk factors, timing of exposure to risk factors, child outcome assessments and timing, ML techniques, other statistical approaches for validation, and main findings was systematically extracted. All procedures were executed by at least two authors, and conflicting assessments were resolved through consensus-based discussions.

Quality assessment

The current scoping review used the framework developed by Boven et al. [64]. Zierow, Tu, and Schweitzer extracted information related to six different criteria, including participants (sample size), data leakage, validation, performance metrics, interpretability, and open science. Note that these criteria depend on each other sequentially. First, only a sufficient number of observations allows for an appropriate split of the dataset for validation purposes. Second, a clean splitting strategy and caution in the tuning of model parameters ensures that the validation is meaningful. Third, only if the previous criteria are fulfilled do performance metrics become meaningful. In a similar vein, transparency and proper documentation are prerequisites for fully assessing the studies. Specifically, this is captured by the criterion of interpretability and open science in the quality assessment.

For sample size and participants, several hundred or more participants are considered appropriate. For data leakage, it is evaluated whether inflated performance due to overfitting has been introduced. For validation, the use of a completely independent sample for external validation is considered indicative of high quality. For the performance metrics, it is evaluated whether metrics such as the area under the receiver operating characteristics curve (AUC) and base rate-sensitive metrics are reported. For interpretability, it is evaluated whether authors provide additional insights into the models, for example, by conducting a comparison with previously developed models. Finally, for open science, high quality is indicated by a study’s support of transparency of data, codes, and models, which would allow for the future reproduction of the methodology. Each parameter was rated as appropriate (score = 2), minor deviation (score = 1), or major deviation (score = 0). The total score ranges from 0 to 12 with a higher total score indicating a higher quality study. In addition to calculating the total score for each study, the scores were also aggregated for each of the six criteria across all included predictive studies, providing a more detailed overview of performance in specific areas.

Results

The search resulted in 10,336 identified studies, of which 3,365 were duplicates. After screening and removing ineligible studies, 60 studies were retained in the current review. The PRISMA flow diagram [65] depicting the selection of studies is presented in Fig 1. Tables 2 and 3 summarize the study characteristics and main findings. For those studies with prediction models (prediction-focused ML studies), results of the quality assessment are reported in Table 2. Among the 16 studies included in this subgroup, the populations were from Ukraine, Australia, Sweden, France, Italy, Iran, Turkey, Brazil, Korea, and China (each with one study), while Israel, Ireland, and the USA each had two studies. These studies examined various outcomes, with neurodevelopmental diagnoses accounting for the largest proportion (40%), while cognitive and motor development each comprised 30% (S1 Fig). The detailed rating for the quality assessment of each of the quality criteria in each study is reported in the S3 Table.

thumbnail
Table 2. Characteristics of prediction-focused ML studies.

https://doi.org/10.1371/journal.pone.0321268.t002

thumbnail
Fig 1. PRISMA flowchart illustrating the study selection process for the scoping review.

A total of 10,336 articles were identified through database searches. Duplicate records were removed using Rayyan, resulting in a unique set of articles. The screening was conducted in three steps (title, abstract, and full-text review), each independently performed by 2–3 authors. Following this rigorous selection process, 60 studies were included in the final review.

https://doi.org/10.1371/journal.pone.0321268.g001

For studies employing ML techniques to study only the association between variables without an objective to predict outcomes (pattern-focused ML studies), the quality assessment tool used in the current review is not applicable. Among the 44 studies in this subgroup, listed in Table 3, the majority examined populations from China (17 studies) and the USA (10 studies), followed by South Korea (3 studies), and the Seychelles (2 studies). The remaining studies were conducted in the Faroe Islands, Costa Rica, Spain, Mexico, Bangladesh, Japan, Canada, and South Africa, each represented by one study. These studies examined a range of developmental outcomes, with neurodevelopment, including motor development, accounting for the largest proportion at 36%, followed by cognitive development at 31%. Behavior and social-emotional development accounted for 19%, while neurodevelopmental disorders represent the smallest share at 14%. This distribution highlights the varying emphasis on different aspects of child development (S2 Fig).

Two main objectives of machine learning techniques in studying maternal exposure to risk factors and child outcomes

From the analysis of the 60 studies in this review, two distinct subgroups were identified based on their objectives for employing ML techniques in constructing statistical models for child development outcome variables including ADHD, ASD, fetal alcohol spectrum disorders (FASD), and developmental problems. The first subgroup, comprising 16 studies, prioritizes predictive diagnostic applications. These prediction-focused ML studies are characterized by a diversity of risk factors and methodological approaches. The second subgroup, which comprises the majority (44 studies, 73.3%), aims to infer relationships between risk factors and child development outcome variables. A predominant focus of this subgroup of pattern-focused ML studies is on external risk factors, such as exposure to air pollutants [6971], organic pollutants [72,73], chemical mixtures [58,7480], metals [74,76,8189], pesticides [74,83,90,91], methylmercury [82,92,93], neurotoxicants [74], or alcohol [55,56,9496]. Additionally, some studies emphasized internal risk factors (or covariates), such as mental and nervous system disorders during pregnancy [97], birth conditions [98], psychological symptoms [96,66,67,99,100], psychosocial stressors, gestational diabetes, or other perinatal complications [58,59,70,73,77,87,91].

In the subgroup of prediction-focused ML studies (Table 2), six studies focused on predictors mainly during the prenatal period, six included both the pregnancy and postnatal periods, and four considered only the postnatal period. A wide range of risk factors were included across these studies, with alcohol exposure highlighted in five studies [55,56,9496] and smoking during pregnancy in two [55,94]. Maternal depression or psychopathological symptoms were assessed in six studies [94,96,66,67,99,100], while one study focused on perinatal factors such as diseases during pregnancy, mental and nervous system disorders, and complications like intrapartum hemorrhage [97]. Exposure to cytokines was investigated in one study [57], and bio-physiological predictors, including maternal data and milk chemical composition, were included in two [75,98]. Toxic exposures, such as paints, solvents, and heavy metals, were reported in one study [58], while biological hazards spanning pre-, peri-, and post-natal periods were covered in another [59]. The authors reported moderate to good accuracy for prediction models for ADHD [56], and FASD (1st year postpartum) [55], and diagnosis of ASD [57,58,97,98]. In two studies aiming to predict ASD in children aged 4 years or older, one study based on register data yielded low accuracy and no improvement after adding data from maternal inflammatory markers from early pregnancy [57]. In two large cohort studies focusing on early ASD symptoms and ASD diagnosis in children aged 2 years or older, depressive symptoms were not included in the final feature importance results [66,67]. The other study, based on several biomarkers such as maternal familial history of auto-immune diseases, reported moderate accuracy [98]. In the context of predicting developmental problems, Li et al. [75] and Soleimani et al. [59] reported results with moderate accuracy based on milk samples and biological problems (e.g., high-risk pregnancy, respiratory distress syndrome, etc.), respectively. Bowe et al. [95] reported that prenatal exposure to an air pollution mixture was linked to lower general memory and attention/concentration scores in children aged 5 years, suggesting poorer memory function. Furthermore, based on maternal psychological symptoms, Usta & Karabekiroğlu [99] reported moderate accuracy in predicting children’s social-emotional problems between 12 and 42 months of age. Finally, alcohol consumption, along with various maternal socioeconomic factors, was found to predict the likelihood of a child having an IQ score below 90 at 5 years old [94].

In the subgroup of pattern-focused ML studies (Table 3), five studies investigated risk factors from multiple ambient air pollutants or fine particulate matter (PM2.5) [6971,101,102], while other 39 studies focused on various chemical or metal exposures. Regarding the timing of exposure, only four studies included both prenatal and postnatal phases. The other 40 studies investigated exposure only during the prenatal phase. Among these 40 studies, six reported that there is little to no evidence to support the impact of exposure to external risk factors (e.g., phenols), during pregnancy on neurodevelopment or psychomotor development [77,78,82,93,103,104]. The other 33 studies reported significant associations, showing a negative impact on cognitive development [74,76,84,86,90,92,101,105111], neurodevelopment [69,74,79,83,88,89,91,102,112115], behavior problems [70,102,116], motor development [71,117,118], word retrieval, language development [72,117,119], and reading skills [73]. Interestingly, one study reported that second-trimester copper exposure was positively associated with cognitive development at 24 months and cognitive trajectories from 6–24 months, with an interaction effect between copper and lead exposures [81]. In addition, Qiu et al. reported prenatal exposure to manganese was linked to a lower risk of non-optimal cognition development, suggesting a protective effect [118]. Another study reported that higher levels of a flame-retardant chemical were linked to more social difficulties, while higher levels of a pesticide were associated with fewer social challenges. Two industrial chemicals were connected to better cognitive skills, and several pollutants were linked to improved adaptive functioning [120]. The timing of exposure, targeted predictors, outcome assessment tools, and age range of the children all differed from one study to another.

Type of machine learning techniques used

In the subgroup of prediction-focused ML studies, the procedure of splitting into train, test, and validation data as well as corresponding performance metrics were typically reported. By contrast, in the subgroup of pattern-focused ML studies, ML techniques were employed to describe relationships in complex data that can be challenging to analyze with traditional methods. Given the studies in this subgroup did not provide predictions, validation techniques for prediction performance were typically not employed and issues such as overfitting were not investigated.

An overview of the methodological characteristics of the prediction-focused ML studies is provided in Table 2 and Fig 2. In these studies, the most frequently used techniques were Decision Tree Models, Logistic Regressions, and Artificial Neural Networks (ANNs). Eleven of the fourteen decision tree models were tree-based ensemble algorithms such as Random Forest Models (RFMs) [55,57,75,80,94,95,100] and Gradient Boosting Decision Trees [66,67,97,98]. The remaining three models were simple single decision trees [56,96,99]. The three ANNs were all relatively small fully connected feed forward ANNs. In the first study by Grossi et al., the ANN had an input layer of 16 nodes, one hidden layer of 12 nodes and an output layer of two nodes [58]. In the second study by Soleimani et al., the ANN had an input layer of 14 nodes, one hidden layer with also 14 nodes, and an output layer with one node [59]. In the third study by Zhou et al, the final ANNs had an input layer of five nodes, one hidden layer of four nodes, and a single output node [80]. Consistent with typically small sample sizes, model size and complexity were relatively small in all of these studies. To put this into the context of general recommendations around sample size of ANNs, for instance, Alwosheel et al. [121] recommended, a sample size of 50–1,000 times the number of model weights. Thus, even for the relatively small networks in the ANN studies reported between 1,000 and 20,000 independent observations would have seemed appropriate.

thumbnail
Fig 2. Overview of the frequency of machine learning and statistical techniques used in the included studies.

This figure displays the various techniques employed across the included studies. On the left, the specific techniques are listed, while the bars represent their frequency of use. The following techniques were each employed in only one of the included studies: Bayesian additive regression trees, Bayesian multiple outcomes model, Bayesian profile regression, distributed lag non-linear models, generalized linear mixed models, latent profile analysis, least absolute shrinkage and selection operator, principal component analysis, Super Learner (ensemble machine learning). *Includes variants of the technique such as Bayesian varying coefficient kernel machine regression. ** For prediction-focused ML studies, only the machine learning/statistical technique in the main part of the analysis but not those used for robustness checks were included in the count.

https://doi.org/10.1371/journal.pone.0321268.g002

In the subgroup of pattern-focused ML studies, the most frequently used technique is Bayesian Kernel Machine Regression (BKMR). This statistical approach was originally introduced to the estimation of health effects of pollutant mixtures by Bobb et al. [122]. It combines Bayesian inference and kernel-based ML to model complex, non-linear relationships in high-dimensional data, yielding interpretable outcomes. BKMRs were employed by 36 out of 44 studies in this subgroup. Of the other eight studies, two further studies employed methods related to BKMR such as Bayesian Varying Coefficient Kernel Machine Regression (BVCKMR) [81] Mean Field Variational Bayes for Lagged Kernel Machine Regression (MFVB-LKMR) [115]. Two studies employed clustering algorithms in combination with traditional regression methods [74,76]. In this context, clustering served the purpose of handling the complexity and high dimensionality of the data before applying simpler methods that are easier to interpret. One study employed Distributed Lag Non-linear Models (DLMNs) for the main part of the analysis, while in a pre-step to this analysis, an RFM was used to estimate the input variable chemical exposure level [69]. Another study employed a Multivariable Poisson Regression (MPR) with Generalized Linear Mixed Models (GLMMs), while again, in a pre-step to the analysis, chemical exposure levels were estimated by an RFM [71].

The level of methodological maturity appears to differ substantially between the two types of studies. In the prediction-focused ML studies, a high degree of heterogeneity of algorithms and procedures indicates that predictive applications of ML are still in a phase of methodological experimentation. Among these studies, it is therefore difficult to identify a clear best-practice example. Soleimani et al. [59], for instance, provide a well-documented example of how to construct, document, and evaluate a simple ANN. However, this method has yet to gain traction and is constrained by limitations in transparency and ability to predict outcomes with small datasets. By contrast, a commendable example of accomplishing transparency and using a method appropriate for their sample size was provided by Goh et al. [56]. They utilized a simple decision tree that was presented in its entirety within their article and validated its performance with an independent sample. Despite its strengths, such as transparency and scalability to smaller datasets, this approach would struggle to handle larger, more complex datasets due to its limited expressiveness and vulnerability to overfitting. In these scenarios, more sophisticated methods such as RFMs, as employed in other studies [55,57,75], could offer a solution. However, these studies fell short in terms of interpretability and failed to consistently apply rigorous performance validation.

In contrast to the prediction-focused ML studies, the pattern-focused ML studies showed a certain convergence to a standard methodology. In this context, ML techniques coalesced into a pre-defined template or recipe to address the dimensionality and complexity of data. As an example, Valeri et al. [88] provided a BKMR template for handling complex and multi-dimensional data that was used successfully by several later studies. As mentioned earlier, BKMR works by combining Bayesian hierarchical modeling with kernel methods. This method is particularly well-suited for situations where the relationships between variables are nonlinear and interactive, such as when modelling environmental exposure mixtures.

Overall quality assessment

In Table 2, we report the quality of each prediction-focused ML study according to the criteria described in the Methods Section. Across the 16 prediction-focuses ML studies, the total study quality ranges from a low score of 5 to a high score of 10 (full score = 12, mean score = 7.2, standard deviation = 1.8), suggesting typical study qualities are moderate. In Fig 3, we present the total scores for each of the six different quality-criteria across 16 studies. If all ten studies had completely met the criteria, the corresponding total score for the criteria would have resulted in a full score of 32. Across the six criteria, total scores range from 6 (open science) to 27 (performance metrics). The remaining four criteria show moderate quality. All observed studies suffered from some limitations in either the procedures to properly validate results or avoid data leakage. The specific issues were different in each case, ranging from missing or incomplete splitting strategies to issues of data leakage. At the same time, all studies reported at least some common performance metrics. Most studies, however, indicated metrics only for a fixed classification threshold value, implying a specific choice of the trade-off between precision and recall that was not discussed or motivated explicitly. Four studies addressed this issue by providing AUC values, underscoring the value of considering the full range of trade-offs represented in the ROC curve. These findings highlight a need for future prediction-focused ML studies to adopt more robust validation practices.

thumbnail
Fig 3. Summary of the quality assessment of ten prediction-focused ML studies.

This figure provides an overview of the quality assessment of 16 prediction-focused ML studies focused on child development. The first author and publication year are listed on the left. Six domains of the evaluation criteria are displayed from left to right: participants, data leakage, validation, performance metrics, interpretability, and open science. The sum scores for each study are detailed in Table 2. The aggregated scores for each domain across all studies are shown in light gray boxes at the bottom of the figure. Color coding indicates the level of quality: green represents appropriate, blue indicates minor deviations, and yellow highlights major deviations.

https://doi.org/10.1371/journal.pone.0321268.g003

Most studies in the criterion of open science show major deviations, with only one study providing most of the related code and data. In fact, the majority of studies provide no disclosure on the availability of data and code. The apparent reservation to making more technical details available might be a further indication of the experimental state of the methodology. Overall, the results indicate that the existing studies may not yet provide a clear template or best-practice example to inform future research. Interestingly, for each criterion, there is at least one example that appropriately fulfills the criteria. As these positive examples occur across very different basic approaches, it would not be obvious though, how they could be combined into a single study design in the future.

Discussion

This review offers an overview of how ML is applied in studies that employ statistical models to analyze maternal exposure to risk factors and child development. The majority of the included studies have a descriptive focus on the associations between perinatal factors and child development outcomes. Additionally, 26.7% of the prediction-focused ML studies employed predictive ML models for supporting differential diagnoses. Among the ML techniques reported, Decision Tree Models and Artificial Neural Networks were the most prevalent.

While the pattern-focused ML studies discussed in this review have provided valuable insights into potential relationships within complex biological, psychosocial, and environmental systems, their limitations in establishing causality must be acknowledged, particularly in their capacity to influence medical routines and interventions. The inherent nature of association studies to identify correlations does not equate to causation. This restricts their direct applicability in developing preventive and therapeutic strategies [123]. In the pursuit of precise identification of early markers or predictors in developmental pathways, as stated in our introduction, the reliance solely on association studies can risk overlooking critical causal factors. Association studies are prone to various biases, such as selection bias and confounding, which can lead to spurious conclusions [124]. Therefore, a shift towards more prediction-oriented research, and ideally, studies that establish causality, is paramount. Causal studies, supported by predictive analytics, would provide a more robust framework for discerning the complex interplay of factors at work.

The evaluation of prediction-focused ML studies reveals several quality indicators, including sample sizes, interpretability, data leakage, and open science, showing minor to major deviations. Across ten studies, several quality indicators, including sample sizes, interpretability, data leakage, and open science, show minor to major deviations. In other words, most of the results need to be interpreted with caution due to several methodological limitations. First, studies based on large-scale populations and longitudinal datasets in healthcare are still limited [125]. Studies with sample sizes of less than 100 mother-child dyads and a large number of predictors might suffer from overfitting, which might not be detected in the absence of external validation metrics [55,75]. Second, potential limitations of the representativeness of the samples should be also noted. Third, in prediction models, standardized internal validation and model evaluation are lacking. Fourth, perinatal factors and child outcomes are subject to changes over time. Most studies focused on a particular time period and the results might not be replicable due to the dynamic nature of data. Fifth, there is a general lack of transparency. Possibly due to privacy concerns or sensitive data, most studies do not provide information and materials according to open science guidelines [126]. Overall, ML models are suitable in identifying correlations and statistical inferences based on complex and large datasets. For prediction modelling observed in the current review, however, the methodology remains immature, with limitations in sample size, validation, transparency, and adaptability. On a positive note, over the past two years, there has been a notable increase in publications on ML, encompassing both pattern-based and prediction-based approaches. Among prediction-based ML studies published after 2023, a positive trend was that all reported appropriate performance metrics and offered meaningful model interpretations. However, we did not observe any substantial improvements in other quality criteria and the ML techniques used in prediction-based studies continued to vary widely.

The challenges for ML applications we describe are consistent with several recent reviews. For instance, in the fields predicting population health and developmental outcomes, the methods of hyperparameter selection, number of feature selection, and methods of feature selection, are often under-reported [127,128]. Moreover, similar to our observations, small samples are frequently used [17,128]. Another common issue is the difficulty to assess data quality, data leakage, and the handling of missing data. In the healthcare sector, these challenges are particularly pronounced due to the sensitive nature of patient information and the complexity of medical data. ML techniques are increasingly being used to address these issues. Ensuring data quality involves using advanced algorithms to validate and clean data, maintaining accuracy, completeness, and consistency, which are crucial for reliable clinical outcomes [129]. Data leakage in ML occurs when information that should not be available to the training dataset inadvertently influences model creation, leading to overly optimistic performance estimates that fail to generalize to new data [130]. This issue often arises when practitioners use ML tools without fully understanding the underlying algorithms and nature of their datasets, relying instead on automated processes [131]. Understanding and mitigating data leakage is crucial for developing robust and reliable ML models that can generalize effectively to real-world scenarios [130,131]. Additionally, handling missing data effectively is vital to avoid biases in research findings. ML methods like Multiple Imputation by Chained Equations (MICE), K-Nearest Neighbors (KNN), and Denoising Autoencoders (MIDAS) have shown promise in accurately imputing missing data, ensuring that healthcare datasets remain robust and useful for improving patient care and advancing medical research [129]. Furthermore, nontransparent reporting of how data are handled might contribute to low validity and prediction accuracy of the reported prediction models [132]. Future studies therefore should report more details of the selection and data processing procedure. With clear and transparent information, though sample sizes for training are small, the results are more likely to further increase the predictive performance of larger future datasets.

Besides increasing transparency, several areas of improvement could be explored in future research. For the pattern-focused ML studies using the BKMR approach, the analysis could be extended for the purpose of prediction. For this purpose, a validation scheme based on a corresponding split of the dataset would be necessary. However, potential hurdles for adopting BKMRs for research projects beyond the specific domain of modeling of multi-pollutant mixtures might be their limitation to continuous variables, the sensitivity of posterior inclusion probabilities to the choice of tuning parameters, and the lack of procedures for assessing the statistical significance of pollutants mixture interactions [133,134].

For prediction-focused ML studies, beyond well-defined special cases, future research will still face a challenging landscape of competing ML techniques with distinct advantages and disadvantages. In ML research, it is well-established that RFMs and other decision-tree based models demonstrate high performance, even when the amount of available data is limited [135]. However, recent advances in transfer learning could make the use of more expressive ANNs feasible even for small datasets [136]. For this approach, a neural network is trained on a large, often domain-general dataset and then fine-tuned on a specific smaller dataset. Whether transfer learning can be effectively applied to prediction-focused ML studies such as those discussed in this review remains an open question that could be addressed by future research.

Finally, across both prediction- and pattern-focused ML studies, a range of maternal external and internal risk factors for child development have been identified. This review specifically examines maternal risk factors throughout pregnancy and the first 12 months postpartum. Capturing the dynamics and changes over time in the mother is inherently complex due to the multifaceted nature of maternal health, behavior, and environment. Similarly, child development is a highly dynamic and intricate process, influenced by a myriad of factors that evolve over time. Future research challenges will involve not only capturing these parallel complexities but also understanding the interplay between maternal and child development. This will require sophisticated longitudinal studies and advanced analytical methods, such as ML techniques, to unravel the intricate web of influences that shape early childhood development.

Conclusions

This scoping review shows that a majority of the observed studies have employed ML techniques for statistical associations between perinatal factors and their impact on child outcomes. Among prediction-focused ML studies, (Decision Tree-based) Ensemble Algorithms and Artificial Neural Networks were the most frequently utilized ML techniques. In contrast to inferential findings, the interpretability and generalizability of prediction studies are more limited. While several performance metrics were reported, to fully meet the evaluation criteria used in the quality assessment, it will be necessary to develop a more systematic methodology and enhance the transparency of future research.

Limitations

There are some limitations in the current scoping review. First, the literature search was limited to PubMed, Scopus, and Web of Science, with Google Scholar used as a supportive tool. Studies from unpublished data, thesis work, or grey literature were not included, potentially limiting the comprehensiveness of the findings. Second, while some included studies had relatively small sample sizes, the review does not aim to determine the optimal sample size for studies in this field. However, the variability in sample sizes across studies could influence the robustness and generalizability of the findings. Third, the quality assessment of the studies was limited to prediction-focused ML studies. Finally, the studies included were highly heterogenous, with considerable variation in ML techniques, timing of exposure, outcome measures, and child development assessment tools. This heterogeneity makes it difficult to draw conclusive results regarding the overall impact of perinatal factors on child developmental problems.

Supporting information

S1 Fig. Distribution of different outcomes across 16 prediction-focused ML studies.

https://doi.org/10.1371/journal.pone.0321268.s001

(PDF)

S2 Fig. Distribution of different outcomes across 44 pattern-focused ML studies.

https://doi.org/10.1371/journal.pone.0321268.s002

(PDF)

Acknowledgments

We would like to express our gratitude to Umeå University Library, Umeå, Sweden, for their support, particularly to librarian Magnus Olsson for their assistance with the literature search. We would also like to thank the editor and the reviewers at PLOS ONE for their helpful comments and suggestions. Our sincere thanks also go to ESB Business School, Reutlingen University, Reutlingen, Germany, for their support in facilitating the publication. The article processing charge was funded by Reutlingen University in the funding program Open Access Publishing.

References

  1. 1. Leach LS, Poyser C, Cooklin AR, Giallo R. Prevalence and course of anxiety disorders (and symptom levels) in men across the perinatal period: A systematic review. J Affect Disord. 2016;190:675–86. pmid:26590515
  2. 2. Feliciano DM, Bordey A. Newborn cortical neurons: only for neonates? Trends Neurosci. 2013;36(1):51–61. pmid:23062965
  3. 3. De Asis-Cruz J, Andescavage N, Limperopoulos C. Adverse Prenatal Exposures and Fetal Brain Development: Insights From Advanced Fetal Magnetic Resonance Imaging. Biol Psychiatry Cogn Neurosci Neuroimaging. 2022;7(5):480–90. pmid:34848383
  4. 4. Roos A, Fouche J-P, Ipser JC, Narr KL, Woods RP, Zar HJ, et al. Structural and functional brain network alterations in prenatal alcohol exposed neonates. Brain Imaging Behav. 2021;15(2):689–99. pmid:32306280
  5. 5. Lugo-Candelas C, Cha J, Hong S, Bastidas V, Weissman M, Fifer WP, et al. Associations Between Brain Structure and Connectivity in Infants and Exposure to Selective Serotonin Reuptake Inhibitors During Pregnancy. JAMA Pediatr. 2018;172(6):525–33. pmid:29630692
  6. 6. Rogers A, Obst S, Teague SJ, Rossen L, Spry EA, Macdonald JA, et al. Association Between Maternal Perinatal Depression and Anxiety and Child and Adolescent Development: A Meta-analysis. JAMA Pediatr. 2020;174(11):1082–92. pmid:32926075
  7. 7. Stein A, Pearson RM, Goodman SH, Rapa E, Rahman A, McCallum M, et al. Effects of perinatal mental disorders on the fetus and child. The Lancet. 2014;384(9956):1800–19. pmid:25455250
  8. 8. Harron K, Gilbert R, Fagg J, Guttmann A, van der Meulen J. Associations between pre-pregnancy psychosocial risk factors and infant outcomes: a population-based cohort study in England. The Lancet Public Health. 2021;6(2):e97–105. pmid:33516292
  9. 9. Liu Z, Cai L, Liu Y, Chen W, Wang Q. Association between prenatal cadmium exposure and cognitive development of offspring: A systematic review. Environ Pollut. 2019;254(Pt B):113081. pmid:31473391
  10. 10. Takegata M, Matsunaga A, Ohashi Y, Toizumi M, Yoshida LM, Kitamura T. Prenatal and Intrapartum Factors Associated With Infant Temperament: A Systematic Review. Front Psychiatry. 2021;12:609020. pmid:33897486
  11. 11. Bzdok D, Altman N, Krzywinski M. Statistics versus machine learning. Nat Methods. 2018;15(4):233–4. pmid:30100822
  12. 12. Germain P, Vardazaryan A, Padoy N, Labani A, Roy C, Schindler TH, et al. Deep Learning Supplants Visual Analysis by Experienced Operators for the Diagnosis of Cardiac Amyloidosis by Cine-CMR. Diagnostics. 2022;12(1):69.
  13. 13. Hannun AY, Rajpurkar P, Haghpanahi M, Tison GH, Bourn C, Turakhia MP, et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med. 2019;25(1):65–9. pmid:30617320
  14. 14. Wiemken TL, Kelley RR. Machine Learning in Epidemiology and Health Outcomes Research. Annu Rev Public Health. 2020;41:21–36. pmid:31577910
  15. 15. Bowe AK, Lightbody G, Staines A, Murray DM. Big data, machine learning, and population health: predicting cognitive outcomes in childhood. Pediatr Res. 2023;93(2):300–7. pmid:35681091
  16. 16. Rudin C. Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. Nat Mach Intell. 2019;1(5):206–15. pmid:35603010
  17. 17. van Boven MR, Henke CE, Leemhuis AG, Hoogendoorn M, van Kaam AH, Königs M, et al. Machine Learning Prediction Models for Neurodevelopmental Outcome After Preterm Birth: A Scoping Review and New Machine Learning Evaluation Framework. Pediatrics. 2022;150(1):e2021056052. pmid:35670123
  18. 18. Barker DJP. In utero programming of chronic disease. Clinical Science. 1998;95(2):115–28.
  19. 19. Netsi E, Pearson RM, Murray L, Cooper P, Craske MG, Stein A. Association of Persistent and Severe Postnatal Depression With Child Outcomes. JAMA Psychiatry. 2018;75(3):247–53. pmid:29387878
  20. 20. Bekkhus M, Rutter M, Barker ED, Borge AIH. The role of pre- and postnatal timing of family risk factors on child behavior at 36 months. J Abnorm Child Psychol. 2011;39(4):611–21. pmid:21181435
  21. 21. Ni Y, Loftus CT, Szpiro AA, Young MT, Hazlehurst MF, Murphy LE, et al. Associations of Pre- and Postnatal Air Pollution Exposures with Child Behavioral Problems and Cognitive Performance: A U.S. Multi-Cohort Study. Environ Health Perspect. 2022;130(6):67008. pmid:35737514
  22. 22. Whitaker RC, Orzol SM, Kahn RS. Maternal mental health, substance use, and domestic violence in the year after delivery and subsequent behavior problems in children at age 3 years. Arch Gen Psychiatry. 2006;63(5):551–60. pmid:16651512
  23. 23. Tien J, Lewis GD, Liu J. Prenatal risk factors for internalizing and externalizing problems in childhood. World J Pediatr. 2020;16(4):341–55. pmid:31617077
  24. 24. Tu H-F, Skalkidou A, Lindskog M, Gredebäck G. Maternal childhood trauma and perinatal distress predict the development of attention in infants from 6 to 18 months in a Swedish cohort study. medRxiv. 2021:2021.09.13.21263510.
  25. 25. Cooke JE, Racine N, Pador P, Madigan S. Maternal Adverse Childhood Experiences and Child Behavior Problems: A Systematic Review. Pediatrics. 2021;148(3):e2020044131. pmid:34413250
  26. 26. Buss C, Entringer S, Moog NK, Toepfer P, Fair DA, Simhan HN, et al. Intergenerational Transmission of Maternal Childhood Maltreatment Exposure: Implications for Fetal Brain Development. J Am Acad Child Adolesc Psychiatry. 2017;56(5):373–82. pmid:28433086
  27. 27. Lindsay KL, Buss C, Wadhwa PD, Entringer S. The Interplay Between Nutrition and Stress in Pregnancy: Implications for Fetal Programming of Brain Development. Biol Psychiatry. 2019;85(2):135–49. pmid:30057177
  28. 28. El Marroun H, White TJH, van der Knaap NJF, Homberg JR, Fernández G, Schoemaker NK, et al. Prenatal exposure to selective serotonin reuptake inhibitors and social responsiveness symptoms of autism: population-based study of young children. Br J Psychiatry. 2014;205(2):95–102. pmid:25252317
  29. 29. Martínez-Martínez MI, Alegre-Martínez A, Cauli O. Prenatal exposure to phthalates and its effects upon cognitive and motor functions: A systematic review. Toxicology. 2021;463:152980. pmid:34624397
  30. 30. Römer P, Mathes B, Reinelt T, Stoyanova P, Petermann F, Zierul C. Systematic review showed that low and moderate prenatal alcohol and nicotine exposure affected early child development. Acta Paediatr. 2020;109(12):2491–501. pmid:32603488
  31. 31. Lebel C, Walton M, Letourneau N, Giesbrecht GF, Kaplan BJ, Dewey D. Prepartum and Postpartum Maternal Depressive Symptoms Are Related to Children’s Brain Structure in Preschool. Biol Psychiatry. 2016;80(11):859–68. pmid:26822800
  32. 32. Liu W, Xin Y, Li Q, Shang Y, Ping Z, Min J, et al. Biomarkers of environmental manganese exposure and associations with childhood neurodevelopment: a systematic review and meta-analysis. Environ Health. 2020;19(1):104. pmid:33008482
  33. 33. Margolis AE, Greenwood P, Dranovsky A, Rauh V. The Role of Environmental Chemicals in the Etiology of Learning Difficulties: A Novel Theoretical Framework. Mind Brain Educ. 2023;17(4):301–11. pmid:38389544
  34. 34. Flak AL, Su S, Bertrand J, Denny CH, Kesmodel US, Cogswell ME. The association of mild, moderate, and binge prenatal alcohol exposure and child neuropsychological outcomes: a meta-analysis. Alcohol Clin Exp Res. 2014;38(1):214–26. pmid:23905882
  35. 35. Andersen HR, Dalsager L, Jensen IK, Timmermann CAG, Olesen TS, Trecca F, et al. Prenatal exposure to pyrethroid and organophosphate insecticides and language development at age 20-36 months among children in the Odense Child Cohort. Int J Hyg Environ Health. 2021;235:113755. pmid:33962121
  36. 36. Chorniy A, Currie J, Sonchak L. Does prenatal WIC participation improve child outcomes? Am J Health Econ. 2020;6(2):169–98. pmid:33178883
  37. 37. Peixinho J, Toseeb U, Mountford HS, Bermudez I, Newbury DF. The effects of prenatal smoke exposure on language development ‐ a systematic review. Infant and Child Development. 2022;31(4).
  38. 38. Saxena R, Babadi M, Namvarhaghighi H, Roullet FI. Role of environmental factors and epigenetics in autism spectrum disorders. Prog Mol Biol Transl Sci. 2020;173:35–60. pmid:32711816
  39. 39. DeSocio JE. Reprint of “Epigenetics, maternal prenatal psychosocial stress, and infant mental health”. Arch Psychiatr Nurs. 2019;33(3):232–7. pmid:31227075
  40. 40. Davidovitch M, Koren G, Fund N, Shrem M, Porath A. Challenges in defining the rates of ADHD diagnosis and treatment: trends over the last decade. BMC Pediatr. 2017;17(1):218. pmid:29284437
  41. 41. Solmi M, Song M, Yon DK, Lee SW, Fombonne E, Kim MS, et al. Incidence, prevalence, and global burden of autism spectrum disorder from 1990 to 2019 across 204 countries. Mol Psychiatry. 2022;27(10):4172–80. pmid:35768640
  42. 42. Moog NK, Entringer S, Rasmussen JM, Styner M, Gilmore JH, Kathmann N, et al. Intergenerational Effect of Maternal Exposure to Childhood Maltreatment on Newborn Brain Anatomy. Biol Psychiatry. 2018;83(2):120–7. pmid:28842114
  43. 43. Tu H-F, Skalkidou A, Lindskog M, Gredebäck G. Maternal childhood trauma and perinatal distress are related to infants’ focused attention from 6 to 18 months. Scientific Reports. 2021;11(1):24190.
  44. 44. Hoang Reede DH, Tancredi DJ, Schmidt RJ. Maternal preconception and prenatal stressful life events in association with child neurodevelopmental outcome in MARBLES: A high familial likelihood cohort. Research in Autism Spectrum Disorders. 2024;114:102364.
  45. 45. Jabbar S, Chastain LG, Gangisetty O, Cabrera MA, Sochacki K, Sarkar DK. Preconception Alcohol Increases Offspring Vulnerability to Stress. Neuropsychopharmacology. 2016;41(11):2782–93. pmid:27296153
  46. 46. Alatrany AS, Hussain AJ, Mustafina J, Al-Jumeily D. Machine Learning Approaches and Applications in Genome Wide Association Study for Alzheimer’s Disease: A Systematic Review. IEEE Access. 2022;10:62831–47.
  47. 47. Bracher-Smith M, Crawford K, Escott-Price V. Machine learning for genetic prediction of psychiatric disorders: a systematic review. Mol Psychiatry. 2021;26(1):70–9. pmid:32591634
  48. 48. Moon I, LoPiccolo J, Baca SC, Sholl LM, Kehl KL, Hassett MJ, et al. Machine learning for genetics-based classification and treatment response prediction in cancer of unknown primary. Nat Med. 2023;29(8):2057–67. pmid:37550415
  49. 49. Santana CP, de Carvalho EA, Rodrigues ID, Bastos GS, de Souza AD, de Brito LL. rs-fMRI and machine learning for ASD diagnosis: a systematic review and meta-analysis. Sci Rep. 2022;12(1):6030. pmid:35411059
  50. 50. Baker S, Kandasamy Y. Machine learning for understanding and predicting neurodevelopmental outcomes in premature infants: a systematic review. Pediatr Res. 2023;93(2):293–9. pmid:35641551
  51. 51. Mangold C, Zoretic S, Thallapureddy K, Moreira A, Chorath K, Moreira A. Machine Learning Models for Predicting Neonatal Mortality: A Systematic Review. Neonatology. 2021;118(4):394–405. pmid:34261070
  52. 52. Mennickent D, Rodríguez A, Opazo MC, Riedel CA, Castro E, Eriz-Salinas A, et al. Machine learning applied in maternal and fetal health: a narrative review focused on pregnancy diseases and complications. Front Endocrinol (Lausanne). 2023;14:1130139. pmid:37274341
  53. 53. Eick SM, Barr DB, Brennan PA, Taibl KR, Tan Y, Robinson M, et al. Per- and polyfluoroalkyl substances and psychosocial stressors have a joint effect on adverse pregnancy outcomes in the Atlanta African American Maternal-Child cohort. Sci Total Environ. 2023;857(Pt 2):159450. pmid:36252672
  54. 54. Malacova E, Tippaya S, Bailey HD, Chai K, Farrant BM, Gebremedhin AT, et al. Stillbirth risk prediction using machine learning for a large cohort of births from Western Australia, 1980-2015. Sci Rep. 2020;10(1):5354. pmid:32210300
  55. 55. Balaraman S, Schafer JJ, Tseng AM, Wertelecki W, Yevtushok L, Zymak-Zakutnya N, et al. Plasma miRNA Profiles in Pregnant Women Predict Infant Outcomes following Prenatal Alcohol Exposure. PLoS One. 2016;11(11):e0165081. pmid:27828986
  56. 56. Goh PK, Doyle LR, Glass L, Jones KL, Riley EP, Coles CD, et al. A Decision Tree to Identify Children Affected by Prenatal Alcohol Exposure. J Pediatr. 2016;177:121–7.e1. pmid:27476634
  57. 57. Brynge M, Gardner RM, Sjöqvist H, Lee BK, Dalman C, Karlsson H. Maternal Levels of Cytokines in Early Pregnancy and Risk of Autism Spectrum Disorders in Offspring. Front Public Health. 2022;10:917563. pmid:35712277
  58. 58. Grossi E, Veggo F, Narzisi A, Compare A, Muratori F. Pregnancy risk factors in autism: a pilot study with artificial neural networks. Pediatr Res. 2016;79(2):339–47. pmid:26524714
  59. 59. Soleimani F, Teymouri R, Biglarian A. Predicting developmental disorder in infants using an artificial neural network. Acta Med Iran. 2013;51(6):347–52. pmid:23852837
  60. 60. Islam MN, Mustafina SN, Mahmud T, Khan NI. Machine learning to predict pregnancy outcomes: a systematic review, synthesizing framework and future research agenda. BMC Pregnancy Childbirth. 2022;22(1):348. pmid:35546393
  61. 61. Sharifi-Heris Z, Laitala J, Airola A, Rahmani AM, Bender M. Machine Learning Approach for Preterm Birth Prediction Using Health Records: Systematic Review. JMIR Med Inform. 2022;10(4):e33875. pmid:35442214
  62. 62. Luo W, Phung D, Tran T, Gupta S, Rana S, Karmakar C, et al. Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View. J Med Internet Res. 2016;18(12):e323. pmid:27986644
  63. 63. Ouzzani M, Hammady H, Fedorowicz Z, Elmagarmid A. Rayyan-a web and mobile app for systematic reviews. Syst Rev. 2016;5(1):210. pmid:27919275
  64. 64. Aafjes-van Doorn K, Kamsteeg C, Bate J, Aafjes M. A scoping review of machine learning in psychotherapy research. Psychother Res. 2021;31(1):92–116. pmid:32862761
  65. 65. Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Int J Surg. 2010;8(5):336–41. pmid:20171303
  66. 66. Ben-Sasson A, Guedalia J, Ilan K, Shaham M, Shefer G, Cohen R, et al. Predicting autism traits from baby wellness records: A machine learning approach. Autism. 2024:13623613241253311.
  67. 67. Ben-Sasson A, Guedalia J, Nativ L, Ilan K, Shaham M, Gabis LV. A Prediction Model of Autism Spectrum Diagnosis from Well-Baby Electronic Data Using Machine Learning. Children (Basel). 2024;11(4):429. pmid:38671647
  68. 68. Huang L-S, Cory-Slechta DA, Cox C, Thurston SW, Shamlaye CF, Watson GE, et al. Analysis of Nonlinear Associations between Prenatal Methylmercury Exposure from Fish Consumption and Neurodevelopmental Outcomes in the Seychelles Main Cohort at 17 Years. Stoch Environ Res Risk Assess. 2018;32(4):893–904. pmid:30323714
  69. 69. Liu B, Fang X, Strodl E, He G, Ruan Z, Wang X, et al. Fetal Exposure to Air Pollution in Late Pregnancy Significantly Increases ADHD-Risk Behavior in Early Childhood. Int J Environ Res Public Health. 2022;19(17):10482. pmid:36078201
  70. 70. Shin J, Park H, Kim HS, Kim E-J, Kim K-N, Hong Y-C, et al. Pre- and postnatal exposure to multiple ambient air pollutants and child behavioral problems at five years of age. Environ Res. 2022;206:112526. pmid:34921822
  71. 71. Xu X, Tao S, Huang L, Du J, Liu C, Jiang Y, et al. Maternal PM2.5 exposure during gestation and offspring neurodevelopment: Findings from a prospective birth cohort study. Sci Total Environ. 2022;842:156778. pmid:35724775
  72. 72. Oulhote Y, Coull B, Bind M-A, Debes F, Nielsen F, Tamayo I, et al. Joint and independent neurotoxic effects of early life exposures to a chemical mixture: A multi-pollutant approach combining ensemble learning and g-computation. Environ Epidemiol. 2019;3(5):e063. pmid:32051926
  73. 73. Vuong AM, Xie C, Jandarov R, Dietrich KN, Zhang H, Sjödin A, et al. Prenatal exposure to a mixture of persistent organic pollutants (POPs) and child reading skills at school age. Int J Hyg Environ Health. 2020;228:113527. pmid:32521479
  74. 74. Yonkman AM, Alampi JD, Kaida A, Allen RW, Chen A, Lanphear BP, et al. Using Latent Profile Analysis to Identify Associations Between Gestational Chemical Mixtures and Child Neurodevelopment. Epidemiology. 2023;34(1):45–55. pmid:36166205
  75. 75. Li K, Bertrand K, Naviaux JC, Monk JM, Wells A, Wang L, et al. Metabolomic and exposomic biomarkers of risk of future neurodevelopmental delay in human milk. Pediatr Res. 2023;93(6):1710–20. pmid:36109618
  76. 76. Kalloo G, Wellenius GA, McCandless L, Calafat AM, Sjodin A, Sullivan AJ, et al. Chemical mixture exposures during pregnancy and cognitive abilities in school-aged children. Environ Res. 2021;197:111027. pmid:33744271
  77. 77. Xie Z, Tan J, Fang G, Ji H, Miao M, Tian Y, et al. Associations between prenatal exposure to perfluoroalkyl substances and neurobehavioral development in early childhood: A prospective cohort study. Ecotoxicol Environ Saf. 2022;241:113818. pmid:35777342
  78. 78. Yim G, Minatoya M, Kioumourtzoglou M-A, Bellavia A, Weisskopf M, Ikeda-Araki A, et al. The associations of prenatal exposure to dioxins and polychlorinated biphenyls with neurodevelopment at 6 Months of age: Multi-pollutant approaches. Environmental Research. 2022;209:112757.
  79. 79. Zhang B, Wang Z, Zhang J, Dai Y, Feng C, Lin Y, et al. Prenatal perfluoroalkyl substances exposure and neurodevelopment in toddlers: Findings from SMBCS. Chemosphere. 2023;313:137587. pmid:36535498
  80. 80. Zhou T, Shen Y, Lyu J, Yang L, Wang H-J, Hong S, et al., editors. Medication Usage Record-Based Predictive Modeling of Neurodevelopmental Abnormality in Infants under One Year: A Prospective Birth Cohort Study. Healthcare; MDPI; 2024.
  81. 81. Liu SH, Bobb JF, Claus Henn B, Gennings C, Schnaas L, Tellez-Rojo M, et al. Bayesian varying coefficient kernel machine regression to assess neurodevelopmental trajectories associated with exposure to complex mixtures. Stat Med. 2018;37(30):4680–94. pmid:30277584
  82. 82. Fruh V, Rifas-Shiman SL, Coull BA, Devick KL, Amarasiriwardena C, Cardenas A, et al. Prenatal exposure to a mixture of elements and neurobehavioral outcomes in mid-childhood: Results from Project Viva. Environ Res. 2021;201:111540. pmid:34166661
  83. 83. Guo J, Wu C, Zhang J, Qi X, Lv S, Jiang S, et al. Prenatal exposure to mixture of heavy metals, pesticides and phenols and IQ in children at 7 years of age: The SMBCS study. Environment International. 2020;139:105692.
  84. 84. Lee K-S, Kim K-N, Ahn YD, Choi Y-J, Cho J, Jang Y, et al. Prenatal and postnatal exposures to four metals mixture and IQ in 6-year-old children: A prospective cohort study in South Korea. Environ Int. 2021;157:106798. pmid:34339957
  85. 85. Li C, Xia W, Jiang Y, Liu W, Zhang B, Xu S, et al. Low level prenatal exposure to a mixture of Sr, Se and Mn and neurocognitive development of 2-year-old children. Sci Total Environ. 2020;735:139403. pmid:32473430
  86. 86. Liu C, Huang L, Huang S, Wei L, Cao D, Zan G, et al. Association of both prenatal and early childhood multiple metals exposure with neurodevelopment in infant: A prospective cohort study. Environ Res. 2022;205:112450. pmid:34861232
  87. 87. Rokoff LB, Shoaff JR, Coull BA, Enlow MB, Bellinger DC, Korrick SA. Prenatal exposure to a mixture of organochlorines and metals and internalizing symptoms in childhood and adolescence. Environ Res. 2022;208:112701. pmid:35016863
  88. 88. Shah-Kulkarni S, Lee S, Jeong KS, Hong Y-C, Park H, Ha M, et al. Prenatal exposure to mixtures of heavy metals and neurodevelopment in infants at 6 months. Environ Res. 2020;182:109122. pmid:32069757
  89. 89. Valeri L, Mazumdar MM, Bobb JF, Claus Henn B, Rodrigues E, Sharif OIA, et al. The Joint Effect of Prenatal Exposure to Metal Mixtures on Neurodevelopmental Outcomes at 20-40 Months of Age: Evidence from Rural Bangladesh. Environ Health Perspect. 2017;125(6):067015. pmid:28669934
  90. 90. Coker E, Gunier R, Bradman A, Harley K, Kogut K, Molitor J, et al. Association between Pesticide Profiles Used on Agricultural Fields near Maternal Residences during Pregnancy and IQ at Age 7 Years. Int J Environ Res Public Health. 2017;14(5):506. pmid:28486423
  91. 91. Wei H, Zhang X, Yang X, Yu Q, Deng S, Guan Q, et al. Prenatal exposure to pesticides and domain-specific neurodevelopment at age 12 and 18 months in Nanjing, China. Environment International. 2023;173:107814.
  92. 92. Huang L-S, Myers GJ, Davidson PW, Cox C, Xiao F, Thurston SW, et al. Is susceptibility to prenatal methylmercury exposure from fish consumption non-homogeneous? Tree-structured analysis for the Seychelles Child Development Study. Neurotoxicology. 2007;28(6):1237–44. pmid:17942158
  93. 93. LaLonde A, Love T, Thurston SW, Davidson PW. Discovering structure in multiple outcomes models for tests of childhood neurodevelopment. Biometrics. 2020;76(3):874–85. pmid:31729013
  94. 94. Bowe AK, Lightbody G, Staines A, Kiely ME, McCarthy FP, Murray DM. Predicting Low Cognitive Ability at Age 5-Feature Selection Using Machine Learning Methods and Birth Cohort Data. Int J Public Health. 2022;67:1605047. pmid:36439276
  95. 95. Bowe AK, Lightbody G, O’Boyle DS, Staines A, Murray DM. Predicting low cognitive ability at age 5 years using perinatal data and machine learning. Pediatr Res. 2024;95(6):1634–43. pmid:38177251
  96. 96. Viegas da Silva E, Hartwig FP, Santos TM, Yousafzai A, Santos IS, Barros AJD, et al. Predictors of early child development for screening pregnant women most in need of support in Brazil. J Glob Health. 2024;14:04143. pmid:39173149
  97. 97. Betts KS, Chai K, Kisely S, Alati R. Development and validation of a machine learning-based tool to predict autism among children. Autism Res. 2023;16(5):941–52. pmid:36899450
  98. 98. Caly H, Rabiei H, Coste-Mazeau P, Hantz S, Alain S, Eyraud J-L, et al. Machine learning analysis of pregnancy data enables early identification of a subpopulation of newborns with ASD. Sci Rep. 2021;11(1):6877. pmid:33767300
  99. 99. Usta MB, KarabekİroĞlu K. Does the Psychopathology of the Parents Predict the Developmental-Emotional Problems of the Toddlers? Noro Psikiyatr Ars. 2020;57(4):265–9. pmid:33354115
  100. 100. Yang S-W, Lee K-S, Heo JS, Choi E-S, Kim K, Lee S, et al. Machine learning analysis with population data for prepregnancy and perinatal risk factors for the neurodevelopmental delay of offspring. Sci Rep. 2024;14(1):13993. pmid:38886474
  101. 101. Chiu Y-HM, Wilson A, Hsu H-HL, Jamal H, Mathews N, Kloog I, et al. Prenatal ambient air pollutant mixture exposure and neurodevelopment in urban children in the Northeastern United States. Environmental Research. 2023;233:116394.
  102. 102. Wang X, Li C, Zhou L, Liu L, Qiu X, Huang D, et al. Associations of prenatal exposure to PM2.5 and its components with offsprings’ neurodevelopmental and behavioral problems: A prospective cohort study from China. Ecotoxicol Environ Saf. 2024;282:116739. pmid:39029225
  103. 103. Yang X, Zheng L, Zhang J, Wang H. Prenatal exposure to per-and polyfluoroalkyl substances and child executive function: Evidence from the Shanghai birth cohort study. Environ Int. 2024;183:108437. pmid:38232503
  104. 104. Zhou T, Abrishamcar S, Christensen G, Eick SM, Barr DB, Vanker A, et al. Associations between prenatal exposure to environmental phenols and child neurodevelopment at two years of age in a South African birth cohort. Environ Res. 2025;264(Pt 1):120325. pmid:39528036
  105. 105. Chen Y, Miao M, Wang Z, Ji H, Zhou Y, Liang H, et al. Prenatal bisphenol exposure and intelligence quotient in children at six years of age: A prospective cohort study. Chemosphere. 2023;334:139023. pmid:37230300
  106. 106. Enright EA, Eick SM, Morello-Frosch R, Aguiar A, Woodbury ML, Sprowles JLN, et al. Associations of prenatal exposure to per- and polyfluoroalkyl substances (PFAS) with measures of cognition in 7.5-month-old infants: An exploratory study. Neurotoxicol Teratol. 2023;98:107182. pmid:37172619
  107. 107. Gu J, Huang H, Tang P, Liao Q, Liang J, Tang Y, et al. Association between maternal metal exposure during early pregnancy and intelligence in children aged 3-6 years: Results from a Chinese birth cohort. Environ Res. 2024;261:119685. pmid:39068966
  108. 108. Long J, Liang J, Liu T, Huang H, Chen J, Liao Q, et al. Association between prenatal exposure to alkylphenols and intelligence quotient among preschool children: sex-specific effects. Environ Health. 2024;23(1):21. pmid:38365736
  109. 109. Ni Y, Szpiro AA, Loftus CT, Workman T, Sullivan A, Wallace ER, et al. Prenatal exposure to polycyclic aromatic hydrocarbons and executive functions at school age: Results from a combined cohort study. Int J Hyg Environ Health. 2024;260:114407. pmid:38879913
  110. 110. Zhang B, Wang Z, Zhang J, Dai Y, Ding J, Guo J, et al. Prenatal exposure to per-and polyfluoroalkyl substances, fetal thyroid function, and intelligence quotient at 7 years of age: Findings from the Sheyang Mini Birth Cohort Study. Environment International. 2024;187:108720.
  111. 111. Zhou J, Tong J, Liang C, Wu P, Ouyang J, Cai W, et al. Prenatal metals and offspring cognitive development: Insights from a large-scale placental bioassay study. Environ Res. 2025;267:120684. pmid:39716677
  112. 112. Cao Z, Yang M, Gong H, Feng X, Hu L, Li R, et al. Association between prenatal exposure to rare earth elements and the neurodevelopment of children at 24-months of age: A prospective cohort study. Environ Pollut. 2024;343:123201. pmid:38135135
  113. 113. Li H, Tong J, Wang X, Lu M, Yang F, Gao H, et al. Associations of prenatal exposure to individual and mixed organophosphate esters with ADHD symptom trajectories in preschool children: The modifying effects of maternal Vitamin D. J Hazard Mater. 2024;478:135541. pmid:39154480
  114. 114. Li Z, Han Y, Li X, Xiong W, Cui T, Xi W, et al. Polycyclic aromatic hydrocarbons exposure in early pregnancy on child neurodevelopment. Environ Pollut. 2025;366:125527. pmid:39675657
  115. 115. Oskar S, Balalian AA, Stingone JA. Identifying critical windows of prenatal phenol, paraben, and pesticide exposure and child neurodevelopment: Findings from a prospective cohort study. Sci Total Environ. 2024;920:170754. pmid:38369152
  116. 116. Hernandez-Castro I, Eckel SP, Howe CG, Niu Z, Kannan K, Robinson M, et al. Prenatal exposures to organophosphate ester metabolite mixtures and children’s neurobehavioral outcomes in the MADRES pregnancy cohort. Environ Health. 2023;22(1):66. pmid:37737180
  117. 117. Conejo-Bolaños LD, Mora AM, Hernández-Bonilla D, Cano JC, Menezes-Filho JA, Eskenazi B, et al. Prenatal current-use pesticide exposure and children’s neurodevelopment at one year of age in the Infants’ Environmental Health (ISA) birth cohort, Costa Rica. Environ Res. 2024;249:118222. pmid:38272290
  118. 118. Qiu Y, Liu Y, Gan M, Wang W, Jiang T, Jiang Y, et al. Association of prenatal multiple metal exposures with child neurodevelopment at 3 years of age: A prospective birth cohort study. Science of The Total Environment. 2024:173812.
  119. 119. Kou X, Millán MP, Canals J, Moreno VR, Renzetti S, Arija V. Effects of prenatal exposure to multiple heavy metals on infant neurodevelopment: A multi-statistical approach. Environ Pollut. 2025;367:125647. pmid:39761717
  120. 120. Song AY, Kauffman EM, Hamra GB, Dickerson AS, Croen LA, Hertz-Picciotto I, et al. Associations of prenatal exposure to a mixture of persistent organic pollutants with social traits and cognitive and adaptive function in early childhood: Findings from the EARLI study. Environ Res. 2023;229:115978. pmid:37116678
  121. 121. Alwosheel A, van Cranenburgh S, Chorus CG. Is your dataset big enough? Sample size requirements when using artificial neural networks for discrete choice analysis. Journal of Choice Modelling. 2018;28:167–82.
  122. 122. Bobb JF, Valeri L, Claus Henn B, Christiani DC, Wright RO, Mazumdar M, et al. Bayesian kernel machine regression for estimating the health effects of multi-pollutant mixtures. Biostatistics. 2015;16(3):493–508. pmid:25532525
  123. 123. Mohyuddin GR, Prasad V. Detecting Selection Bias in Observational Studies-When Interventions Work Too Fast. JAMA Intern Med. 2023;183(9):897–8. pmid:37306983
  124. 124. Dahabreh IJ, Bibbins-Domingo K. Causal Inference About the Effects of Interventions From Observational Studies in Medical Journals. JAMA. 2024;331(21):1845–53. pmid:38722735
  125. 125. de Hond AAH, Leeuwenberg AM, Hooft L, Kant IMJ, Nijman SWJ, van Os HJA, et al. Guidelines and quality criteria for artificial intelligence-based prediction models in healthcare: a scoping review. NPJ Digit Med. 2022;5(1):2. pmid:35013569
  126. 126. Nosek BA, Alter G, Banks GC, Borsboom D, Bowman SD, Breckler SJ, et al. SCIENTIFIC STANDARDS. Promoting an open research culture. Science. 2015;348(6242):1422–5. pmid:26113702
  127. 127. Flaxman AD, Vos T. Machine learning in population health: Opportunities and threats. PLoS Med. 2018;15(11):e1002702. pmid:30481173
  128. 128. Morgenstern JD, Buajitti E, O’Neill M, Piggott T, Goel V, Fridman D, et al. Predicting population health with machine learning: a scoping review. BMJ Open. 2020;10(10):e037860. pmid:33109649
  129. 129. Prakash P, Street K, Narayanan S, Fernandez BA, Shen Y, Shu C. Benchmarking Machine Learning Missing Data Imputation Methods in Large-Scale Mental Health Survey Databases. medRxiv. 2024:2024.05. 13.24307231.
  130. 130. Sasse L, Nicolaisen-Sobesky E, Dukart J, Eickhoff SB, Götz M, Hamdan S, et al. On Leakage in Machine Learning Pipelines. arXiv preprint arXiv:231104179. 2023.
  131. 131. Apicella A, Isgrò F, Prevete R. Don’t Push the Button! Exploring Data Leakage Risks in Machine Learning and Transfer Learning. arXiv preprint arXiv:240113796. 2024.
  132. 132. Huqh MZU, Abdullah JY, Wong LS, Jamayet NB, Alam MK, Rashid QF, et al. Clinical applications of artificial intelligence and machine learning in children with cleft lip and palate—a systematic review. International Journal of Environmental Research and Public Health. 2022;19(17):10860.
  133. 133. Bobb JF, Claus Henn B, Valeri L, Coull BA. Statistical software for analyzing the health effects of multiple concurrent exposures via Bayesian kernel machine regression. Environ Health. 2018;17(1):67. pmid:30126431
  134. 134. Yu L, Liu W, Wang X, Ye Z, Tan Q, Qiu W, et al. A review of practical statistical methods used in epidemiological studies to estimate the health effects of multi-pollutant mixture. Environ Pollut. 2022;306:119356. pmid:35487468
  135. 135. Fernández-Delgado M, Cernadas E, Barro S, Amorim D. Do we need hundreds of classifiers to solve real world classification problems? The Journal of Machine Learning Research. 2014;15(1):3133–81.
  136. 136. Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, et al. A Comprehensive Survey on Transfer Learning. Proc IEEE. 2021;109(1):43–76.