In vitro fertilisation (IVF) is a common mode of conception. Understanding the long-term implications for these children is important. The aim of this study was to determine the causal effect of IVF conception on primary school-age childhood developmental and educational outcomes, compared with outcomes following spontaneous conception.
Methods and findings
Causal inference methods were used to analyse observational data in a way that emulates a target randomised clinical trial. The study cohort comprised statewide linked maternal and childhood administrative data. Participants included singleton infants conceived spontaneously or via IVF, born in Victoria, Australia between 2005 and 2014 and who had school-age developmental and educational outcomes assessed. The exposure examined was conception via IVF, with spontaneous conception the control condition. Two outcome measures were assessed. The first, childhood developmental vulnerability at school entry (age 4 to 6), was assessed using the Australian Early Developmental Census (AEDC) (n = 173,200) and defined as scoring <10th percentile in ≥2/5 developmental domains (physical health and wellbeing, social competence, emotional maturity, language and cognitive skills, communication skills, and general knowledge). The second, educational outcome at age 7 to 9, was assessed using National Assessment Program–Literacy and Numeracy (NAPLAN) data (n = 342,311) and defined by overall z-score across 5 domains (grammar and punctuation, reading, writing, spelling, and numeracy). Inverse probability weighting with regression adjustment was used to estimate population average causal effects.
The study included 412,713 children across the 2 outcome cohorts. Linked records were available for 4,697 IVF-conceived cases and 168,503 controls for AEDC, and 8,976 cases and 333,335 controls for NAPLAN. There was no causal effect of IVF-conception on the risk of developmental vulnerability at school-entry compared with spontaneously conceived children (AEDC metrics), with an adjusted risk difference of −0.3% (95% CI −3.7% to 3.1%) and an adjusted risk ratio of 0.97 (95% CI 0.77 to 1.25). At age 7 to 9 years, there was no causal effect of IVF-conception on the NAPLAN overall z-score, with an adjusted mean difference of 0.030 (95% CI −0.018 to 0.077) between IVF- and spontaneously conceived children. The models were adjusted for sex at birth, age at assessment, language background other than English, socioeconomic status, maternal age, parity, and education. Study limitations included the use of observational data, the potential for unmeasured confounding, the presence of missing data, and the necessary restriction of the cohort to children attending school.
In this analysis, under the given causal assumptions, the school-age developmental and educational outcomes for children conceived by IVF are equivalent to those of spontaneously conceived children. These findings provide important reassurance for current and prospective parents and for clinicians.
Why was this study done?
- More than 8 million children have been conceived globally with the assistance of in vitro fertilisation (IVF).
- Some studies suggest these children have an increased risk of congenital abnormalities, autism spectrum disorder, developmental delay, and intellectual disability.
- Educational and school-age developmental outcomes following IVF conception have not yet been adequately characterised.
What did the researchers do and find?
- Using statewide, linked population data from Victoria, Australia, we investigated the school-age developmental and educational outcomes for children born following IVF-assisted conception.
- The study examined 2 separate assessments of school-age development and educational outcomes among 585,659 children, including 11,059 children who were conceived via IVF.
- This study was designed and performed within a causal framework, in order to produce the best possible estimate of exposure effect using observational data.
- We found no difference in school-age childhood developmental and educational outcomes between IVF- and spontaneously conceived children.
Citation: Kennedy AL, Vollenhoven BJ, Hiscock RJ, Stern CJ, Walker SP, Cheong JLY, et al. (2023) School-age outcomes among IVF-conceived children: A population-wide cohort study. PLoS Med 20(1): e1004148. https://doi.org/10.1371/journal.pmed.1004148
Received: July 3, 2022; Accepted: November 23, 2022; Published: January 24, 2023
Copyright: © 2023 Kennedy et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: Data for this study was provided by various data custodians and linked by the Centre for Victorian Data linkage (https://www.health.vic.gov.au/reporting-planning-data/the-centre-for-victorian-data-linkage). With relevant ethical approval, data are available upon request to the governing data custodians.
Funding: This work was supported by the National Health and Medical Research Council through the Australian Federal Government Graduate Research Scheme (AK) and Mercy Foundation, through Mercy Perinatal (AK). Ferring Pharmaceutics supported this work through an unconditional research grant (AK). The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: BV has a paid role as a member of the Therapeutic Goods Administration. BV, FA and KS own shares in respective IVF companies (Monash IVF, Virtus Health and Melbourne IVF).
Abbreviations: AEDC, Australian Early Development Census; ATE, average treatment effect; ATSI, Aboriginal and Torres Strait Islander; BMI, body mass index; CI, confidence interval; CVDL, Centre for Victorian Data Linkage; DAG, directed acyclic graph; ICSI, intracytoplasmic sperm injection; IPW, inverse probability weight; IPWRA, inverse-probability-weighted regression adjustment; IVF, in vitro fertilisation; MD, mean differences; NAPLAN, National Assessment Program–Literacy and Numeracy; NMS, national minimum standard; POM, potential outcome means; PS, propensity score; RD, risk difference; RR, relative risk; SAP, statistical analysis plan; SE, standard error; SEIFA, Socio-Economic Indexes for Areas; TMLE, targeted maximum likelihood estimation; VPDC, Victorian Perinatal Data Collection
In vitro fertilisation (IVF) is a common mode of conception worldwide . Since the first successful IVF birth in 1978, more than 8 million babies have been born globally following IVF conception [2,3]. In Australia, it is now estimated that 1 in 20 babies are born following IVF conception [4,5].
As the number of children born following IVF conception continues to rise, a deeper understanding of the long-term implications for these children is important. It is well established that there are increased risks of maternal and perinatal complications following IVF conception [6–8]. Large cohort studies have suggested an increase in the frequency of congenital abnormalities, autism spectrum disorder, developmental delay, and intellectual disability in children conceived via IVF or intracytoplasmic sperm injection (ICSI) techniques [9–13]. However, reports detailing longer term outcomes after IVF beyond the neonatal period remain sparse.
Educational and cognitive outcomes following IVF conception have not yet been thoroughly investigated. Several small cohort studies [14–17] have reported conflicting results. One large population-based study suggested a small difference in school performance in favour of spontaneous conception . Another population study recently concluded that school performance was not adversely affected by the process of IVF but, rather, the condition of subfertility .
Parents of IVF- and spontaneously conceived children possess inherently different health and sociodemographic characteristics [20,21]. Factors such as increased maternal age and higher education are known to be associated with both the use of fertility treatment and better early childhood outcomes [22–24]. It is thus critical that such factors are appropriately acknowledged when examining the association between mode of conception and childhood outcomes. Proper adjustment in any statistical analysis is required before any association can be given a causal interpretation.
Our study aimed to overcome some of the limitations of the analysis of observational (non-randomised) data by using a causal inference approach that seeks to emulate the results of a randomised comparison in a clinical trial (Table 1) [25,26]. This analytical approach attempts to simulate a randomised trial by (1) requiring an a priori statistical analysis protocol; (2) addressing a causal question reflecting the effect of an intervention at a specific clinical decision point on a prespecified outcome; and (3) using inverse probability weighting via propensity score (PS) models to balance the outcome propensity differences between exposed and control populations, with the aim of producing exchangeable comparison groups . This allowed us to estimate the population-average effect of mode of conception (IVF versus spontaneous conception) on childhood developmental and educational outcomes with a causal interpretation. Our study aims to estimate the total causal effect of IVF conception on school-age childhood developmental and educational outcomes using a causal inference approach and employing the necessary assumptions.
This study is reported as per the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guideline (Checklist in S13 File).
The study population included all singleton livebirths in Victoria between 2005 and 2014. Twins and higher order multiple births were excluded. Perinatal information was collected from audited birth outcome data through the Victorian Perinatal Data Collection (VPDC) [28,29]. The 3 largest IVF units in Victoria provided maternal records from all cycles that resulted in a birth during the study period. Creation of linked maternal/child data pairs required matching of the VPDC data with birth records, which were obtained from the Victorian Births, Deaths and Marriage registry.
The exposure was conception via IVF compared with spontaneous conception. The term “IVF” is used collectively to include both conventional IVF, IVF with ICSI, and associated laboratory techniques. IVF cases were identified through the IVF database. Victorian births not identified in the IVF database were allocated to the control group. Pregnancies recorded as “IVF conception” in the VPDC but not identified within the IVF database were excluded, ensuring the control group did not contain any IVF conceptions. These cases likely represent overseas or interstate IVF conceptions, Victorian IVF conceptions not captured by our database, or failed linkages between the IVF database and state birth records.
Main outcome measures.
Childhood educational and developmental outcomes were assessed using 2 standardised, national assessments. The Australian Early Development Census (AEDC)  and The National Assessment Program–Literacy and Numeracy (NAPLAN) . See Supporting information file (Methods in S1 File), for a detailed description of each measure.
Australian Early Developmental Census (AEDC).
The AEDC assesses broad childhood functional development at school entry (age 4 to 6) across 5 domains: physical health and wellbeing, social competence, emotional maturity, language and cognitive skills (school-based) and communication skills, and general knowledge. The primary AEDC outcome for this study was a global measure, developmental vulnerability, defined as scoring <10th percentile in ≥ 2 of the 5 developmental domains. The secondary outcomes included developmental vulnerability in each of the 5 domains.
The National Assessment Program–Literacy and Numeracy (NAPLAN).
NAPLAN is a school-based psychometric assessment, assessing 5 educational domains: grammar and punctuation, reading, writing, spelling, and numeracy . The study cohort’s grade 3 NAPLAN (fourth year of primary school) results were investigated. For this study, an overall z-score was calculated and used as the primary outcome, with the individual domain z-scores examined as secondary outcomes. By a priori consensus, a mean z-score difference of 0.2 standard deviations was considered to be clinically relevant. Individual domain scores below the published national minimum standard (NMS) NAPLAN scores for each year and for each domain were analysed as secondary (binary) outcomes.
Covariates to be considered for inclusion in the statistical analysis models were decided a priori by the authorship team whose expertise included epidemiology, perinatology, reproductive endocrinology, and education. These covariates included child’s sex (as assigned at birth), child’s age in years at assessment, language background other than English (LBOTE), maternal age (at birth of the child), parity and both maternal and paternal highest obtained level of education, and socioeconomic status . Gestational age at birth, mode of delivery, and birthweight were considered to be mediators on the causal pathways of interest and therefore not adjusted for in this analysis. A directed acyclic graph (DAG) was created to describe the structure of the relationships between all variables and identify the adjustment variable set, in line with the methodology recommendations of Tennant and colleagues . Our prespecified statistical analysis plan (SAP) and the DAG were agreed upon and signed off by all authors in May 2020, prior to the commencement of data analysis (Protocol in S2 File).
Administrative record linkage techniques were employed to match cases with the exposure (conception via IVF) through to childhood outcome data. Data linkage was performed by the Centre for Victorian Data Linkage (CVDL), a third-party government-funded data linkage unit . Probabilistic linkage was performed between the 5 individual databases—birth records, birthing outcomes, IVF records, AEDC, and NAPLAN. Post-linkage data were manually screened for false matches using secondary variables (e.g., residential postcode). False matches and duplicates were removed (Table A in S3 File outlines number and percentage of successful linkages).
Two separate, linked study populations were identified, children with a linked AEDC record and children with a linked NAPLAN record. These 2 cohorts were analysed and are reported separately. Some children were included in both cohorts.
The ATE (average treatment effect) estimand used in this study is based upon the Potential Outcomes Framework. If a set of assumptions is met, then causal interpretation can be made. The causal assumptions are counterfactual consistency, ignorability (conditional exchangeability), and positivity. Counterfactual consistency means that the definition of exposure is consistent for all individuals. Ignorability states that treatment assignment can be considered random after controlling for, conditioning on, a set of covariates . By identifying confounding variables, and importantly, the structure of the relationships between variables, via a DAG (S1 Fig) and by performing appropriate statistical modelling to balance the population (for example, inverse probability weighting), observed populations can be considered exchangeable or “unconfounded.” Exchangeability requires that there are no important unmeasured confounders; this assertion is untestable. The positivity assumption means that for all observations, the conditional probability of being exposed (receiving treatment/no treatment) is greater than zero. This is likely violated if overlap of the control and exposed populations is poor .
To best emulate a target trial, it must be possible for all participants to potentially receive both treatments. To ensure the assumptions underlying our causal approach were as robust as possible, we considered our observational data in direct comparison with the conditions of a target trial (Table 1)
Handling of missing data
The proportions of missing data are described in Table 2. Data were missing for outcomes and covariates; there were no missing exposure data. Missing covariates and outcomes that were considered to either be missing completely at random or missing at random were imputed. Children identified as having special needs are not allocated an AEDC domain category and thus their outcome data is missing. Likewise, children who attend school but have a disability that precludes them from being able to appropriately participate in the NAPLAN are exempt from sitting the test; by definition, these children are below the NAPLAN national minimum standard for each domain from which they have been excluded. Outcome data for these children for both AEDC domain categories and NAPLAN z-scores was considered missing not at random. In the analysis of all primary and secondary AEDC outcomes, children with special needs have been included and assumed to be “developmentally vulnerable.” In the analysis of NAPLAN outcomes, “exempt” children have also been included and allocated either the lowest possible test z-score or deemed to be below the national minimum standard for binary NAPLAN domain outcomes.
All covariates in the analysis model that had missing data were imputed, even if missing was very low. For the AEDC analysis, imputed covariates included parity, age at assessment, maternal education, Socio-Economic Indexes for Areas (SEIFA), and outcome score. For the NAPLAN analysis, imputed covariates included parity, age at assessment, maternal education, paternal education, SEIFA, and outcome score. Maternal body mass index (BMI) was excluded from imputation and analysis because the missingness was too high (>50%). For AEDC imputation models, second parent education level (42% missing) was excluded due to non-convergence when included in the imputation model.
Multiple imputation of missing data was performed under a fully conditional specification using a predictive mean model for continuous and unordered categorical covariates and a logistic model for binary covariates, with standard errors (SEs) accounting for maternal clustering (Methods, Table 2 and Figs A–C in S5 File). The model contained outcome, exposure, model covariates, and auxiliary variables (AEDC: remote locality, Aboriginal and Torres Strait Islander (ATSI) status, and maternal country of origin; NAPLAN: ATSI status and maternal country of origin) along with interaction terms (exposure-parity, exposure-maternal age, exposure-age at assessment, gender-age at testing) and 1 higher order term (test age2). At 20 imputations, the Monte Carlo errors were less than 10% of corresponding SE for all covariates. Each imputation model was subjected to the recommended diagnostic tests .
Descriptive statistics were calculated and are reported for each cohort by IVF exposure status, according to type and distribution of data.
Treatment effect size modelling.
All multivariate models were adjusted for the listed covariates identified in the prespecified SAP, except for (1) maternal BMI; and (2) second parent education level, for AEDC outcome models only.
For each of the imputed datasets, the predicted probability of exposure or PS and associated inverse probability weight (IPW = 1/PS) were estimated using a logistic regression model, conditional on all analysis model covariates . These weights were then stabilised by including as a factor in the numerator the proportion of each treatment group within the population, i.e., the prevalence of IVF and spontaneous conception . Diagnostic tests performed after planned treatment effect modelling (Figs A–D in S6 File) demonstrated poor overlap of exposed and non-exposed cohorts. We therefore restricted our analysis to the IVF population whose weights overlapped with the control group to ensure that the overlap (“positivity”) assumption was not violated. This reduced the IVF cases to 31.6% of the original AEDC cohort and 22.3% of the NAPLAN cohort. For each covariate, the standardised mean difference between the exposure arms was calculated to assess if balance between weighted pseudo-populations was achieved (Figs A and B in S8 File).
For each imputed dataset, a doubly robust inverse-probability-weighted regression adjustment (IPWRA) model [38,39] was then used to estimate the respective potential outcome means (POM) followed by (1) the risk difference (RD) and relative risk (RR) for binary outcomes (AEDC and NAPLAN); and (2) mean differences (MD) for continuous outcomes (NAPLAN z-score).
Finally, estimates for each imputed dataset were pooled to provide overall ATE with associated 95% confidence limits using Rubin’s method.
Provided the assumptions outlined above are satisfied, the estimates generated from these analyses can be interpreted as the population average causal effect, that is, the mean effect on the outcome if the treatment was applied to the entire population and contrasted with the outcome if the entire population received the control condition.
Clustering of data within mothers due to more than 1 singleton birth during the study period was accounted for in the imputation models, the calculation of inverse probability weights and estimation of the treatment effect by using robust SEs.
Sensitivity analyses were also performed to address identified sources of potential bias. For both AEDC (special needs status) and NAPLAN (exempt status) cohorts, sensitivity analyses were performed: (1) by excluding these cases completely; and (2) by imputing their outcomes (Fig A in S4 File). Targeted maximum likelihood estimation (TMLE) modelling (a machine learning ensemble that is less sensitive to violations of positivity and does not require data distribution assumptions) was undertaken for comparison . Additionally, calculation of E-values for our 2 primary outcomes was performed to quantify the magnitude of unobserved bias required to alter our findings.
The total cohort included 585,659 singleton births in Victoria between 2005 and 2014. Among this cohort, 173,200 children, including 4,697 IVF births, were linked to AEDC outcome data. Additionally, 342,331 children, including 8,976 IVF births, were linked to NAPLAN data (Fig 1). Overall, a total of 11,059 IVF-conceived children and 401,654 spontaneously conceived children were included in the study (2,614 IVF cases and 100,184 controls were in both study arms). We estimate that our study cohort includes >95% of IVF conceptions during the study timeframe (Tables A and B in S3 File). Analysis of the linked and non-linked cases showed little evidence of association between linkage and exposure status (Chi2 p = 0.80); that is, IVF cases were just as likely to be included in the final linked cohort as controls. There were no births from 2014 that linked to outcome data.
IVF, in vitro fertilisation; IVF cases, pregnancies and children identified with conception assisted by IVF; Controls, pregnancies and children not identified as IVF assisted conception; ART, assisted reproductive technology; non-IVF ART, ovulation induction and intrauterine insemination; VPDC, Victorian Perinatal Data Collection; BDM, Victorian Births Death Marriages Registry; IVF, combined Victorian IVF pregnancy record database.
Baseline population characteristics differed considerably between the 2 exposure groups (Table 2). Compared with spontaneously conceived controls, children conceived via IVF had older, more highly educated parents and mothers with lower parity. IVF-conceived children resided in postal areas with higher socioeconomic ranking and were less likely to be from non-English speaking backgrounds. Age at assessment was similar between the exposure groups.
Global developmental vulnerability at school entry (The Australian Early Development Census, AEDC)
Our findings suggest no causal effect of IVF conception on developmental vulnerability, with 13.6% of IVF-conceived children predicted to be developmentally vulnerable (<10th percentile in 2 or more domains of the 5 AEDC domains) compared with 13.9% of spontaneously conceived children. The adjusted RD was at −0.3%, indicating that 0.3% fewer children who were conceived by IVF were developmentally vulnerable compared with those conceived spontaneously. However, the 95% CI (−3.7% to 3.1%), indicates this result is indistinguishable from zero. Similarly, the adjusted relative risk showed no detectable difference in risk of developmental vulnerability, where IVF-conceived children were 3.0% less likely to be developmentally vulnerable than spontaneously conceived children (RR 0.97, 95% CI: 0.77 to 1.25) (Table 3).
For secondary outcomes, we examined each of the 5 AEDC domains individually. The unadjusted observed results and causal model results for each individual domain are reported in Table 3. There were no differences between IVF- and spontaneously conceived children in adjusted risk difference for any of the individual AEDC domains.
Outcome data were missing for 5.6% of the AEDC-linked cohort. The vast majority (92%) of these missing cases were children with special needs (5.2% of overall cohort). There was no evidence of an association between the presence of missing outcome and exposure status (Chi2 p = 0.68). Sensitivity analysis was performed by (1) excluding children with special needs; and (2) including these children, with multiple imputation of their missing outcomes (Tables A and B in S8 File). Most covariates had minimal or no missing data (<1.0%). Maternal education level was missing for 30.5% and maternal post-school education was missing for 31.6%.
Psychometric assessment of 5 educational domains at primary school (The National Assessment Program–Literacy and Numeracy, NAPLAN)
Our findings indicate the causal effect of IVF conception on overall NAPLAN z-score was indistinguishable from zero. The predicted outcome mean z-score and was 0.013 (SE 0.024) for IVF-conceived children and −0.016 (SE 0.002) for spontaneously conceived controls, with an adjusted mean difference of 0.030 (95% CI −0.018 to 0.077) (Table 4).
For secondary outcomes, we examined individual NAPLAN domain z-scores (Table 4). IVF-conceived children performed better on average in measures of writing than their spontaneously conceived peers with a z-score mean difference of 0.068 (95% CI 0.004 to 0.132), but this is unlikely to be a clinically important difference. The estimated effect is less than 0.07 of a standard deviation, and a difference of 0.2 standard deviations or greater was determined a priori as representing a finding of importance.
Additionally, for each domain, a binary outcome (domain scores above or below the national minimum standard) was examined. In 4 of 5 domains (numeracy, reading, spelling, and writing), IVF-conceived children were less likely to be below the national minimum standard compared with their spontaneously conceived peers (Table 5). For these 4 domains, the RD was between −0.7% and −1.25%. In absolute terms, this equates to approximately 1 additional IVF-conceived child, for every 100, predicted to score above the national minimum standard compared with their spontaneously conceived peers.
Spontaneously conceived children were more likely to have missing NAPLAN data (7.6%) than IVF-conceived children (5.9%, Chi2 p < 0.001). During the primary analysis, missing outcomes related to a child being absent or withdrawing from the test were imputed. The results presented include 7,222 children who were exempt from sitting the NAPLAN, with their results set to the lowest possible outcome score. Sensitivity analysis was performed by (1) excluding these children; and (2) including the exempt cases, with multiple imputation of their missing outcomes. There was no meaningful difference in the results (Tables A and B in S9 File). Most covariates had minimal or no missing data (<4.0%). Second parent school education level was missing in 13.8% of cases and post-school education missing in 15.4% of cases.
To validate our analysis model, we re-examined our AEDC primary outcome and the NAPLAN binary domain outcomes using TMLE modelling. Results from the TMLE model did not meaningfully differ from the findings of the primary analysis (Table A in S10 File).
An E-value was estimated for both primary outcomes and was found to be 1.90 and 1.77 for AEDC and NAPLAN outcomes, respectively, suggesting that an unknown bias of sufficient magnitude to change the study findings is unlikely (Figs A and B in S11 File).
Using a causal inference approach, we found no effect of IVF conception on developmental vulnerability at school entry in Victorian children born between 2005 and 2014. Additionally, IVF-conceived children performed as well as their spontaneously conceived peers in school-based psychometric testing at age 7 to 9 years.
For the first time, our study has estimated the causal effect of IVF conception on global childhood development at school entry and educational outcomes at primary school, under the assumptions of causal inference. Using an updated epidemiological approach , this study provides robust evidence about the longer term implications of IVF conception. The findings of this study offer timely reassurance about the impact of IVF conception on the developmental and educational outcomes at primary school age of the children conceived. Neither the outcomes of developmental vulnerability at school entry nor educational achievement at age 7 to 9 differed substantially between IVF- and spontaneously conceived children. Among 4 out of 5 NAPLAN individual domain national minimum standard results, there was a trend towards better performance in the IVF cases, but the clinical and social implications of these findings are difficult to quantify.
Two large Scandinavian studies have reported on childhood outcomes following IVF conception. Norrman and colleagues found that IVF-conceived children perform worse on school-based assessment in year 9 , among their cohort of just over 8,000 IVF-conceived children. Wienecke and colleagues reported that IVF-conceived children had poorer school performance than controls and that spontaneously conceived children of subfertile parents also had poorer outcomes . By examining a subpopulation of spontaneously conceived children of subfertile parents, the authors of this Danish study concluded that the IVF process itself was not responsible for the differences demonstrated .
These past studies are limited by examining historical birth cohorts dating back prior to the year 2001. Our study examines a more contemporary birth cohort (2005 to 2014), which is important given the advances in artificial reproductive techniques that have occurred since the turn of the century. IVF technologies that have evolved since this time include the introduction of blastocyst culture, vitrification, and single-embryo transfer [4,43]. Thus, our study findings are more generalisable to contemporary fertility practice. Importantly, our use of updated epidemiological and statistical methods ensures that we have estimated effects that have a causal interpretation. It is important that our methods are replicated in future studies to strengthen the existing evidence base.
Given the use of observational data, there were missing data and inherent differences in the covariate profile of the exposure cohorts. An a priori SAP was developed to overcome these limitations. First, inverse probability weighting with regression adjustment was used to mimic exchangeable treatment and control comparison groups, similar to those that would be generated by randomisation in a controlled trial. The success of this procedure is demonstrated by achieving adequate covariate balance and thus sufficient overlap of covariate distributions between exposure groups after inverse probability weighting (Figs A–D in S6 File). Second, we sought to mitigate the potential biases resulting from missing data. In order to do this, we performed multiple imputation of covariates included in our model and then compared the results of analyses that were based on complete cases with those of multiply imputed datasets (Tables A and B in S12 File).
It is possible that unmeasured common cause confounders may have led to bias in estimating the ATEs. Many important factors (socioeconomic status, maternal age, and education) were identified a priori, measured, and included in the estimation procedure. Potential known but unmeasured sources of bias include subfertility and maternal BMI. Current evidence suggests that subfertility is likely to be associated with poorer childhood outcomes . Consequently, if this variable were able to be measured and included in our causal model, correcting for it is likely to have favoured IVF-conceived children in our analysis. Maternal BMI is also likely to have followed the same trend with higher average BMI among the IVF group (after accounting for socioeconomic position) and high BMI being associated with poorer perinatal and childhood outcomes .
Unmeasured variables may have had an impact on the outcome. Factors such as childcare attendance or grandparent involvement will be preceded on causal pathways by covariates that were measured and included in the model, such as maternal age and socioeconomic status [45–47]. These factors were therefore considered to mediate rather than confound the relationship between these covariates and the outcome. Sensitivity analyses were performed to further evaluate unmeasured confounding, with E-values calculated for AEDC and NAPLAN primary outcomes. Within the limitation of E-values, these analyses indicate that it is unlikely an unknown bias exists without our knowledge and with the necessary magnitude of effect and prevalence to change our conclusions (Figs A and B in S11 File) .
Generalisation of our findings to all IVF births is a potential study limitation. As described in our Methods, observations with non-overlapping PSs were excluded from analysis in order to meet the assumption of positivity, required for causal inference under the potential outcomes framework. Generalisation of our findings to all IVF births therefore requires the consideration that the baseline characteristics of the population of interest are comparable to the IVF cases analysed in our final cohort.
Due to the use of school-based outcome assessments, our cohort was limited to children attending school. AEDC, as a triennial assessment, limited our sample to children captured during assessment years, and the later years of our birth cohort had not yet reached the assessment age for NAPLAN outcomes to be captured. However, our study included 70% of the relevant birth cohort for the study timeframe and in the years where both AEDC and NAPLAN data were available, over 95% of the Victorian birth cohort was sampled (Table A in S3 File). The remaining approximately 5% of children not sampled represent failed linkages as well as excluded IVF conceptions (due to non-Victorian IVF or non-IVF-assisted reproduction). A small percentage will also represent children with a disability significant enough not to attend mainstream school, introducing potential selection bias. Importantly, however, our study was not designed to assess severe disability or developmental delay, but rather an overall measure of global development and school achievement.
Furthermore, through the examination of school-based outcomes, our study was inherently designed to examine outcomes for liveborn children. “Live birth bias” as it is known, is a recognised limitation of observational studies that investigate periconception and antenatal exposures . For the purposes of this research question, the outcomes of failed conception, miscarriage, stillbirth are considered alternative endpoints and less relevant to the research question that aims to compare the school-age outcomes of children born following IVF conception with those who were conceived without assistance.
Under the specified assumptions, this analysis has demonstrated that there is no causal effect within the population studied of IVF conception on early childhood developmental vulnerability and school-age educational outcomes. Compared with spontaneously conceived children, children conceived by IVF were no more likely to be developmentally vulnerable at school entry and had equivalent numeracy and literacy performance by age 7 to 9 years. These findings provide important reassurance for current and prospective parents and their treating clinicians.
S1 File. Methods: Description of outcome metrics.
Tables A and B. Table A. Successful linkages by birth year. Table B. Annual cycle summaries from major Victorian IVF providers 2010–2014.
Fig A. Analysis flow chart (NAPLAN).
S5 File. Methods: Multiple imputation model summary and model diagnostics.
Table A. Missing data summary (NAPLAN). Fig A. NAPLAN convergence. Fig B. NAPLAN density plots of observed and imputed data. Fig C. NAPLAN distribution of outcome and covariates after imputation in m = 1 dataset.
Figs A–D: Distribution and overlap of manually calculated stabilised weights. Fig A. AEDC imputation #1. Fig B. AEDC imputation #13. Fig C. NAPLAN imputation #1. Fig D. NAPLAN imputation #13.
Figs A–C: Variable standardised mean differences. Fig A. NAPLAN imputation #1 variable standardised mean differences. Fig B. NAPLAN imputation #13 variable standardised mean differences. Fig C. AEDC imputation #7 variable standardised mean differences.
Tables A and B. Table A. Sensitivity analysis–AEDC (special needs multiply imputed). Table B. Sensitivity analysis–AEDC (special needs excluded).
Tables A and B. Table A. Sensitivity analysis–NAPLAN (exempt multiply imputed). Table B. Sensitivity analysis–NAPLAN (exempt excluded).
Table A–Sensitivity analysis–binary outcomes TMLE.
Figs A and B. Fig A. Sensitivity analysis AEDC primary outcome–E-value estimation. Fig B. Sensitivity analysis NAPLAN primary outcome–E-value estimation.
Tables A and B. Table A. Traditional regression and treatment effect models (AEDC). Table B. Traditional regression and treatment effect models (NAPLAN).
This paper uses data from the Australian Early Development Census (AEDC). The AEDC is funded by the Australian Government Department of Education, Skills and Employment. The findings and views reported are those of the author(s) and should not be attributed to the Department or the Australian Government. We are grateful for the provision of data by the AEDC.
We are grateful to CCOPMM for providing access to the data used for this project and for the assistance of the staff at Safer Care Victoria. The conclusions, findings, opinions and views, or recommendations expressed in this paper are strictly those of the author(s). They do not necessarily reflect those of CCOPMM.
We are thankful for contribution of Victorian IVF providers, Melbourne IVF, Monash IVF, and City Fertility Centre to this research. We acknowledge the significant amount of work undertaken on behalf of this project and appreciate the opportunity to work with staff from each unit.
Finally, we are grateful to the Australian Curriculum Assessment and Reporting Authority (ACARA) for their assistance, collaboration, and for providing the National Assessment Program for Literacy and Numeracy (NAPLAN) data.
- 1. Adamson GD, de Mouzon J, Chambers G, Zegers-Hochschild F, Ishihara O, Banker M, et al. International Committee for Monitoring Assisted Reproductive Technology: world report on assisted reproductive technology. 2017. Available from: https://www.icmartivf.org/reports-publications/.
- 2. European Society of Human Reproduction and Embryology. More than 8 million babies born from IVF since the world’s first in 1978: European IVF pregnancy rates now steady at around 36 percent, according to ESHRE monitoring. ScienceDaily. 2018 [cited 2021 Nov 8]. Available from: www.sciencedaily.com/releases/2018/07/180703084127.htm [Internet].
- 3. European Society of Human Reproduction and Embryology. World’s number of IVF and ICSI babies has now reached a calculated total of 5 million. ScienceDaily. 2012 [cited 2021 Nov 8]. Available from www.sciencedaily.com/releases/2012/07/120702134746.htm>.
- 4. Newman JE, Paul RC, Chambers GM. Assisted reproductive technology in Australia and New Zealand 2018. National Perinatal Epidemiology and Statistics Unit, the University of New South Wales, Sydney, 2020. Available from: https://npesu.unsw.edu.au/surveillance/assisted-reproductive-technology-australia-and-new-zealand-2018.
- 5. Australian Bureau of Statistics. Births, Australia. Canberra, Australia: Australian Bureau of Statistics, 2018. Available from: https://www.abs.gov.au/statistics/people/population/births-australia/2018.
- 6. Pandey S, Shetty A, Hamilton M, Bhattacharya S, Maheshwari A. Obstetric and perinatal outcomes in singleton pregnancies resulting from IVF/ICSI: a systematic review and meta-analysis. Hum Reprod Update. 2012;18(5):485–503. pmid:22611174.
- 7. Qin J, Liu X, Sheng X, Wang H, Gao S. Assisted reproductive technology and the risk of pregnancy-related complications and adverse pregnancy outcomes in singleton pregnancies: a meta-analysis of cohort studies. Fertil Steril. 2016;105(1):73–85.e6. pmid:26453266
- 8. Marino JL, Moore VM, Willson KJ, Rumbold A, Whitrow MJ, Giles LC, et al. Perinatal outcomes by mode of assisted conception and sub-fertility in an Australian data linkage cohort. PLoS ONE. 2014;9(1):e80398. Epub 2014/01/15. pmid:24416127; PubMed Central PMCID: PMC3885393.
- 9. Stromberg B, Dahlquist G, Ericson A, Finnstrom O, Koster M, Stjernqvist K. Neurological sequelae in children born after in-vitro fertilisation: a population-based study. Lancet. 2002;359(9305):461–5. Epub 2002/02/21. pmid:11853790.
- 10. Bay B, Mortensen EL, Hvidtjorn D, Kesmodel US. Fertility treatment and risk of childhood and adolescent mental disorders: register based cohort study. BMJ. 2013;347:f3978. Epub 2013/07/09. pmid:23833075; PubMed Central PMCID: PMC3702157.
- 11. Pinborg A, Loft A, Schmidt L, Andersen AN. Morbidity in a Danish national cohort of 472 IVF/ICSI twins, 1132 non-IVF/ICSI twins and 634 IVF/ICSI singletons: health-related and social implications for the children and their families. Hum Reprod. 2003;18(6):1234–43. Epub 2003/05/30. pmid:12773452.
- 12. Sandin S, Nygren KG, Iliadou A, Hultman CM, Reichenberg A. Autism and mental retardation among offspring born after in vitro fertilization. JAMA. 2013;310(1):75–84. Epub 2013/07/04. pmid:23821091.
- 13. Davies MJ, Moore VM, Willson KJ, Van Essen P, Priest K, Scott H, et al. Reproductive technologies and the risk of birth defects. N Engl J Med. 2012;366(19):1803–13. Epub 2012/05/09. pmid:22559061.
- 14. Leslie GI, Gibson FL, McMahon C, Cohen J, Saunders DM, Tennant C. Children conceived using ICSI do not have an increased risk of delayed mental development at 5 years of age. Hum Reprod. 2003;18(10):2067–72. Epub 2003/09/26. pmid:14507822.
- 15. Knoester M, Helmerhorst FM, Vandenbroucke JP, van der Westerlaken LA, Walther FJ, Veen S, et al. Cognitive development of singletons born after intracytoplasmic sperm injection compared with in vitro fertilization and natural conception. Fertil Steril. 2008;90(2):289–96. Epub 2007/11/06. pmid:17980875.
- 16. Mains L, Zimmerman M, Blaine J, Stegmann B, Sparks A, Ansley T, et al. Achievement test performance in children conceived by IVF. Hum Reprod. 2010;25(10):2605–11. Epub 2010/08/19. pmid:20716561.
- 17. Barbuscia A, Mills MC. Cognitive development in children up to age 11 years born after ART-a longitudinal cohort study. Hum Reprod. 2017;32(7):1482–8. Epub 2017/05/26. pmid:28541549; PubMed Central PMCID: PMC5850752.
- 18. Norrman E, Petzold M, Bergh C, Wennerholm UB. School performance in singletons born after assisted reproductive technology. Hum Reprod. 2018;33(10):1948–59. Epub 2018/08/31. pmid:30165380.
- 19. Wienecke LS, Kjaer SK, Frederiksen K, Hargreave M, Dalton SO, Jensen A. Ninth-grade school achievement in Danish children conceived following fertility treatment: a population-based cohort study. Fertil Steril. 2020;113(5):1014–23. Epub 2020/05/11. pmid:32386613.
- 20. Ronfani L, Vecchi Brumatti L, Mariuz M, Tognin V, Bin M, Ferluga V, et al. The Complex Interaction between Home Environment, Socioeconomic Status, Maternal IQ and Early Child Neurocognitive Development: A Multivariate Analysis of Data Collected in a Newborn Cohort Study. PLoS ONE. 2015;10(5):e0127052. Epub 2015/05/23. pmid:25996934; PubMed Central PMCID: PMC4440732.
- 21. Ferreira L, Godinez I, Gabbard C, Vieira JLL, Cacola P. Motor development in school-age children is associated with the home environment including socioeconomic status. Child Care Health Dev. 2018;44(6):801–6. Epub 2018/08/02. pmid:30066336.
- 22. Falster K, Hanly M, Banks E, Lynch J, Chambers G, Brownell M, et al. Maternal age and offspring developmental vulnerability at age five: A population-based cohort study of Australian children. PLoS Med. 2018;15(4):e1002558. Epub 2018/04/25. pmid:29689098; PubMed Central PMCID: PMC5915778.
- 23. Hanly M, Falster K, Banks E, Lynch J, Chambers GM, Brownell M, et al. Role of maternal age at birth in child development among Indigenous and non-Indigenous Australian children in their first school year: a population-based cohort study. Lancet Child Adolesc Health. 2020;4(1):46–57. Epub 2019/11/24. pmid:31757762.
- 24. Jeong J, Kim R, Subramanian SV. How consistent are associations between maternal and paternal education and child growth and development outcomes across 39 low-income and middle-income countries? J Epidemiol Community Health. 2018;72(5):434–41. Epub 2018/02/14. pmid:29439191.
- 25. Hernan MA. Methods of Public Health Research—Strengthening Causal Inference from Observational Data. N Engl J Med. 2021;385(15):1345–8. Epub 2021/10/02. pmid:34596980.
- 26. Hernán MA, Robins JM. Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available. Am J Epidemiol. 2016;183(8):758–64. Epub 20160318. pmid:26994063; PubMed Central PMCID: PMC4832051.
- 27. Hernán MA. Methods of Public Health Research—Strengthening Causal Inference from Observational Data. N Engl J Med. 2021;385(15):1345–1348. pmid:34596980
- 28. Flood MM, McDonald SJ, Pollock WE, Davey MA. Data accuracy in the Victorian Perinatal Data Collection: Results of a validation study of 2011 data. Health Inf Manag. 2017;46(3):113–26. Epub 20170127. pmid:28537203.
- 29. Davey MA, Sloan ML, Palma S, Riley M, King J. Methodological processes in validating and analysing the quality of population-based data: a case study using the Victorian Perinatal Data Collection. Health Inf Manag. 2013;42(3):12–9. pmid:24067237.
- 30. The Australian Early Development Census. ABOUT THE AEDC 2022. 2020 [cited 2022 Feb 18]. Available from: https://www.aedc.gov.au/about-the-aedc.
- 31. The Australian Curriculum Assessment and Reporting Aurthority (ACARA). The National Assessment Program for Literacy and Numeracy. Available from: https://nap.edu.au.
- 32. The Australian Curriculum Assessment and Reporting Aurthority (ACARA). Test development. 2018. Available from: https://www.nap.edu.au/about/test-development.
- 33. Australian Bureau of Statistics. Socio-Economic Indexes for Areas. 2016. Available from: https://www.abs.gov.au/websitedbs/censushome.nsf/home/seifa.
- 34. Tennant PWG, Murray EJ, Arnold KF, Berrie L, Fox MP, Gadd SC, et al. Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations. Int J Epidemiol. 2021;50(2):620–632. pmid:33330936
- 35. A Guide for Data Integration Projects Involving Commonwealth Data for Statistcial and Research Purposes Australian Government—National Statistcial Service [23 July 2021]. Available from: https://statisticaldataintegration.abs.gov.au/topics/applying-the-separation-principle.
- 36. Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41–55.
- 37. Eddings W, Marchenko Y. Diagnostics for multiple imputation in Stata. Stata J. 2012;12(3):353–367.
- 38. Williamson EJ, Forbes A, Wolfe R. Doubly robust estimators of causal exposure effects with missing data in the outcome, exposure or a confounder. Stat Med. 2012;31:4382–400. pmid:23086504
- 39. Desai RJ, Franklin JM. Alternative approaches for confounding adjustment in observational studies using weighting based on the propensity score: a primer for practitioners. BMJ. 2019;367:l5657. Epub 2019/10/28. pmid:31645336.
- 40. Luque-Fernandez MA. “ELTMLE: Stata module to provide Ensemble Learning Targeted Maximum Likelihood Estimation”. Statistical Software Components S458337. revised 04 Jul 2021 ed: Boston College Department of Economics; 2017.
- 41. StataCorp. Stata Statistical Software, Release 17. College Station, Texas. 2021.
- 42. R Core Team. R Foundation for Statistical Computing. R: A language and environment for statistical computing. Vienna, Austria; 2021.
- 43. Chambers GM, Paul RC, Harris K, Fitzgerald O, Boothroyd CV, Rombauts L, et al. Assisted reproductive technology in Australia and New Zealand: cumulative live birth rates as measures of success. Med J Aust. 2017;207(3):114–8. Epub 2017/08/03. pmid:28764619.
- 44. Pearce A, Scalzi D, Lynch J, Smithers LG. Do thin, overweight and obese children have poorer development than their healthy-weight peers at the start of school? Findings from a South Australian data linkage study. Early Child Res Q. 2016;35:85–94. Epub 2016/05/10. pmid:27158187; PubMed Central PMCID: PMC4850238.
- 45. Sadruddin AFA, Ponguta LA, Zonderman AL, Wiley KS, Grimshaw A, Panter-Brick C. How do grandparents influence child health and development? A systematic review. Soc Sci Med. 2019;239:112476. Epub 2019/09/21. pmid:31539783.
- 46. Chung EO, Hagaman A, LeMasters K, Andrabi N, Baranov V, Bates LM, et al. The contribution of grandmother involvement to child growth and development: an observational study in rural Pakistan. BMJ Global Health. 2020;5(8). Epub 2020/08/14. pmid:32784209; PubMed Central PMCID: PMC7418670.
- 47. Allen L, Kelly B. Committee on the Science of Children Birth to Age 8: Deepening and Broadening the Foundation for Success. Transforming the Workforce for Children Birth Through Age 8: A Unifying Foundation. editors. Washington (DC); 2015.
- 48. VanderWeele TJ, Ding P. Sensitivity Analysis in Observational Research: Introducing the E-Value. Ann Intern Med. 2017;167(4):268–74. Epub 20170711. pmid:28693043.
- 49. Goin DE, Casey JA, Kioumourtzoglou MA, Cushing LJ, Morello-Frosch R. Environmental hazards, social inequality, and fetal loss: Implications of live-birth bias for estimation of disparities in birth outcomes. Environ Epidemiol. 2021;5(2):e131. Epub 20210226. pmid:33870007; PubMed Central PMCID: PMC8043739.