Reader Comments

Post a new comment on this article

Lessons to be learned from the DESiGN Trial

Posted by pcorcoran on 14 Jul 2022 at 20:24 GMT

Randomised controlled trials (RCTs) are challenging undertakings and this is especially true for cluster RCTs of complex interventions. The DESiGN Collaborative Group are to be commended for their cluster RCT of the Growth Assessment Protocol (GAP) intervention to improve antenatal detection of small for gestational age (SGA) [1]. At the outset, the Group faced recruiting from a limited number of maternity units, primarily in London. None of these units had adopted the GAP intervention while 64% of all UK units had done so. This suggests there was a risk that these units would be reluctant or poor adopters of the intervention. Even though 13 of the 16 invited sites enrolled, two of seven intervention sites never even contacted the GAP provider and implementation by the other five was sub-optimal.

Unsurprisingly, randomisation of 13 clusters (and exclusion of the two non-starters) failed to produce comparison groups amenable to providing a valid estimate of effectiveness, i.e. similar in all respects except for access to the intervention. Women in the intervention sites were less likely to be of white ethnicity, less likely to live in affluent areas and more likely to be having their first baby, all relevant to risk and detection of SGA. Striking site differences were evident at baseline with greater utilisation of ultrasound and greater antenatal detection of SGA in the intervention sites.

Masking can be as important as randomisation in providing unbiased assessments of intervention effectiveness in RCTs. The article's only reference to masking states that due to the nature of the intervention, concealment was not possible. Presumably, the pregnant women in each arm were unaware of the study so their antenatal behaviour would not have been influenced. Some degree of masking may have been possible for staff, for example, concealing from staff in the standard care sites that they were in the control arm of an RCT that would be compared against other sites implementing the GAP program. This might have reduced the risk of changes in standard care, changes that seem to have happened. Concealment from those conducting the statistical analysis may have been feasible and could have been useful.

The Saving Babies Lives care bundle was being rolled out nationally during the study period, creating a serious risk of concurrent implementation of interventions that would confound efforts to assess effectiveness of the GAP program versus standard care. We are informed that most of the five London-based standard care clusters chose to implement some of the care bundle strategies but they were exempted from implementing the component related to fetal growth restriction. The authors considered it unethical to stop standard care clusters that were willing to implement concomitant strategies for improved detection of SGA and prevention of stillbirths. Knowing the units were exempt but would not be stopped does not tell us what happened. If they did implement measures to improve antenatal detection of SGA, as appears to have happened, then details of these measures and the extent of their implementation should have been provided.

In their Discussion, the authors state that they did not perform statistical testing to assess for changes between the prerandomisation and outcome periods as per prespecified analysis plan. This is a significant omission from the analysis plan. This analysis would have highlighted the significant increase in utilisation of ultrasound and detection of SGA in the standard care sites, which is understated in the published article and, apparently, was not mentioned at all in the original submission. As stated in the article, the primary aim of the DESiGN trial was to determine whether implementation of GAP results in improved ultrasound detection of SGA, when compared to standard care. To improve is to make or become better. It does not mean to be better. Therefore, the primary aim required change over time in intervention sites to be compared to change over time in standard care sites, rather than comparing intervention and standard care sites after implementation.

We did this using the numerators and denominators specified in Table F of Appendix S3, which reported improvement in ultrasound detection of SGA from 26.2% to 28.0% in interventions sites and from 19.0% to 27.8% in standard care sites. Table F reported the effect of GAP implementation, as very slightly better detection of SGA in intervention sites, respectively, by 2.6 (95% CI: -6.7, 11.9) and 0.1 (95% CI: -13.4, 13.5; p-value=0.99) percentage points, before and after adjusting for baseline, age, ethnicity, parity and stratification factor. However, improvement in intervention sites was seven percentage points less than improvement in standard care sites, a statistically significant effect (95% CI: -11.9, -2.2; p-value=0.005) based on binary regression with robust standard errors. Although we could not adjust for clustering or confounding factors that differed between the trial arms at baseline, this shows that the DESiGN trial did observe improvement in ultrasound detection of SGA; implementation of GAP resulted in improved detection in the sites that did not implement it.

The cluster-summary approach used to assess intervention effects was inefficient with wide confidence intervals for the differences between intervention and standard care sites. The one departure from the cluster-summary approach led to a statistically significant finding regarding stillbirth. The 0.36% risk in standard care sites and 0.31% risk in intervention sites was reported as a statistically significant adjusted difference of −0.07 (95% CI: −0.14, −0.01; p-value=0.03; Table 5) based on logistic regression with robust standard errors. The cluster-summary approach showed no evidence of an effect on stillbirth (adjusted difference = -0.06, 95% CI: -0.20, 0.09; p-value=0.40; Table H).

In general, it is accepted that the RCT is the best study design to establish the effectiveness of health interventions. However, this can only apply if the key elements of the study design are delivered successfully. Unfortunately, this was not the case for the DESiGN trial. The randomisation failed, the intervention delivery was suboptimal, the controls received intensified care, assessment of the intervention effect focused too much on outcomes after implementation rather on change in outcomes and the findings were sensitive to the statistical approach adopted. As a result, lessons from the DESiGN trial relate more to cluster RCT implementation than GAP implementation.

Paul Corcoran1, Sara Leitao1, Keelin O’Donoghue2, Richard A Greene1
1National Perinatal Epidemiology Centre and 2INFANT Research Centre, University College Cork, Ireland

References
1. Vieira M, Relph S, Mureut-Gutierrez W, Elstad M, Bolaji C, Moitt N, et al. Evaluation of the Growth Assessment Protocol (GAP) for antenatal detection of small for gestational age: The DESiGN cluster randomised trial. PLoS Med. 2022; 19(6):e1004004.

No competing interests declared.