Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The effectiveness of the quality improvement collaborative strategy in low- and middle-income countries: A systematic review and meta-analysis

  • Ezequiel Garcia-Elorrio ,

    Roles Conceptualization, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Supervision, Writing – original draft, Writing – review & editing

    Affiliation Healthcare quality and safety department, Instituto de Efectividad Clínica y Sanitaria (IECS-CONICET), Buenos Aires, Argentina

  • Samantha Y. Rowe,

    Roles Data curation, Formal analysis, Methodology, Resources, Software, Validation, Writing – review & editing

    Affiliations Malaria Branch, Division of Parasitic Diseases and Malaria, Center for Global Health, Centers for Disease Control and Prevention, Atlanta, Georgia, United States of America, CDC Foundation, Atlanta, Georgia, United States of America

  • Maria E. Teijeiro,

    Roles Formal analysis, Resources, Validation, Writing – review & editing

    Affiliation Quality Department, Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia (FLENI), Escobar, Buenos Aires Province, Argentina

  • Agustín Ciapponi ,

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Project administration, Software, Validation, Writing – original draft, Writing – review & editing

    ‡ AC and AKR are joint senior authors on this work.

    Affiliation Argentine Cochrane Centre, Instituto de Efectividad Clínica y Sanitaria (IECS-CONICET), Buenos Aires, Argentina

  • Alexander K. Rowe

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Resources, Validation, Writing – original draft, Writing – review & editing

    ‡ AC and AKR are joint senior authors on this work.

    Affiliation Malaria Branch, Division of Parasitic Diseases and Malaria, Center for Global Health, Centers for Disease Control and Prevention, Atlanta, Georgia, United States of America



Quality improvement collaboratives (QICs) have been used to improve health care for decades. Evidence on QIC effectiveness has been reported, but systematic reviews to date have little information from low- and middle-income countries (LMICs).


To assess the effectiveness of QICs in LMICs.


We conducted a systematic review following Cochrane methods, the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach for quality of evidence grading, and the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) statement for reporting. We searched published and unpublished studies between 1969 and March 2019 from LMICs. We included papers that compared usual practice with QICs alone or combined with other interventions. Pairs of reviewers independently selected and assessed the risk of bias and extracted data of included studies. To estimate strategy effectiveness from a single study comparison, we used the median effect size (MES) in the comparison for outcomes in the same outcome group. The primary analysis evaluated each strategy group with a weighted median and interquartile range (IQR) of MES values. In secondary analyses, standard random-effects meta-analysis was used to estimate the weighted mean MES and 95% confidence interval (CI) of the mean MES of each strategy group. This review is registered with PROSPERO (International Prospective Register of Systematic Reviews): CRD42017078108.


Twenty-nine studies were included; most (21/29, 72.4%) were interrupted time series studies. Evidence quality was generally low to very low. Among studies involving health facility-based health care providers (HCPs), for “QIC only”, effectiveness varied widely across outcome groups and tended to have little effect for patient health outcomes (median MES less than 2 percentage points for percentage and continuous outcomes). For “QIC plus training”, effectiveness might be very high for patient health outcomes (for continuous outcomes, median MES 111.6 percentage points, range: 96.0 to 127.1) and HCP practice outcomes (median MES 52.4 to 63.4 percentage points for continuous and percentage outcomes, respectively). The only study of lay HCPs, which used “QIC plus training”, showed no effect on patient care-seeking behaviors (MES -0.9 percentage points), moderate effects on non-care-seeking patient behaviors (MES 18.7 percentage points), and very large effects on HCP practice outcomes (MES 50.4 percentage points).


The effectiveness of QICs varied considerably in LMICs. QICs combined with other invention components, such as training, tended to be more effective than QICs alone. The low evidence quality and large effect sizes for QIC plus training justify additional high-quality studies assessing this approach in LMICs.


Major failures in health care have been reported elsewhere but are most evident in low- and middle-income countries (LMICs). An evaluation of the health-related Millennium Development Goals (MDGs) found that, in 2015, when they were to be achieved, major health care quality gaps still were present in LMICs, which ignited a strong demand for quality improvement [1]. The MDGs have now been replaced by the Sustainable Development Goals (SDGs), instituted by the United Nations with the aim to contribute to the achievement of universal health coverage with quality care for all [2]. Concurrently in 2017, The Lancet Global Health Commission on High-Quality Health Systems in the SDG Era was established to review current knowledge, conduct new focused research, and propose policies for measuring and improving health care quality to reach new levels of performance in LMICs. This Commission advocated for a revision of methods that could contribute to the advance of the field of quality of care worldwide [3].

Among the several quality improvement strategies available, quality improvement collaboratives (QICs) (also known as collaborative improvement and learning collaboratives) have been used to improve health care for several decades [4]. However, reporting on specific components of QICs has been imprecise [5].

Formal QICs involve the use of healthcare teams from different sites to improve performance on a specific topic by collecting data and testing ideas with improvement cycles (usually plan-do-study-act cycles, involving planning a change, trying it, observing the results, and acting upon what is learned) supported by coaching and learning sessions [6]. QICs are supported by the concept that district managers and networks of facilities can be harnessed into learning systems that accelerate improvement in health care performance with the potential to achieve results at large scale for scale. The district level of the health system is well positioned to facilitate systematic group learning among facilities of similar types and across tiers of the health system. District-led area-based learning and planning bring together providers and administrators responsible for a catchment area to solve clinical and system problems, harmonize approaches, maximize often limited resources and create better communication and referral between facilities [7].

The use of QICs has increased rapidly despite the absence of strong evidence for effectiveness, cost-effectiveness or long-term impact. Published systematic reviews on QICs, which predominantly include studies from high-income countries, show modest improvements, particularly when addressing straightforward aspects of care where there is a clear gap between recommended and actual practice. There is still limited information from LMICs, unpublished studies, or non-English studies [810].

Recently, an extensive systematic review has been published characterizing the effectiveness of a wide array of strategies to improve health care provider (HCP) performance in LMICs (the Health Care Provider Performance Review, or HCPPR) [11]. Although this review includes QICs, thus far, these strategies have been analyzed under the broader strategy category of “group problem solving,” which includes other, non-QIC, strategies. Additionally, the most recent literature search for the HCPPR was conducted in May 2016.

The objective of this work was to particularly estimate the effectiveness of QICs in LMICs using data from the HCPPR and results of studies from an updated literature search. We aimed to inform decisions about whether to use QIC, how best to implement them, and to identify knowledge gaps on QICs in LMICs and provide direction on future evaluations of this strategy.

Materials and methods

We conducted a systematic review following Cochrane Collaboration methods and the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) statement for reporting [12, 13]. The study protocol was registered in PROSPERO International prospective register of systematic reviews (registration number CRD42017078108).

Study eligibility criteria

Type of study designs.

Studies meeting the Cochrane Effective Practice and Organisation of Care (EPOC) Review Group for inclusion in a systematic review of interventions [14]:

  1. Randomized controlled trials (RCTs)
  2. Controlled before- and- after trials (CBA)
  3. Interrupted time series (ITS) designs with at least 3 data points before and after the intervention, with or without comparison groups

Types of participants.

HCPs (and patients that they care for) from LMICs (defined as countries with a low or middle-income economy, according to the World Bank at the time of the literature search) [15]. HCPs included hospital-, clinic-, and community-based health workers, pharmacists, and medicine vendors.

Type of intervention.

Studies were included if they had an intervention arm exposed to QIC with or without other strategy components (e.g., training) compared to a non-exposed control group (or historical controls, for ITS studies) that could be defined as usual practice. QIC was defined as a strategy with the following core elements: a) a team of experts (in clinical care and quality improvement) involved in bringing together the scientific evidence, practical contextual knowledge and quality improvement methods, usually within a “change package” or toolkit; b) multiple teams from multiples sites that chose to participate; c) a model or framework for improvement that included measurable aims, data collection, implementation and evaluation of small tests of change; and d) a set of structured activities that promoted a collaborative process to learn and share ideas, innovations, and experiences (e.g. face-to-face or virtual meetings; visits to other sites; visits by experts or facilitators; web-based activities to report changes, results and comparisons with other teams; and coaching and feedback by improvement experts). The comparator was non-exposed control groups that represent usual practice.

Type of outcomes.

There was no restriction on outcome type. Outcomes were grouped into the following categories.

  • Facilitators (i.e., elements that facilitate HCP performance, such as supplies and HCP knowledge)
  • Health worker practices (i.e., processes of care, such as correct treatment)
  • Patient health outcomes
  • Patient behaviors related to care-seeking or use of health services
  • Other patient behaviors (i.e., those not related to care-seeking, such as adherence to treatment regimen)

Effect sizes were based on primary outcomes, with the following exclusions.

  • For outcomes expressed as a percentage, effect sizes based on <20 observations per study group and time point, for a given comparison
  • Effect sizes based on a simulation study and not actually observed data
  • Effect sizes for which baseline and follow-up measures in the intervention group were both 100%, as this indicates that HCP performance in the intervention group had no room for improvement and did not worsen over time. Similarly, for HCP practice outcomes expressed as a percentage, we excluded effect sizes based on a baseline value of 95% or greater, as there was little room for improvement.
  • Effect sizes based on outcome measures that were not taken at comparable times between study groups. For example, if the outcome for a control group was measured at –1 month, 3 months, and 9 months since the intervention began, and the outcome for an intervention group was measured at –1 month, 3 months, and 21 months since the intervention began, the effect size based on the 9-month and 21-month outcome measures would be ineligible.
  • Outcomes from ITS studies for which the time series was highly unstable and thus could not be reliably modeled, and outlier outcome measures that probably did not represent the true trend in HCP performance.

Search strategy

The literature search was conducted in two phases (see S1 File for details). In summary, we first searched results of the HCPPR, which is a comprehensive systematic review of the effectiveness of strategies to improve health worker performance in LMICs. The HCPPR study team searched 52 electronic databases for published studies and 58 document inventories for unpublished studies from 1960s–2016, screened personal libraries, asked colleagues for unpublished studies, and performed hand searches of 854 bibliographies from previous reviews. Second, we updated the HCPPR literature search with a focus on studies of QICs (search date was March 15, 2019). This update involved the search of electronic databases (S1 File, page 14), screening bibliographies of included study reports (referred to as “reports from additional sources” in Fig 1), and seeking reports from colleagues. There were no language restrictions.

Data collection

In the first phase of the review, a team of researchers assessed study eligibility, and each researcher screened studies independently. Before the screening began, concordance testing was conducted against a “gold standard” list of reports until at least 80% was identified by each researcher. In the second phase of the review, a pair of investigators (MET, EGE) independently assessed study eligibility, and discrepancies were reconciled in consultation with a third team member (AC). The study eligibility process was conducted using Covidence© from the Cochrane collaboration. Also, two investigators (AKR, SYR) assessed the eligibility of study reports that we received from colleagues. Data were extracted from the included studies independently by a pair of investigators (SYR, AKR) or researchers using a standardized form, and discrepancies were resolved through discussion. Before beginning data extraction, concordance testing of all data abstractors was conducted until the percent agreement between individual abstractors and a gold standard set of abstracted data (based on consensus by investigators SYR, AKR) was at least 80%. Data from each study were entered into a Microsoft Access database (Microsoft Inc., Redmond, Washington). Data elements included: study setting (where, when, HCP types, other contextual factors), study design, health conditions addressed, strategy description, outcome description, outcome measurements, the timing of outcome measurements in relation to the implementation of the strategy, effect sizes, sample sizes, sampling details, and data elements needed to assess risk of bias (RoB). If details regarding study characteristics or the QIC intervention were not available in study reports, we contacted study authors. Except for the purpose of meta-analysis, missing data were not imputed. For meta-analysis, we used estimates of standard errors of effect sizes that were available from the HCPPR database. A small proportion of the standard error estimates for percentage outcomes from the HCPPR database were based on imputed data (usually because sample size data were missing). Effect sizes with missing standard errors were excluded from meta-analysis.

Risk of bias (quality) assessment

We categorized RoB with methods based on guidance from the Cochrane EPOC Group [16]. RoB at the study level was categorized as low, moderate, high, or very high. We assessed the following RoB domains: number of clusters per study arm, completeness of dataset, balance in baseline outcome measurements, balance in baseline characteristics, reliability of outcomes, adequacy of concealment of allocation (where relevant), intervention unlikely to affect data collection, intervention plausibly independent of other changes, and number of data points before and after the intervention.

We used the Recommendations Assessment, Development, and Evaluation (GRADE) approach to assess the quality of evidence related to each of the key outcomes [17]. For assessments of the overall quality of evidence for each outcome, randomized studies, ITS studies, and other non-randomized studies started at “high quality”, “moderate quality” and “low quality” of evidence, respectively. Although the traditional approach is to start non-randomized studies as “low quality” [18], ITS studies with multiple periods and measurements during each period with no other limitations may constitute “moderate quality” of evidence [19, 20]. We downgraded the study one or two levels depending on the extent of violation across the following criteria: study limitations (RoB); indirectness of evidence; inconsistency; imprecision of effect estimates; or publication bias. If we did not find study limitations, we upgraded the evaluation of the quality of the evidence when the pooled estimates revealed negligible concerns about confounders, a strong dose-response gradient, or a large magnitude of effect. Considering a mean baseline health worker performance level at 40% for a process-of-care outcome expressed as a percentage, an absolute increase of 40% or more, representing a relative risk >2, allowed us to upgrade the quality of evidence by one level.

Data synthesis

Effect sizes were defined as absolute percentage-point differences; positive values meant improvement.

In non-ITS studies with pre- and post-intervention outcome measures, for outcomes that were dichotomous or expressed as a percentage, the effect size was calculated with Eq 1.


In non-ITS studies with pre- and post-intervention outcome measures, for outcomes that were continuous but not obviously bounded (e.g., a mortality rate), the effect size was calculated with Eq 2.


For ITS studies, segmented linear regression modeling was performed to estimate a summary effect size that incorporated both the level and trend effects. The summary effect size was the outcome level at the mid-point of the follow-up period as predicted by the regression model minus a predicted counterfactual value that equals the outcome level based on the pre-intervention trend extended to the mid-point of the follow-up period. This summary effect size was used because it allows the results of ITS studies to be combined with those of non-ITS studies.

To estimate strategy effectiveness from a single study comparison, the effect size was defined as the median of all effect sizes (MES) in the comparison for outcomes in the same outcome category. Results were stratified by HCP type (health facility-based vs. lay or community HCP).

For the primary analysis, we reported median, interquartile range, minimum, and maximum MES. The median effect size has been used in other systematic reviews of strategies to improve HCP performance [21, 22]. Median MES for strategy groups that were based on fewer than five study comparisons were not weighted, as weighting with small samples might cause the median to be a poor measure of central tendency when outliers are present. Median MES for strategy groups with five or more study comparisons were weighted, where the weight = 1 + the natural logarithm of the number of HCPs or (if the number of HCPs in a study was not reported) the number of service provision sites (e.g., health facilities) or (if the number of service provision sites was not reported) the number of administrative areas (e.g., districts) in the study. Strategy groups tested by at least three study comparisons were considered to have enough evidence to form generalizations—although caution is increasingly warranted as the minimum of three comparisons is approached. Strategy groups tested by only one or two study comparisons were interpreted separately.

In a secondary analysis, standard random-effects meta-analysis was used to estimate the weighted mean MES and 95% confidence interval (CI) of the mean MES of each strategy group. We used I2 as a measure of consistency for each meta-analysis, considering low heterogeneity <30%, moderate heterogeneity 30–60%, and high heterogeneity >60% [23]. We conducted a meta-analysis on one median effect size per study comparison for each outcome group, and we performed a sensitivity analysis considering all effect sizes individually to test consistency of the results.

Publication bias was assessed using Funnel Bias Assessment plots to conduct visual inspection for asymmetry for strategy-outcome groups with at least 10 studies.


During the first phase of the literature search, 216,477 citations were identified (S1 File). After screening and assessing eligibility, 46 reports from 25 studies were included (left side of Fig 1). In the second phase, which updated the search through 15 March 2019, 3207 articles were identified, and seven more reports from four studies were included after removing duplicates. Altogether, 53 reports from 29 studies with 30 study comparisons were included for this systematic review (Fig 1).

Description of included studies

The included studies were published between 2008 and 2019, from 12 LMICs in four continents. Most studies (24/29, 82.7%) were from Africa, three were from the Russian Federation, and one each was from Georgia and Mexico (Table 1). Most studies were ITS studies without controls (19/29, 72.4%), two were CBAs with randomized controls, three were CBAs with non-randomized controls, two were post-only CRTs, and one was an ITS study with controls.

Fig 2 presents the RoB of included studies individually by specific domains. Most studies (25/29, 86.2%) had a high or very high RoB. Two studies had a moderate RoB and two had a low RoB. The 30 study comparisons from 29 studies tested six different strategies that included QICs (Table 2). The most commonly tested QIC intervention had no additional strategy components (21 study comparisons). Other QIC interventions that were tested usually combined QIC with training, with or without additional components. The median study follow-up time was about one year.

Fig 2. Risk of bias of included studies: Summary and by domain item.

√ Yes/done; Unclear; X No/not done; NA Not Applicable. CBA (NRC): Controlled Before-After study with non-randomized controls; CBA (RC): Pre-post study with randomized controls; CITS: Controlled interrupted time series (with non-randomized controls); HCPFI: Health Care Professional-directed financial incentives; ITS: Interrupted time series; OMT: Other management techniques; POS-CRT: Post-only study-Cluster randomized trial; QIC: Quality Improvement Collaborative; R&G: Regulation and governance; S: Supervision; SI: Strengthening infrastructure; TR: Training.

Table 2. Number of comparisons and risk of bias by quality improvement collaborative strategy.

In our assessment of publication bias, no strategy-outcome group had the minimum of 10 studies. However, for the one strategy-outcome group with the most studies (QIC intervention, health worker practice outcomes expressed as a percentage, n = 9 studies), the funnel plot revealed no evidence of asymmetry (S2 File).

Effect of interventions

The findings are summarized in Table 3, which presents QIC intervention effectiveness in terms of median MES (left column) and mean MES (right column and S2 File) from the random effects meta-analysis. Individual effect sizes are presented in Table 1. We had five main findings. First, for the “QIC only” strategy, effectiveness varied highly across outcome groups. For patient behaviors not related to care-seeking, the effect was moderate (median MES: 17.6 percentage points) (Table 3, row 3). For patient health outcomes, there was essentially no effect (0.3 and 1.4 percentage points for percentage and continuous outcomes, respectively). The results ranged from modestly to highly effective for health worker practice outcomes (30.2 to 44.2 percentage points) and patient care-seeking outcomes (7.7 to 62.2 percentage points).

Second, for the “QCI + training” strategy for health facility-based HCPs, although there were only 4 studies, effectiveness was very high: MES 52.4 to 63.4 percentage points for health worker practice outcomes, 111.6 percentage points for patient health outcomes, and 87.7 percentage points for non-care-seeking patient behaviors (Table 3, rows 6–8). An additional study on a similar strategy (QIC + training + other management techniques) also found very high effectiveness (101.1 percentage points) for its one outcome on care-seeking patient behaviors.

Third, for the “QIC + training + strengthening infrastructure (bicycles for facilitators) + supervision + other management techniques (group process between HCP and community)” strategy, the one study found essentially no effect (MES 0.1 percentage points, for patient health outcomes) (Table 3, row 10). Fourth, for the “QIC + strengthening infrastructure (report cards) + regulation and governance (community scorecards)” strategy, the effectiveness from two studies ranged from essentially no effect (-2.8 percentage points, for non-care-seeking patient behaviors) to modest effect (9.5 percentage points, for care-seeking patient behaviors) (Table 3, rows 11–12).

Finally, the one study of lay health workers found highly variable results, ranging from essentially no effect (-0.9 percentage points, for care-seeking patient behaviors) to moderately large effects (18.7 percentage points, for non-care-seeking patient behaviors) to very large effects (50.4 percentage points, for health worker practice outcomes) (Table 3, rows 13–15).

Both the random effects meta-analysis considering one median effect size per study comparison for each outcome (Table 3), and the sensitivity analysis considering all effect sizes individually (S3 File) were consistent with the primary analysis. The certainty of the evidence according to GRADE criteria was low or very low for all strategy-outcome combinations, except for the effect of QIC + training on health worker practice outcomes for lay health workers (moderate certainty). However, as the result for this last group is based on only a single study, the generalizability is extremely limited.


This systematic review and meta-analysis on QICs in LMICs showed variable effectiveness across different outcomes and strategies. The quality of the evidence was mainly low or very low [17]. We found consistent results using different statistical approaches.

In summary, among studies of health facility-based HCPs, for the “QIC only” strategy, effectiveness varied highly across outcome groups, with no effect for patient health outcomes. For the “QIC + training” strategy, effectiveness might be very high for patient health outcomes, HCP practice outcomes, and care-seeking. Adding other management techniques to this strategy might also be highly effective for patient care-seeking behaviors. The effect of “QIC + training + strengthening infrastructure + supervision + other management techniques” or “QIC + strengthening infrastructure + regulation and governance” strategies seemed small to modest.

The only study assessing lay health workers showed effects that varied from essentially no effect on care-seeking patient behaviors to a large effect on non-care-seeking patient behaviors and HCP practice outcomes.

The main limitations of our systematic review were low quality of the evidence, scarce data on long-term effects, and heterogeneous outcomes. Also, some included studies came from unpublished gray literature, and several were conducted by the same group of authors. We attempted to address any potential imbalance in the quality of these studies by applying the same risk-of-bias assessment to all included studies. Furthermore, the random effects meta-analysis in this review was limited by the low quality of studies and wide diversity of outcomes. However, we believe meta-analysis as a secondary analysis tool provided useful complemental information about the direction, magnitude, and precision of intervention effects. Strengths of our review were that it was based on an extensive literature review from multiple sources, it used a single analytic framework with comparable effect sizes (as opposed to reporting different effect sizes, such as odds ratios and risk differences, from different studies), and it focused on LMIC settings. Its results can inform decision-making for health programs and intervention implementers with regards to which QIC-based interventions are most effective for improving which aspects of health systems in LMICs. Considering the small number of studies for each main comparison and the low quality of evidence, this review also highlights substantial evidence gaps and important opportunities for improvement in the conduct of future QIC studies.

Previous systematic reviews have approached the topic of QIC effectiveness in different ways and did not include several studies captured by our work [810]; nevertheless, they found similar effects and evidence gaps. Numerous potential determinants of QIC success were evaluated in a systematic review that did not include any of the primary studies included in our review, and only a few related to empirical effectiveness [24]. For example, some aspects of teamwork and participation in specific collaborative activities seem to improve short-term success, while sustainability of teams and continued data gathering enhanced the chances of long-term success. In a study currently underway, the impact of district-led learning on clinical practice and patient outcomes, communication, HCP motivation, and team dynamics are being explored [25, 26]. It would be desirable for future studies to examine what core components of QICs are related to patient- and provider-level outcomes.

Our findings clearly show that there is still not a solid evidence base on the effect of QICs in LMICs, although our results suggest that there are situations in which QICs could be considered. QICs are not static structures–rather, they have been implemented and adapted in a number of ways to achieve their stated aims. Some common adaptations include their use for generating new ideas and for empowering HCPs. Although based on relatively few studies, our review’s results suggest that combining QICs with training might be the most effective approach for implementing QICs.

Finally, on the recommendation for additional studies on QICs, we think that the ideal study design would be an interrupted time series with a randomized control group. The justification is that such a design would allow for an overall evaluation of intervention effectiveness as well as an evaluation of heterogeneity of effectiveness among sites. The design would also allow for a characterization of the effect over time. Other attributes include a follow-up time of at least 12 months, an objective data source for the evaluation (i.e., not only data collected by the QI teams unless the data quality is reasonably good and data quality does not change over time), a sample size that reflects real-world QICs (i.e., at least 20 facilities per study arm), qualitative and process evaluation components to describe how the intervention worked, a costing and economic evaluation, and an assessment of whether the intervention had any negative effects (e.g., drawing health workers’ attention to one aspect of care that decreases quality for other aspects of care).

In conclusion, the overall quality of the evidence on the effectiveness of QICs in LMICs was low. Based on the large and variable effect sizes seen in some outcome groups, additional research with high-quality studies is warranted to provide a more reliable and precise estimation of the effect of this promising intervention.

Supporting information

S2 File. Meta-analysis results, forest plots, and funnel plots.


S3 File. Sensitivity analysis and list of excluded studies.



This study was performed as part of the Lancet Global Health Commission on High Quality Health Systems in the SDG Era, which the authors thank. We also thank the HCPPR team members and Daniel Comandé and Cintia Spira from IECS for their assistance. This article is based upon information in the Health Care Provider Performance Review, a joint program of CDC, Harvard Medical School, World Health Organization, Management Sciences for Health, Johns Hopkins University, and the CDC Foundation.

Disclaimer: the findings and conclusions presented in this report are those of the authors and do not necessarily reflect the official position of the Centers for Disease Control and Prevention and the CDC Foundation.


  1. 1. The Millennium Development Goals Report 2015. New York: United Nations; 2015.
  2. 2. Sachs JD. From millennium development goals to sustainable development goals. Lancet. 2012;379(9832):2206–11. pmid:22682467
  3. 3. Kruk ME, Gage AD, Arsenault C, Jordan K, Leslie HH, Roder-DeWan S, et al. High-quality health systems in the Sustainable Development Goals era: time for a revolution. Lancet Glob Health. 2018;6(11):e1196–e252. pmid:30196093
  4. 4. Ovretveit J, Bate P, Cleary P, Cretin S, Gustafson D, McInnes K, et al. Quality collaboratives: lessons from research. Quality & Safety in Health Care. 2002;11(4):345–51.
  5. 5. Nadeem E, Olin SS, Hill LC, Hoagwood KE, Horwitz SM. Understanding the components of quality improvement collaboratives: a systematic literature review. Milbank Q. 2013;91(2):354–94. pmid:23758514
  6. 6. Institute for Healthcare Improvement. The Breakthrough Series: IHI’s Collaborative Model for Achieving Breakthrough Improvement. Cambridge, Massachusetts: Institute for Healthcare Improvement. 2003.
  7. 7. Kruk ME, Pate M, Mullan Z. Introducing The Lancet Global Health Commission on High-Quality Health Systems in the SDG Era. Lancet Glob Health. 2017;5(5):e480–e1. pmid:28302563
  8. 8. Schouten LM, Hulscher ME, van Everdingen JJ, Huijsman R, Grol RP. Evidence for the impact of quality improvement collaboratives: systematic review. BMJ. 2008;336(7659):1491–4. pmid:18577559
  9. 9. de Silva D. Improvement collaboratives in health care. London: Health Foundation, 2014.
  10. 10. Wells S, Tamir O, Gray J, Naidoo D, Bekhit M, Goldmann D. Are quality improvement collaboratives effective? A systematic review. BMJ Qual Saf. 2018;27(3):226–40. pmid:29055899
  11. 11. Rowe AK, Rowe SY, Peters DH, Holloway KA, Chalker J, Ross-Degnan D. Effectiveness of strategies to improve health-care provider practices in low-income and middle-income countries: a systematic review. The Lancet Global Health. 2018;6(11):e1163–e75. pmid:30309799
  12. 12. Higgins J, Green S, editors. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 [updated March 2011]. Chichester: The Cochrane Collaboration 2011.
  13. 13. Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gotzsche PC, Ioannidis JP, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. BMJ. 2009;339:b2700. pmid:19622552
  14. 14. Cochrane Effective Practice and Organization of Care (EPOC). What study designs should be included in an EPOC review? EPOC Resources for review authors. 2017 [23 January, 2018]. Available from:
  15. 15. World Bank Country and Lending Groups Washington, DC. 2018 [07/01/2018]. Available from:
  16. 16. Cochrane Effective Practice and Organization of Care (EPOC). Suggested risk of bias criteria for EPOC reviews. EPOC Resources for review authors. 2017 [updated January 2018]. Available from:
  17. 17. Guyatt G, Oxman AD, Akl EA, Kunz R, Vist G, Brozek J, et al. GRADE guidelines: 1. Introduction-GRADE evidence profiles and summary of findings tables. J Clin Epidemiol. 2011;64(4):383–94 pmid:21195583
  18. 18. Schünemann H, Brożek J, Guyatt G, Oxman A, (editors). Handbook for grading the quality of evidence and the strength of recommendations using the GRADE approach. 2013 [updated October 2013]. Available from:
  19. 19. Harder T, Abu Sin M, Bosch-Capblanch X, Bruno C, de Carvalho Gomes H, Duclos P, et al. Towards a framework for evaluating and grading evidence in public health. Health Policy. 2015;119(6):732–6. pmid:25863647
  20. 20. Schunemann HJ, Cuello C, Akl EA, Mustafa RA, Meerpohl JJ, Thayer K, et al. GRADE guidelines: 18. How ROBINS-I and other tools to assess risk of bias in nonrandomized studies should be used to rate the certainty of a body of evidence. J Clin Epidemiol. 2018.
  21. 21. Ivers N, Jamtvedt G, Flottorp S, Young JM, Odgaard‐Jensen J, French SD, O'Brien MA, Johansen M, Grimshaw J, Oxman AD. Audit and feedback: effects on professional practice and healthcare outcomes. Cochrane Database of Systematic Reviews 2012, Issue 6. Art. No.: CD000259. pmid:22696318
  22. 22. Holloway KA, Ivanovska V, Wagner AK, Vialle-Valentin C, D. R-D. Have we improved use of medicines in developing and transitional countries and do we know how to? Two decades of evidence. Tropical Medicine and International Health. 2013;18:656–64. pmid:23648177
  23. 23. Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ (Clinical research ed.). 2003;327(7414), 557–560. pmid:12958120
  24. 24. Hulscher ME, Schouten LM, Grol RP, Buchan H. Determinants of success of quality improvement collaboratives: what does the literature show? BMJ Qual Saf. 2013;22(1):19–31. pmid:22879447
  25. 25. Magge H, Chilengi R, Jackson EF, Wagenaar BH, Kante AM. Tackling the hard problems: implementation experience and lessons learned in newborn health from the African Health Initiative. BMC health services research. 2017;17(Suppl 3):829. pmid:29297352
  26. 26. Magge H, Garcia-Elorrio E, Liljestrand J, Hirschhorn L, Twum-Danso N, Roder-DeWan S, et al. From Spread to QI Institutionalization: The Adaptation of Improvement Collaborative Design and Aims in LMICs (in press).
  27. 27. N’Guessan J, Franco LM, Ackah A, Kouassi V, Gondwe T. Effects of collaborative improvement on PMTCT and ART indicators in Cote d’Ivoire: a comparative study. Bethesda, MD: University Research Co., LLC (URC), 2011.
  28. 28. N’Guessan J, Traore V, Boucar M, Ackah A, Dosso Y, Kouassi V, et al. Results from the pilot phase of an ART/PMTCT improvement collaborative in Cote d’Ivoire. Bethesda, MD: University Research Co., LLC (URC), 2011.
  29. 29. Chitashvili T, editor Addressing rational use of medication in pediatric patients with respiratory tract infections (RTI) through improvement collaborative in Georgia. Third Global Symposium on Health Systems Research; 2014 30 September to 3 October 2014; Cape Town, South Africa.
  30. 30. Chitashvili T. Rationale for improving integrated service delivery: reduced cost and improved care in Georgia. International Journal of Integrated Care. 2015;15(8).
  31. 31. Chitashvili T. Scaling Up, sustaining and institutionalizing better health care in Georgia: results and strategic recommendations from USAID support for improving quality of priority clinical conditions during 2012–2015. Technical Report. 2015.
  32. 32. Chitashvili T, Cherkezishvili E. Improving quality of care for respiratory tract infections in children: the role of capacity building and coaching in supporting one multi-facility improvement team in Samtredia district, Georgia. Submitted to a journal for publication to Lancet of Infectious Diseases 2017:22.
  33. 33. Chitashvili T CE, Broughton E, Chkhaidze I, Shengelia N, Hill K, Massoud MR, Ruadze E. Improving antibiotic prescription practices for pediatric respiratory tract infections in Georgia. Forthcoming (submitted in 2017 for publication to Lancet Infectious Diseases). 2017.
  34. 34. USAID. USAID ASSIST Project: Applying Science to Strengthen and Improve Systems (ASSIST) Project. Georgia Country Report FY14. Bethesda, MD: University Research Co., LLC (URC), 2014.
  35. 35. Singh K, Speizer I, Handa S, Boadu RO, Atinbire S, Barker PM, et al. Impact evaluation of a quality improvement intervention on maternal and child health outcomes in Northern Ghana: early assessment of a national scale-up project. Int J Qual Health Care. 2013;25(5):477–87. pmid:23925506; PubMed Central PMCID: PMC3888142.
  36. 36. Twum-Danso NA, Akanlu GB, Osafo E, Sodzi-Tettey S, Boadu RO, Atinbire S, et al. A nationwide quality improvement project to accelerate Ghana's progress toward Millennium Development Goal Four: design and implementation progress. Int J Qual Health Care. 2012;24(6):601–11. pmid:23118097.
  37. 37. Twum-Danso NA, Dasoberi IN, Amenga-Etego IA, Adondiwo A, Kanyoke E, Boadu RO, et al. Using quality improvement methods to test and scale up a new national policy on early post-natal care in Ghana. Health Policy Plan. 2014;29(5):622–32. pmid:23894073.
  38. 38. Cofie LE, Barrington C, Akaligaung A, Reid A, Fried B, Singh K, et al. Integrating community outreach into a quality improvement project to promote maternal and child health in Ghana. Glob Public Health. 2014;9(10):1184–97. pmid:25204848; PubMed Central PMCID: PMC4310571.
  39. 39. Singh K, Brodish P, Speizer I, Barker P, Amenga-Etego I, Dasoberi I, et al. Can a quality improvement project impact maternal and child health outcomes at scale in northern Ghana? Health Res Policy Syst. 2016;14(1):45. pmid:27306769; PubMed Central PMCID: PMC4910198.
  40. 40. Afari H, Hirschhorn LR, Michaelis A, Barker P, Sodzi-Tettey S. Quality improvement in emergency obstetric referrals: qualitative study of provider perspectives in Assin North District, Ghana. BMJ Open. 2014;4(5):e005052. pmid:24833695; PubMed Central PMCID: PMC4025473.
  41. 41. Speizer IS, Story WT, Singh K. Factors associated with institutional delivery in Ghana: the role of decision-making autonomy and community norms. BMC Pregnancy Childbirth. 2014;14:398. pmid:25427853; PubMed Central PMCID: PMC4247879.
  42. 42. Colbourn T, Nambiar B, Bondo A, Makwenda C, Tsetekani E, Makonda-Ridley A, et al. Effects of quality improvement in health facilities and community mobilization through women's groups on maternal, neonatal and perinatal mortality in three districts of Malawi: MaiKhanda, a cluster randomized controlled effectiveness trial. Int Health. 2013;5(3):180–95. pmid:24030269; PubMed Central PMCID: PMC5102328.
  43. 43. Colbourn T, Pulkki-Brannstrom AM, Nambiar B, Kim S, Bondo A, Banda L, et al. Cost-effectiveness and affordability of community mobilisation through women's groups and quality improvement in health facilities (MaiKhanda trial) in Malawi. Cost Eff Resour Alloc. 2015;13(1):1. pmid:25649323; PubMed Central PMCID: PMC4299571.
  44. 44. Colbourn T, Nambiar B, Costello A, MaiKhanda A. Final Evaluation Report. The impact of quality improvement at health facilities and community mobilisation by women’s groups on birth outcomes: an effectiveness study in three districts of Malawi. 2013.
  45. 45. Barcelo A, Cafiero E, de Boer M, Mesa AE, Lopez MG, Jimenez RA, et al. Using collaborative learning to improve diabetes care and outcomes: the VIDA project. Prim Care Diabetes. 2010;4(3):145–53. pmid:20478753.
  46. 46. Crigler L, Boucar M, K. S, Abdou S, Djibrina S, Saley Z. The Human Resources Collaborative: Improving Maternal and Child Care in Niger. Final Report. 2012.
  47. 47. USAID. USAID Health Care Improvement Project. Strengthening human resources for health to improve maternal care in Niger's Tahoua region. Bethesda, MD: University Research Co., LLC (URC), 2011.
  48. 48. Oyeledun B, Phillips A, Oronsaye F, Alo OD, Shaffer N, Osibo B, et al. The Effect of a Continuous Quality Improvement Intervention on Retention-In-Care at 6 Months Postpartum in a PMTCT Program in Northern Nigeria: Results of a Cluster Randomized Controlled Study. J Acquir Immune Defic Syndr. 2017;75 Suppl 2:S156–S64. pmid:28498185.
  49. 49. Osibo B, Oronsaye F, Alo OD, Phillips A, Becquet R, Shaffer N, et al. Using small tests of change to improve PMTCT services in Northern Nigeria: Experiences from implementation of a continuous quality improvement and breakthrough series program. Journal of Acquired Immune Deficiency Syndromes. 2017;75:S165–S72. pmid:28498186
  50. 50. Oyeledun B, Oronsaye F, Oyelade T, Becquet R, Odoh D, Anyaike C, et al. Increasing retention in care of HIV-positive women in PMTCT services through continuous quality improvement-breakthrough (CQI-BTS) series in primary and secondary health care facilities in Nigeria: a cluster randomized controlled trial. The Lafiyan Jikin Mata Study. J Acquir Immune Defic Syndr. 2014;67 Suppl 2:S125–31. pmid:25310118.
  51. 51. USAID Health Care Improvement Project. The Improvement Collaborative: An Approach to Rapidly Improve Health Care and Scale Up Quality Services. Bethesda, MD: University Research Co., LLC (URC), 2008 June 2008. Report No.
  52. 52. Massoud M. Applying modern quality improvement methodology to maternal and child health in Tver Oblast, Russian Federation. QA Brief. 2001;9(2):28–32.
  53. 53. Abdallah H, Chernobrovkinam O, Korotkova A, Massoud R, Burkhalter B. Improving the quality of care for women with pregnancy-induced hypertension reduces costs in Tver, Russia. Operations Research Results 2(4). Bethesda, MD: Agency for International Development (USAID), 2002.
  54. 54. Ethier K. Developing evidence-based standards for pregnancy-induced hypertension in Russia. Quality Assurance Project Case Study. Bethesda, MD: Agency for International Development (USAID), 2001.
  55. 55. Catsambas TT, Franco LM, Gutmann M, Knebel E, Hill P, Lin Y-S, et al. Evaluating health care collaboratives: the experience of the Quality Assurance Project. Bethesda, MD: University Research Co., LLC (URC), 2008.
  56. 56. Franco L, Marquez L, Ethier K, Balsara Z, Isenhower W. Results of collaborative improvement: effects on health outcomes and compliance with evidence-based standards in 27 applications in 12 countries. 2009.
  57. 57. Franco LM, Marquez L. Effectiveness of collaborative improvement: evidence from 27 applications in 12 less-developed and middle-income countries. BMJ Quality & Safety. 2011;20(8):658–65. pmid:21317182.
  58. 58. Furth R, Gass R, Kagubare J. Rwanda human resources assessment for HIV/AIDS services scale up: summary report. Operations Research Results. 2006.
  59. 59. Ngidi W, Reddy J, Luvuno Z, Rollins N, Barker P, Mate KS. Using a campaign approach among health workers to increase access to antiretroviral therapy for pregnant HIV-infected women in South Africa. J Acquir Immune Defic Syndr. 2013;63(4):e133–9. pmid:23514955.
  60. 60. Wittcoff A, Furth R, Nabwire J, Crigler L. Baseline assessment of HIV service provider productivity and efficiency in Uganda. Technical Report. 2010.
  61. 61. Jaribu J, Penfold S, Manzi F, Schellenberg J, Pfeiffer C. Improving institutional childbirth services in rural Southern Tanzania: a qualitative study of healthcare workers' perspective. BMJ Open. 2016;6(9):e010317. Epub 2016/09/24. pmid:27660313; PubMed Central PMCID: PMC5051329.
  62. 62. Jaribu J, Penfold S, Green C, Manzi F, Schellenberg J. Improving Tanzanian childbirth service quality. Int J Health Care Qual Assur. 2018;31(3):190–202. Epub 2018/04/25. pmid:29687759; PubMed Central PMCID: PMC5974692.
  63. 63. Broughton E, Saley Z, Boucar M, Alagane D, Hill K, Marafa A, et al. Cost-effectiveness of a quality improvement collaborative for obstetric and newborn care in Niger. International Journal of Health Care Quality Assurance. 2013;26(3):250–61. pmid:23729128.
  64. 64. Franco L, Webb L. Niger Site Visit Report. Unpublished report prepared for the U.S. Agency for International Development (USAID) by the USAID Health Care Improvement Project and the Quality Assurance Project. 2008.
  65. 65. Westercamp N, Staedke S, Hutchinson E, Naiga S, Nabirye C, Taaka L, et al., editors. Effectiveness and sustainability of a collaborative improvement method to increase the quality of routine malaria surveillance data in Kayunga District, Uganda. 66th Annual Meeting of the American Society of Tropical Medicine and Hygiene;; 2017 Nov 5–9; Baltimore, MD.
  66. 66. Hutchinson E, Nayiga S, Nabirye C, Taaka L, Westercamp N, Rowe A, et al. Opening the 'Black Box' of collaborative improvement: a qualitative evaluation of a pilot intervention to improve quality of surveillance data in public health centres in Uganda. 2017.
  67. 67. Fatuma A. The Republic of Uganda, Kayunga District Local Government: 3-year district development plan, 2010/2011. 2010 Apr 282010.
  68. 68. Horwood C, Butler L, Barker P, Phakathi S, Haskins L, Grant M, et al. A continuous quality improvement intervention to improve the effectiveness of community health workers providing care to mothers and children: a cluster randomised controlled trial in South Africa. Hum Resour Health. 2017;15(1):39. pmid:28610590; PubMed Central PMCID: PMC5470211.
  69. 69. Horwood CM, Youngleson MS, Moses E, Stern AF, Barker PM. Using adapted quality-improvement approaches to strengthen community-based health systems and improve care in high HIV-burden sub-Saharan African countries. AIDS. 2015;29 Suppl 2:S155–64. pmid:26102626.
  70. 70. Webster PD, Sibanyoni M, Malekutu D, Mate KS, Venter WDF, Barker PM, et al. Using quality improvement to accelerate highly active antiretroviral treatment coverage in South Africa. BMJ Quality and Safety. 2012;21 (4):315–24. pmid:22438327.
  71. 71. Waiswa P, Manzi F, Mbaruku G, Rowe A, Marx M, Tomson G, et al. Effects of collaborative quality improvement on maternal and newborn health care in Tanzania and Uganda. The Expanded Quality Management Using Information Power (EQUIP) quasi-experimental study. Implementation Science. 2017;(1).
  72. 72. Tancred T, Mandu R, Hanson C, Okuga M, Manzi F, Peterson S, et al. How people-centred health systems can reach the grassroots: experiences implementing community-level quality improvement in rural Tanzania and Uganda. Health Policy Plan. 2018;33(1):e1–e13. pmid:29304250.
  73. 73. Marchant T, Schellenberg J, Peterson S, Manzi F, Waiswa P, Hanson C, et al. The use of continuous surveys to generate and continuously report high quality timely maternal and newborn health data at the district level in Tanzania and Uganda. Implement Sci. 2014;9:112. pmid:25149316; PubMed Central PMCID: PMC4160540.
  74. 74. Hanson C, Waiswa P, Marchant T, Marx M, Manzi F, Mbaruku G, et al. Erratum to: Expanded Quality Management Using Information Power (EQUIP): protocol for a quasi-experimental study to improve maternal and newborn health in Tanzania and Uganda. Implement Sci. 2015;10:152. pmid:26515014; PubMed Central PMCID: PMC4627429.
  75. 75. Hanson C, Waiswa P, Marchant T, Marx M, Manzi F, Mbaruku G, et al. Expanded Quality Management Using Information Power (EQUIP): protocol for a quasi-experimental study to improve maternal and newborn health in Tanzania and Uganda. Implement Sci. 2014;9(1):41. pmid:24690284; PubMed Central PMCID: PMC4230245.
  76. 76. Hanson C, Marchant T, Waiswa P, Manzi F, Schellenberg J, Willey B, et al. Overcoming low implementation levels for essential maternal and newborn health interventions: Results from the equip project using systemic quality improvement in Tanzania and Uganda. International Journal of Gynecology and Obstetrics. 2015;131:E331.
  77. 77. Baker U, Hassan F, Hanson C, Manzi F, Marchant T, Swartling Peterson S, et al. Unpredictability dictates quality of maternal and newborn care provision in rural Tanzania-A qualitative study of health workers' perspectives. BMC Pregnancy Childbirth. 2017;17(1):55. pmid:28166745; PubMed Central PMCID: PMC5294891.
  78. 78. Tancred T, Manzi F, Schellenberg J, Marchant T. Facilitators and Barriers of Community-Level Quality Improvement for Maternal and Newborn Health in Tanzania. Qual Health Res. 2017;27(5):738–49. pmid:27022034.
  79. 79. The Health Foundation. Process evaluation of community intervention: baseline report on the experiences of women volunteers as women's group facilitators. In: Centre for International Health & Development, Institute of Child Health. UCL (UK)2008. p. 45 p. Report No.: 20080924 (v1.4 Final) WGF Report.