Reliability and Validity of Instruments for Assessing Perinatal Depression in African Settings: Systematic Review and Meta-Analysis

Background A major barrier to improving perinatal mental health in Africa is the lack of locally validated tools for identifying probable cases of perinatal depression or for measuring changes in depression symptom severity. We systematically reviewed the evidence on the reliability and validity of instruments to assess perinatal depression in African settings. Methods and Findings Of 1,027 records identified through searching 7 electronic databases, we reviewed 126 full-text reports. We included 25 unique studies, which were disseminated in 26 journal articles and 1 doctoral dissertation. These enrolled 12,544 women living in nine different North and sub-Saharan African countries. Only three studies (12%) used instruments developed specifically for use in a given cultural setting. Most studies provided evidence of criterion-related validity (20 [80%]) or reliability (15 [60%]), while fewer studies provided evidence of construct validity, content validity, or internal structure. The Edinburgh postnatal depression scale (EPDS), assessed in 16 studies (64%), was the most frequently used instrument in our sample. Ten studies estimated the internal consistency of the EPDS (median estimated coefficient alpha, 0.84; interquartile range, 0.71-0.87). For the 14 studies that estimated sensitivity and specificity for the EPDS, we constructed 2 x 2 tables for each cut-off score. Using a bivariate random-effects model, we estimated a pooled sensitivity of 0.94 (95% confidence interval [CI], 0.68-0.99) and a pooled specificity of 0.77 (95% CI, 0.59-0.88) at a cut-off score of ≥9, with higher cut-off scores yielding greater specificity at the cost of lower sensitivity. Conclusions The EPDS can reliably and validly measure perinatal depression symptom severity or screen for probable postnatal depression in African countries, but more validation studies on other instruments are needed. In addition, more qualitative research is needed to adequately characterize local understandings of perinatal depression-like syndromes in different African contexts.

Adewuya and colleagues [6] Consecutive sample of 876 women attending a six-week postnatal appointment at five health centres in the semi-urban town of Ilesa, Nigeria were recruited, along with 900 matched women from the general medical practice. Literate women self-administered English or Yoruba versions of the EPDS and 21-item BDI, while illiterate women were administered these scales by one of the study authors. Two psychiatrists blinded to the EPDS and BDI scores administered the SCID to all women with EPDS ≥9 or BDI ≥10 and a randomly selected sample of 875 women with EPDS <9 and BDI <10 to establish the DSM-III reference criterion diagnoses or major or minor depressive disorder.

EPDS BDI
The EPDS had an internal consistency of 0.89. There was a statistically significant association between the EPDS and BDI scores (Spearman's rho=0.46, P<0.001). EPDS ≥9 had 0.94 sensitivity and 0.97 specificity for detecting major or minor depression (AUC=0.99), and 1.00 sensitivity and 0.89 specificity for detecting major depression only. BDI ≥10 had 0.89 sensitivity and 0.97 specificity for detecting major or minor depression (AUC=0.92).
Adewuya [7] Consecutive sample of 478 women in the semi-urban town of Ilesa, Nigeria were recruited from five health centers on the day after delivery. At 5 days postnatally, women self-administered English or Yoruba versions of the MBS and the EPDS. At weeks 4 and 8 postnatally, two trained psychiatrists administered the SADS to establish the reference criterion diagnosis, and participants selfadministered the EPDS.
Adewuya and colleagues [8] Consecutive sample of 182 women in the semi-urban town of Ilesa, Nigeria were recruited from five health centers in late pregnancy (greater than 32 weeks' gestation). Literate women self-administered English or Yoruba versions of the EPDS, while illiterate women were administered the EPDS by a trained research assistant. All women with EPDS ≥6, and a random subset of 11 women with EPDS <6, underwent a MINI diagnostic assessment by a psychiatrist blinded to the EPDS scores to establish the reference criteria of major and minor depressive disorder.
Agoub and colleagues [9] Convenience sample of 144 women in Morocco who had given birth in the two months prior. Participants were recruited at first postnatal visit 15-20 days after delivery and reassessed at 6 weeks, 6 months, and 9 months. The Arabic version of the EPDS was selfadministered, verbally administered for illiterate participants. The Moroccan Colloquial Arabic version of the MINI was used to establish the reference criterion diagnosis of major depressive disorder.
EPDS EPDS ≥12 had 0.92 sensitivity and 0.96 specificity for detecting major depressive disorder.
Baggaley and colleagues [10] 61 women were administered West African French and locallanguage (Moore and Dioula) versions of the K10/K6 at 3 and 6 months postnatally. A local psychiatrist blinded to the K10/K6 score conducted diagnostic interviews to establish the reference criterion diagnosis of depression consistent with ICD-10 criteria.
Bass and colleagues [11] A convenience sample of 80 women in Kinshasa, DRC who had given birth to a child within the previous two years was asked to free-list problems faced by new mothers in the first postnatal year. A convenience sample of 14 key informants (traditional healers, ministers, marriage counselors, local older women) were interviewed in-depth about postnatal mental health problems. The qualitative data were used to develop a new screening instrument based on the EPDS and HSCL-15; two poorly understood items were dropped (from the EPDS) and seven non-overlapping items identified as being salient symptoms of a local syndrome (malady ya souci) were added. A purposive sample of 133 women who had recently given birth to a child was administered the 23-item total symptom scale. For the reference criterion, 'caseness' was established if there was agreement between the participant and key informant about whether she had malady ya souci. A random sample of women were re-interviewed within 3 days of the first assessment. De Bruin and colleagues [12] A sample of all 147 women living in a peri-urban settlement near Cape Town, South Africa who had recently given birth were administered the isiXhosa version of the EPDS.

EPDS
The EPDS had an internal consistency of 0.89. Maximum likelihood confirmatory factor analysis was used to test two measurement models: (a) a one-dimensional construct, and (b) two correlated factors of depressive feelings and cognitive anxiety. The root mean square error of approximation values indicated that both the one-and two-factor models provided a satisfactory fit to the data and that the two-factor model only provided a marginally improved fit.
Chibanda and colleauges [13] A random sample of 210 women attending two primary care clinics in peri-urban Zimbabwe were interviewed at 6 weeks postnatally. Trained community counselors administered a Shona version of the EPDS. Two psychiatrists blinded to the EPDS scores assessed the women using DSM-IV criteria to establish the reference criterion diagnosis of major depression.

EPDS
The internal consistency of the EPDS was 0.87. EPDS ≥12 had 0.88 sensitivity and 0.89 specificity for detecting major depression (AUC=0.82).
Hanlon and colleauges [14] A community-based sample of 1285 consecutive women in rural Ethiopia were interviewed during the perinatal period and administered the Amharic 10-item EPDS, 20-item SRQ, and 29-item culturally modified SRQ-F across several studies. The CPRS was administered by psychiatry trainees to establish the reference criterion of clinically significant psychiatric morbidity. Hartley and colleagues [15] All pregnant women (18+ years) in 24 neighborhoods of a peri-urban settlement near Cape Town, South Africa were recruited for participation in a randomized trial of family health. They were administered the isiXhosa version of the EPDS.

EPDS
The EPDS had an internal consistency of 0.87. EPDS ≥14 was associated with single motherhood, being unemployed, low income, alcohol use, experience of intimate partner violence, and poor financial and/or social support (all P<0.05).
Kaaya and colleagues [16] A sample of 903 HIV-positive pregnant women enrolled in a randomized controlled trial in urban Tanzania were administered the 25-item HSCL. Psychiatrists blinded to the HSCL scores evaluated a subset of 100 participants using the SCID to establish the reference criterion diagnosis of major depressive disorder. Kaaya and colleagues [17] In-depth interviews were conducted with 10 key informants, including women's group and village leaders, traditional healers, village health workers, and with 10 women identified by key informants as experiencing symptoms of locally defined syndromes. These interviews produced 30 different local idioms of depressive and anxiety symptoms, which were administered in Kiswahili along with 17 of the semantically and conceptually distinct items from the HSCL to a convenience sample of 787 pregnant women attending an antenatal clinic in Tanzania at 28-36 weeks gestation. Intra-class correlation coefficients were used to estimate both inter-rater reliability (using a subsample of 21 participants) and test-retest reliability (using a subsample of 12 participants interviewed 1 week after the initial interview).

DSQ-19
Stepwise forward logistic regression was used to select 19 of the 47 items for inclusion in the new scale, the DSQ-19. The DSQ-19 had an internal consistency of 0.84. The intra-class correlation between interviewers was 0.89, and the intra-class correlation within participants was 0.82. Principal components analysis revealed a single factor. DSQ-19 scores had statistically significant correlations with the SF-36, satisfaction with economic wellbeing, household decision-making power, and marital status (P-values not reported).
Kaaya and colleagues [18] Qualitative interviews were conducted in Kiswahili with a purposive sample of 12 traditional practitioners and 10 women previously affected by depression living in a peri-urban region near Dar es Salaam, Tanzania. Axial coding was used to identify and describe clusters of idioms of distress.
One categorization of pregnancy that emerged from the interviews, the "problematic pregnancy," framed women's recollection of distress. The data suggested existence of a construct with similarities to Western-based psychological understandings of depression.
Lawrie and colleauges [19] A consecutive sample of 103 women attending a 6-week postnatal clinic at a hospital in urban South Africa were administered the EPDS in one of six South African languages. A physician blinded to the EPDS scores conducted structured psychiatric interviews guided by DSM-IV criteria and the MADRS to establish the reference criterion diagnoses of major and minor depression.
EPDS EPDS ≥12 had 1.00 sensitivity and 0.68 specificity for detecting major depression, and 0.80 sensitivity and 0.77 specificity for detecting major and minor depression combined.
Lee and colleagues [20] The 25-item HSCL was administered in Kiswahili to a convenience sample of 787 pregnant women attending an antenatal clinic in Tanzania at 28-36 weeks gestation. Intra-class correlation coefficients were used to estimate both inter-rater reliability (using a subsample of 21 participants) and test-retest reliability (using a subsample of 12 participants interviewed 1 week after the initial interview).

HSCL-15
Internal consistency was 0.90 for the HSCL-25 and 0.88 for the 15item depression subscale. The intra-class correlation between interviewers was 0.85, and the intra-class correlation within participants was 0.85. Principal components analysis revealed a single factor. The HSCL-25 and HSCL-15 had statistically significant correlations with 4 dimensions of the SF-36 (P-values not reported).
Nhiwatiwa and colleauges [21] A consecutive sample of 500 pregnant women seen at antenatal clinics or by traditional birth attendants in the community were enrolled at 32 weeks' gestation into a prospective cohort study in peri-urban Zimbabwe. The 14-item SSQ was administered to all participants. For all 95 women with SSQ ≥8, and for a random sample of 105 women with SSQ <8, the Shona version of the CISR was administered by a psychiatrist at 6-8 weeks postnatally to establish the reference criterion of psychiatric 'caseness'.
SSQ SSQ had statistically significant associations with poor intimate partner relationship quality and worsened health status. SSQ ≥8 at 32 weeks' gestation had 0.82 sensitivity and 0.66 specificity for detecting psychiatric 'caseness' (most commonly postnatal depression).
Rochat [22] This study recruited a consecutive sample of 112 women (16+ years) attending a first (for the current pregnancy) antenatal visit and testing for HIV as part of a PMTCT program in rural KwaZulu-Natal, South Africa. They were administered the EPDS, along with the SCID to establish the reference criterion diagnosis of major depressive episode. A subset of 55 consecutive women (27 HIV-positive and 28 HIV-negative) were recruited to participate in a qualitative substudy. Zulu-speaking research assistants, clinical psychologists, and a psychiatric nurse participated in a one-day translation workshop to guide translation of the EPDS and SCID into Zulu.

EPDS
Focus group participants identified the IsiZulu word ingcindezi ("for something to be pressing down on you or weighing down on you and your emotions") as a word commonly used in this population to describe depression and its social and psychological sequelae. The EPDS had an internal consistency of 0.61. EPDS ≥13 had 0.69 sensitivity and 0.78 specificity for detecting major depression (AUC=0.73). In the qualitative study, interpersonal conflict, unwanted pregnancy, and testing positive for HIV were prominent themes leading to emotional distress.
Spies and colleagues [23] The sample of 129 women with low-risk pregnancies at less than 20 weeks' gestation presenting for a first antenatal visit at midwife obstetric units in peri-urban South Africa was drawn from a larger prospective cohort study. The K10 was self-administered in English or Afrikaans. A researcher administered the SCID to establish the reference criterion diagnosis of current major depressive episode. K10 K10 ≥21.5 had 0.73 sensitivity and 0.54 specificity for detecting a current major depressive episode (AUC=0.66).

Taiwo and Olayinka [24]
A convenience sample of 256 women who gave birth at a teaching hospital in north central Nigeria self-administered the English or Hausa version of the EPDS at 6 weeks postnatally. All 24 women with EPDS ≥12, and a random sample of 38 women with EPDS <12, underwent a clinical diagnostic interview by a psychiatrist to establish the reference criterion diagnosis of major depression.

EPDS
EPDS ≥7 had 0.72 sensitivity and 0.62 specificity for detecting major depression.
Tesfaye and colleagues [25] The EPDS and K6/K10 were administered to a pilot sample of 30 postnatal women attending vaccination clinics at a primary health care center in Addis Ababa, Ethiopia, and they were further probed to explore potentially unclear items. A convenience sample of 100 postnatal women (18- Uwakwe and Okonkwo [26] A consecutive sample of 225 women (18-29 years) who were either in the maternity ward of a peri-urban Nigerian teaching hospital on postnatal day 7 or attending its postnatal clinic self-administered the English or Igbo versions of the EPDS and 20-item ZDS. Illiterate patients were administered the scales by resident physicians. A psychiatrist conducted a clinical diagnostic interview, guided by a modified version of the ICD-10 Symptom Checklist, to establish the reference criterion diagnosis of depression.

EPDS ZDS
Internal consistency was 0.83 for the EPDS and 0.79 for the ZDS. EPDS ≥9 had 0.75 sensitivity and 0.97 specificity for detecting depression.
Weobong and colleagues [27] Among women (15-45 years) participating in an ongoing randomized, controlled trial, those from a single district in rural Ghana were interviewed at 5-11 weeks postnatally using the SRQ, EPDS, and PHQ-9. A clinical psychologist blinded to the results of the SRQ, EPDS, and PHQ-9 conducted clinical assessments using the CPRS to establish the reference criterion of psychiatric 'caseness'. The intra-class correlation coefficient was used to estimate test-retest reliability in a subset of 40 women who were reinterviewed 2 weeks after the initial interview.