Nearly 275 000 maternal deaths occurred globally in 2011, nearly all of which took place in low– and middle–income countries (LMIC) . Most of these countries did not reduce maternal mortality to levels targeted in the Millennium Development Goals (MDG5) . Progress has been hindered, in part, by a lack of reliable maternal health data, especially on maternal deaths . Measurement challenges are particularly significant in LMIC with irregular and incomplete health system reporting.
To measure progress in maternal health, monitoring agencies have relied on tracking indicators proposed as measures of quality of care, such as the proportion of births attended by a skilled birth attendant, that are assumed to be strongly correlated with maternal mortality . Such indicators are routinely assessed in population––based household survey programs, such as the Demographic and Health Surveys (DHS) and Multiple Indicator Cluster Surveys (MICS), in which female respondents report on events surrounding recent births . Despite their widespread use, the majority of proposed quality of care indicators, including skilled birth attendance, have not been validated [1,5,6]. In fact, numerous researchers have noted the lack of correlation between these indicators and maternal mortality levels [5,7–9]. These researchers argue that information on the category of provider at birth is deficient as a measure of quality of care as it relies on assumptions about provider training and competence as well as access to essential supplies and equipment. It is important therefore to identify alternate indicators that describe the actual content of care, can be reported with accuracy, and have the potential to be included in routine data collection programs.
A growing, but still limited, body of research has examined the validity of indicators of the quality of care in the intrapartum and early postpartum period. To our knowledge, however, no study has yet reported on how accurately women can recall the skill level of their provider at birth, although there have been some attempts to look at data quality issues . Furthermore, the few validation studies that have taken place have generally compared maternal self–reports with hospital records, which may be incomplete or inaccurate, or have been conducted in high–income settings, where maternal mortality rates are generally low [11–15].
To address this gap, this study assessed women’s ability to report on a set of quality of maternal and newborn health care indicators that are either currently in use or have the potential to be included in routine survey–based data collection. In spite of its limitations, it seems likely that the “skilled birth attendance” indicator will continue to be used and so we assess how accurately women report on the skill level of their provider during delivery. We compare women’s self–reports of maternal and newborn care received against third party observations during labor and delivery. Finally, we provide suggestions for modifications to data collection procedures that could improve the measurement of maternal and newborn health care.
Validation exercises were conducted in two high volume public hospitals located in Kisumu District and Kiambu District in western and central Kenya, respectively. According to the 2014 Kenya DHS, nationally, 61% of births in the five years preceding the survey were delivered in a health facility; in Kisumu and Kiambu districts the prevalence was 70% and 93%, respectively . Facility–based delivery is less likely among older women, those who have lower education, are poorer, or reside in rural areas . Fertility levels among women in the two districts are lower than the national rate, with the total fertility rate in Kisumu at 3.6 births per woman and in Kiambu at 2.7, compared with 3.9 nationally .
Data collection took place from July to September 2013. All pregnant women aged 15 to 44 who were admitted to a study facility maternity unit and in early labor were invited to participate. Participants included eligible women who underwent labor and delivery and were able to provide consent.
Our reference standard for validity analysis is data collected by trained researchers who observed providers in the maternity admission room and labor and delivery rooms using a structured checklist–type form. Observers were registered Kenyan nurse/midwives with at least three years of experience in a maternal and newborn health unit and previous research experience. Observations were used as the reference standard as they reflected all facets of caregiving including events related to the birth itself as well as interactions between the women and provider, before, during and up to one hour after delivery. In the few cases in which clarification was needed (eg, in the event the mother and infant were taken into separate rooms, the observer remained with the mother) observations were supplemented by checking facility records and by asking providers.
Exit interviews with women took place prior to hospital discharge. Data collectors who were degree holders in a social science interviewed women using a structured questionnaire. Interview questionnaires were translated into Kiswahili, Dholuo and Kikuyu and were administered in the woman’s language of preference.
All data collectors received four days of intensive training on the study procedures, the rationale behind each element of the client questionnaire and observation checklist to ensure full understanding of the instrument components, and how to record responses and observations.
Written informed consent was obtained from all participants and their attending providers prior to participation. All women and providers were provided with a description of the study and procedures, including their right to refuse participation at any time. In Kenya, pregnant adolescents between ages 15–17 years are considered “emancipated minors” and their written informed consent was also obtained [17–19]. Staff who provide labor and delivery care were identified by the hospitals’ obstetrics and gynaecology director, and approached for recruitment and consent. No providers refused participation.
Prior to participant enrollment, the study protocol was approved by the ethical review committees of the Population Council and the Kenya Medical Research Institute (KEMRI).
To identify indicators to be validated, a landscaping scan of published and grey literature was conducted from April to July 2012. The scan focused on indicators of the quality and content of care received during labor and delivery and the health outcomes related to this period . Indicators were included if they were currently in use or proposed for use in household survey programs such as the DHS and MICS or reflected standard practices of maternal and newborn labor and delivery care. The scan yielded a list of 285 indicators. This list was assessed by a group of public health experts specializing in maternal health to select a set of 80 indicators for validity testing. Indicators were selected on the basis of their wide use and/or potential to assess the critical elements of maternal and newborn care during the initial assessment of the woman, the first, second and third stages of labor, and immediate postnatal period.
Sample size was calculated assuming 50% prevalence for all indicators, given that some harmful practices would rarely occur, and some beneficial practices would almost always occur, at 60% sensitivity ±6% precision, 70% specificity ±6% precision, with type 1 error set at α = 0.05 assuming a normal approximation to a binomial distribution. These specifications imply a minimal sample size of 500, which was increased to 600 women to allow for 20% attrition in a separate study to re–interview women approximately one year following delivery.
Statistical analysis was performed using Stata Version 12  to assess indicator accuracy at the individual and population level. For individual–level reporting accuracy, we calculated the sensitivity (ie, true positive rate) and specificity (ie, true negative rate) of indicators by constructing two–by–two tables for each indicator that had at least five counts per cell .
Missing pairwise data were excluded. To summarize the accuracy of each indicator, we quantified the area under the receiver operating curve (AUC), which plots the sensitivity (ie, true positive rate) of each indicator against its false positive rate (1–specificity). To measure uncertainty associated with validity, we estimated 95% confidence intervals (CI), assuming a binomial distribution. In practice, the AUC represents the “average accuracy of a diagnostic test” [23,24]. AUC values range from 1.0 (perfect classification accuracy) to 0 (zero accuracy). An AUC value of 0.5 is the equivalent of a random guess.
To assess the population–based validity of indicators, we estimated the inflation factor (IF), also known as the Test to Actual Positives (TAP) ratio . The IF reflects the prevalence of the indicator as it would be reported by women in a survey after accounting for sensitivity and specificity (Pr) divided by the true prevalence (ie, observer report) (P). By comparing the ratio of the estimated survey–based prevalence to its true prevalence, we calculated the degree to which each indicator would be over or under–estimated by women’s self–report (IF = Pr/P) [25,26].
The prevalence of women’s self–report in a survey (Pr) is calculated by applying each indicator’s estimated sensitivity (SE) and specificity (SP) to its true prevalence (P), using the following equation: Pr = P × (SE+SP−1)+(1−SP) . We caution that the estimated survey–based prevalence is dependent on the observed prevalence of the indicator. Therefore, IF estimates reflect the magnitude of over or under–estimation in the study setting. To illustrate the implications of the IF estimates for other contexts in which the true prevalence is different from our study setting (eg, outside of a hospital facility), we model the estimated survey prevalence for select indicators across all possible coverage levels (ie, true prevalence ranging from 0 to 100%) using the above equation .
We categorized individual–level reporting accuracy as high (AUC>0.70), moderate (0.60<AUC<0.70), and low (AUC<0.60)  and the degree of bias reflected by the IF as low (0.75<IF<1.25), moderate (0.50<IF<1.5) and large (IF<0.50 or IF>1.5) . In order to summarize indicator validity in terms of both individual and population–level accuracy, we considered indicators with high AUC and low IF to have high overall performance .
1039 women admitted to the maternity unit at participating study facilities were recruited to participate. Of those, 676 women were observed (Kiambu = 395, Kisumu = 281). Approximately one–third of women were not observed because they were not in labor but required monitoring on the antenatal ward, did not progress into labor, or they progressed rapidly into labor and full observation was not possible ( Figure 1 ). Fourteen women who were observed did not participate in the exit interview.
Participants’ background characteristics and differences by facility location are presented in Table 1 . The majority of women were under age 25 with fewer than two prior births, married, and with no or primary education. A greater percentage of Kisumu participants were never married, while fewer were married/living together or separated/widowed.
|% total sample (N = 662)||% Kiambu (N = 388)||% Kisumu (N = 274)||P–value*|
|Age in years:||0.504|
|Prior parity (total number of live births):||0.435|
|4 or more||3.3||3.1||3.6|
|Single, never married||14.7||9.8†||21.5†|
|Type of delivery:||0.679|
*Based on χ2 test comparing facility locations, statistically significant at P < 0.05.
†Statistically significant pairwise comparisons using the Holm–Bonferroni correction to adjust for multiple comparisons.
The full list of indicators selected for validity testing is presented in Table 2 . The table provides the prevalence for each indicator as reported by women and observers, which, for some indicators, varied substantially. For example, 73% of women reported that the provider(s) washed his or her hands or used antiseptic before any initial examination, while 27% of observers recorded that this took place. “Don’t know” responses were minimal for most indicators. However, four indicators for which the proportion of women who responded “Don’t know” exceeded 5% are reported in Table 3 . Two of these indicators refer to the immediate postnatal period: whether the newborn was immediately dried after birth and whether the newborn was immediately wrapped in a towel. Having a cesarean section as opposed to a vaginal delivery was significantly associated with responding “Don’t Know” to both of these questions (OR = 15.3, 95% CI = 8.2–28.3, P < 0.001; OR = 2.7, 95% CI = 1.2–5.9, P = 0.016, respectively).
|Indicator||N||Women’s self–report, prevalence (%)||Observer report, prevalence (%)||At least 5 counts/cell|
|Initial client assessment:|
|Type of facility where gave birth (public hospital)||651||98.0||100.0||No|
|Referred to facility because of a problem||655||8.2||7.9||Yes|
|HIV status checked||659||25.2||94.7||No|
|Offered HIV test||660||8.3||1.7||No|
|Receives HIV test||654||8.6||9.3||No|
|Provider washes hands with soap and water or uses antiseptic before any initial examination||467||73.1||26.6||Yes|
|Takes blood pressure||654||93.4||87.0||Yes|
|Takes urine sample||654||5.7||1.4||No|
|Checks fetal heart rate with fetoscope/ultrasound||659||95.8||99.7||No|
|Wears high–level disinfected or sterile gloves for vaginal examination||658||99.9||99.9||No|
|Provider respectful care:|
|Woman allowed to drink liquids/eat||624||66.8||42.0||Yes|
|Encourages/assists woman to ambulate during labor||644||87.0||77.5||Yes|
|Encourages/assists woman to assume different positions in labor||649||14.3||58.2||Yes|
|Woman allowed to have a support person present during labor and delivery||648||8.8||9.1||Yes|
|A support person is present at birth||644||3.7||4.8||Yes|
|First stage of labor:|
|Induces labor by uterotonic (IV, IM, tablet)||630||10.8||4.6||Yes|
|Augments labor with uterotonic (by IV line, IM injection, or tablet)||625||39.2||22.4||Yes|
|Uterotonic received (to induce or augment labor)||619||43.8||27.1||Yes|
|Membranes ruptured (to induce or augment labor)||650||3.1||42.3||Yes|
|Skilled birth attendance–main provider:|
|Main provider labor|
|Skilled main provider labor†||649||89.9||92.6||Yes|
|Main provider labor–doctor or medical resident||649||9.6||0.5||No|
|Main provider labor–doctor (ob–gyn)||649||9.6||0.3||No|
|Main provider labor–medical resident||649||0.0||0.2||No|
|Main provider labor–medical intern||649||0.2||1.9||No|
|Main provider labor–nurse/midwife||649||80.2||92.1||Yes|
|Main provider labor–clinical officer||649||2.2||0.6||No|
|Main provider labor–facility support/ staff aide||649||0.2||0.3||No|
|Main provider labor–student nurse||649||2.3||2.8||No|
|Main provider labor–support companion||649||0.6||1.7||No|
|Main provider labor–no one or other||649||4.6||0.0||No|
|Main provider delivery|
|Skilled main provider delivery†||644||94.3||92.9||Yes|
|Main provider delivery– doctor (ob–gyn) or medical resident||644||19.3||11.8||Yes|
|Main provider delivery– doctor (ob–gyn)||644||19.1||3.0||No|
|Main provider delivery–medical resident||644||0.2||8.9||No|
|Main provider delivery–medical intern||644||0.3||1.1||No|
|Main provider delivery–nurse/midwife||644||75.0||81.1||Yes|
|Main provider delivery–clinical officer||644||2.2||0.8||No|
|Main provider delivery–student nurse||644||2.2||4.8||Yes|
|Main provider delivery–no one or other||644||0.9||0.5||No|
|Second and third stage of labor:|
|Uterotonic administered within few minutes of delivery (via injection, IV medication, or oral/rectal tablets)||562||96.8||98.8||No|
|Uterotonic received 1–3 min after birth||552||96.9||81.5||No|
|Uterotonic received after delivery of placenta||552||59.1||2.4||Yes|
|Applies controlled cord traction||561||97.5||98.9||No|
|Performs uterine massage after delivery of placenta||558||88.4||98.6||No|
|Position of mother at birth–on back||645||94.7||99.8||No|
|Health provider wore gloves during delivery of baby||563||100.0||99.8||No|
|Immediate postnatal newborn care:‡|
|Newborn given to mother immediately after birth||611||59.9||57.6||Yes|
|Newborn immediately dried with towel/cloth||594||96.1||99.5||No|
|Newborn placed immediately skin to skin on mother's chest§||602||78.9||16.3||Yes|
|Newborn immediately skin to skin on mother (2 item indicator)#||596||29.2||16.2||Yes|
|(Of newborns not on skin) Newborn wrapped with towel||86||90.7||91.9||No|
|Breastfeeding initiated within first hour of birth||551||76.4||53.0||Yes|
|Something other than breastmilk given to baby within first hour of delivery||572||1.9||1.1||No|
|Baby bathed within the first hour after birth||604||2.8||0.1||No|
|Low birth–weight baby (<2500g)||579||6.7||7.8||Yes|
|High birth–weight baby (≥4500g)||579||1.0||1.0||No|
|3 elements of newborn care (immed. · dried + on skin + breastfed in first hour)||506||71.5||9.3||Yes|
|3 elements of newborn care (immed. · dried, 2 item skin–to–skin#, breastfed in first hour)||501||25.1||9.2||Yes|
|Immediate postnatal care for mother:|
|Palpates uterus 15 min after delivery of placenta||557||88.3||70.2||Yes|
|Provider did at least one post–delivery health check||649||96.0||94.9||No|
|In first post–delivery exam, provider checks for bleeding||627||62.0||90.6||Yes|
|In first post–delivery exam, provider examines perineum||554||56.1||87.4||Yes|
|In first post–delivery exam, provider takes temperature||638||60.0||40.3||Yes|
|In first post–delivery exam, provider takes blood pressure||642||74.6||48.3||Yes|
|In first post–delivery exam, provider checks for involution||615||64.2||78.2||Yes|
|Woman asked for pain relief medication while at facility||638||32.1||10.5||Yes|
|Woman received pain relief medication||640||59.4||17.5||Yes|
|Maternal and infant outcomes:|
|Cesarean section (C/S) performed||651||13.5||13.4||Yes|
|Decision for C/S taken before labor started||76||9.2||0.0||No|
|(Of women who had a C/S) C/S performed after labor started||76||90.8||100.0||No|
|(Of women who had a C/S) Provider decided C/S would be done||80||82.5||100.0||No|
|(Of women who had a C/S) Reason for C/S–prolonged/obstructed labor||76||32.9||67.1||Yes|
|–Prolonged labor (>12 h)||654||23.7||3.7||Yes|
|(Of women who had complications) Blood products given||72||15.3||18.1||No|
*Table presents descriptive results. The sample size per indicator varied by women’s responses to the question, and uses matched data, excluding ‘Don’t Know’ and missing responses.
†Skilled provider is doctor (obstetrician/gynecologist, ob–gyn), nurse/midwife or medical resident.
‡Asked of mothers whose babies were breathing at birth.
§Newborn was placed against mother’s chest after delivery.
#Indicator constructed from two skin–to–skin items: (1) newborn placed against mother’s chest after delivery and (2) newborn was naked on skin (not wrapped in a towel).
|Respondent question||N||“Don’t know” (%)|
|Did the health provider(s) wash his/her hands with soap and water or use antiseptic before examining you?||662||29.5|
|Was your baby dried off with a towel or cloth immediately after his/her birth, within a few minutes of delivery?||660||8.3|
|(Of women who reported newborn was not placed against her chest immediately after delivery) Was your baby wrapped in a towel or cloth immediately after birth?||170||20.6|
|In your first physical examination after delivery, did a health provider do a perineal exam?||662||9.8|
Of the 41 indicators with at least five cases in each cell of the two–by–two table, four had both high individual reporting accuracy and low population–level bias ( Table 4 ). These were: the main provider during delivery was a nurse/midwife, a support companion was present during the birth, cesarean operation, and low birthweight infant (<2500 g). Receiving an episiotomy was close to meeting both criteria (AUC = 0.87, 95% CI = 0.83–0.89, IF = 1.26).
|Indicator||Total number||Observer prevalence (%)||Sensitivity (95% CI)||Specificity (95% CI)||Self–report survey estimate (%), based on sensitivity and specificity||AUC (95% CI)||IF||High accuracy (AUC>0.70) & low bias (IF 0.75–1.25)|
|Initial client assessment:|
|Woman referred to facility because of a problem||655||7.9||25.0 (14.0–38.9)||93.2 (90.9–95.1)||8.2||0.59 (0.55–0.62)||1.04||IF|
|Provider washes hands with soap and water or uses antiseptic before initial examination||467||26.6||83.9 (76.2–89.9)||32.9 (28.0–38.2)||71.5||0.58 (0.54–0.63)||2.69|
|Takes blood pressure||654||87.0||87.7 (84.9–90.2)||23.3 (11.8–38.6)||86.3||0.55 (0.52–0.59)||0.99||IF|
|Provider respectful care:|
|Woman allowed to drink liquids or eat||624||42.0||72.5 (66.7–77.8)||37.3 (32.3–42.5)||66.8||0.55 (0.51–0.59)||1.59|
|Encourages/assists woman to ambulate during labor||644||77.5||90.2 (87.2–92.6)||24.1 (17.4–31.9)||87.0||0.57 (0.53–0.61)||1.12||IF|
|Encourages/assists woman to assume different positions in labor||649||58.2||19.1 (15.2–23.4)||92.3 (88.4–95.1)||14.3||0.56 (0.52–0.59)||0.25|
|Woman allowed to have a support person during labor and delivery||648||9.1||23.7 (13.6–36.6)||92.7 (90.3–94.7)||8.8||0.58 (0.54–0.62)||0.97||IF|
|Support companion present during birth||644||4.8||48.4 (30.2–66.9)||98.5 (97.2–99.3)||3.7||0.73 (0.70–0.77)||0.77||Both|
|First stage of labor:|
|Induces labor with uterotonic||630||4.6||69.0 (49.2–84.7)||92.0 (89.6–94.1)||10.8||0.80 (0.77–0.84)||2.35||AUC|
|Augments labor with uterotonic||625||22.4||72.9 (64.7–80.0)||70.5 (66.2–74.5)||39.2||0.72 (0.68–0.75)||1.75||AUC|
|Uterotonic received (labor induction or augmentation)||619||27.1||78.0 (70.9–84.0)||69.0 (64.5–73.2)||43.8||0.73 (0.70–0.77)||1.61||AUC|
|Membranes ruptured (labor induction or augmentation)||650||42.3||4.0 (2.0–7.0)||97.6 (95.5–98.9)||3.1||0.51 (0.47–0.55)||0.07|
|Skilled birth attendance:|
|Skilled main provider† labor||649||92.6||90.5 (87.9–92.7)||16.7 (7.5–30.2)||90.0||0.54 (0.50–0.58)||0.97||IF|
|–Main provider labor nurse/midwife||649||92.1||81.1 (77.7–84.2)||27.5 (15.9–41.7)||80.4||0.54 (0.50–0.58)||0.87||IF|
|Skilled main provider† delivery||644||92.9||95.0 (92.9–96.6)||15.2 (6.3–28.9)||94.3||0.55 (0.51–0.59)||1.02||IF|
|–Main provider delivery doctor (ob–gyn)/medical resident||644||11.8||82.9 (72.5–90.6)||89.3 (86.4–91.7)||19.3||0.86 (0.83–0.89)||1.63||AUC|
|–Main provider delivery nurse/midwife||644||81.1||86.2 (82.9–89.0)||73.0 (64.2–80.6)||75.0||0.80 (0.76–0.83)||0.93||Both|
|Main provider delivery student nurse||644||4.8||16.1 (5.5–33.7)||98.5 (97.2–99.3)||2.2||0.57 (0.53–0.61)||0.45|
|Second and third stage labor:|
|Episiotomy performed||545||18.2||82.8 (73.9–89.7)||90.4 (87.2–92.9)||22.9||0.87 (0.83–0.89)||1.26||AUC|
|Uterotonic received following delivery of placenta||552||2.4||53.9 (25.1–80.8)||40.8 (36.6–45.1)||59.1||0.47 (0.43–0.52)||25.1|
|Immediate newborn postnatal care:|
|Baby given to mother immediately after birth||611||57.6||66.5 (61.3–71.4)||49.0 (42.8–55.3)||59.9||0.58 (0.54–0.62)||1.04||IF|
|Baby placed immediately skin to skin on mother||602||16.3||78.6 (69.1–86.2)||21.0 (17.6–24.9)||78.9||0.50 (0.46–0.54)||4.85|
|Baby placed immediately skin to skin on mother (2 item)†||596||16.2||26.8 (18.3–36.8)||70.3 (66.1–74.3)||29.2||0.49 (0.44–0.53)||1.80|
|Breastfeeding within first hour of birth||551||53.0||88.4 (84.1–91.8)||37.1 (31.2–43.3)||76.4||0.63 (0.59–0.67)||1.44|
|3 elements of essential newborn care (immediately dried, on mother’s skin, breastfed within first hour)||506||9.3||70.2 (55.1–82.7)||28.3 (24.2–32.7)||71.5||0.49 (0.45–0.54)||7.70|
|3 elements of essential newborn care (immediately dried, 2 item on mother’s skin‡, breastfed within first hour)||501||9.2||19.6 (9.4–33.9)||74.3 (70.0–78.2)||25.2||0.47 (0.42–0.51)||2.7|
|Low birthweight newborn (<2500g)||579||7.8||71.1 (55.7–83.6)||98.7 (97.3–99.5)||6.7||0.85 (0.82–0.88)||0.87||Both|
|Immediate postnatal care for mother:|
|Palpates uterus 15 min after delivery of placenta||557||70.2||88.8 (85.2–91.7)||12.7 (8.0–18.7)||88.3||0.51 (0.46–0.55)||1.26|
|First post–delivery exam, provider ask/checks for bleeding||627||90.6||59.9 (55.7–63.9)||17.0 (8.4–29.0)||62.0||0.38 (0.35–0.42)||0.68|
|First post–delivery exam, provider examines perineum||554||87.4||57.9 (53.3–62.3)||55.7 (43.3–67.6)||56.1||0.57 (0.53–0.61)||0.64|
|First post–delivery exam, provider takes temperature||638||40.3||88.1 (69.3–80.3)||50.1 (45.0–55.3)||60.0||0.63 (0.59–0.66)||1.49|
|First post–delivery exam, provider takes blood pressure||642||48.3||88.1 (85.1–92.5)||38.0 (32.3–43.5)||74.6||0.63 (0.59–0.67)||1.55|
|First post–delivery exam, provider checks for involution||615||78.2||64.0 (59.6–68.3)||35.1 (27.0–43.8)||64.2||0.50 (0.46–0.54)||0.82||IF|
|Woman asked for pain relief medication during stay||638||10.5||35.8 (24.5–48.5)||68.3 (64.3–72.1)||32.1||0.52 (0.48–0.56)||3.06|
|Woman received pain relief medication||640||17.5||85.7 (77.8–91.6)||46.2 (41.9–50.6)||59.4||0.66 (0.62–0.70)||3.39|
|Cesarean section (C/S) performed||651||1.4||93.1 (85.6–97.4)||98.8 (97.5–99.5)||13.5||0.96 (0.94–0.97)||1.01||Both|
|Reason for C/S– prolonged/obstructed labor||76||67.1||39.2 (25.8–53.9)||80.0 (59.3–93.2)||32.9||0.60 (0.47–0.70)||0.49|
|Complications (any):||654||11.0||62.5 (50.3–73.6)||57.4 (53.3–61.4)||44.8||0.60 (0.56–0.64)||4.07|
|–Hemorrhage||654||4.6||33.3 (17.3–52.8)||89.9 (87.3–92.2)||11.2||0.62 (0.58–0.65)||2.43|
|–Prolonged labor||654||3.7||50.0 (29.1–70.9)||77.3 (73.8–80.5)||23.7||0.64 (0.60–0.67)||6.46|
|–None||654||89.0||53.8 (49.6–57.9)||66.7 (54.6–77.3)||51.5||0.60 (0.56–0.64)||0.58|
CI – confidence interval; IF – inflation factor; AUC – receiver operating curve
Recommended indicators met both AUC and IF validation criteria.
*Validation analysis based on matched data, excluding ‘Don’t Know’ responses. Sensitivity and specificity analysis was not performed for indicators that had fewer than 5 counts per cell.
†Skilled provider includes doctor (ob–gyn), medical resident or nurse/midwife.
‡Indicator constructed from two skin–to–skin items: (1) newborn placed against mother’s chest after delivery and (2) newborn was lying naked against the mother’s chest.
A total of 8 indicators had high individual reporting accuracy (AUC>0.70), 7 had moderate accuracy (AUC>0.60), and 26 had low accuracy (AUC<0.60). Indicators with high AUC results reflected events leading up to (eg, induction or augmentation of labor, episiotomy) and during the birth itself (eg, cesarean section, main provider during delivery was a doctor or medical resident, main provider during delivery was a nurse/midwife, support person present during birth, low birthweight infant). Indicators with low value AUC results tended to be related to events immediately following the birth (eg, uterotonic received following delivery of the placenta) and postnatal health checks for the mother and newborn. For population–level bias, a total of 13 indicators had low bias (0.75<IF<1.25), 7 had moderate bias (0.5<IF<1.5), and 21 had large bias (IF<0.5 or IF>1.5). Indicators with large bias varied with respect to phase of labor and delivery, but those with the largest bias tended to have a low observed prevalence and be those that may require medical knowledge to report accurately (eg, experience of complications).
To assess women’s ability to recall the type of provider who attended them, respondents were asked, “Who was the main provider assisting you during delivery?” There were sufficient cell counts to assess two provider categories with robust analysis: nurse/midwife and student nurse. The nurse/midwife indicator met both the high AUC and low IF criteria while the student nurse indicator had low individual accuracy (AUC = 0.57) and large bias (IF = 0.45). An indicator constructed in analysis that combines responses of “doctor”, “medical resident” and “nurse/midwife” as “skilled attendants” had low individual accuracy (AUC = 0.55), primarily due to low specificity, and low population–level bias (IF = 1.02) . Cross–tabulation results that compare women’s reports to observers’ reports on their main provider during delivery suggest a tendency for medical residents and nurse/midwives to be misclassified by women as doctors ( Table 5 ).
|Self–report (number)||Observer report (number)|
|Doctor (obstetrician/gynecologist)||Medical resident||Medical intern||Nurse or midwife||Clinical officer||Student nurse||Other||Total|
To illustrate the implications of indicator properties established in this study for other contexts, we plot the values of the predicted survey prevalence (Pr) of select indicators across all possible levels of intervention coverage (from 0 to 100%). Figures 2 and 3 Figure 3 compare the predicted prevalence using the sensitivity and specificity calculated in this study (blue line), to perfect reporting accuracy assuming 100% sensitivity and specificity (black line) across all levels of coverage. Using the example of a high sensitivity and low specificity indicator such as “skilled birth attendance”, these data demonstrate that in a high coverage setting the estimated survey–based prevalence from women’s self–report more closely approximates the true prevalence while in low coverage settings, the estimated survey–based prevalence would greatly overestimate the true rate ( Figure 2 ). For example, in a setting where the true prevalence of skilled attendance is 40%, rather than the 93% observed in this study (the red triangle), the estimated survey–based prevalence would exceed the true prevalence by 50 percentage points. In contrast, an indicator with both high sensitivity and high specificity, such as “cesarean operation”, would generate a survey–based estimate that closely approximates the true prevalence across all coverage levels ( Figure 3 ).
This study provides validity results for 41 indicators of the quality of maternal and immediate newborn health care that are either currently in use or have the potential to be included in household surveys. Across phases of labor and delivery, we found indicators related to concrete, observable aspects of care or which reflected pain or concern were reported with higher accuracy. These results are consistent with previous studies which have found particularly memorable events, such as having a cesarean operation [11,22,29] and having a support person present , have high overall validity.
That a small number of the initial list of 82 indicators met both validation criteria is partly due to the fact that many preventative care interventions almost always occurred, and there was insufficient variation (ie, not enough cases in each cell) for robust analysis. For many preventative care indicators we found that most women accurately reported receiving the care (ie, high indicator sensitivity). For example, an indicator that is a proxy for receiving a uterotonic for the prevention of postpartum hemorrhage (ie, if an injection, IV medication or tablets were received in the first few minutes following birth), was accurately reported by nearly all women. Although the near universal implementation of this practice limited robust analysis, these results suggest some aspects of routine care can be accurately reported. However, given that there were few instances in which standard preventative interventions were not received, unless there was almost perfect negative classification by women, these indicators also tended to have low specificity. An alternate interpretation is that the observed pattern of high sensitivity and low specificity for many preventative practices may reflect “facility reporting bias” among women based on the expectation of receiving appropriate care. This finding has also been described in a study of women’s reporting of maternal and child health care in China .
The potential for facility reporting bias may also be relevant for indicators on skilled birth attendance. Indicators measuring the assistance of a skilled provider had high sensitivity and low specificity for both labor and delivery. Women tended to underreport the presence of less skilled providers, such as student nurses, and over–report the presence of a doctor or obstetrician/gynecologist. The positive bias may also be due to differences in how women conceptualized who their “main” provider was. It is possible that women understood their “main” provider to be the attendant with the highest rank and who may have been deemed ‘in–charge’ of her care, while observers identified the primary provider as the attendant who administered the majority of the care to the woman.
Study findings also suggest the validity of some indicators may be dependent on context and question wording. Indicators that performed worse on the validity tests tend to be related to the timing or sequence of events, such as whether the newborn was placed skin–to–skin on the mother’s chest immediately after delivery. A two–item question sequence that clarified the precise meaning of “skin–to–skin” greatly reduced women’s overestimation of the practice compared to a one–item indicator ( Table 4 ). These results are consistent with findings that women had difficulty reporting whether their newborn was placed skin–to–skin in a qualitative study of delivery and newborn care among women in Bangladesh and Malawi , but contrast with findings from a recent validation study in Mozambique . The mixed results may be attributable to a longer recall period in the Mozambique study.
An influential aspect of the birth context was the type of delivery. Women who had a cesarean operation were less likely to be able to report on immediate newborn care than women with normal deliveries. This is reflected in high levels of “Don’t Know” responses. This finding suggests that it may be worth excluding women with cesarean sections from questions about care immediately after birth in routine household surveys.
A number of indicators performed well on the IF test only; individual–level misclassification does not inherently signify that measurement at the aggregate level will be inaccurate . In studies where the goal is to estimate the approximate population–based coverage of an indicator, false positives and false negatives may balance out to produce a close approximation of population level coverage (ie, indicators that meet the IF criteria alone). Knowing if an indicator’s IF is large can inform when corrective methods may be needed to limit false positive reporting (eg, use of a two–item indicator).
Knowledge of whether an indicator is likely to be overestimated can also have significant programmatic implications. For example, where skilled birth attendance is over–reported, progress in scaling up the presence of higher cadre providers may not be as great as expected. It is important to recognize that the presence of a skilled provider is one aspect of receiving quality care, one which relies on the assumption that providers have received the necessary training to administer essential interventions and have acquired the competencies to address complications during childbirth. Additionally, even “skilled” providers may not be able to deliver adequate care if they do not have access to necessary equipment and supplies. Information on skilled attendance as reported by women should be corroborated with indicators on the content of care. When possible, we recommend that users also triangulate self–reported data on quality of care with other data sources such as information on stock–outs of essential medicines .
While a strength of this study is the use of direct observation as the reference standard, there are some limitations. Validation results are based on women seeking delivery in a large public hospital, and may not be generalizable to women who deliver in other types of facilities or at home. The lack of variation in hospital practices also limited the ability to analyze all of the indicators, which may have otherwise proven to be valid if we had collected data in a range of health institutions. Finally, our results inform a ‘best case’ scenario in terms of recall accuracy because women were interviewed shortly following delivery. To inform how recall changes over time, as well as to investigate women’s understanding of concepts such as who their main provider was, a follow–up study is under way to re–interview women one year after delivery.
The measurement of the quality of maternal and newborn health care received in LMIC settings often relies on data from surveys of women. Little research has examined the validity of these indicators. The primary indicator of interest in this study–delivery by a skilled birth attendant–met validation criteria for reporting at the population level only and the results indicate that reporting accuracy may be particularly problematic where skilled birth attendance coverage is low. Indicator properties established here provide insight into contexts where indicator use is appropriate, and where modifications to data collection procedures or question construction may be warranted.