Voluntary Medical Male Circumcision Scale-Up in Nyanza, Kenya: Evaluating Technical Efficiency and Productivity of Service Delivery

Background Voluntary medical male circumcision (VMMC) service delivery is complex and resource-intensive. In Kenya’s context there is still paucity of information on resource use vis-à-vis outputs as programs scale up. Knowledge of technical efficiency, productivity and potential sources of constraints is desirable to improve decision-making. Objective To evaluate technical efficiency and productivity of VMMC service delivery in Nyanza in 2011/2012 using data envelopment analysis. Design Comparative process evaluation of facilities providing VMMC in Nyanza in 2011/2012 using output orientated data envelopment analysis. Results Twenty one facilities were evaluated. Only 1 of 7 variables considered (total elapsed operation time) significantly improved from 32.8 minutes (SD 8.8) in 2011 to 30 minutes (SD 6.6) in 2012 (95%CI = 0.0350–5.2488; p = 0.047). Mean scale technical efficiency significantly improved from 91% (SD 19.8) in 2011 to 99% (SD 4.0) in 2012 particularly among outreach compared to fixed service delivery facilities (CI -31.47959–4.698508; p = 0.005). Increase in mean VRS technical efficiency from 84% (SD 25.3) in 2011 and 89% (SD 25.1) in 2012 was not statistically significant. Benchmark facilities were #119 and #125 in 2011 and #103 in 2012. Malmquist Productivity Index (MPI) at fixed facilities declined by 2.5% but gained by 4.9% at outreach ones by 2012. Total factor productivity improved by 83% (p = 0.032) in 2012, largely due to progress in technological efficiency by 79% (p = 0.008). Conclusions Significant improvement in scale technical efficiency among outreach facilities in 2012 was attributable to accelerated activities. However, ongoing pure technical inefficiency requires concerted attention. Technological progress was the key driver of service productivity growth in Nyanza. Incorporating service-quality dimensions and using stepwise-multiple criteria in performance evaluation enhances comprehensiveness and validity. These findings highlight site-level resource use and sources of variations in VMMC service productivity, which are important for program planning.


Introduction
Service delivery of VMMC for HIV intervention was rolled out in Kenya in 2008. The program (i) is characterized by a complex and resource intensive delivery function, which has considerable implications for both technical and functional program outcomes [1,2]; (ii) efficiency and productivity portends the program's impact on HIV epidemic and policy directions [3]; (iii) require wide and rapid coverage to realize the intended public health impact [2,4]; (iv) resource allocation and use require objective information on both institutional and micro-level service delivery performance to enhance decision-making. [1] Hitherto, most studies on VMMC services have focused on program cost-effectiveness [4,5,6,7] and how it works [8,9,10,11]. The current study builds on existing knowledge of how the program works by examining the technical efficiency and productivity dimensions of service delivery in Nyanza region, Kenya, to determine the extent of resource use by service facilities vis-à-vis selected outputs. The study results are critical to augmenting VMMC service delivery management solutions.
Service delivery is the key function of the health systems, and it is defined as 'the way inputs are combined to allow provision of a series of interventions or health actions' to promote, restore or maintain health in an equitable manner. [12] The prevalent perspectives for evaluating service delivery consider: (i) the relationship between inputs (such as manpower and capital) available for service delivery and the outputs (including services, products, or technologies) that results from health care activities (productivity perspective); and (ii) performance of service delivery in terms of the health effects or status change resulting from the outputs (effectiveness perspective). [13] Technical efficiency and productivity of voluntary medical male circumcision (VMMC) services was evaluated based on the first perspective.
Technical efficiency measures the ability of a facility to produce the maximum quantity of program outputs for any given amount of inputs or the minimum input levels used for any given amount of outputs. Service productivity identifies 'the change in service output resulting from a unit change in the inputs' over time. [14] Service quality dimensions were considered central to service delivery function hence a key variable in identifying benchmark units (ideal performance units set on the basis of a sample of similar facilities and performance over time). [15,16] The conceptual framework for evaluating these measures encompasses: i) inputs (clinicians, nurses, surgical bed, surgical time); ii) process (structure such as tasks performed during circumcision) and; iii) output (services including number of circumcisions accomplished, proportion of circumcised men receiving HIV test, service quality). [17] Efficiency, benchmarking and productivity evaluation of VMMC service delivery Evaluation of a service delivery plan for VMMC involves several dimensions including inputs used, outputs generated and service quality. [18] Simultaneous consideration of multiple dimensions of service delivery accords a platform to demonstrate how resources are used in diverse contexts (in terms of input-output mix) among different producing units. Evaluation indicators would normally be designed in relation to one or multiple dimensions selected. [19,20] When multiple dimensions are observed, composite indicators (defined as a combined metric that incorporates multiple individual measures to provide a single score) are preferred to: i) aggregate the input and output data into a single comprehensive measure of performance; ii) determine if the critical aspects of service delivery have been achieved.
Traditionally, program measures have been evaluated against absolute standards estimated as global average values, mainly focusing on controllable input variables such as staff and capital. The analysis may be based on 'best performance frontier' and/or 'central tendency (average-based') techniques, although the two perspectives can potentially result into different improvement decisions. [21] Furthermore, there exists variants of either of the "frontier" methods and regression analyses. Whether to prefer either one or combination of the methods depends on the study context and objectives, data characteristics and user skills.
The frontier methods include non-parametric data envelopment analysis (DEA) and parametric stochastic frontier analysis (SFA). Both can be used to identify a production frontier for a group of facilities but they employ different assumptions and methodologies. DEA methods use mathematical programing to obtain the production frontier enveloping all the observed data. Specifically DEA estimates efficiency scores for each unit by comparing its input mix (normally the resources necessary to complete a task) and volume of services provided against the best performing peers in the set. In models assuming variable returns to scale unit comparison is restricted to only among those with comparable sizes. The scores obtained depend on model characteristics and level of input variables used by best performing facilities in terms of their outputs to inputs ratio. They reflect the performance of each facility relative to best performing ones. The exact interpretation depends on the DEA model orientation used, whether output-maximizing or input-minimizing. Limitations of DEA include sensitivity to outliers, assumes no errors (which may bias results) and standard models do not permit hypothesis testing for the best model specification. [22] Conversely, stochastic frontier methods are parametric. Typically they accommodate only a single input with multiple outputs; can differentiate errors from inefficiency sources; require specification of a functional form and; permit computation of the confidence intervals for efficiency scores and their best predictors for individual facilities. However, based on parameter estimates it may not envelop all output points and does not identify peers. [22] Regarding regression methods, least squares are used to define functional relationships between one dependent variable and other or multiple independent ones and to predict sources of variations. The methods estimate a single sample-based global average score and is amenable to hypothesis testing.
Increasingly, data envelopment analysis (DEA) is becoming instrumental in evaluating health service delivery efficiency, which is typically complex and multidimensional. Preference for DEA accrues from: (i) its capacity to integrate multiple input and output data of any measurement (both controllable and those beyond a provider's control) and dimension simultaneously [21] to produce a single aggregate relative "efficiency score" for each service unit. These scores, adjusted to be a number between 0 and 1 (0-100%) are relative measures estimated based on the most favorable combination mix for each unit in contrast to using an absolute standard; (ii) ability to construct a 'best practice' frontier and simultaneously compare facilities to classify each unit most favorably; (iii) no need for inclusion of cost variable nor modelling of functional relationships for inputs to outputs; [23] (iv) ability to identify respective unit productivity individually, sources of inefficiencies as well as the benchmark peers ('peers' = units assigned a score of 100%) in the set plus their respective weight to guide improvements required for the less efficient ones. However, it does not reveal how to accomplish the needed changes. Ideally, the improvement efforts prioritized by a manager for respective facilities should consider their practicality and feasibility. [24,25] In the current study, since efficient resource use and output maximization are the key objectives of the VMMC serviced delivery, DEA-based output-orientated technical efficiency and Malmquist productivity index (MPI) are used respectively to: i) demonstrate extent of resource use by facilities to maximize VMMC service outputs; ii) measure total factor productivity change and identify sources of variation by estimating technical efficiency change and efficiency change between 2011(low season and routine services) and 2012 (period of accelerated VMMC activities). Malmquist Productivity Index (MPI) is interpreted as a measures of total factor productivity change over time and its components (efficiency change and technology change) provide insight into the sources of observed variations in VMMC target outputs. Essentially, it distinguishes productivity changes that are due to increased efficiency (catching-up with best-practice facilities) from technological changes, e.g. service delivery strategies/techniques adopted. The efficiency change component is a product of scale and pure efficiency and shows the position of a facility relative to the frontier made up by "best practice" units. The technical change component measures how much the frontier shifts relative to comparable units. In either case the index values greater, equal to, or less than one indicate improvement, stagnation or regress. Since MPI values are percentiles, they are expressed as geometric means.
Its key benefit is that it does not require information on the prices of inputs and outputs. Furthermore, calculation of this index requires no assumptions regarding orientation of the organizations under analysis. [26,27] The strategic importance of using DEA techniques to evaluate efficiency of medical male circumcision services is that multiple dimensions are assessed simultaneously; each unit is ranked according to the most favorable performance relative to similar ones in the set; DEAestimated frontier is a good approximation of the true underlying production possibilities; it provides guidelines for objective benchmarking and setting production objectives for less efficient units; and it enables productivity evaluation of performance over time. Whereas the DEA outputs provide diagnostic performance information for a set of comparable service delivery units, it is desirable that management decisions further consider broader policy objectives such as service access and coverage as well as prevailing exogenous factors.

Considerations for constructing DEA model
Variable selection. Although any set of variables may be chosen for the DEA model, the outputs preferred for the current study closely reflect the organizational context plus their functional relationships. [27,28,29,30,31] Using an arbitrary approach inherently may exclude important performance variables [27,30]. Variables incorporated included clinicians, nurses, surgical beds and total elapsed circumcision time [32,33] as inputs while number of circumcisions, proportions of circumcised receiving HIV tests and quality dimensions [34] were outputs. The quality variable was constructed using principal component analysis (PCA) and exploratory factor analysis techniques. Fifteen process items that correlated highly (conventionally set at 0.4) with factor 1 were identified as the critical quality measure items, thus were used to construct composite index for scoring service quality per facility. Final index scores were obtained by averaging scores across items assessed, with higher coefficient scores representing higher quality on a percentile scale [34].
Model orientation. Clarifying model orientation provides information on how efficiency scores are derived and how they vary. Technical efficiency indicators may be either input-oriented or output-oriented depending on which variable set the program managers have control. Input-orientated technical efficiency focuses on minimizing inputs used without reducing the output quantity while in output-orientated efficiency the focus is on expanding output quantities while maintaining current level of inputs. Technical efficiency (global TE) is a product of pure technical (PTE) and scale efficiency (SE). Pure TE is generally associated with organization of operations of the specific service producing units (input-output mix) while scale depicts issues related to size and indicates if the facility is too large or too small considering the inputs used to produce the observed outputs. Sources of inefficiency of a facility unit may thus be attributable to either or both of the components [35]. Scale efficiency of a service delivery unit may be examined under different model versions which make different assumptions about returns to scale: constant returns to scale (CRS), variable returns to scale (VRS) and non-increasing returns to scale (NIRS). [36] The scale efficiency (SE) is given by the ratio between the CRS and VRS technical efficiency scores. [23,37,38] Scale inefficiency (SE<100%) may occur if the facility is not operating at its most productive/optimal size (in terms of its output-input mix), due to: i) increasing returns to scale (TE VRS > TE CRS ); or ii) decreasing returns to scale (TE VRS = TE NIRS ). [27,39] The VRS model, allows the best practice level of outputs to inputs to vary with the size of the facilities assessed whereas under CRS it is determined by the highest achievable ratio of outputs to inputs for each unit, regardless of size. Efficiency scores are identical when computed using input or output orientation under CRS but may vary under VRS. Also, scores obtained when assuming VRS may be higher than or equal to CRS ones since they indicate only technical inefficiency resulting from non-scale factors. [40,41].

Ethics statement
The project was approved by the Ethics Review Committee of the Kenya Medical Research Institute. Written informed consent was obtained from each participant in the study. Academic approval was obtained from Maseno University.

Study design and setting
Using a comparative process evaluation of voluntary medical male circumcision (VMMC) scale-up in Nyanza, site level data was collected among randomly sampled facilities providing VMMC services as fixed, out-reach and mobile sites (15/12/3) during two rounds of Systematic Monitoring of Medical Male Circumcision Scale-up (SYMMACS) in 2011 and 2012. [9] The first round was conducted during low season while round two occurred during peak season with accelerated activities. Assessment of service tasks performed, availability of guidelines, supplies and equipment and, continuity of care was conducted using modified national VMMC monitoring instruments.

Sample size for DEA
Of all facilities observed, only 9 fixed and 12 outreach VMMC facilities meeting the model requirements were included in the current study. The following recommendations regarding sample size requirements for performing DEA were considered from literature: [42,43] i. It should be larger than the product of the number of inputs and outputs; ii. It should be at least 3 times the sum of the number of inputs and outputs. Given 4 input and 3 output variables, the minimum sample size would be at least 12 (4X3) based on # (i) or 21 [3(4+3)] based on #(ii). [42] Given these considerations, 21 VMMC facilities was considered a sufficient sample.

Choice of model input-output variables
Hacer and Ozcan [44] recommend multiple outputs specification instead of one to reduce measurement error occasioned by varied input requirements, although effects of within-group homogeneity or between-group heterogeneity should be similarly considered. Bessent and Bessent [45] have proposed a criteria for identifying relevant input and output variables to ensure DEA performance remains robust [

Rationale for the performance model
In performing DEA selecting an appropriate variable set and specifying model specification and orientation is a methodological necessity to ensure results are comprehensive and robust. Variables included in the DEA model (Table 1) were considered most critical to circumcision process. [46] The number of circumcisions, surgical beds in use and uptake of pre-operative HTC were considered to be outside the control of providers (exogenous factors) since they depend on demand for VMMC. Consequently, the maximum possible increase in outputs by facility was estimated while keeping the inputs and exogenously fixed outputs at their current levels. As demonstrated by Banker and Morey [47] this consideration allows non-discretionary variables to influence the relationship between inputs and outputs, but the "efficiency score" is not affected by them (since they are considered fixed and out of the control of the providers). It also improves comparability of units in the set and enhances opportunities for identifying target increases in the controllable variables required for the facilities to be efficient.

Model orientation
We assumed an output orientation with variable returns to scale since the program aims to maximize outputs within constrained resources. VMMC facility size (in terms of number of clinical staff and beds used) was deemed relevant to assessing relative efficiency. At the same time, bed space, number of staff, uncertain service demand and other exogenous constraints were likely to cause VMMC facilities to operate at suboptimal capacities. [2] In these circumstances, the VRS assumption ensures that a facility is only compared against others with similar size (based on number of staff and beds). [23,42] Other model versions were computed to elicit the marginal productivity of service units under different assumptions. The efficiency scores obtained indicate extent of input use for the maximum possible outputs obtained with given unit sizes. [42] Malmquist productivity index (MPI) Productivity index measures how output changes with the level of inputs used between two time-intervals (t, t+1). Values indicate shifts in productivity for each production unit relative to (towards, along or away from) the observed frontier. [27] Index values >1 implies productivity growth, while a value <1 shows productivity decline, and if = 1 indicates stagnation. Thus production quantities and technological best practice can be shown to be improving, deteriorating or unchanging over time. Malmquist productivity index is one of the methods commonly used to assess productivity changes over time. It identifies sources of productivity change in terms of: i) technical change (associated with variations in quantity and quality of labor/capital, for example clinical staff skills by cadre and bed space); ii) pure efficiency change (associated with variations in context/organizational approach largely of labor and capital inputs, including compliance with VMMC treatment protocols and referrals, support supervision, availability of supplies). Both i & ii constitute the overall efficiency change and; iii) scale efficiency (which measures productivity changes attributable to variation in unit size, for example staff mix and work responsibilities, work space and logistics). If there is improved use of resources the service unit position will move towards the frontier indicating positive efficiency gain. [48] The Malmquist productivity index was estimated based on Ray and Desli (1997) method in Cooper et al., 2007 [21] to account for scale efficiency change effects as the output mix varied over time with changes in the number of staff and surgical beds used. [49] The average efficiency changes between the two time-periods considered are represented by geometric means to normalize values because multiple items with different properties are involved.

Weighting considerations
No a priori weight restrictions were imposed on the variables.

Identification of peers
Based on model specifications with exogenous factors fixed, conventional DEA efficiency evaluation of VMMC facilities was performed simultaneously and a reference set of efficient units (peers) identified using a two stage process to ensure identification of both high quality-high efficiency peers. The procedure also identified potential changes required to make each inefficient unit as efficient as the most efficient (best-practice) ones on the frontier [23].  (Table 3).

Technical efficiency scores
Efficiency scores by service delivery type.  Table 4). Table 5 shows an output oriented VRS results of inefficient facilities and their peers with respective combination weights in parenthesis. These show projected production options that will enable them reach relative efficiency. Facilities identified as peers were #111, 119, 121, 129 and 125 in 2011; 103 and 101 in 2012. The reference facilities #129 and 111 had high technical efficiency scores but low in quality score (50 and 55 respectively). We repeated the DEA model excluding the 2 low-quality units to enhance probability of obtaining only high efficiency-high quality peers, following Sherman and Zhu (2006) [50] and, Shimshak, Lenard and Klimberg (2009). [51]. The resulting new reference units shown in Table 6

Technical efficiency
The observed technical efficiency results suggest that, given the quantity of inputs they consumed, the facilities could have produced 16% more output in 2011 and 11% more output in 2012. The observed distribution of technical efficiency scores under VRS (which expresses only    On the other hand, the pure technical inefficiency elicited under VRS could be related to largely unsatisfactory performance of tasks including compliance to standard guidelines for service delivery. Additionally, among fixed facilities, observed inefficiency could be associated with dynamic contexts, inelastic obligatory institutional requirements and personnel factors that adversely affect technical efficiency. [9,52] In previous DEA evaluation of health care delivery at various delivery tiers in Kenya Kirigia and colleagues [29,31] demonstrated that technical inefficiencies was largely associated with unexploited resources. The present study similarly highlights the critical importance of resource use in VMMC service delivery. Hence, it is recommended that program supervisors should include management solutions in planning their routine operations.
Overall, DEA technique is particularly a useful tool to use for first-line evaluation to furnish vital diagnostic information on VMMC facility performances. However, since it cannot generally identify the 'causes' of inefficiency with the precision a manager would need in order to take decisive action, additional investigation on the improvement needs identified using other methodologies is necessary. [53] Benchmarking In DEA-based benchmarking, respective unit performance is assessed against the efficient frontier or best practice units in the sample as opposed to an 'average' or 'central tendency' analysis. In the current study, the benchmark facilities were all of fixed facility category. This could be attributed to unique and diverse experiences among outreach service categories in terms of size and operational dynamics. This implies that when planning improvement efforts, based on benchmarking, it is necessary for managers to consider the contextual needs of facilities and other occult causes of inefficiencies unique to them despite their position relative to the frontier.
Stratifying facilities using multiple criteria in a step wise approach [54] improves the precision of DEA benchmarking exercise as observed in this study. Inclusion of service quality variable in DEA benchmarking in the current study enhanced evaluation comprehensiveness and balance, similar to previous studies [51,55]. However, Sherman and Zhu (2006) have observed an efficiency/quality trade-off when benchmarking with quality-adjusted DEA to seek lowercost-high quality service in the banking industry. Shimshak et al., (2008) recommend that "the Technical Efficiency and Productivity of Scaled-Up VMMC Services in Kenya choice of quality output measures be appropriately related to the input measures" [51] to improve compatibility with the objectives of the DEA model.

Productivity measures and sources of variation in VMMC service delivery
The present study has used DEA to identify the scope of technical inefficiency, insofar as they are pure inefficiency (context / organizational-related), technology-related, or scale-related. The main driver of productivity increase was technical change largely related to accelerated program activities in 2012 that enabled facilities to expand their production possibilities. For example, improved 'speed'/experience in performing circumcisions enabled providers to perform more procedures without additional staff/bed space as inputs. The significant progress in technical efficiency change especially related to pure efficiency change among outreach facilities suggest the majority were versatile and aptly exploited their production resources/possibilities to expand productivity, hence importance of flexible program strategies. However, the modest progress in scale efficiency change indicates that facility size was not a major source of the improved productivity observed. In 2011, the majority of facilities did not exhibit optimal productive unit size. The decline in factor productivity among fixed VMMC facilities was attributable mainly to regress in technical change, technical efficiency change and pure efficiency change which reflect probable influence of operating environments, staff skills and other institutional management factors. These facilities face challenges to optimally adjust to variations in service demands due to inelasticity in obligatory resources, especially related to personnel issues, supplies and theatre-space. [56] Consequently the Ministry of Health policies and implementing organizations could seek to emphasize improvements of operational contexts of fixed facilities through strategic resource allocation and investment in staff skills. This is more critical when considering mainstreaming VMMC for long-term sustainability. However, outreach service delivery model remains strategic for efficient resource use. [38,57]

Study limitations
Since DEA technical efficiency scores exhibit unknown statistical distribution and that the efficiency scores by CRS, VRS and Scale may be skewed the statistical inferences should be interpreted with caution. DEA assumes all errors are due to inefficiency and its estimates are sensitive to outliers.