A four-stage DEA-based efficiency evaluation of public hospitals in China after the implementation of new medical reforms

This study applied the non-parametric four-stage data envelopment analysis method (Four-Stage DEA) to measure the relative efficiencies of Chinese public hospitals from 2010 to 2016, and to determine how efficiencies were affected by eight factors. A sample of public hospitals (n = 84) was selected from Chongqing, China, including general hospitals and traditional Chinese medicine hospitals graded level 2 or above. The Four-Stage-DEA method was chosen since it enables the control of the impact of environment factors on efficiency evaluation results. Data on the number of staff, government financial subsidies, the number of beds and fixed assets were used as input whereas the number of out-patients and emergency department patients and visits, the number of discharged patients, medical and health service income and hospital bed utilization rate were chosen as study outputs. As relevant environmental variables, we selected GDP per capita, permanent population, population density, number of hospitals and number of available sickbeds in local medical institutions. The relative efficiencies (i.e. technical, pure technical, scale) of sample hospitals were also calculated to analyze the change between the first stage and fourth stage every year. The study found that Four-Stage-DEA can effectively filter the impact of environmental factors on evaluation results, which sets it apart from other models commonly used in existing studies.


Introduction
In March 2009, the Chinese government formally launched healthcare reform as a long-term goal. China's investment in health infrastructure has increased significantly, resulting in the primary medical and health service systems being strengthened. Accessible medical insurance coverage has been achieved in a relatively short period of time and basic medical care and health services are available to all. In addition, the proportion of total health costs covered by patients has decreased, and the equalization of basic public health services has continued to PLOS  advance. Subsequently, rates of child and maternal mortality and infectious disease morbidity have been significantly reduced, and health levels and life expectancies of Chinese residents have been significantly improved.
As is the case with many other countries, China faces many challenges in the reform and development of its medical and health services. Chronic diseases such as cancer, diabetes and heart diseases have become major health threats with aging of the population. Moreover, according to an important study, total health expenditure (THE) in China is expected to increase from 3.53 trillion yuan in 2014 to 15.805 trillion yuan in 2035, with a phased average annual growth rate of 8.4%. More importantly, health expenditure as a proportion of GDP will rise from 5.6% in 2014 to more than 9% in 2035, and incremental health expenditure of more than 60% was due to the increase in hospital services [1].Therefore, efficiency evaluation should pay attention to hospital efficiency, especially public hospitals above level II. Because that China's health institutions can be designated are Primary, Secondary or Tertiary institutions based on their ability to provide medical care, medical education and conduct medical research. A primary hospital is typically a township hospital that contains less than 100 beds. They are tasked with providing preventive care, minimal health care and rehabilitation services. Secondary hospitals tend to be affiliated with a medium size city, county or district and contain more than 100 beds, but less than 500. They are responsible for providing comprehensive health services, as well as medical education and conducting research on a regional basis. Tertiary hospitals round up the list as comprehensive or general hospitals at the city, provincial or national level with a bed capacity exceeding 500. They are responsible for providing specialist health services, perform a bigger role with regard to medical education and scientific research and they serve as medical hubs providing care to multiple regions [2]. According to economic type or sources of funding, Chinese hospitals can also be divided into public hospitals and private hospitals and the former are the most important component of health systems and account for a large share of health expenditure in China.
The evaluation of the effectiveness of public hospital reform is very important and beneficial since it is of benefit to the government and hospital managers to facilitate an understanding of the current situation of the reforming processes and to allow for targeted measures. The process can help measure the effects of reform of public hospitals in China, can assist in selecting an appropriate evaluation method and indicators and how to control or eliminate the influence of environmental factors on the outcome. Data enveloping analysis (DEA) is a multiple-input multiple-output nonparametric evaluation method and has already been widely employed to estimate the relative efficiencies of public hospitals, particularly over recent years, including measures of overall efficiency and technical efficiency, pure technical efficiency and scale efficiency. The Four-Stage-DEA method [3][4][5] is mostly based on the common variants of DEA (e.g., C 2 R, BC 2 and Malmquist) which have been used in some previous studies as three-stage-DEA [6][7][8]. Most of those studies were focused on analyzing the influence of performance metrics on results and the external factors were first incorporated by Zhang and Liu (2013) [9]. However, previous studies failed to effectively translate into decision making practice because environmental variables were not effectively eliminated or controlled. Since the objectivity of evaluation results can be questioned, it is doubtful that past evaluation results can be adopted by government regulators and hospital managers.

Data source
To assist medical institutions in keeping track of information on health resources, the Chongqing Health and Family Planning Commission collects data from city-and county-level medical institutions and hospitals on an annual basis, compiles the data into the Collection of Major Metrics of Some Medical Institutions in Chongqing and releases these data to the city's medical agencies. In the present study, the input-output data were derived from the collection released during 2010-2016, along with information published on the official websites of the Chongqing Health Information Center and 84 public hospitals. The data for environmental variable were acquired from the Chongqing Statistical Yearbooks of years 2011-2017. To guarantee homogeneity of the evaluated hospitals, only comprehensive hospitals and traditional Chinese hospitals graded level 2 or above under the Chinese hospital criterion were selected. An additional reason behind this selection is that according to a previous investigation, over 80% of the traditional Chinese hospitals now adopt Western medical technologies. Hospitals with incomplete data over the study period were excluded; notable examples were the People's Hospital in Gaoxin district and University-Town Hospital of Chongqing Medical University. In addition, the hospitals that are governed by a different administrative area, such as the Shuangqiao and Wansheng hospitals, were also excluded. A total of 84 hospitals that satisfied all the inclusion criteria were selected, and sorted according to the Collection of 2016. These hospitals were sequentially labeled H1 to H84.

Methodology
The input-output metrics and the environmental variable were descriptively analyzed using Excel 2010. A database was constructed in compliance with the requirements of Deap2.1 and Four-Stage DEA was used for analysis. Stage 1: "Coarse" efficiency calculation. With Deap2.1, the DEA value of the 84 hospitals and the slackness of each input were computed using the input-oriented DEA-BC 2 model. The means of integrated efficiency, pure technical efficiency and scale efficiency, along with tendency were then analyzed. Finally, the frequency distribution of hospital efficiencies was determined (100%, 80-99.9%, 60-79.9%, < 60% [10]).
Stage 2: Tobit regression analysis. With the input relaxation obtained from Stage 1 as the dependent variable and the environmental factor as the independent variable, I Tobit regression models were constructed, where I denotes the number of input elements. A Tobit truncated model-based analysis was performed via stata14. The regression model is: Where denotes the total relaxation of the i-th input element obtained from the classic DEA method, Z ik denotes the vector of exogenous environmental variables,α i denotes the constant and β i denotes the vector coefficients which need to be estimated.U i denotes error. Stage 3: The original input quantities were jointly adjusted to the regression result of the second step and the maximum fitted value using Excel 2010. The method can be formulated as: Where indicates that the input quantity is not adjusted. And means that given a congenial external environment, the original input causes a decrease in DEA efficiency. This can be attributed to the fact that the largest fitted value indicates that the decision-making unit is under the worst external environment currently. Adjusting and improving the external environment can reduce the high efficiency that is obtained from the congenial factors [4,11]. Stage 4: The adjusted value of relative efficiency was computed jointly using the adjusted input quantity, original output, Deap2.1 and DEA-BC 2 . The efficiency and frequency distributions of hospital efficiency were compared with those of the first stage.
After the four stages were completed, the influence of the environmental factors on efficiency could be effectively eliminated or alleviated, making the evaluation result more objective.

Selection of metrics
Input/output metrics and environmental variables for DEA analysis Considering the fact that the 84 public hospitals are rated above level 2, biblio-metric analysis and our previous quantitative research were performed to determine the input/output metrics and the eight external environmental variables chosen for the current study [12][13][14][15][16][17]. Their definitions and data sources are given in Table 1.

Descriptive analysis of input/output metrics and environmental variables
Input/output metrics and environmental variables were first descriptively analyzed before evaluating relative efficiency via Deap2.1.
Analysis of input/output metrics: Excluding sickbed utilization, the mean of the remaining seven metrics increased for seven years consecutively. Excluding financial investment and sickbed utilization, the maximum values of the remaining six metrics also increased on an annual basis. The minimum value of five metrics, excluding manpower, financial subsidy and sickbed utilization, increased annually, as shown in Table 2.
Analysis of environmental variables: Excluding hospital density, the mean and maximum values of the remaining seven metrics increased annually, and excluding hospital and population density, the maximum values of the remaining six metrics also increased on an annual basis. whereas the minimal values of the permanent population and population density decreased. This can be attributed to the fact that this city's population is concentrated in the major downtown and economically-developed regions, resulting in annual decreases in populations of remote counties. The minimum value of hospital density varied irregularly, whereas the minimum value of the remaining five metrics increased on an annual basis, as shown in Table 3. Table 4 shows the relative efficiency and frequency distribution of the 84 public hospitals in Chongqing. It can be seen that although the mean of relative efficiency of the samples was high (> 0.862), the technical, pure technical and scale efficiencies displayed different characteristics.

Results of DEA analysis for Stage 1
Technical efficiency. Technical efficiency refers to the ability to comprehensively measure the distribution and utilization of resources in an evaluated hospital. As shown in Table 4, the maximum value was constantly 1.000 whereas others varied irregularly. The mean of technical efficiency of sampled hospitals increased during 2010-2013, decreased in 2014 and then rose again in 2016. The minimum value of technical efficiency peaked at 0.535 in 2012. In terms of the frequency distribution of hospital efficiency, relatively efficient hospitals (100%) accounted for about 1/3 of the total samples, whereas over 35% of the samples fell in the range of 80- The size of manpower at the end of each year = medical service revenue/revenue generated by each worker Government financial subsidies (10,000 RMB) The total financial and business funds (including fixed and targeted grants) obtained by the hospital from its supervising authorities or sponsors Financial subsidy = total revenue × the proportion of financial subsidy in hospital revenue

Number of beds
The actual and fixed number of hospital sickbeds (those not occupied) at year end, including regular sickbeds, bunk beds, monitoring beds, sickbeds under disinfection and repair and out-of-service sickbeds due to expansion or major renovation.

Direct extraction
Fixed assets (10,000 RMB) Tangible assets that have been used for more than one year and whose amount and specific standards have not changed. Revenue obtained by a medical and healthcare organization during its operations, including incomes from patient occupancy, diagnosis, examination, treatment, surgery, laboratory tests, nursing and other income.

Hospital bed utilization rate (%)
The percentage occupied sickbeds per day Direct extraction

Number of available sickbeds in local medical institutions
The actual and fixed number of sickbeds (those that have not been occupied) in the area at year end Direct extraction

Number of medical staff in local regions
The professional medical staff such as practicing physicians, practicing assistant physicians, registered nurses, pharmacists, laboratory technicians, imaging technicians, health supervisors and trainees (pharmacists, nurses and technicians)

Licensed and assistant doctors in local regions
Doctors rated whose as "practicing physicians" or "practicing assistant physicians" and who are engaged in medical treatment and preventive healthcare work

Licensed and assistant nurses in local regions
Nurses who have a registered nurse certificate and are engaged in practical nursing work 99.9% and around 20% of the hospitals were in the range of 60-79.9%. H7 showed efficiencies consistently below 60%. These findings contrasted with the previous conclusion of annual increases for most metrics, prompting us to further explore the reasons behind relative inefficiency. Pure technical efficiency. Pure technical efficiency is influenced by hospital management and technologies. It can be inferred from Table 4 that the pure technical efficiency of the samples was slightly higher than the technical efficiency for seven years consecutively, varying in a way similar to the technical efficiency. Note that for each of those seven years, the number of hospitals with relatively high pure technical efficiencies was consistently higher than the varied irregularly. This finding also contradicts the previous conclusion of annual increases for most metrics. Scale efficiency. Scale efficiency is influenced by hospital size. As shown in Table 4, the mean of scale efficiency fluctuated within a small range (0.956-0.975), and the minimum value fluctuated in the range of 0.585-0.752. For over 99.95% of the hospitals, the efficiencies were 100% or in the range of 80-99.9%. Only one hospital had an efficiency below 60% in 2013.
Considered together, Stage 1 analysis indicated that the mean technical efficiency of the 84 sampled hospitals was high, but the mean of scale efficiency was obviously higher than the mean of pure technical efficiency. This means that without considering external factors, the scale of sampled hospitals is suitable. Government departments and hospital management personnel should improve the relative efficiencies of hospitals by improving management and technology.

Result of Tobit analysis for Stage 2
Using eight independent environmental variables and four input relaxation dependent variables, four Tobit models were constructed for regression analysis of the seven years. The basic interpretation of the analysis was that if the explanatory variable was directly proportional to the explained variable and the input slack variable, the environmental factor did not contribute to the relative efficiency of the hospital. On the other hand, if the explanatory variable was in reverse proportion to the explained variable and the input slack variable, that particular environmental factor contributed to the relative efficiency of the hospital. The observations have been summarized in Table 5: In the years 2010-2016, at the end of each year, manpower was directly proportional to the GDP per capita, meaning that increasing the hospital's manpower was not conducive to the improvement of relative efficiency of hospitals.
The remaining variables were in either direct or reverse proportion to the environmental factors, implying that the contribution of each metric to an environmental factor varied with time.
Statistically significant regression results were only observed for some years. After repeated verification of the original data, computed independently by two researchers, the above phenomenon persisted. On further consideration and inspired by studies in other fields, it was deemed absolutely necessary to alleviate or eliminate the influence of environmental factors on relative efficiency by fitting the original input.

Result of fitting using Excel 2000 for Stage 3
Based on the Tobit-based regression analysis result of Stage 2, the largest fitting value was inserted in Table 6 to adjust the original input. This is because the largest fitting value is able to represent the worst external environment of the hospital, effectively alleviating the high efficiency that can be attributed to a congenial external environment. In this way, the hospitals can be assessed under similar circumstances to guarantee as much homogeneity as possible. In other words, the influence of environmental variables on efficiency evaluation can be considerably controlled or even eliminated. The results in Table 6 show that all metrics changed after the fitting process. Considerable increase in manpower, financial subsidy and sickbed availability was however seen only at the end of 2011. After repeatedly confirming that the original data and the calculation process were error-free, the results above were attributed to the statistical significance of the year (e.g., P = 0.00 for the manpower). Table 7 compares the "net" relative efficiency with the "coarse" relative efficiency, which was analyzed from the perspectives of unchanged and changed variables, as described below.

Result of DEA analysis for Stage 4
Unchanged parts. The mean of scale efficiency remained larger than the means of technical efficiency and pure technical efficiency. This indicated that the environmental variable did not exert a significant influence on this phenomenon. In other words, scale efficiency was the major contributor to the hospital's relative efficiency, implying a scope for expansion.
The maximal values for technical efficiency, pure technical efficiency and scale efficiency were all invariably 1.000. This indicated that for most hospitals, efficiencies remained optimal even after the environmental variable was introduced. This could be attributed to high levels of internal management and technologies of the evaluated hospitals. This finding can provide relevant government bodies and hospital management with more accurate decision making support.   The frequency distribution of scale efficiency remained the same for 2013 after adjustment of the input variable. Rather than the equality of "net" and "coarse" relative efficiencies, the scale efficiency that increased for some hospitals and decreased for others can explain the above result. This finding also demonstrates the ability of the environmental variable to correct the bias of evaluation results.   and decreased for the remaining six years. The number of hospitals in the range < 60% increased for three years and remained the same in four years. Therefore, the fitting process was able to yield more objective efficiency and correct the bias of evaluation results.
The mean of scale efficiency varied for (−0.053-−0.002) but increased in 2014. The frequency distribution of hospital efficiency remained within a certain range. That is, the number of hospitals in the 100% range decreased except 2014 whereas those in the < 79.9% range increased. Hence, the high value of scale efficiency obtained in Stage 1 is susceptible to the environment and these effects need to be removed effectively during the efficiency evaluation.
The above analysis shows considerable variation of efficiency and its frequency distribution after the environmental variable is controlled and eliminated. Hence, we can conclude that the environmental factor has an impact on the evaluation results, and this observation should be considered during relative efficiency evaluation.

Discussion
With increasing efforts for new medical reforms in China, research on efficiency evaluation for public hospitals is of increasing practical significance. This is because objective evaluations of efficiency can provide direct reference for government departments and hospital managers

Post-adjustment
Comparison with pre-adjustment results

Post-adjustment
Comparison with pre-adjustment results

Post-adjustment
Comparison with pre-adjustment results

Post-adjustment
Comparison with pre-adjustment results

Post-adjustment
Comparison with pre-adjustment results so facilitate pertinent efficiency measures. However, the relevant approaches and indices to select and how to control the impact factors have remained problematic. DEA is a multi-input and multi-output efficiency evaluation method, suitable for efficiency evaluation of public hospitals. However, past research results using this method have not been effectively applied to management decision making. One of the important reason for this is that the impact of the external environment on the efficiency evaluation results were not effectively controlled or eliminated. In other words, the precondition that the decision-making units should be as homogeneous as possible when using DEA for analysis cannot be ensured, resulting in questionable objectivity of the assessment results, which decreases their practical applicability.

Post-adjustment
The results of the present study indicate that during the Step 1 "initial" efficiency evaluation, the three relative efficiency measures of sample hospitals, namely technical efficiency (ranging from 0.860 to 0.890), pure technical efficiency (ranging from 0.882 to 0.922) and scale efficiency (ranging from 0.956 to 0.975), showed increasing trends, indicating that new medical reforms have played a significant role in improving the relative efficiencies of public hospitals and demonstrating the feasibility and effectiveness of the public hospital reform policy. However, the technical efficiency values of public hospitals in Chongqing were lower than those of other provinces.  [19]. In 2012, the average technical efficiency, pure technical efficiency and scale efficiency values of public hospitals in Tianjin were 0.893, 0.909 and 0.979, respectively [20]. These differences in evaluation results might be attributed to the index selection, hospital grade and other external factors Second, it can be seen from the Tobit analysis in Step 2 that most of the eight external factors selected had an impact on the input slack variables, and we can draw the conclusion that the analysis of additional impact factors will have an impact on the efficiency evaluation results. Consequently, it is essential to use the Four-Stage-DEA analysis to evaluate the efficiency of public hospitals; otherwise, the homogeneity of the evaluated hospitals cannot be ensured in the DEA evaluation.
Third, it can be observed from the adjustment of original inputs in Step 3 that the adjusted average inputs are greater than the original inputs. In this way, all the evaluated decision-making units are placed in the worst external environment, thereby placing the sample hospitals on the same external platform to allow more efficient, objective and scientific efficiency evaluation results.
Fourth, it can be seen in Step 4 that during the Four-Stage-DEA analysis, the results obtained from Step 4 were quite different from those obtained from Step 1. The Four-Stage-DEA analysis is different from other commonly-used models such as DEA-CCR, DEA-BCC and Malmquist or models of decreasing indices. One of the characteristics of the present study is that the number of relatively-efficient hospitals almost changed and the mean, median and minimum scale efficiencies of the sample hospitals changed too in different years. These phenomena show that the new medical reform plays a significant role in improving the efficiencies of public hospitals; however, it is essential to eliminate and control the impact factors.

Conclusions
The Four-Stage-DEA method was adopted in this paper to quantitatively evaluate the technical, pure technical and scale efficiencies achieved by 84 public hospitals in Chongqing during 2010-2016. To the best of our knowledge, this method has not previously been applied elsewhere in the medical field in China. The influence of environmental factors on the evaluation result was controlled and eliminated by comparing relative efficiencies with and without considering the environmental factors. The adopted Four-Stage-DEA method is different from the Three-Stage-DEA method and the traditional DEA model. Conclusions from this study are summarized as follows.
1. The study on factors is unique: Although the Tobit regression analysis has been performed in previous studies where the input/output metrics were compared with the efficiency evaluation results, the unique feature of the present study was the addition of environmental variable into the analysis, and the emphasis on the homogeneity of evaluated hospitals during DEA. We found that the direction and amplitude of the influence that a specific environmental factor has on a specific input slack variable varied across years. Therefore, an appropriate and controllable range has not yet been identified in the context of unbalanced deployment of medical resources and it should focus on in the future research and management practices.
2. The external environment has an impact on evaluation result of relative efficiency: The mean of scale efficiency was consistently higher than those of the two other efficiencies throughout the seven years, even though it decreased during most years. This implies that traditional studies using conventional models have overestimated relative efficiencies of hospitals, thereby limiting their usefulness for decision makers.
3. The scope for relative efficiency improvement is high: For each of the three relative efficiencies, the mean was high and over 0.800 [21] at Stages 1 and 4. However, there is scope for improvement for two thirds of the hospitals. For some hospitals, the efficiencies were consistently below 60% across the seven years after the addition of environmental variables (e.g., H7). These inefficiencies should therefore be treated very carefully.
4. More attention should be paid to this problem in future studies: Efficiency evaluation is an important content of the new medical reform and is very necessary in solving the problem of limited and unbalanced medical resources because efficiency evaluation is needed in the supply-side structural reform in the medical and health fields. The idea and method for evaluation of relative efficiency should be put into practice. The ultimate aim of the present study is to put our conclusions into practice. To this end, the metrics and their weights need to be appropriately determined and the factors need to be evaluated. The Four-Stage-DEA method, which was adopted in the present study to control environmental variables, can be combined with DEA-Bootstrap to control random error and improve reliability of the evaluation results.
Supporting information S1