Using social choice theory and acceptability analysis to measure the value of health systems

The Future Health Index (FHI) is developed by the Royal Philips to help determine the readiness of countries to address global health challenges and build sustainable, fit-for-purpose national health systems. The FHI 2018 presents the Value Measure to measure the value of 16 health systems, which is formulated by taking the arithmetic average of Access, Satisfaction and Efficiency. However, this scheme is not the Pareto optimal and loses association with weights. For these reasons, this paper proposes to apply the social choice theory and Stochastic Multicriteria Acceptability Analysis for group decision making (SMAA-2) to measure the value of health systems, by means of re-constructing the Value Measure. Specifically, we begin with considering all possible individual preferences among Access, Satisfaction and Efficiency, which is mathematically represented by ranked weights of them; the pessimistic and optimistic outcomes under certain individual preference are derived in a closed-form manner, according to which an interval decision matrix is then formulated; the SMAA-2 is then lastly applied to compute the holistic acceptability index, which is considered as a revised Value Measure. An empirical study using the data of 16 health systems is conducted to show the effectiveness and superiority of our method. It is demonstrated that our method always outperforms the Value Measure, by means of comparing the Spearman’s rank correlation coefficients.


Introduction
The challenges of delivering health care in many countries are receiving increasing attentions as costs continue to rise and evidence of uneven quality accumulates [1]. Although most health care reforms have focused on coverage, the far bigger long-term driver of success will originate from restructuring the health care delivering system to a value-based system [2]. The concept of value-based health care suggests a change of model in which the provision of health services does not focus on the quantity of services provided but on the value they generate, understanding value as overall quality of care and health outcomes related to the costs achieving those PLOS ONE PLOS ONE | https://doi.org/10.1371/journal.pone.0235531 July 2, 2020 1 / 13 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 outcomes. In this sense, as a people-centric approach, value-based health care describes a system with the goal of increasing access to care, improving patient outcomes, and delivering satisfaction to both patients and practitioners at optimum cost. In other words, value-based health care is contextual, geared towards providing the right care in the right place, at the right time and at the right level of cost. Therefore, achieving high value for various stakeholders must become the overwhelming goal of health care delivery. Rigorous, disciplined measurement and improvement of value are the best way to drive system progress [3]. Nevertheless, the value in health remains largely misunderstood and unmeasured. The Future Health Index (FHI) is a research-based platform designed by the Royal Philips to help determine the readiness of countries to address global health challenges and build sustainable, fit-for-purpose national health systems. The FHI 2016 measures perceptions to produce a snapshot of how health care is experienced on both sides of the patient-professional divide. The FHI 2017 compares these perceptions to the reality of health systems in each country researched. The FHI 2018 builds on the increasing consensus that, with the rise of chronic diseases and health care costs, the value-based care model is the best approach to address these challenges. In addition, the FHI 2018 identifies key challenges that form a barrier to the largescale adoption of value-based care and improved population access; and assesses where connected care technology-data collection and analytics, and telehealth-can help speed up the health care transformation process. The FHI 2018 measures and assesses the value presented in 16 health systems of developed and developing markets through proposing a broadly applicable composite indicator, namely, Value Measure. The Value Measure combines criteria with respect to value-based health care and access to care, arguably the ultimate goals of modern health care.
The Value Measure consists of three metrics: Access (how universal, and affordable, is access to health care in the designated market?), Satisfaction (to what extent do the general population and practitioners in the designated market see the health care system as trustworthy, and effective?) and Efficiency (does the system in the given market produce outcomes at an optimum cost?). The components of Value Measure are listed in Table 1. Each metric is composed of several sub-metrics, which are normalized to ensure comparability across countries and are scored to fit onto a 0 to 100 scale. The scores for each sub-metric are arithmetically averaged to calculate each metric sore and those scores are then arithmetically averaged to construct the Value Measure. That is, The scores for these sub-metrics use a combination of third-party data and survey data. Specifically, the third-party data is sourced from many organizations including the World Health Organization, The Commonwealth Fund, and the World Bank, while the survey data is collected from the countries analyzed using their native language. A combination of face-toface, online and phone interviewing is employed. The sample from the survey includes 24,654 adults and 3,244 health care professionals.
As shown in (1), the Value Measure assigns equal weights to Access, Satisfaction and Efficiency, this plausible scheme results in substantial information loss [4]. In addition to this, the arithmetical average is significantly affected by the extreme values, not Pareto optimal, and losses association with weights [5]. All these shortcomings inspire scholars and practitioners to develop new methods for improving the calculation of the Value Measure. The contribution of this paper is the development of a new method to modify the Value Measure released by the Royal Philips for measuring the value of health systems, based upon the social choice theory and Stochastic Multicriteria Acceptability Analysis for group decision making (SMAA-2). Social choice is the theory of how one designs or chooses a mechanism to summarize from a set of individual preference orders over alternatives available to a society of those individuals to a collective or social preference order over those same alternatives [6,7]. Stochastic Multicriteria Acceptability Analysis (SMAA) is a multicriteria decision support method for multiple experts in discrete problems, based on exploring the weight space to describe the valuations that make each alternative the preferred one [8,9]. SMAA-2 extends SMAA by taking into account information about other ranking positions, therefore identifies good compromise alternatives.
Specifically, we begin with eliminating the equalitarianism assumption to consider all possible individual preferences among three metrics. Certain individual preference is mathematically represented by a set of ranked weights. It seems reasonable that a decision maker should at least rank the metrics, since rankings are normally easier to provide than usually inaccessible precise weights information [4,10]. In the meanwhile, the decision maker may be unable, unavailable, or even unwilling to obtain sufficiently precise weights [11]. Nevertheless, it is difficult to achieve consensus about exact weights in a problem with multiple decision makers [9]. In this sense, we then calculate the worst and best outcomes under certain individual preference in a closed-form manner, according to which an interval-valued decision matrix is formulated with country-as-row and individual preference-as-column. Lastly, the SMAA-2 is applied to obtain the holistic acceptability index for each country, which is regarded as an improved version of the Value Measure. We compute the Spearman's rank correlation coefficients to demonstrate the superiority and rationality of the proposed method. This study proposes a new incentive and a feasible direction to measure the value of health systems in an appropriate manner, along with the provision of some academic, managerial and policyrelated implications.
The remainder of the paper is organized as follows. We present the method for improving the evaluation of the Value Measure in Section 2, followed by an empirical study for a panel of 16 countries in Section 3. We conclude in Section 4 by discussing the details of our method and suggestions for future research.

Method
For the purpose of measuring the value of health systems that are previously aggregated using the arithmetic average, this section proposes a method for the general case with m Decision Making Units (DMUs) and n metrics, which can be easily applied to improve the Value Measure with Access, Satisfaction and Efficiency. x ij , i = 1, 2, . . ., m, j = 1, 2, . . ., n indicates the performance of DMU i under sub-index j. To adjust values measured on different scales to a notionally common scale, we use the feature scaling (alternative known as min-max normalization) to scale the range in [0, 1]: The method proposed is two-fold and begins with investigating all possible individual preferences among the n metrics, under which the pessimistic and optimistic outcomes are derived in a closed-form manner; we then employ the SMAA-2 to compute the holistic acceptability index for aggregating the individual preferences into a social choice result.

Individual preference
This paper takes into account all possible individual preferences among the metrics to deal with the drawbacks associated with the arithmetic average method. In this sense, an individual preference can be represented by an importance order of metrics. For the ease of demonstration, we only investigate one of the individual preferences in this section, the result of which can be easily migrated in other scenarios. We investigate the situation in which w 1 � w 2 � � � � � w n , and w j , j = 1, 2, . . ., n is the importance degree of metric j. In this manner, the pessimistic and optimistic results for DMU i can be determined by the following two linear programs: For α j � 0, j = 1, 2, . . ., n, we define the weights as w k ¼ X n j¼k a j . This is consistent with given individual preference among metrics, Moreover, we define Therefore, the linear program (3) is equivalent to the following model: Letk 2 f1; 2; . . . ; ng satisfies that s ik ¼ max k fs ik g, then the optimal solution to linear pro- Consequently, the optimistic result for DMU i with certain individual preference can be easily determined as the following closed form: This scheme is easy-to-understand and simple-to-implement, and can be readily migrated to other situations. Similarly, the pessimistic result for DMU i with certain individual preference Taking into account the pessimistic and optimistic outcomes under all possible individual preferences, an interval-valued decision matrix O m × n! is formulated as below: As claimed by [12], O m × n! represents a stochastic decision problem. SMAA-2 has been accepted as an effective tool to solve this problem [9].

SMAA-2
Stochastic multicriteria acceptability analysis (SMAA) is a multicriteria decision support method for multiple experts in discrete problems, based on exploring the weight space to describe the valuations that make each alternative the preferred one [8,9]. SMAA-2 extends SMAA by taking into account information about other ranking positions, therefore identifies good compromise alternatives. This in particular makes sense when some extreme alternatives obtain the best ranking positions through some experts, but reach a very bad ranking position according to others.
We describe the preference structure among different experts that can be represented by a real-valued utility function u(x i , λ), which maps different alternatives x i to utility values in terms of a weight vector λ to quantify each specific preference among various decision results. Consider a more general environment in which neither input data nor weights are exactly known. The uncertain or imprecise input data is represented by stochastic variables z il with estimated joint probability distribution and density function f(z) in the space X, while the unknown or partially known preferences are represented by a weight distribution with density function f(λ) in the set of feasible weights Λ defined as The set of feasible weights is therefore a (p − 1) dimensional simplex. The aforementioned utility function is then employed to map stochastic input data and weight distributions into utility distributions u(z i , λ).
Total loss of knowledge on weights is represented in "Bayesian" manner by a uniform weight distribution in Λ, which has density function In SMAA, the set of favorable weights for each alternative Λ i (z) is then defined as: The ranking position of each alternative is defined as an integer from the best (= 1) to the worst (= m), in terms of a ranking function: in which φ(ture) = 1 and φ(false) = 0. In SMAA-2, the set of favorable weights for L r i ðzÞ is defined as: L r i ðzÞ ¼ fl 2 L : rankðz i ; lÞ ¼ rg: A weight l 2 L r i ðzÞ assigns utilities for the alternatives in this manner so that alternative x i reaches ranking position r.
The rank acceptability index b r i is thereby defined as the expected volume of the set of favorable weights, and regarded as a measure of the variety of different valuations granting alternative x i achieves ranking position r. Meanwhile, the rank acceptability index is calculated as a multidimensional integral over the input data distributions and the favorable rank weights by means of The rank acceptabilities can be utilized directly in the evaluation of alternatives. For largescale problems, we introduce an iterative process, in which the κ best ranking positions (κbr) acceptabilities are analyzed at each iteration κ: The kbr acceptabilities a k i is a measure of the variety of different valuations that assign alternative x i any of the κ best ranking positions.
The problem of comparing alternatives through rank acceptabilities motivates us to propose a complementary method that integrates the rank acceptabilities into holistic acceptability indices a h i for each alternative as: where β r are surrogate weights. The basic requirements for surrogate weights are nonnegative, normalized and nonincreasing when rank increases, namely, β 1 � β 2 � � � � � β m � 0. The elicitation of surrogate weights have been extensively studied in literature [4,10,13].

FHI 2018
Data has been universally regarded as one of the most important resources in modern health care. The collection, sharing and analyzing of data can help identify disease earlier, make hospitals become faster organizations, and transform the patient experience. Value defined in health care are tracked, measured and improved though data. The FHI 2018 analyzes data and conducts interviews with leaders that are making value-based health care happen around the world, to produce practical insights that health care leaders can apply for accelerating their path towards that goal. The fist chapter of FHI 2018 outlines how the Value Measure tool can form the basis of a positive platform for change across the countries it surveys, and reports the value delivered by health systems of 16 countries, which are shown in Table 2 below. We observe that Germany performs best in Access, Singapore has the best performance in Satisfaction and Efficiency. The 16-country average Value Measure is 43.48, and Singapore has the highest Value Measure across the 16 countries surveyed.

Result and analysis
We take into account all possible individual preferences among Access, Satisfaction and Efficiency: ASE: access ≽ satisfaction ≽ efficiency, AES: access ≽ efficiency ≽ satisfaction, SAE: satisfaction ≽ access ≽ efficiency, SEA: satisfaction ≽ efficiency ≽ access, EAS: efficiency ≽ access ≽ satisfaction, and ESA: efficiency ≽ satisfaction ≽ access. By means of the closed-form solutions obtained in Section 2.1, the pessimistic and optimistic results are derived to formulate the following interval decision problem as Table 3. As for this stochastic decision problem, we follow [12] to consider both Gaussian and Uniform distributions to implement the SMAA-2. [14] develops a open-source implementation of SMAA methods in java, which can be downloaded at http://smaa.fi/jsmaa/.

Gaussian distribution
We consider that the interval-valued data satisfies the Gaussian distribution, the mean and variance are simulated as [12]: The rank acceptability indices are easily obtained and vividly illustrated in Table 4 and Fig  1 below. In addition, we use the rank-order centroid approach (ROC) to elicit surrogate weights for constructing the holistic acceptability indices: b r ¼ 1 16

Uniform distribution
Again, we take into account the uniform distribution and apply the open-source decision supporting software to calculate the rank acceptability indices and show them in the following   Table 5 and Fig 2. The aforementioned surrogate weights are employed to build the holistic acceptability indices. Similar to that of Gaussian distribution, the first rank support of Singapore is 85.46% of the possibility, while the last rank supports of Brazil and South Africa are 47.38% and 51.16%, respectively.  In what follows, we use the holistic acceptability index under Gaussian and Uniform distributions as the revised metric of Value Measure, then compare these ranks with those according to the Value Measure, as shown in Table 6. It is evident that our method generates sufficiently robust rank among 16 countries. Only Australia and Saudi Arabia are ranked differently with slight difference. Meanwhile, the ranks of Brazil (15), Italy (10), Singapore (1), South Africa (16), Sweden (8) and United Kingdom (9) are significantly reliable because both our method and Value Measure produce the identical outcomes for them.
In addition, we make the full use of Spearman's rank correlation coefficient to verify the feasibility and rationality of the proposed method. In statistics, Spearman's rank correlation coefficient is a nonparametric measure between the rankings of two variables, and evaluates how well the relationship between two variables can be described using a monotonic function. The Spearman's rank correlation coefficient is capable of reflecting the conflict between ranking orders [15]. The more discordant the rankings of two variables, the smaller the Spearman's rank correlation coefficient [16]. The formula to compute Spearman's rank correlation coefficient is where d i is the difference between the two ranks of each variable, and m is the number of DMUs [17]. We calculate and compare the average Spearman's rank correlation coefficients in the following Table 7, which are capable of measuring the strength and direction of association between obtained ranks and variables, and assessing the accuracy of models [18]. The Spearman's rank correlation coefficients between the Value Measure and Access, Satisfaction, Efficiency are computed as a benchmark for further analysis. Columns 2-4 report the Spearman rank correlation coefficients between the ranks obtained from our method and from Access, Satisfaction, Efficiency, respectively. Relative improvements are reported in the last column. Apparently, our method always outperforms the Value Measure, and the improvement from SMAA-2 with Gaussian and Uniform distributions are 4.15% and 3.25%, respectively. According to the comparison of average Spearman's rank correlation coefficients, the proposed method outperforms the original Value Measure in terms of better associations between modified Value Measure and Access, Satisfaction, Efficiency. This indicates that countries can improve the levels of Value Measure in a precise manner.

Concluding remarks
The Future Health Index (FHI) 2018 measures and assesses the value presented in 16 health systems of developed and developing markets through proposing a broadly applicable composite indicator, namely, Value Measure, which is constructed in terms of the arithmetic average of Access, Satisfaction and Efficiency. However, the individual preferences among them remain largely unexplored in literature.
This paper proposes to apply the social choice theory and Stochastic Multicriteria Acceptability Analysis for group decision making (SMAA-2) for measuring the value of health systems, by means of re-constructing the Value Measure. Specifically, we begin with considering all possible individual preferences among Access, Satisfaction and Efficiency, which is mathematically represented by ranked weights of them; the pessimistic and optimistic outcomes under certain individual preference are derived in a closed-form manner, according to which an interval decision matrix is then formulated; the SMAA-2 is then applied to compute the holistic acceptability index and is considered as a revised Value Measure. An empirical study using the data of 16 countries is performed to demonstrate the usefulness of our method, in which both Gaussian and Uniform distributions have been taken into account. It is evident that our method is capable of generating sufficient robust and superior results to the Value Measure.
The applicability and feasibility of our method are in particular limited by two aspects of the data set: extreme values and number of metrics. Specifically, it is more meaningful to extensively investigate various individual preferences when the metric values are changed mildly among different metrics. Moreover, the application of our method could be more complicated when there exist more metrics to consider, since the importance orders would dramatically increase as the increase of the number of metrics. Therefore, the proposed method is applicable and feasible when the amount of metrics is considerably small, such as no more than four. For the scenario with over five metrics, future research should develop some statistical techniques, for example, principal component analysis, to select useful orders for implementation. In addition, future research should consider other statistical distributions (e.g., lognormal distribution, gamma distribution) of the stochastic parameters. A wide spectrum of methods should also be determined to select meaningful individual preferences for further analysis.