A synthetic indicator on the impact of COVID-19 on the community’s health

The expansion of Covid-19 has severely hit the community’s health all over the world, killing hundreds of thousands of people, subjecting health systems to an enormous stress (besides derailing economic activities and altering personal and social behavior). Two elements are essential to monitor the evolution of the pandemic as well as to analyze the effectiveness of the response measures: reliable data and useful indicators. We present an indicator that helps to assess the impact of Covid-19 on the community’s health, combining two different components: the extent of the pandemics (i.e. the share of the population affected) and its severity (the intensity of the disease on those affected). The severity measure derives from the application of an evaluation protocol that allows comparing population distributions based on the proportions of those affected with different health conditions. We illustrate the functioning of this indicator over a case study regarding the situation of the Italian regions on March 9 (the beginning of the confinement) and April 8, 2020, one month later.


Introduction
The speed and spread of transmission of COVID-19 have forced governments all over the world to implement strong defensive measures to control the expansion of the epidemics and avoid the collapse of their health care systems. Assessing the effectiveness of those measures calls for surveillance strategies on its application and continuous monitoring of the disease's evolution. Both aspects require the availability of reliable data and adequate evaluation protocols that transform those data into helpful indicators [1]. Several variables measure particular instances of the pandemics (e.g. reproduction rate, mortality rate, positive rate). Here we shall focus on the overall evaluation of the impact of Convid-19 on the community's health.
According to the general recommendations of the World Health Organization [2], there are three variables to consider in a pandemic of this nature: how many people are infected (transmissibility), how severely sick get the infected individuals, and how the pandemic affects the health-care system and society. In a similar vein, the US Department of Health and Human Services has developed the Pandemic Severity Assessment Framework (PSAF) [3]. The PSAF proposes two assessment dimensions, transmissibility, and clinical severity, and distinguishes on how to apply those measurement protocols depending on the stage of the epidemic [4]. The European Centre for Disease Prevention and Control (ECDC) has recently advised to "monitor the intensity, geographic spread and severity of COVID-19 in the a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 population to estimate the burden of disease, assess the direction of recent time trends, and inform appropriate mitigation measures" [5]. The ECDC recommends countries to comprehensively testing suspected cases and to report the number of confirmed cases, distinguishing between those hospitalized, those in intensive care units (ICU), and those deceased.
There is, therefore, an extensive agreement on the variables that should be considered to track the evolution and the impact of COVID-19 on community's health. There is also consensus on the way of approaching the evaluation, which can be regarded as a conventional way to assess the global impact of a given phenomenon on a population: computing both extent (the share of people involved) and severity (how intensely the event affects that population).
We know that countries have only been computing a fraction of the people infected by the virus, especially at the beginning of the pandemics, depending on detection policies and test availability [6]. Something similar can be said of the reports on the numbers of people dead and cured, as there is evidence that different countries (or even different regions within a country) apply diverse protocols to compute those cases. As a result, we do not have an exact picture of the dynamics of the disease [7]. Yet, it is still essential to get an idea on how things are evolving [8], if only to determine the effectiveness of the solutions that are being implemented, helping to calculate the needs of sanitary supplies, the pressure on the equipment and human resources of the health systems, and predicting the evolution of the disease and the progressive return to normality. Hence the need to tackle the second challenge: finding adequate indicators of the extent and severity of the pandemic.
Evaluating the severity of the disease requires analyzing the distribution of the population affected in different health conditions (e.g., hospitalized, in intensive care units, recovered, deceased). Ideally, we would like to have a numerical indicator that allows for quantitative comparisons to assess both the direction and the size of the changes in severity. This involves the design of an evaluation formula that, as a general rule, adopts the structure of a weighted average or a generalized mean of the relative frequencies of those health conditions. That is, we need to assign weights to each of those conditions and decide, for instance, how much we value the death of one person relative to the healing of another. Our conclusions on the severity of the pandemics will depend on those judgments, which are extremely difficult to determine for both technical and ethical reasons.
Herrero and Villar [9,10] have developed an evaluation procedure that does not require introducing those judgments, which can help to assess the severity of the impact of COVID-19. Relying on the comparison of the probabilities that members of a community being worseoff than members of some other, we obtain a cardinal measure of the relative severity with which the pandemic affects different populations. Note that populations here may refer to the people infected in a group of countries, regions within a country, demographic groups, or different points in time. We can thus apply this evaluation procedure to estimate the impact on the community's health of COVID-19 in a variety of ways.
We organize the paper as follows. Section 2 presents the evaluation protocol, which consists of the product of a measure of the extent and a measure of the severity. Section 3 illustrates this evaluation protocol regarding the situation of Italy and its regions in two points in time: the beginning of the confinement and one month later. Finally, Section 4 contains a short discussion on some of the critical aspects of this evaluation procedure and its applicability.

The indicator
We propose to measure the impact of COVID-19 by an indicator made out of two components: extent (share of the population affected by the virus) and severity (a measure of the relative health situation of that population). The indicator, denoted by I Co19 , consists of the product of those two variables. That is, I Co19 = Extent×Severity.
More formally, if we call n a h the population affected by the virus in society h, n h the total population of that society, and s h the measure of severity, our indicator for population h is given by: The indicator provides an intuitive measure of the degree to which each community is affected by the disease, as it describes how many people are affected, times how severely affected they are, relative to the population size. This is a standard measurement rod to estimate the impact of a given phenomenon on a population subgroup; in particular, it is the conventional approach regarding poverty measures [11,12].
Note that we have used the expression "population affected by the virus" for n a h , rather than "population infected", and "extent" for n a h =n i , rather than "incidence". The reason is that, especially during the initial phase of the contagion, those registered as infected were only those who required some kind of medical treatment or preventive measure (e.g. isolation). Monitoring is nowadays more thorough and present data also capture people with light or no symptoms. Needless to say, the indicator works with whatever reference population we consider, as its internal logic is independent of that aspect. Yet, the interpretation of the results is conditional on that reference population. We shall be precise on this respect in the illustration presented in Section 3 and will say more on this subject in the discussion (Section 4).

The evaluation protocol for severity
Let us now address the question of how to measure the severity. The formal problem consists of comparing a collection of populations affected by the virus, G = {1, 2, . . ., g}, in terms of the distributions of their members over an ordered set of health conditions, c = 1, 2, . . ., C. We describe the health situation of population h by a vector a(h) = (a h1 , a h2 ,. . .,a hC ), where a hc is the fraction of people affected by the virus in population h with health condition c. That is, we can write a hc ¼ n ac h =n a h , where n ac h is the number of individuals in population h who are affected by the virus and exhibit health condition c. By construction, a hc � 0; P C c¼1 a hc ¼ 1, for all h. To assess the relative situation of those populations, regarding the intensity of the pandemics, we compare the likelihood of getting a worse health condition for representative members of those societies. To be precise, let p hk denote the probability that a member of society h exhibits a worse health condition than a member of society k. Assuming that those categories are ordered from worst to best, such a probability obtains as follows: Let e hk = e kh stand for the probability of a tie and define q hk ¼ p hk þ 1 2 e hk (i.e., we split the probability of a tie evenly).
To compare the severity of the pandemics in two societies, h, and k, we apply the following principle: the severity measures of those societies are proportional to the corresponding probabilities of being relatively worse off. That is if we call s h , s k the severity measures, we let: 9,10 Note that, by construction, this equation has a degree of freedom. That is, we can freely set the units in which we measure severity. In the two-society case, by letting s h = 1, we can rewrite the former equation as follows: That is, s h tells us how likely is that an individual from society h is in a worse health condition than a individual from society k, relative to the complementary case.
When there are more than two societies involved this simple formulation has to be adjusted. Yet, we can extend this principle easily by taking expectations over the expression s h ¼ q hk q kh s k (we cannot rely on bilateral comparisons as they are not transitive). That is, we measure severity in society h relative to the rest by the following formula: The previous expression has a similar meaning as before, even though now each probability in the numerator is weighted by the corresponding measure of severity. The vector or those s h values is called the balanced worth [10] and obtains as the dominant eigenvector of a Perron matrix P built as follows. The elements in the diagonal are of the form R k = (g−1)−∑ h6 ¼k q kh ; the off-diagonal elements are just the probability values q hk . That is, The balanced worth provides a relative evaluation of the severity of the pandemic in the different populations considered. The structure of this matrix ensures that the balanced worth vector s = (s 1 ,. . .,s g ), which corresponds to the solution to the equation s = Ps, always exists, and it is positive and generically unique, except for the choice of units, as it has one degree of freedom. Thus, we have to normalize those values with respect some reference level that will define the units in which we measure this variable.
Remark. There is a friendly and freely accessible algorithm hosted in the Ivie website https://web2011.ivie.es/balanced-worth/balanced-worth-vector.php that performs instantly all calculations required to obtain this vector. This algorithm uses the mean, by default, to normalize the values of the corresponding eigenvector (i.e. we measure the values obtained in terms of percentages of the mean value). This normalization can be easily modified.

A case study: The impact of Covid-19 on the Italian regions
Let us see how this evaluation protocol works in a case study. This section serves the purpose of illustrating the application of the methodological approach we propose, rather than to provide an empirical study. We consider two different applications that show the measurement we can obtain in a synchronic and in a diachronic context. We first compare the situation of the Italian regions on April 8, 2020, which is one month after Italy decreed the confinement. Here we measure the impact of the Covid-19 in the Italian regions relative to the whole country. With this exercise we capture the diversity of the situations in the Italian regions at a particular point in time. Then we address the change experienced by those regions between March 9, the day in which the confinement started in Italy, and April 8. Here we compare the situation of the Italian regions relative to the initial state (Italy as a whole on March 9), so that we can have an estimate of the evolution of the pandemics within the regions and also relate how diverse was the situation at the beginning of the confinement and one month later.
The data come from the Italian Ministry of Health (Ministero della Salute), which are freely available at its webpage [13]. They referred to the people infected by the virus who had been identified due to the gravity of their symptoms. They included those treated in hospitals, those who have been isolated at home, cured, or have died. There was not, at that early stage, any estimate of those who might be infected but showed no apparent symptoms. We refer to the population registered as infected in that period as infected with worrisome symptoms (IWS, for short). Table 1 describes the cumulative number of the IWS population on April 8. Individuals in this group presented one of the following five health conditions, ordered from worst to best: deceased, in intensive care units (ICU), hospitalized (non-ICU), isolated at home, and cured. Table 2 shows that Italian regions exhibited a large variability regarding the extent of the COVID-19 (see the last column of the table), with a coefficient of variation around 0�8. The highest values were in the Northern regions: Lombardia, Emilia-Romagna, Piemonte, Marche, Liguria, Trento, Bolzano, and Val d'Aosta. We can decompose the extent figure into two components: the product of the ratio of the IWS over the number of tests performed, and the ratio between the number of tests per 100,000 inhabitants. The first term tells how the IWS relates to the number of tests (a measure of the detection rate). The second term is an index of how intense the search of IWS was between regions. Data show that the more intense the search, the more cases detected (a coefficient of correlation of 0�624). Despite the variability of the ratio between IWS and the number of tests performed, the extent variable is orthogonal with that measuring the tests per 100.000 inhabitants (a correlation coefficient of 0�1). That is, data suggest that the differential impact of the disease over the regions was not due to the differential search intensities. Fig 1 displays the proportions of the IWS population into the different health conditions (arranged by increasing number of deceased). The proportions of those deceased and cured presented a large variability (with coefficients of variation of 0�4 and 0�57, respectively). For those isolated at home, the variability was relatively low (CV = 0�18), whereas that of those hospitalized or at the ICU was somewhere in between (CV = 0�3 in both cases). Fig 1 illustrates well the challenge of transforming those data into an indicator of severity and gives a hint on how things can appear depending on the way of attaching values to the health conditions. To obtain the severity measure described in Section 2 (the so-called balance worth), we just have to plug the data generating this figure into the web page mentioned in the Remark above. This measure tells us about the relative health situation of the IWS in the Italian regions. To facilitate the comparison, we normalize the values by setting Italy to 100. Table 3 reports the evaluation of the severity of COVID-19. There are two features worth commenting. First, the variability was relatively small, with a coefficient of variation of 0�155. Second, some of the regions with higher severity were in the South, where the extent was much smaller.

A day in the life of Italy with Covid-19
The indicator we propose to measure the impact of COVID-19 over the community's health simply obtained by weighing extent by severity. The resulting data appear in Table 3. The variability of the impact was extremely high, with a coefficient of variation above 0�8. Lombardia and Valle d'Aosta presented the highest impact, followed by Trento and Emilia-Romagna. The lowest impact corresponded to Sicilia, Calabria, and Basilicata.

Changes after one month of confinement
We now discuss how the situation changed between March 9 and April 8. Table 4 provides the relevant information for those two dates, setting Italy to 100 on March 9, both for severity and impact. There are several features worth commenting. During this month the impact in the whole country multiplied by a factor of 10, whereas in some regions multiplied by more than 40 times: Bolzano (70 times), Trento (60 times), Valle d'Aosta (54 times), Basilicata (49 times), Calabria (47 times), and Sicilia (43 times). All these regions exhibited low impact values on March 9 (especially the last three regions). The regions with a higher impact on March 9 display much smaller factors: Lombardia (6 times), Marche (8 times), Veneto (12 times), Piamonte (17 times), Liguria (18 times). As a consequence, the extreme diversity between the Italian regions, as measured by the coefficient of variation, exhibits a substantial reduction between March and April (from 0�347 to 0�185 for severity and from 1�519 to 0�853 for impact). This fact may indicate that confinement is an effective policy in fighting the disease.
Severity decreased substantially in most of the regions, with an overall reduction of 33%. This reduction happened more intensely in those regions with worse indicators so that we observe a sharp decline in its variability, which dropped by almost one half. This suggests that the health system was responding correctly, and did it more intensely in those regions more in need.

Discussion
Many countries provide daily reports on the effects of COVID-19 regarding the spread of the infection, the numbers of people dead, hospitalized, in intensive care units, and cured. Those data evolve differently both within each population (they increase and decrease and do it at different rates) and between populations (e.g., countries, regions, age groups). This complex dynamics makes it challenging to get an idea of the global impact of COVID-19 on community's health. We have presented a protocol intended to address this evaluation problem. It measures impact as the product of extent and severity. Extent is simply the share of those registered as infected in the population whereas severity is a more sophisticated indicator.

Severity
Severity is measured by comparing distributions across different health conditions of the populations affected by the virus. The type of comparison proposed here (the balanced worth) permits one to get a cardinal measure without having to attach weights to those health conditions. We depart, therefore, from other approaches based on setting ex-ante scores to those states (e.g. the weights used to ponder different health states in an advanced phase of the epidemics in PSAF) and on the "disability-adjusted life years" metrics used to estimate the "burden of disease." The nature of the available information at that early stage of contagion makes it difficult to apply those evaluation formulae [4].
This severity measure is based on the relative likelihood of getting a worse health condition for a representative member of an affected population. The formula is intuitive and corresponds to a well-known mathematical tool, similar to the one used by Google to order web pages or the principle behind the Eigenfactor [14,15]. The evaluation obtains as the dominant eigenvector of a Perron matrix associated with a Markov chain [16]. Therefore, calculations are conventional, and we know precisely how the evaluation protocol works and what information conveys. With the advantage of providing quantitative estimates and having designed a specific algorithm that is freely available. It is worth mentioning that severity is not a variant of another elementary indicator, such as the lethality rate (i.e., the ratio between deceased and affected).

Population subgroups
Our way of comparing distributions implies that the evaluation is relative. That is, we obtain an assessment of how a population fares relative to others. This fact is essential both to understand the meaning of the evaluation and to think of the different questions this protocol permits to address. Besides the types of the evaluation presented here, regarding the comparison of different populations (Italian regions) at a given point in time and different dates, we may consider different types of individuals (depending on age, gender, race, wealth, etc.) or particular population subgroups [10].
A population subgroup of particular relevance is that corresponding to those who are positive at the reference day, that is, those registered as infected who are in intensive care units, at the hospital or isolated at home. Let us call PAP this population subgroup, as a shorthand for Positive At Present. The impact of Covid-19 over the PAP population is a measure of the effort currently required from the health system, as we discount from the population affected those already cured and those deceased.   Table 1. They show that Lazio, Liguria, Lombardia and Piemonte were the regions with higher shares of people in hospitals (including those in ICU). Those with smaller shares corresponded to Friuli V.G., Molise, Sardegna and Veneto. Table 5 provides the evaluation of the Italian regions on April 8, in terms of severity and impact, for the PAP population. As it was the case for the IWS population, the impact has much larger variability than the severity (a coefficient of variation of 0.759 with respect to 0.166). Valle d'Aosta, Lombardia, Trento, Emilia-Romagna, Liguria and Bolzano were the regions with a more substantial impact of Covid-19 on the PAP. Calabria, Campania, Molise and Sicilia were those with a smaller impact.
It is also interesting to observe how severity has changed along this month by comparing our two reference dates and anchoring the evaluation by setting Italy to 100 on March 9, as shown in Table 6. There are two remarkable features that those data reveal. First� the sharp decline of the severity values in all regions, to almost one half of the initial value for the whole country. Second, the even sharper reduction of the variability between regions (60% reduction in the coefficient of variation. which drops from 0�462 to 0�168). That suggests, once more, that the health system reacted in a balanced way absorbing the shock according to need.

The dynamics of the pandemics
There is a strong suspicion that, at the initial phase of the pandemics, the number of people infected was much larger than reported. The multiplication of the tests and the surveys on   seropositivity have started capturing people infected with light or no symptoms so that nowadays data are richer and more informative. 13 Those data might induce a revision of the extent and severity of the pandemics in that initial phase, recurring to some statistical techniques. Indeed, there is already some statistical analysis on the "excess of deaths", which suggests that the number of people registered as deceased by the virus was also underestimated [17]. As a consequence, some revisions on the data on the evolution of the pandemics are to be expected. The new data available entail a change in the nature of the reference population and has implications on the impact analysis. Measuring severity in this richer scenario will require introducing another health condition, corresponding to infected people with light or no symptoms. As severity is a relative measure, the effect of introducing this new category will depend on the distribution of those infected but asymptomatic between the populations under consideration.
Introducing that new health condition is a trivial change in the analysis presented in Section 3. Note that, in a synchronic analysis, this modification presents no particular problem. Things are different in a diachronic analysis because the change in the population registered as infected in March 2020 and in September 2020, say, involves an implicit change in the criterion that defines those who are infected. It would thus be prudent, for the time being, to analyze the evolution of the impact within periods in which the recording criteria have not changed much, or keeping as the reference population those infected with worrisome symptoms (the IWS population used in our empirical application), Alternatively, one may smooth the effect of the change in the detection policy by making the impact analysis on a very short period bases (e.g. daily) using a moving average approach [18,19]. Supporting information S1 File. (DOCX)