Income distribution and health: What do we know from Chinese data?

In the last four decades, the problem of income inequality has gradually become one of the most serious social problems in China at both the regional and individual levels. Recently, the central government announced that the main social contradiction is that between people’s growing need for a better life and unbalanced and insufficient economic development. In this study, we analyse the effects of income distribution on individuals’ health using a series of indicators of income distribution and different measures of individuals’ health status. By utilizing data from the China Health and Nutrition Survey (CHNS) from 1989 to 2015, our empirical findings show that self-reported health (SRH), activities of daily living (ADLs), and diabetes mellitus appear to be negatively related to the income share of rich people when average income is equalized among counties, which indicates that individuals’ health will deteriorate as the income share of rich people increases. In addition, our results show that there is an inverted U-shaped relationship between income inequality, as measured by the county-level Gini coefficient, and individuals’ health status. We also find that income inequality affects health through the accessibility of healthcare facilities and public infrastructures and through hazardous health behaviours such as smoking and alcohol use. These findings suggest that reducing income inequality could be an important means of improving the overall health of China’s population.


Introduction
In the last four decades, the problem of income inequality has gradually become one of the most serious social problems in China. By 2015, China's Gini coefficient had reached 0.46, and according to international practice, 0.4 is considered as the "warning line" of the income distribution gap between rich people and poor people [1]. Income inequality has an impact on many aspects of social and economic development. In the field of population health, researchers have conducted numerous studies analysing the impact of income inequality on individuals' health and the possible channels of the effect. There are also many studies that have examined whether the expansion of income inequality worsens health inequality and whether the effects are heterogeneous among different income groups [2][3][4][5][6][7][8]. However, the previous findings are mixed.
From the 1970s to the 1990s, multinational macro data were mostly used to analyse the relationship between income distribution and overall population health. The Gini coefficient, the Theil index, and rich people's share of total income were commonly used to measure income distribution. Regarding health status, researchers often chose life expectancy, the mortality rate, and low birth weight in their analyses [9][10][11]. The conclusions reached in this period were quite consistent, indicating a negative relationship between income inequality and population health [12][13][14][15]. By the mid-1990s, with scepticism regarding the comparability and reliability of cross-border data and the availability of micro level data, researchers started to utilize individual-level data. Consequently, the focus of research shifted from overall population health to individuals' health and individual health measures, such as self-reported health (SRH), which replaced population health measures [16,17]. After controlling for family and personal characteristics and the mean income of regions, most empirical studies found a negative relationship between the Gini coefficient and SRH [18,19].   [20,21] showed that the widening gap between rich people and poor people caused serious social polarization, destroyed the social relationship between people, reduced social trust and social cohesion, and affected the health of members of society. Spencer (2004) [22] reached similar conclusions and pointed out that the wide income gap between rich people and poor people made low-income individuals feel depressed, which could exert significant psychological pressure and lead to negative emotions. When people live under such conditions, their risk of suffering from various chronic diseases, such as cardiovascular disease and depression, increases. In contrast, by using American data, Mellor and Milyo (2001) [17] found that there was a significant relationship between the Gini coefficient and SRH when considering only income level and the Gini coefficient, however, after personal characteristics were controlled for in the regression equations, the impact of income inequality on SRH was no longer significant.
Regarding studies using Chinese data, the findings are also mixed. Using two waves of data (1997 and 2000) from the China Health and Nutrition Survey (CHNS), Feng and Yu (2007) [23] found an inverted U-shaped relationship between county-level income inequality and SRH. Li and Zhu (2008) [16] found a significant association between community-level income inequality, as measured by the Gini coefficient, and SRH. In contrast, Chen and Meltzer (2008) [24] showed that there was no relationship between county-level income inequality, as measured by the Gini coefficient, and individuals' health, as measured by obesity and hypertension, in urban China. Bakkeli (2016) [25] found that the Gini coefficient did not have a significant impact on individuals' risks of having health problems, as measured by blood pressure and mid-arm muscle circumference (MAMC). Additionally, their results were robust when using the Theil indexes to measure income distribution.
The possible mechanisms of the effects of income inequality on individuals' health are as follows. First, income inequality could be a major source of social distrust and stress, which could have a direct influence on people engaging in behaviours that are harmful to their health (e.g., smoking and drinking) [26]. Second, when average income is equalized among communities, the larger the income inequality in a community is, the greater the likelihood that the low-income population will suffer from insufficient healthcare due to financial constraints. Third, income inequality will affect the accessibility of healthcare facilities and public infrastructures for poor people who are more likely to live in rural areas where the supply of public goods is usually insufficient due to unbalanced economic development [19,27].
This study utilizes a longitudinal dataset set to analyse the relationship between income distribution and individuals' health in China. We make several contributions to the existing literature. First, we measure the income distribution in multiple dimensions, including the share of county's total income held by the richest 5% of the population, the Gini coefficient, and the Theil index. Different measures have different emphases, and by including multidimensional measures of income inequality, we are able to analyse the impact of income distribution on health more comprehensively. Second, most of the previous literature adopted only one subjective health measure, SRH, which may not fully reflect individuals' health status. In addition to SRH, this study analyses several objective health indexes, including body mass index (BMI), activities of daily living (ADLs), and diabetes. Third, previous studies have mainly focused on the impact of income distribution on health status, and very few have expanded the discussion to the channels of the impact of income distribution. This study attempts to analyse whether income inequality affects health through the accessibility of healthcare facilities and public infrastructures and through hazardous health behaviours such as smoking and alcohol consumption. Lastly, we utilize a balanced panel to alleviate the problem of endogeneity and examine the dynamic effect of income distribution on individuals' health. In short, our empirical findings show that SRH, ADLs and diabetes are negatively related to the income share of rich people when average income is equalized among counties. Additionally, there is an inverted U-shaped relationship between the county-level Gini coefficient and individuals' health. Income inequality may also affect health through the accessibility of healthcare facilities and public infrastructures and through hazardous health behaviours.

Data source
The sample used in this study is drawn from ten waves  of the CHNS. The CHNS was designed to investigate Chinese residents' health and nutrition status and how this status is affected by the social and economic transformation of Chinese society. Based on a multistage stratified random cluster process, the survey draws 7,200 households with over 30,000 individuals in both urban and rural areas from 15 provinces and municipal cities representing most parts of China's territory. We further restrict our sample to individuals who are at least 16 years old with full-scale information on their health status and other personal characteristics. As we need to construct income inequality indexes, families with zero income or less are also excluded. Finally, we obtain two sub-samples for analysis: pooled cross-sectional data consisting of 78,235 observations in total and a balanced panel of 121 observations from each wave.

Dependent variables
Our dependent variables are various measures of individuals' health conditions, including BMI, SRH, ADLs, and diabetes, which are transformed into binary variables. Specifically, BMI is a widely used measurement for obesity and health status. The standard value ranges between 18.5 and 23.9, and a higher or lower value outside of that range indicates an unhealthy status. In this study, we regard a person as healthy if his or her BMI falls between 18.5 and 23.9 and unhealthy otherwise. In CHNS data, SRH includes five different categories ("very good", "good", "fair", "bad", and "very bad"). We use a dummy variable that takes the value of 1 if the survey respondent reports a good or very good health condition and 0 otherwise. ADLs are a series of basic activities necessary for a person's independent living at home or in the community [28,29]. They mainly consist of personal hygiene, dressing, eating, maintaining continence, and mobility. We applied a binary variable that indicates whether the respondents could complete all of these basic activities. Finally, diabetes indicates whether a respondent was diagnosed with the disease.

Independent variables
The key independent variables are the county-level share of rich people, the Gini coefficient, the Theil-index, and the percentile expenditure ratio (P90/P10). The county-level share of rich people is the share of the county's total income received by a certain percentage of the population, usually 5%, 10%, etc. It could specify the income share of people with either the highest or lowest income. The Gini coefficient is an index (between 0 and 1) used to evaluate the degree of equality of a distribution based on the Lorentz curve. The more equal the income distribution is, the smaller the arc of the Lorentz curve and the smaller the Gini coefficient. As a measure of inequality, the Theil index has good decomposability, which means that when a sample is divided into multiple groups, the Theil index can measure the contribution of intragroup and inter-group inequality to the total inequality. P90/P10 represents the proportion of income in the 90th percentile to that in the 10th percentile. We use different indexes to reflect the income distribution within county-level administrative regions, which include both counties in rural areas and districts in urban areas. Using county-level administrative regions as the unit of analysis is reasonable because counties are a basic government branch in the fiscal system and are responsible for providing of most healthcare services and other public infrastructures. Table 1 lists all individual-level variables, including income proxied by per capita family income, length of education, age, gender, ethnicity, marital status, occupation, hukou (urban or rural resident), and health insurance status [30]. Moreover, we control for several public infrastructure covariates, such as accessibility to tap water and the availability of indoor toilets, and other variables that reflect the supply of healthcare, including the number of medical institutions and the distance to the closest medical institution.

Empirical models
This study applies a probit model as our dependent variables are all binary. In Eq (1), we first use the share of rich people as the measure of income distribution to examine the influence of income inequality on individuals' health: where subscripts i, c, and t denote that individual i from county c was interviewed in year t. The dependent variable is a binary variable indicating whether person i was in a good health condition or not. β 1 is the coefficient of interest, which shows the association of individuals' health with the share of county income received by rich people after controlling for the average income of the county and other covariates. We use different shares (5%, 10% and 20%) t of rich people to estimate the impacts of income inequality in a county. X is a vector of covariates including per capita county income and the other aforementioned covariates. μ is the independently and identically distributed error term. In addition, we use the county-level Gini coefficient/Theil-index as measures of income distribution, and an interaction term between per capita family income and the inequality indexes is added. This paper estimates the Gini coefficient at the county level. At present, China's counties are not only the basic unit in the financial system but also the basic scope of farmers' economic activities. County governments are fully responsible for public health and basic health services, especially in rural areas [23]. Subramanian and Kawachi (2004) [19] believe that the US state-level governments are responsible for implementing policy with regard to residents' public health and education investment as well as other income redistribution policies. The use of state-level income gap indicators has a significant impact on health, while at a lower level, the role of the income gap is weak. Combined with the situation of China, the scope of social communication and economic activities has exceeded the scope of communities and towns; thus, it is necessary to investigate the income gap and its influence mechanism at a higher level. Therefore, it is reasonable to estimate the income gap by county-level units in China. Specifically, the estimation equation is as follows: where Y ict and X ict are similarly defined as in Eq (1). Per capita family income is used as the income variable in this paper. It primarily reflects the fact that an individual's health may depend more on family resources than on an individual's salary level because family members often make decisions based on the whole family's economic circumstances. The family is both a production unit and a decision-making unit; thus, per capita family income can more accurately reflect the effect of income and the income gap on health than the income of the individual as a member of the labour force. In addition, the equalization of total family income can better capture the utilization of family resources by all family members. Gini ct is the Gini coefficient of county c in year t, and β 1 is the key coefficient of interest. Gini ct is replaced by Theil ct when the Theil-index is used as the measure of income distribution. The interaction term of the Gini coefficient or Theil-index and per capita family income can help us explore the heterogeneous effects of income inequality on the health of people with different income levels.

The share of rich people and individuals' health
The income share of the richest 5% of the population (rich_share5) was calculated by dividing the total income of the richest 5% of the population by total income. Table 2 reports the regression estimates. We find that the income share of the richest 5% of the population has a significant negative effect on all health status measures except for BMI after controlling for the average income of the county and individual and family characteristics. Specifically, for every 1% increase in the share of rich people (rich_share5), the probability of being in very good health/good health decreases by 4.81%, the probability of having no difficulties with ADLs declines by 6.16%, and the probability of being diagnosed with diabetes increases by 5.16%. The findings demonstrate that individuals' health will worsen as the income share of rich people increases. The same results are not found when BMI is used as the measure of health status. BMI is a diagnostic tool for obesity that can predict a person's health conditions; however, the cut-off points (lower than 18.5 or higher than 23.9) are ambiguous, and thus, it is not a wellrecognized indicator of health [31]. Therefore, it is not surprising that our estimates show that present income inequality and BMI are not related. Regarding the control variables, the higher the per capita family income is, the better the SRH and ADLs, the worse the BMI is, the higher the risk of diabetes. County average income has a significant positive effect on all health status measures except for diabetes. Individuals' health status decreases with age, and men have a better health status than women. Unmarried people have a better BMI and lower rates of diabetes. The greater the number of years of education an individual has, the better his or her SRH, probably because he or she is more health conscious. Insurance status has no significant effect on individuals' health status. We find that having more family members is beneficial to the health of the individual, especially if elderly individuals have difficulty moving and family members can provide necessary help. In addition, rural residents have a better BMI and a lower risk of diabetes, which is related to their diet structure and eating habits. We also estimate the relationship between the share of the county's total income received by the richest 10% and 20% of the population and the indicators of individuals' health (Tables  3 and 4). The findings are highly consistent with those reported in Table 2. Again, individuals' health is negatively affected by income inequality when SRH, ADLs, and diabetes are used as health measures. The coefficient on income inequality remains nonsignificant when health is measured by BMI. The sign and significance of the control variables are also basically consistent with those shown in Table 2.

Gini coefficient and individuals' health
In this section, to measure income distribution, we use the Gini coefficient, which is a commonly used index reflecting the degree of income inequality between rich people and poor people [32]. Tables 5 and 6 report the regression results. We first estimate Eq (2) by including only the Gini coefficient, its squared term, and other covariates (Model 1). We then add an interaction term between the Gini coefficient and income level to test the heterogeneous effects of income inequality on populations with different income levels (Model 2). The results indicate that there is an inverted-U shaped correlation between the Gini coefficient and SRH, ADLs, and diabetes, meaning that the individuals' health status will first improve with an increase in income inequality; however, health conditions will deteriorate significantly once a threshold is exceeded. The Gini coefficient thresholds for SRH, ADLs, and Diabetes are 0.456, 0.442, and 0.425, respectively. According to the Bureau of National Statistics, the Gini coefficient in China in 2015 was 0.462, exceeding all three threshold  [16] found that the threshold of the Gini coefficient was 0.40. Again, BMI is not significantly affected by the Gini coefficient, which similar to the previous findings using the share of rich people as the measure of income inequality. In Model 2, we include the interaction term between the Gini coefficient and income level to check the heterogeneous effects of income inequality. The estimates in Tables 5 and 6 show that the coefficients on the interaction terms are positive when SRH, ADLs, and diabetes are used as the health measures, meaning that an increase in income inequality will amplify the effect of income level on individuals' health. In other words, the expansion of income inequality has caused the health conditions of poor people to deteriorate more other the study people than their rich counterparts, further widening the health gap between rich people and poor people. Regarding the control variables, the results are consistent with those shown in Table 2.

Possible channels
Why might increasing income inequality be correlated with worse health status? In this section, we propose several possible channels through which income distribution affects individuals' health status, including the accessibility of health care facilities and public infrastructures as well as smoking and alcohol use.

The accessibility of health facilities and public infrastructures.
A very high level of income possessed by rich people may indicate uneven regional development, and poor people living in underdeveloped areas are likely to have limited access to healthcare facilities and public infrastructures. A large number of studies have confirmed the impact of the accessibility of healthcare facilities and public infrastructures on health conditions [33]. In this study, we use the number of medical institutions in a county per capita and the distance to the closest medical institution to measure the accessibility of healthcare facilities. Additionally, we use tap water and indoor toilets to proxy for the availability of public infrastructures. From the regression results in Tables 7 and 8, we find that the impact of the accessibility of healthcare facilities and public infrastructures on health status is mostly positive and significant. For example, the higher the number of hospitals per capita in a county is, the better the health status of residents will be. More importantly, the absolute values of the coefficients on income inequality (Lnrich_share5) are smaller than those reported in Table 2, which means that income distribution affects individuals' health status through the accessibility of healthcare facilities and public infrastructure investment. Next, we directly estimate whether income inequality affects the accessibility of healthcare and public infrastructures ( Table 9). The coefficients on the share of rich people and the Gini coefficient are significantly negative, which means that a greater income gap is associated with fewer medical institutions and public infrastructures. These findings support our hypothesis that income inequality affects individual health through the accessibility of healthcare and public infrastructure. Increasing the supply of medical facilities and public infrastructures may help alleviate the adverse impact of the income gap on individual health. [34] found that low-income people with lower social positions are more likely to be burdened with more pressure and to engage in unhealthy behaviours. The previous literature has shown that income disparity affects personal health by increasing hazardous health behaviours. For example, Ramos et al.  [35] found that self-reported violent victimization is associated with an increased chance of engagement in hazardous health behaviours in all Brazilian state capitals, especially in areas with serious income inequality. We verify this potential channel by analysing whether income equality affects the probability of individuals engaging in hazardous health behaviours, such as smoking and alcohol use. Table 10 shows that there is a significant and positive correlation between income inequality and the amount of daily smoking and the frequency of drinking, and the results indicate that greater income inequality results in people engaging in hazardous health behaviours.

Robustness check
3.4.1. Theil index and the percentile expenditure ratio (P90/P10). When measuring income distribution, different income inequality indexes have different emphases. In this section, we use two different measures of income distribution to check for the robustness of our earlier findings. We re-estimate Eqs (1) and (2) by replacing the share of rich people and the Gini coefficient with the Theil index and P90/P10. The Theil index uses the concept of entropy in information theory to calculate income inequality. The advantage of the Theil index is that it can be decomposed to measure the contribution of both intra-group and inter-group income inequality to total income inequality. P90/P10 represents the proportion of income in the 90th percentile to that in the 10th percentile. The regression results in Tables 11 and 12 show that after replacing the share of rich people and the Gini coefficient with the Theil index and P90/P10, the coefficients of interest barely change in both sign and level of significance. We also find an inverted U-shaped relationship between the Theil index and P90/P10, on the one hand, and SRH, ADLs, and diabetes, on the other hand. The average Theil index and P90/P10 are on the right side of the inverted Ushaped curve, which also indicates that China's current income distribution has a negative impact on individuals' health, which is similar to the findings reported in section 3.1. In addition, the coefficients on the interaction term between the Theil index and P90/P10 and income level are significantly positive, which is also consistent with our earlier findings.

Simultaneous causality.
In this topic, the most common problem is simultaneous causality, which directly affects the reliability and accuracy of our estimates. On the one hand, income inequality will have an effect on individuals' health behaviours and, thus, their health outcomes; on the other hand, health status influences individuals' labour force participation and, thus, their personal income. However, current health status does not affect either income inequality or income in the past. Therefore, we substitute the current Gini coefficient and share of rich people with their lagged terms. The results in Table 13 show that the inverted Ushaped relationship between the lagged Gini coefficient and health outcomes still exists when SRH, ADLs, and diabetes are used as health measures. The coefficients of the lagged share of rich people are significantly negative, which is consistent with the results shown in Tables 2-4.

Balanced panel data.
In this part, we utilize a balanced panel dataset from the ten waves of data from 1989 to 2015, and the sample includes 121 observations from each wave. The balanced panel can alleviate the problem of endogeneity and examine the dynamic effect of income distribution on individual health status. Table 14 presents the regression results. Although an inverted U-shaped relationship between income inequality and health status still exists, the inflection points when SRH, ADLs, and diabetes are used as health measures are 0.274, 0.387, and 0.256, respectively, which are lower than those found in Tables 5 and 6. The coefficients on the share of rich people are still negative and significant. In general, the estimates using balanced panel data are consistent with those reported in previous sections.

Discussion and conclusion
This study utilizes CHNS data to estimate the impact of income distribution on individuals' health, and it discusses the possible channels of the effect. The main findings are as follows. First, increasing income inequality has a negative impact on SRH, ADLs, and diabetes when income inequality is measured by the share of rich people. However, BMI is not significantly affected by the change in income inequality. Second, when measuring income distribution by the Gini coefficient, we find an inverted U-shaped relationship between the Gini coefficient and all health measures except for BMI, which means that income inequality has a positive effect on health status when it is low, but once a certain threshold is exceeded, it has a negative effect on individuals' health. In addition, the results show that the widening of the income gap amplifies the effect of personal income on health, which implies that the increasing level of income inequality has caused the health conditions of poor people to deteriorate more over the study people than their rich counterparts. We also use different measures of income distribution, the Theil index and P90/P10, and the findings are mostly consistent. Lastly, we attempt to analyse the possible channels of the effects by analysing the accessibility of healthcare facilities and public infrastructures. The results indicate that income distribution affects individuals' health by affecting the number of healthcare facilities and public infrastructures. The expansion of income inequality also increases the possibility and frequency of people engaging in hazardous health behaviours such as smoking and alcohol consumption.
In conclusion, China's current income inequality is significantly higher than the international warning line, and this study shows that it is taking a toll on people's health. To alleviate this problem, the Chinese government should use all means to reduce income inequality and speed up investment in healthcare facilities and public infrastructures, especially in rural areas.