Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Distribution of self-reported health in India: The role of income and geography

  • Ila Patnaik ,

    Contributed equally to this work with: Ila Patnaik, Renuka Sane, Ajay Shah, S. V. Subramanian

    Roles Conceptualization, Funding acquisition, Methodology, Supervision

    Affiliation National Institute of Public Finance and Policy, Delhi, India

  • Renuka Sane ,

    Contributed equally to this work with: Ila Patnaik, Renuka Sane, Ajay Shah, S. V. Subramanian

    Roles Formal analysis, Methodology, Writing – original draft, Writing – review & editing

    Affiliation National Institute of Public Finance and Policy, Delhi, India

  • Ajay Shah ,

    Contributed equally to this work with: Ila Patnaik, Renuka Sane, Ajay Shah, S. V. Subramanian

    Roles Conceptualization, Formal analysis, Funding acquisition, Investigation, Methodology, Software, Writing – original draft, Writing – review & editing

    ajayshah@mayin.org

    Affiliation xKDR Forum, Mumbai, Maharashtra, India

  • S. V. Subramanian

    Contributed equally to this work with: Ila Patnaik, Renuka Sane, Ajay Shah, S. V. Subramanian

    Roles Conceptualization, Methodology, Validation, Writing – review & editing

    Affiliation Harvard University, Cambridge, MA, United States of America

Abstract

An important new large-scale survey database is brought to bear on measuring and analysing self-reported health in India. The most important correlates are age, income and location. There is substantial variation of health across the 102 ‘homogeneous regions’ within the country, after controlling for household and individual characteristics. Higher income is correlated with better health in only 40% of India. We create novel maps showing regions with poor health, that is attributable to the location, that diverge from the conventional wisdom. These results suggest the need for epidemiological studies in the hotspots of ill-health and in regions where higher income does not correlate with improved health.

Introduction

Health policy is ultimately about creating conditions in which people are healthy. The wellness of the people is the outcome of interest. Many plausible health outcome measures reflect different dimensions of wellness. In the Indian health literature, infant and maternal mortality are the dominant measures which have been employed. For example, [1] find that improved prenatal care impacts neonatal and infant mortality, while [2] find that despite noticeable improvements in maternal health care utilisation since the implementation of National Rural Health Mission (NRHM) in 2005, India did not achieve the target of millennium development goal to reduced maternal mortality ratio by 2015. However, maternal and child mortality represent a subset of the population, and narrow government programs have influenced these metrics without reshaping the broader health of the population.

Medical researchers are able to use objective metrics from medical tests, such as blood pressure, hypertension, anemia, diabetes etc. as a measure of the health of an individual [35]. Aggregate measures of disease burden also includes measurement of Disability Adjusted Life Years (DALYs) which captures the reduction in life expectancy and the diminished quality of life [6]. However, such measurement, done consistently across time and space, involves the construction of complex datasets which is infeasible in less developed countries.

Death can be accurately measured, which suggests pathways for health outcomes measurement through death rates and longevity. For example, [7] calculate a M-cure for India, while [8] study the mortality from cervical cancer in Indian states, while [9] use survey data to study links between household economic status and adult mortality. Measures based on mortality in the overall population are impeded by the limitations of official mortality-related statistics. [10] provides a description of the limitations of official statistics.

One path to measuring the health status of an individual is to to directly ask individuals to assess their own health. This is termed Self-Report Health (SRH). For example, the WHO has used the question: In general, how would you rate your health today?. Responses to such a survey question can be binary, which naturally suggests an ill-health rate which is the fraction of a given set of people who report that their self reported health is not good. Alternatively, a five-point scale can be used, e.g. In general, would you say that you health is excellent, very good, good, fair, or poor?.

When compared with objective measures based on medical tests, SRH measurement is inexpensive and less intrusive. From the viewpoint of the foundations of human welfare, asking a person if they are feeling well is of essence [11]. At the same time, SRH suffers from four limitations:

  1. SRH inevitably introduces a psychological filter in determining whether there is ill health. For example, [12] find that nearly 40% of respondents change their answers between two interviews done just one month apart, consistency in answers was not influenced by health events, and reliability was worse for disadvantaged groups. There also appears to be a gender gap in reporting of health [1315]. The under-reporting seems to hold even after controlling for objective health measures [16]. These psychological factors might be correlated with wealth: the tooth ache or fever that makes an upper class person report ill-health might not trigger the same response from a poor person.
  2. The precise phrasing of the SRH question in a household survey matters in interpreting the results. Many different designs of the question can be applied, e.g. Are you feeling well today? or Have you been healthy in the last month? or In the last week, were you unwell for at least one day?. As a consequence, the numerical values of the SRH rate are not comparable across survey datasets.
  3. The SRH measure also suffers from limitations on comparability between age-groups. For example infants partly cannot communicate their health status to the person responding the survey, and forms of behaviour can be misinterpreted. Further, these other persons (e.g. parents) can be more anxious when their children are very young in contrast to only young.
  4. The SRH approach should not be used for specific morbidities. Disorders like diabetes are not discerned by the person for many years. For example, [17] find that the self reported incidence of hypertension and lung disease underestimate the disease burden in India. Other studies find that the prevalence of non communicable diseases is under-reported when viewed through SRH, especially among the poor [18, 19].
  5. Access to sound health care, and particularly testing, is likely to influence perceived health. This could generate more accurate answers in households with more access to health care.

While SRH has flaws, it contains useful information about the health of an individual. A successful empirical literature has utilised SRH in health research. It correlates with objective health outcome measures [20, 21]. It does reasonably well on predicting mortality, especially among elderly populations in developed countries [2225]. In India, [21] use the 2002 WHS data and find that SRH is a useful measure. Similarly [26] use the NFHS as well as the NSS data and find that the use of self-rated ill health has validity in relationship to socioeconomic status. [20] finds a correlation between SRH and disease burden in China.

As a consequence, over the years, SRH measurement has emerged in many survey datasets, such as World Health Survey 2002 and SAGE 2007 conducted by WHO, the Health and Retirement study (HRS) and the National Health and Nutrition Examination Survey (NHANES) in the US, the Survey of Health, Ageing and Retirement in Europe (SHARE), the English Longitudinal Study of Ageing (ELSA) in the UK. The HRS largely samples the elderly, while the NHANES samples adults and children, and they have used a five-point scale with the categories excellent, very good, good, fair and poor. The ELSA uses a slightly different five point measure: very good, good, fair, bad and very bad. Another example is the SF-36, a short-form 36-item questionnaire developed by the Rand Corporation for medical outcomes measurement in in the US. It is a patient-reported survey of health as measured along eight multi-item categories. Each of these datasets has resulted in evidence about SRH which has fed into a substantial downstream literature. For example, see the literature on understanding the link between income inequality and self-reported health [2730].

In this paper, we exploit the Consumer Pyramids Household Survey (CPHS), a longitudinal dataset which measures about 170,000 households, three times a year. There is one other panel data set in India, the Indian Human Development Survey (IHDS), where households are observed 10 years apart. While a great deal of the health literature in India is based on the National Sample Survey Organisation (NSSO) and the National Family Health Survey (NFHS) data sets, these are repeated cross-sections, and the existing literature on SRH with these data in India is relatively limited.

We obtain foundational facts about ill health for individuals in India at one point in time (calendar years 2018 and 2019). We study the variation of ill health with age, and the impact of a variety of socioeconomic factors such as income, gender, caste, religion and education.

We explore the variation of SRH by location. The dataset divides the country into 102 ‘homogeneous regions’ (HRs). In myriad other contexts, the evidence shows substantial heterogeneity within the country, across the HRs. We explore geographical heterogeneity in ill-health, and in the impact of income upon health.

Ultimately, multiple health outcome measures—medical tests, SRH, mortality—need to come together into a rich literature on the causes and consequences of ill health, which can inform the decision making of individuals, health care providers, and public health. The CPHS longitudinal data makes possible a new literature where the causes and consequences of health can be explored. This paper constitutes a first building block for that research program.

Methods

Ethics statement

The data used in the paper is sourced from Consumer Pyramids Household Survey database which is collected by the Centre for Monitoring the Indian Economy (CMIE). We did not seek IRB approval since the process of data collection was not undertaken by us. CMIE takes the consent of every household that participates in the household survey.

Data sources

We use data from the Consumer Pyramids Household Survey (CPHS) carried out by the Centre for Monitoring Indian Economy (CMIE). This is available to all researchers upon payment of a subscription fee. The data can be sourced from https://consumerpyramidsdx.cmie.com. (Queries related to subscription may be sent to Mr. Prabhakar Singh, Head, Institutional Business, CMIE. sprabhakar@cmie.com.) The CPHS collects high-quality data through face-to-face interviews with households. Answers provided by respondents are captured in a mobile phone through a specially developed app. CPHS measures a panel of households at three points in each year. Households are met with in three waves a year; Wave 1 runs from January—April, Wave 2 from May—August, and Wave 3 from September—December.

The sample is nationally representative and selected through a multistage stratified design. The count of sample households increased from 166,744 households in the January—April, 2014 wave to 174,405 households in the September—December, 2019 wave. Over the six years a total of 65,892 households were added to the original sample of 166,744 households and 58,231 households were dropped for various reasons. The broadest level of stratification is the “Homogeneous Region (HR)”, a set of neighbouring districts that are similar in three dimensions: agro-climatic conditions, urbanisation levels and female literacy. India is partitioned into 102 such HRs in the CPHS. The HR is further divided into stratum—which is either rural (all villages in the HR) or urban (towns which are further classified into four different stratum based on their size) region within a HR. Towns with more than 200,000 households are classified as Very Large stratum, between 60,000–200,000 are classified as Large stratum, between 20,000–60,000 households are medium stratum and below 20,000 households are small stratum. Thus, each HR is further disaggregated into 5 stratums, 1 rural and 4 urban. The primary sampling units (PSUs) in the survey are villages and towns from the 2011 Census of India. The ultimate sampling units are the households from these primary sampling units. CPHS questions are of two broad categories: those that are asked at the household level (such as expenditure, income and asset ownership) and those that are asked of each member of the household (such as demographics, religion, caste, education, and health).

In any narrow period of time, there can be special problems like a local epidemic or a natural disaster, which can generate higher ill-health in one particular region. Some aspects of the disease burden can be seasonal; e.g. reduced water quality in summer or respiratory ailments in North India in the winter. In the year 2017, there is the possibility of an adverse impact upon health of the Demonetisation event of November 2016. On 8 November, 2016 the government of India banned currency denominations of Rs.500 and Rs.1000, wiping out 86% of the currency overnight. These notes were replaced by newer notes of denominations of Rs.500 and Rs.2000. In the year 2020, there was the pandemic. Hence, in this paper, we use data from the six waves of 2018 and 2019. Averaging over six waves helps remove isolated episodes of ill health and seasonal factors. The results in this paper can be viewed as a characterisation of baseline conditions in India, against which special periods such as the pandemic can be compared in future research.

The CMIE measurement process thus involves one member—the respondent—who responds on behalf of all members in the household. The text of the question is “Does the member feel healthy as of today?”. To that extent, this measure is not self-reported health, but the state of health of an individual, as observed by the survey respondent in the same household. This may help ensure that relatively minor conditions, which are known to the person but not the respondent, do not influence the answer. Similarly, mental health issues could influence SRH as reported by an individual but not the information obtained from an observer in the household.

Variables of interest

  1. Age We form six discrete age bins: 0–4, 5–9, 10–34, 35–49, 50–59 and 60+. We expect a U-shaped curve where the very young and the very old are more unwell. This is because infants have a weak immune system and suffer from a high incidence of infectious disease. The elderly are gradually brought down by the process of aging.
  2. Social and demographic characteristics Aggregate facts about the incidence of ill health will be shaped by social as well as demographic characteristics. Towards this, we examine the variation in the age-specific SRH rate across gender, education, caste, and religion.
  3. Income Income could influence health through four pathways: nutrition, housing quality (which would influence hygiene and physical access for disease vectors), knowledge and health care. In this paper, we seek to measure the overall correlation between income and ill-health, summing across these four factors. There are always concerns about adequately capturing the income of a household through a survey. For example, [31] presents a discussion on the challenges of measurement through a household survey. However, it is the best available measure of cash-flows of households, and their ability to seek better preventive health as well as health-care. In order to avoid endogeneity bias (where ill-health triggers off reduced income), income in month t is expressed as an average of income over the previous 12 months.
    We use the total income of the household reported in the CPHS. It is the summation of total income of every earning member and the income of the household collectively, which cannot be attributed to any individual member. This includes income received from all sources such as rent, income earned from self production, private transfers, wages, overtime, bonus, etc. Household income as reported in CPHS is converted into real rupees using the Consumer Price Index. The base year of CPI is 2012–13.
  4. Location We examine how ill health varies by location. This is interesting, in and of itself, as it shows where the unwell persons are. This shows the spatial distribution of health care requirements. In addition, this can potentially yield insights on public health interventions that can improve health.

Odds ratios of ill health

We estimate a pooled logit model explaining ill-health. Standard errors are clustered at the level of the “primary sampling unit (PSU)”. These are the towns for the urban sample and villages for the rural sample. We estimate three models:

  1. We first model the log odds of reporting poor health using binary regression with a logit link function and robust error variance, given as: where is the odds that self reported poor health for individual i = 1, 0 otherwise. β0 represents the log odds of reporting poor health for the reference category. βXi represents the change in the log odds of reporting poor health for a one unit change in a vector of independent variables (age, sex and education, income quintiles). Here, the odds of an event is the ratio of the probability of the event happening to the probability of not happening (i.e ).
    In this model, we have time fixed effects, T, to permit systematic change of the overall ill-health rate by wave.
  2. The location may influence ill health. This could derive from differences in state performance on public health, e.g. on problems such as pollution control. To measure the variation of ill-health by location, after controlling for individual characteristics, we allow the intercept to vary by location (k). We estimate:
    This HR fixed effects estimate effectively has a distinct intercept k for each homogeneous region. This yields estimates for 102 coefficients for the HRs of India, which can be viewed as properties of these locations. A sorted list of these coefficients represents a sorted list of the places in India where certain features of these locations are associated with the best or worst health, after controlling for household characteristics.
  3. Finally, we also allow the slope in income to vary by location. We estimate:
    With this in hand, there is an overall model that applies for the whole country, but the slope and intercept for log income are measured on a per-HR basis.

With these estimates in hand, we perform certain counter-factual calculations for one canonical individual. We focus on a Hindu male, who is of the SC/ST caste, who is in the 35–49 age group, and is educated upto Class12/Diploma level. The person is in a rural household which has a real income of Rs.13,000 per month (about USD 266 per month). This is approximately the modal individual in the dataset. Holding these individual and household characteristics constant, we explore the extent to which the predicted ill health probability changes when this person is moved across each HR. This gives us a way to visualise the impact of location on SRH. It shows how ill health varies within India, after removing non-comparability owing to differences in age and income.

Results

The data contains information on 736,945 unique individuals from 170,804 unique households across the six waves from Wave 1 2018 till Wave 3 2019. The overall sample size is 3.5 million observations: on average, each unique individual is measured 4.5 times. S1 Table in S1 File shows the sample size across each wave used in the dataset. The panel is unbalanced: some households are present in less than three waves.

S2 Table in S1 File presents summary statistics about the dataset. There are 47.18% females. About 69.5% of the sample is in the 10–49 age group. While 84% of the sample is Hindu, 11% is Muslim. The education measure that we focus on in this paper is the education level of the most educated person in the household: this person may be expected to process information and help make health-related decisions for everyone in the household.

The overall estimate of self-reported ill health rate in the dataset is 3.25%. This overall average SRH rate implies an estimated 44.4 million individuals in India were reported as unhealthy on any one given date. It also implies that, on an average, individuals are unwell for about 12 days a year.

Fig 1 presents the share of unhealthy people in every HR. The map shows many intriguing facts that invite greater exploration. A few regions stand out as having a greater ill-health rate: Uttarakhand, West Haryana, East Uttar Pradesh, Bengal, Assam, Telangana and Andhra Pradesh, and Kerala. Both Himachal Pradesh and Uttarakhand are mountainous regions, but ill health in Uttarakhand is much worse.

One concern about SRH measurement is the extent to which psychological biases are systematically present in certain cultures. When we look at the map, there are many states within which we may expect a certain degree of cultural homogeneity, but the SRH values are heterogeneous. Haryana, Kerala, Uttar Pradesh and West Bengal show such features. This helps increase our confidence in the extent to which the psychological aspects of SRH measurement are not primarily shaped by culture.

Fig 1 shows the rates of self-reported ill health across the country, regardless of the reason why this might be happening. This map directly illuminates health care requirements. From the viewpoint of understanding the health of the people, of course, we must recognise that many factors are at work in generating this heterogeneity. As an example, high levels of ill health in Kerala may reflect the greater share of the elderly in Kerala.

We turn to examining the variation of the ill-health rate by individual and household characteristics. Fig 2 shows the age variation of the ill-health rate. It is useful to recall that an ill health rate of 1% corresponds to 3.65 unhealthy days per year. The graph, where the y axis is in log scale, shows a U shaped pattern. A little over 2% of infants (0–4 age group) are reported as unhealthy. This declines to the lowest ill-health at the 20–24 age group, and then degrades with age. For the entire age range from 10 to 39 years, the ill health rate is relatively low, with a peak value that is slightly above 1%. As we move to the elderly, there are large jumps in the share of unhealthy people.

Table 1 shows the fraction of people who reported being unhealthy in each age group, interacted with socio-economic characteristics. The first row—Overall—represents a useful comparison point for the other rows.

In the 35–49 age range, there is greater ill-health with the rural population, for women and for Muslims. The upper caste shows the highest ill-health rate in the 35–49 age range. When the most educated person in the household is in the lowest education category (none or primary), the ill-health rate is much higher, at 2.08%, in the 35–49 age range and 1.23% in the 10–34 age range.

Higher household income is generally associated with lower ill health rates. For all the age ranges from 10 to 59, there is a monotonic relationship: richer households have reduced ill health. But this is not the case below age 10 and above age 59.

A survivorship bias may be at work. If poor people are more prone to die, when prosperity arrives, mortality may improve, so that persons are more likely to be alive and report they are unwell. The ill-health rate for age 0–4 is at 2.21% in the lowest income quintile and actually worsens to 2.31% at the top quintile. Similarly, when it comes to the elderly, the ill health rate falls to 19.46% in the second income quintile, but rises to 20.88% for the middle and then the best value of 17.65% is obtained at the top income quintile.

Finally, there are strong location effects visible in this table. In the age 35–49 range, we get a large variation by region. Two regions are well above the overall average of 1.66% (4.13% (North-East), 3.34% (East)) and three are well below (1.04% (West), 0.73% (Central) and 0.51% (South)).

While the patterns in Table 1 are quite revealing, all these covariates are correlated with each other. It is, therefore, useful to look at the adjusted odds ratios from the logit regression where these covariates are all present at once in the model. These are presented in Table 2. Columns (1) presents the odds-ratio and the 95% confidence intervals and clustered standard errors for a single model that covers all households, with only wave (time) fixed effects. Column (2) presents the same, with HR fixed effects. The single intercept for the whole country is broken out into 102 intercepts for the 102 HRs. Finally, Column (3) presents the results with HR controls and HR and income interaction effects. Instead of a single coefficient for log income for all households, this coefficient is permitted to vary by HR.

thumbnail
Table 2. Odds ratios from logistic regressions that predict self reported (ill)Health.

https://doi.org/10.1371/journal.pone.0279999.t002

The variation of the odds-ratio in Model (2), by age, is consistent with the variation seen in the simple summary statistics of Fig 2. The coefficient of log income is statistically and economically significant. A striking feature of Model (2) when compared with Model (1) is that once we control for age, income and location, the importance of religion and caste subsides. The most important sources of variation of SRH, in Model (2) and (3) are age, income and location. In Model (3), there are small enhanced ill health rates for women (OR = 1.048) and one education categories (OR = 1.070 for Class 10).

When we compare Model (1) versus Model (2) there is a large gain in the Bayesian Information Criterion (from 804,737 to 731,282). This suggests that the HR fixed effect (variation of the intercept of the model by HR) is particularly important. In comparison, there is a reduced gain in going from Model (2) to Model (3) (from 731,282 to 729,763).

Fig 1 had shown us the unconditional geographical variation of ill health. With the models of Table 2 in hand, we can control for some explanatory variables and focus on geographical variation. In order to do this, we make predictions for the ill-health rate for the canonical individual: a Hindu, male, in the 35–49 age group, of the SC/ST caste category, with a Class 12/Diploma education, in a rural household with a real income of Rs.13,000 per month. This is approximately the modal person in the dataset. Within each homogeneous region, we compute the predicted probability of ill health, using Model (3). These predicted probabilities are mapped in Fig 3.

thumbnail
Fig 3. Model-based predictions of the probability of ill health for a fixed individual.

https://doi.org/10.1371/journal.pone.0279999.g003

The comparison between the unconditional geographical variation of ill health (Fig 1) and the geographical variation of ill-health after controlling for individual characteristics (Fig 3) is revealing. The range of values, in the unconditional map, is much higher, with values ranging from 0 to 15%. Once we narrow down to the canonical person, important sources of variation (age and income) are removed from the picture. Persons in the 35–49 age range are healthier than the overall population. Hence, the range of values seen in Fig 3 is smaller: from 0 to 4%.

When a hot spot like south Kerala is visible in Fig 1, this could be associated with age structure, income or location. When it shows up (more weakly) in Fig 3, this suggests there is some feature of the location which is associated with greater ill health, which is not merely induced by age structure and income.

A striking feature of the two figures is the extent to which they look similar (though of course the scale is quite different). This is consistent with the statistical findings in Table 2, that location has a powerful impact upon ill health.

The overall result in Model 2 and Model 3 is that health improves with log income. Model (3) permits the slope in income to also vary by location, thus yielding 102 coefficients, for the slope on log income in each HR. Fig 4 helps us visualise the estimated slopes. Regions that are coloured blue are those where higher income is associated with reduced ill health, where the null hypothesis of a slope of 0 is rejected at a 95% level of significance. This property is found in only 49% of the HRs of India. With 35% of the HRs, H0 is not rejected. In 16% of the locations, shown in red, ill health is greater for households with higher income. We also test for the non-linearity between income and ill-health through a cubic spline based regression. The predicted probability of ill health is seen to linearly decline with income as presented in S1 Fig in S1 File.

Discussion

The paper finds that there is a U-shaped relationship between age and self-reported ill health. This is similar to previous research that finds that a sizable number of elderly who report a poor self-rated health [32]. Women have worse self-related health status, also consistent with other literature [33]. The conventional wisdom of health economics expects that higher income generates better health, through a combination of improved nutrition, housing (which impacts on hygiene and disease vectors), knowledge and health care [34]. In the aggregate, in our results, an increase in log income is indeed associated with reduced ill health. While this paper makes no causal claims, ultimately, if there is a causal and positive link between income and health, then mere economic growth would help improve health (while recognising that some of this impact flows through higher public and private expenses on health).

However, the interesting finding of the paper is the role of location in determining health. In fact, after location, several of the social variables such as caste and religion lose their statistical significance. We find that there is a large variation in the self-reported health of the identical person depending on the location of the person. Table 3 shows the names of the 10 HRs of India with the highest and lowest ill health rates. We may offer some conjectures about the epidemiological phenomena at work in Fig 3. The region of eastern Uttar Pradesh, Assam and Bengal is known to face high arsenic contamination of water [35, 36]. The adverse health outcomes of arsenic contamination in the form of skin lesions, neurological effects, obstetric problems, cardiovascular effects and cancers typically involving the skin, lung, and bladder [3739]. Our reported measures of poor health in Eastern India may reflect this phenomenon.

thumbnail
Table 3. The HRs with the highest and lowest ill health rates.

https://doi.org/10.1371/journal.pone.0279999.t003

Similarly, the high ill health in Uttarakhand, as opposed to the neighbouring hilly state of Himachal Pradesh, might be associated with religious pilgrimage and festivals [40, 41] where pilgrims from all across India bring novel pathogens to Uttarakhand, and unhygienic crowding of pilgrims within Uttarakhand create conditions where more virulent strains succeed. There is a need for further research in understanding the sources of this variation, particularly the variation seen in the predicted ill health rates for the modal person which reflect characteristics of these locations that can potentially be influenced by modified strategies in public health.

The second surprising result is that there is considerable geographical heterogeneity in this slope. In about 40% of India, the relationship is negative (the rich have less ill health). But in the rest of India, this association is not observed. There is a significant area where health is worse for higher income households. The two maps (Figs 3 and 4) also do not fit within a simple North India vs. South India stereotype. While the population centres of Uttar Pradesh and Bihar are considered to be poverty traps, the canonical person is only particularly unhealthy in Eastern Uttar Pradesh. There are some parts of UP and Bihar where the rich are healthier than the poor. The HRs which have greater difficulties are present in many parts of the country and call for a revision of our priors about where ill health in India is found. The phenomena seen in the two maps (Figs 3 and 4) are not the same. The regions where ill health worsens for high income people are not the same as the regions with high ill health. These are distinct phenomena that require distinct explanations.

The impact of increased income upon health through improved nutrition and housing is likely to operate all across the country in relatively uniform ways. If health care is completely absent, there should be a negative slope in income as higher income is likely to generate better nutrition, housing and knowledge. If government-supplied health care works well, all income groups would get access to comparable health care, and a negative slope in income would still be generated through the impact of nutrition, housing and knowledge. If private health care works well, higher income would generate better purchases of health care services, and a negative slope in income would arise. It is thus a puzzle, to explain the white areas (rich and poor are similarly healthy) and red areas (the rich have higher ill-health).

We conjecture that this is related to health care and survivorship bias. High income households in certain regions may get low quality care, to a point where it overshadows the other channels of impact (improved nutrition, housing, knowledge). If there is a large income gap in mortality, for the young and the old, then high income may generate more survivors who are in a state of ill health, thus generating a zero or negative slope for health in income. Another explanation can be found in the work by [42] who finds that health improvements in poor countries are not driven by income, or knowledge or technology, but by social factors that improve health delivery, and political factors that make population health a priority.

There are two limitations of this work. The standard difficulties of SRH measurement are present here: where health is observed through an intermediating psychological filter which introduces a certain degree of imprecision. To the extent that these psychological characteristics are correlated with the variables of interest, the statistical estimates are biased. The second limitation of this study is that the logistic regressions shown here lack a causal interpretation. These calculations should be viewed as merely describing features in the data. They do not guide interventions where certain features are changed in order to impact upon SRH.

Conclusions

In this paper, we have undertaken a novel examination of a health outcome measure in India in a large-scale household survey database pertaining to calendar years 2018 and 2019, drawing on the health status seen in 3.2 million records for individuals. In the existing health literature, there is a substantial analysis of the health of infants, children and mothers. In this paper, we analyse the entire population.

The overall aggregate ill-health rate is about 3.25%, which maps to about 44 million persons being unwell on any one day. The average individual is unwell for about 12 days a year. There is a U-shaped curve, with a long zone of reduced ill health rates from age 10 to age 39.

The important correlates of SRH are age, income and location. Once these are controlled for, other individual characteristics that were explored here (education, caste, religion, gender) are relatively unimportant.

Location is remarkably important. Health researchers, and health policy makers, need to look beyond all-India averages, and recognise the immense variation within the country. The three maps constructed in the paper—the raw ill-health rate, the ill-health rate of the modal person and the regions where higher income is correlated with improved health—are novel and go beyond the simple preconceptions of North India vs. South India. The geographical variation shown here merits further exploration.

References

  1. 1. Upadhyay AK, Singh A, Srivastava S. New evidence on the impact of the quality of prenatal care on neonatal and infant mortality in India. Journal of Biosocial Science. 2019;52(3):439–451. pmid:31496456
  2. 2. Ali B, Chauhan S. Inequalities in the utilisation of maternal health Care in Rural India: Evidences from National Family Health Survey III & IV. BMC Public Health. 2020;20(1):(1–13).
  3. 3. Puri P, Singh SK, Srivastava S. Reporting heterogeneity in the measurement of hypertension and diabetes in India. Journal of Public Health. 2020; 28(1):p. 23–30.
  4. 4. Charan J, Tank N, Reljic T, Singh S, Bhardwaj P, Kaur R, et al. Prevalence of multidrug resistance tuberculosis in adult patients in India: A systematic review and meta-analysis. J Family Med Prim Care. 2019;8(10):3191–3201. pmid:31742141
  5. 5. Vart P, Jaglan A, Shafique K. Caste-based social inequalities and childhood anemia in India: results from the National Family Health Survey (NFHS) 2005–2006. BMC Public Health. 2015;15(1):1–8. pmid:26044618
  6. 6. Krishnamoorthy K, Harichandrakumar KT, Kumari AK, Das LK. Burden of chikungunya in India: estimates of disability adjusted life years (DALY) lost in 2006 epidemic. J Vector Borne Dis. 2009;46(1):26–35. pmid:19326705
  7. 7. Creedy J, Subramanian S. Mortality Comparisons’At a Glance’: A Mortality Concentration Curve and Decomposition Analysis for India. Sankhya B. 2022; 84(2):873–894.
  8. 8. Singh M, Jha RP, Shri N, Bhattacharyya K, Patel P, Dhamnetiya D. Secular trends in incidence and mortality of cervical cancer in India and its states, 1990–2019: data from the Global Burden of Disease 2019. BMC Cancer. 2022;22(1):1–12. pmid:35130853
  9. 9. Barik D, Desai S, Vanneman R. Economic Status and Adult Mortality in India: Is the Relationship Sensitive to Choice of Indicators? World Development. 2018; 103:176–187. pmid:29503495
  10. 10. Jha P, Gajalakshmi V, Gupta PC, Kumar R, Mony P, Dhingra N, et al. Prospective Study of One Million Deaths in India: Rationale, Design, and Validation Results. PLOS Medicine. 2006;3(2: e18). pmid:16354108
  11. 11. Singh B. Anthropological investigations of vitality: Life force as a dimension distinct from space and time HAU: Journal of Ethnographic Theory. 2018;8(3):550–565.
  12. 12. Zajacova A, Dowd JB. Reliability of Self-rated Health in US Adults. American Journal of Epidemiology. 2011;174(8):977–983. pmid:21890836
  13. 13. Hirve S, Juvekar S, Lele P, Agarwal D. Social gradients in self-reported health and well-being among adults aged 50 years and over in Pune District, India. Global Health Action. 2010;3(1):2128.
  14. 14. Singh L, Arokiasamy P, Singh PK, Rai RK. Determinants of Gender Differences in Self-Rated Health Among Older Population: Evidence from India. SAGE Open. 2013;3(2):2158244013487914.
  15. 15. Boerma T, Hosseinpoor AR, Verdes E, Chatterji S. A global assessment of the gender gap in self-reported health with survey data from 59 countries. BMC public health. 2016;16(1):1–9. pmid:27475755
  16. 16. Dasgupta A. Systematic measurement error in self-reported health: is anchoring vignettes the way out? IZA Journal of Development and Migration. 2018;8(1):1–30.
  17. 17. Onur I, Velamuri M. The gap between self-reported and objective measures of disease status in India. PloS one. 2018;13(8):e0202786. pmid:30148894
  18. 18. Banerjee A, Deaton A, Duflo E. Health, health care, and economic development. 2004;94(2):326–330.
  19. 19. Vellakkal S, Subramanian SV, Millett C, Basu S, Stuckler D, Ebrahim S. Socioeconomic Inequalities in Non-Communicable Diseases Prevalence in India: Disparities between Self-Reported Diagnoses and Standardized Measure. PLoS ONE. 2013;8((7): e68219). pmid:23869213
  20. 20. Wu S, Wang R, Zhao Y, Ma X, Wu M, Yan X, et al. The relationship between self-rated health and objective health status: a population-based study. BMC public health. 2013;13(1):1–9. pmid:23570559
  21. 21. Cullati S, Mukhopadhyay S, Sieber S, Chakraborty A, Burton-Jeangros C. Is the single self-rated health item reliable in India?A construct validity study. BMJ global health. 2018;3(6):e000856. pmid:30483411
  22. 22. Idler EL, Benyamini Y. Self-Rated Health and Mortality: A Review of Twenty-Seven Community Studies. Journal of Health and Social Behavior. 1997;38(1):21–37. pmid:9097506
  23. 23. Miilunpalo S, Vuori I, Oja P, Pasanen M, Urponen H. Self-rated health status as a health measure: the predictive value of self-reported health status on the use of physician services and on mortality in the working-age population. Journal of clinical epidemiology. 1997;50(5):517–528. pmid:9180644
  24. 24. Benyamini Y, Idler EL. Community studies reporting association between self-rated health and mortality: additional studies, 1995 to 1998. Research on aging. 1999;21(3):392–401.
  25. 25. Heistaro S, Jousilahti P, Lahelma E, Vartiainen E, Puska P. Self rated health and mortality: a long term prospective study in eastern Finland. Journal of Epidemiology & Community Health. 2001;55(4):227–232. pmid:11238576
  26. 26. Subramanian S, Subramanyam MA, Selvaraj S, Kawachi I. Are self-reports of health and morbidities in developing countries misleading? Evidence from India. Social science & medicine. 2009;68(2):260–265. pmid:19019521
  27. 27. Góngora-Salazar P, Casabianca MS, Rodríguez-Lesmes P. Income inequality and self-rated health status in Colombia. International Journal for Equity in Health. 2022;21(1):1–14. pmid:35578287
  28. 28. Rajan K, Kennedy J, King L. Is wealthier always healthier in poor countries? The health implications of income, inequality, poverty, and literacy in India. Social Science & Medicine. 2013;88:98–107. pmid:23702215
  29. 29. Kondo N, Sembajwe G, Kawachi I, van Dam RM, Subramanian SV, Yamagata Z. Income inequality, mortality, and self rated health: meta-analysis of multilevel studies. BMJ. 2009;339(b4471). pmid:19903981
  30. 30. Weich S, Lewis G, Jenkins SP. Income inequality and self rated health in Britain. Journal of Epidemiology & Community Health. 2002;56(6):436–441. pmid:12011200
  31. 31. Deaton A. Household Surveys, Consumption, and the Measurement of Poverty. Economic Systems Research. 2003;15(2):135–159.
  32. 32. Kumar S, Pradhan MR. Self-rated health status and its correlates among the elderly in India. Journal of Public Health. 2019;27(3):291–299.
  33. 33. Case A, Deaton A. Health and Wealth among the Poor: India and South Africa Compared. The American Economic Review, Papers and Proceedings of the One Hundred Seventeenth Annual Meeting of the American Economic Association. 2005;95(2):229–233. pmid:29120590
  34. 34. Marmot M. The Influence Of Income On Health: Views Of An Epidemiologist. Health Affairs. 2002;21(2):31–46. pmid:11900185
  35. 35. Sengupta et al. Groundwater Arsenic Contamination in the Ganga-Padma-Meghna-Brahmaputra Plain of India and Bangladesh. Archives of Environmental Health: An International Journal 2003;58(11):701–702. pmid:15702895
  36. 36. Chakraborti et al. Groundwater Arsenic Contamination in the Ganga River Basin: A Future Health Danger. International Journal of Environmental Research and Public Health 2018;15(2):180. pmid:29360747
  37. 37. Ahamed et al. Arsenic groundwater contamination and its health effects in the state of Uttar Pradesh (UP) in upper and middle Ganga plain, India: a severe danger. Science of the Total Environment 2006;370(2-3):310–322. pmid:16899281
  38. 38. Chakrabarty S, Sarma HP. Heavy metal contamination of drinking water in Kamrup district, Assam, India. Environmental monitoring and assessment 2011;179(1-4):479–486. pmid:20976545
  39. 39. Mahanta R, Chowdhury J, Nath HK. Health costs of arsenic contamination of drinking water in Assam, India. Economic Analysis and Policy 2016;49:30–42.
  40. 40. David S, Roy N. Public health perspectives from the biggest human mass gathering on earth: Kumbh Mela, India. International Journal of Infectious Diseases. 2016;47:42–45. pmid:26827807
  41. 41. Sridhar S, Gautret P, Brouqui P. A comprehensive review of the Kumbh Mela: identifying risks for spread of infectious diseases. Clinical Microbiology and Infection. 2015;21(2):128–133. pmid:25682278
  42. 42. Deaton A. Global Patterns of Income and Health: Facts, Interpretations, and Policies. NBER Working Paper 12735. 2006.