Reporting biases in self-assessed physical and cognitive health status of older Europeans

This paper explores which demographic characteristics substantially bias self-reported physical and cognitive health status of older Europeans. The analysis utilises micro-data for 19 European countries from the Survey of Health, Ageing and Retirement in Europe to compare performance-tested outcomes of mobility and memory with their self-reported equivalents. Relative importance analysis based on multinomial logistic regressions shows that the bias in self-reported health is mostly due to reporting heterogeneities between countries and age groups, whereas gender contributes little to the discrepancy. Concordance of mobility and cognition measures is highly related; however, differences in reporting behaviour due to education and cultural background have a larger impact on self-assessed memory than on self-assessed mobility. Southern as well as Central and Eastern Europeans are much more likely to misreport their physical and cognitive abilities than Northern and Western Europeans. Overall, our results suggest that comparisons of self-reported health between countries and age groups are prone to significant biases, whereas comparisons between genders are credible for most European countries. These findings are crucial given that self-assessed data are often the only information available to researchers and policymakers when asking health-related questions.

displays the regression results for Models 1 and 2 when using the new specification of cognitive impairment. The magnitude of the coefficients changes, yet the findings remain the same as within the main analysis. The pattern of age effects and between countries are almost identical to the main findings.
The only difference is that the level of overestimating is lower and the level of underestimating is higher with the new specification. In conclusion, the threshold of impairment impacts the level of overestimating and underestimating, but not the overall trends in concordance between tested and self-reported cognition.
In our main analysis, objective cognition was based on immediate word recall. However, the selfassessment of memory might also refer to delayed word recall. Thus, we also provide an additional analysis of objective cognitive impairment based on delayed word recall. During the interview, survey participants are first asked to repeat a list of ten words, which is the basis for the immediate word recall measure. Following that, the participants perform some additional tests, for example on numeracy. After these additional tests, which take approximately 5 minutes, the interviewer asks "A little while ago, I read you a list of words and you repeated the ones you could remember. Please tell me any of the words that you can remember now?", which is the basis for a delayed word recall measure. While survey participants recall on average 5.2 words immediately, they only recall 3.9 words in the delayed test. As a consequence, concordance is lower when objective cognition is based on delayed word recall, because, by default, more individuals overestimate their cognition when the new definition is applied. Table E presents regression results for when objective cognition is based on delayed word recall. While the trend in age is similar to that of immediate word recall, the decrease in concordance with age appears less steep. Furthermore, differences between educational attainment groups are smaller when the new specification is applied. On the contrary, the difference between the genders increases. In line with these findings, the results based on the relative importance analysis show that age and education appear slightly less important in explaining the variance in response behaviour, whereas gender appears more relevant.

Additional sample compositions
We also analyse whether the results are sensitive to different sample compositions. For example, frail individuals might be more likely to live in institutions in some countries than in other countries and consequently are not always included in our target population of non-institutionalised population. This could be relevant for the results since the survey respondent's overall level of health might affect concordance, especially when they suffer from very poor health. Thus, we exclude frail individuals from the sample and analyse if they influence the outcomes. To measure frailty, we rely on a well-established indicator introduced by [1], for which individuals are considered frail if they show three or more of the following components: exhaustion, weakness, slowness, shrinking and low activity levels. We follow exactly the operationalisation by [2], who adapted the indicator for SHARE data. According to the frailty measure, 8% of the survey participants are considered frail in our mobility sample (Waves 2 and 5), and 9% in our cognition sample (Waves 4 and 5). Consequently, 6,335 observations are dropped for the robustness analysis of mobility, and 9,996 observations for cognition.
The results for mobility are presented in Table F. Country coefficients change marginally in magnitude when frail individuals are excluded, while all other coefficients remain almost identical. Similarly, results based on relative importance analysis hardly change when frail survey participants are dropped. In the model with (without) frail individuals, country differences contribute 35% (39%) to the explained variance, age differences contribute 29% (32%), education differences contribute 17% (15%), gender differences contribute 11% (11%) and time effects contribute 5% (6%). Thus, the only difference is that age and education contribute marginally less to the explained variance in concordance, which appears plausible since frailty is highly correlated with age and education. Consequently, all other determinants explain relatively more of the variation once frailty is accounted for. The results for cognition hardly change when frail individuals are dropped from the sample (Table G). Country coefficients change slightly in magnitude, but not in sign. All other coefficients are virtually identical to those of the main regression analysis. Similarly, results based on relative importance analysis remain unaffected. In summary, the results appear robust to different compositions of frail individuals and their reporting behaviour.
In the main analysis, we describe differences in reporting behaviour between physical and cognitive impairment. Physical impairment is taken from Wave 2 and Wave 5, cognitive impairment from Wave 4 and Wave 5. Since the results for the two health dimensions are not based on the same sample, these differences could stem from differences in the sample rather than differences in reporting behaviour. Thus, we run additional analyses based on Wave 5 only, in which information on concordance of physical as well as cognitive health care measures is provided, i.e. we can estimate the relationship between demographic characteristics and the probability to overestimate or underestimate physical and cognitive health based on the exact same group of individuals. The regression results are provided in Table H and Table I. Since wave dummies are not needed for this specification, they are excluded from the model. Although some of the coefficients slightly change in magnitude and significance, the main results appear robust. Results from the relative importance analysis cannot be directly compared with the main model, since the wave dummy is now missing. In Wave 5, the explained variation in concordance of mobility measures can be decomposed as follows: country differences 29%, age differences 43%, educational differences 19%, gender differences 10%; thus, the main difference to the estimations based on both waves is that age appears more relevant now than when both waves are combined. The variation in concordance of cognition measures can be decomposed as follows: country differences 50%, age differences 27%, educational differences 21% and gender 1%. Thus, the results are very similar to the main computations.
Results for each country individually can be found in Figs A and B.

Additional model specifications
In addition to demographic characteristics, other factors might have an impact on concordance and/or further explain the effect of demographic characteristics on reporting behaviour. In particular, we analyse whether the results change after we account for employment status, marital status and whether a person has children (Tables J-O). Furthermore, Tables P to S provide regression results including learning effects and an interaction term between gender and education.
Whether an individual works or not is likely to influence health perception. First, persons working regularly might be more aware of their mobility impairments. Further, during their working tasks they might face limitations of their memory abilities, which might be particularly relevant for individuals working in analytical jobs. Since age is highly correlated with an individual's employment status, parts of the strong effect of age on concordance might be explained by younger survey participants that are still in employment.
Furthermore, employment might be an important mediator for the effect of educational attainment on concordance between measures of cognitive health, since highly educated individuals are more likely to work in jobs that demand strong cognitive skills. To test the employment channel, we add a dummy variable to the models that indicates if an individual is employed, as opposed to retired, unemployed, permanently sick or a homemaker.
In the mobility sample, 27% of the survey participants are employed and in the cognition sample, it is 26%.
In both samples, employment has a strong negative correlation with age and a strong positive correlation with educational attainment. Furthermore, summary statistics show that employed individuals are more likely to achieve concordance. Tables J and K present regression results for mobility and cognition respectively. As expected, employed individuals are less likely to overestimate or underestimate their physical and cognitive health. Furthermore, the age gradient in concordance appears less pronounced. In addition, the education gradient in concordance appears less pronounced for mobility once employment is accounted for but does not change for cognition.
In addition to employment, having children or being in a relationship might influence health perception.
For example, if family members comment on the survey participant's health status or if the health of other family members serves as a reference point. Thus, we provide results for two more models, in which we control for whether the survey participant has children (Tables L and M) and for whether the survey participant is married or in a registered partnership (Tables N and O). The coefficients for children and marriage either have the expected sign or are insignificant. What is more relevant for the work at hands, however, is that the inclusion of these variables has almost no impact on all other coefficients.
Relative importance analysis confirms that the employment channel explains part of the strong age effect, at least for reporting behaviour related to mobility. When employment status, marital status and a dummy for children are added to the model for mobility, country differences still contribute 32% percent to the explained variation, but age differences drop to 20%, probably, because differences in employment status explain 17%. Likely, for the same reason, the contribution of educational differences slightly decreases to 13%. Gender remains at 9% and wave at 4%. Being married (3%) and having children (1%) explains only little of the variation. Similar results are found for cognition, although employment seems relatively less important in explaining concordance. Country differences contribute 44% to the explained variation, age differences 22%, differences in employment status 11%, educational differences 20%, gender 2% and wave less than one per cent. Again, the contribution of having children and being married is negligible.
Including additional mediators in the model identified potential pathways, but more detailed analyses are required to draw concrete conclusions. For instance, the effect of labour market participation should be investigated more thoroughly considering factors such as the number of working hours, part-time retirement and type of occupation; however, this goes beyond the scope of this study.      The dependent variable is a three-category variable that indicates if an individual achieved concordance (reference category), overestimated or underestimated his or her health. Coefficients are given in log odds, standard errors are clustered at the individual level, *p<0.05, **p<0.01, ***p<0.001 The dependent variable is a three-category variable that indicates if an individual achieved concordance (reference category), overestimated or underestimated his or her health. Coefficients are given in log odds, standard errors are clustered at the i ndividual level, *p<0.05, **p<0.01, ***p<0.001 The dependent variable is a three-category variable that indicates if an individual achieved concordance (reference category), overestimated or underestimated his or her health. Coefficients are given in log odds, standard errors are clustered at the individual level, *p<0.05, **p<0.01, ***p<0.001 The dependent variable is a three-category variable that indicates if an individual achieved concordance (reference category), overestimated or underestimated his or her health. Coefficients are given in log odds, standard errors are clustered at the individual level, *p<0.05, **p<0.01, ***p<0.001 The dependent variable is a three-category variable that indicates if an individual achieved concordance (reference category), overestimated or underestimated his or her health. Coefficients are given in log odds, standard errors are clustered at the i ndividual level, *p<0.05, **p<0.01, ***p<0.001