Weight Bias Internalization Scale: Psychometric Properties and Population Norms

Objective Internalizing the pervasive weight bias commonly directed towards individuals with overweight and obesity, co-occurs with increased psychopathology and impaired quality of life. This study sought to establish population norms and psychometric properties of the most widely used self-report questionnaire, the Weight Bias Internalization Scale (WBIS), in a representative community sample. Design and Methods In a survey of the German population, N = 1158 individuals with overweight and obesity were assessed with the WBIS and self-report measures for convergent validation. Results Item analysis revealed favorable item-total correlation of all but one WBIS item. With this item removed, item homogeneity and internal consistency were excellent. The one-factor structure of the WBIS was confirmed using confirmatory factor analysis. Convergent validity was shown through significant associations with measures of depressive and somatoform symptoms. The WBIS contributed to the explanation of variance in depressive and somatoform symptoms over and above body mass index. Higher WBIS scores were found in women than in men, in individuals with obesity than in individuals with overweight, and in those with lower education or income than those with higher education or income. Sex-specific norms were provided. Conclusions The results showed good psychometric properties of the WBIS after removal of one item. Future research is warranted on further indicators of reliability and validity, for example, retest reliability, sensitivity to change, and prognostic validity.


Introduction
Weight bias includes pervasive negative stereotypes and prejudice regarding an individual's overweight, such as attributions of responsibility or incompetence, and can extend to actual discrimination in multiple domains of life [1,2]. Stigmatized individuals with overweight and obesity often have the tendency to internalize this weight bias, leading to feelings of incompetence, self-hate, or devaluation. Consequently, weight bias internalization has significant associations with depressive symptoms, anxiety, lower self-esteem, eating disorder psychopathology, social and behavioral problems, lower quality of life and health status, and greater health care utilization [3][4][5][6][7][8]. Weight bias internalization has been shown to have greater explanatory power of psychopathology over and above stigmatizing attitudes, experiences of discrimination, and body mass index (BMI, kg/m 2 ) [9][10][11][12].
Despite the psychopathological relevance of weight bias internalization, only two self-report questionnaires are available for assessment [5,9]. The most commonly used instrument, the Weight Bias Internalization Scale (WBIS) [9], measures the degree to which a respondent believes that negative stereotypes and selfstatements about persons with overweight and obesity apply to herself or himself (11 items, e.g., ''I hate myself for being overweight;'' 1 = strongly disagree to 7 = strongly agree). Psychometric analyses in adult samples documented good internal consistency of the total mean score (.71#Cronbach's a#.94), corrected itemtotal correlations in the middle to upper range, a unidimensional factor structure, and convergent validity in the explanation of psychopathology as described above [8][9][10][11][12]. A re-analysis of the measure in adolescents seeking surgical treatment for obesity suggested the elimination of one item to yield unidimensionality, and Cronbach's a and corrected item-total correlations were slightly enhanced [4]. Overall, the WBIS was developed and evaluated in non-representative Internet-based community samples [9,10,12] and in smaller-sized clinical samples (n,200) [3,4,8]. Thus, establishing population norms and additional psychometric properties (including factorial, convergent, and discriminant validity, and item statistics) in a representative sample are needed. This study addressed these aspects for the German version of the WBIS.

Recruitment and Sample
In June and July 2012, a representative sample of the German population was recruited, with assistance by an independent agency specializing in market, opinion, and social research (USUMA; Berlin, Germany). In a three-stage random sampling procedure, sample point regions were selected from 320 regions, based on representative data; target households within these sample point regions were determined using a random route procedure; and target persons within these target households were selected using a kish selection grid. Inclusion criteria were age $14 years and fluent German.
Following this procedure, 4436 target households were randomly selected from all German states. Of these, N = 2515 individuals participated in the assessment, corresponding to a response rate of 56.7% (573 [ All participants were visited in-person. A maximum of four attempts were made to contact a target person. Participants were informed about the study by a trained research assistant, and told the purpose of the study was to investigate health and general behavior. Participants provided their oral informed consent prior to assessment (for minor participants, oral informed consent was also obtained from one parent). Oral consent is common in survey research in Germany. The ethical guidelines of the International Code of Marketing and Social Research Practise by the International Chamber of Commerce and the European Society for Opinion and Marketing Research were followed. The Ethics Committee of the University of Leipzig approved the methodological concept for the conduct of this study including the consent procedure according to which participants were included and assessed only if they had given their verbal consent (Approval No. 092-12-05032012). During the self-report assessment, the research assistant was present in order to assist with completion if he or she was asked for help. In accordance with this procedure, social desirability effects were limited, but cannot be excluded.
Body mass index (BMI, kg/m 2 ) was calculated from selfreported weight and height. As the WBIS was originally designed to measure weight bias internalization, only overweight and obese participants with BMI$25.0 kg/m 2 were selected for the analyses (N = 1164 [46.9%]). Additionally, participants with incomplete data on the WBIS (i.e., one or more missing items) were excluded from the analyses. Thus, N = 1092 participants were retained for the final study sample.
Sample characteristics are shown in

Measures
Weight Bias Internalization Scale (WBIS). The English version of the WBIS was translated into German and controlled by a back-translation procedure through a licensed translator (for psychometric properties, see Introduction).
Beck Depression Inventory for Primary Care (BDI-PC). The BDI-PC [13] is a widely used screening questionnaire for major depression that consists of seven items (e.g., symptoms of sadness, pessimism, and loss of pleasure). The items are rated on four-point scales, each of which consists of four different statements that reflect varying degrees of depressive symptom severity. A total sum score is computed with higher scores indicating greater severity of depressive symptoms. The scale shows good internal consistency (Cronbach's a = .86) and convergent validity. It was hypothesized that WBIS scores would be positively correlated with greater symptom severity.
Somatic Symptom Scale -8 (SSS-8). The SSS-8 [14] is the short form of the PHQ-15 [15] and was used to assess the severity of somatic symptoms such as stomachaches or headaches. The eight items are scored from 1 = not bothered at all to 5 = bothered very strongly, and a total sum score is computed, with higher scores indicating greater somatic symptom severity. The SSS-8 shows good internal consistency (Cronbach's a = .81) and convergent validity. A positive correlation between WBIS scores and greater severity of somatic symptoms was expected.

Data Analytic Plan
Primary analyses. For psychometric analyses, item distributions were tested for normality using the Shapiro-Wilks normality test. Pearson's r was calculated for corrected itemtotal-correlations, and average inter-item correlation. Item difficulty was estimated as p m = sum of item scores/(N * maximal item score). Cronbach's a was computed as a measure of internal consistency. All psychometric analyses were performed for men and women separately.
Secondary analyses. Distributions of WBIS mean scores were analyzed using univariate General Linear Model analyses including Sex6Age6Weight status, with an additional inclusion of the factors Education, Household income, Marital status, or Nationality, respectively, in separate steps (for categories of sociodemographic variables see Table 1). Univariate and posthoc test results were only interpreted when significant higher-order effects were found. To estimate effect sizes, partial g 2 was reported when appropriate and interpreted according to Cohen [16] (g 2 : small: .01, medium: .06, large: .14). Based on the results of the Sex6Age6Weight status analysis, percentiles were determined for the total mean score of the WBIS.
Confirmatory factor analyses (CFAs), using maximum likelihood estimation, were conducted to evaluate the hypothesized one-factorial structure of the WBIS [9]. First, a CFA was conducted for the final sample with BMI$25.0 kg/m 2 . The data were examined for normality using the Mardia test. In the case of multivariate non-normality, the Bollen-Stine bootstrap method was utilized [17]. The adequacy of fit was assessed using the following statistics: x 2 test, Tucker-Lewis Index (TLI), Comparative Fit Index (CFI), Root Mean Square Error of Approximation (RMSEA), and Standardized Root Mean Residual (SRMR).
To test the WBIS's convergent validity, Spearman's rank correlation coefficients were calculated between WBIS scores and measures for convergent validation. Hierarchical multiple regression analyses were run to assess impact of sex, age, BMI (block 1), and WBIS mean score (block 2) on depression and severity of somatic symptoms, respectively. In an additional step, an interaction term between sex * WBIS scores was included in both regression analyses. Effect size of prediction was evaluated according to Cohen [16] (R 2 : small: .01, medium: .09, large: .25).
A two-tailed a,.05 was applied for all statistical tests. Statistical analyses were performed using IBM SPSS Statistics and AMOS version 20.0. Data are available upon request.

Primary Analyses
Item analysis. The percentage of missing item responses was low (M = 4.83%, SD = 0.50). Item distributions deviated significantly from normality (all p,.01). All items were positively skewed, except item 4 (''I wish I could drastically change my weight''), which was negatively skewed for women ( Table 2). Most items had a low kurtosis. The difficulty indices were of medium size (.27#p m #.53). Corrected item-total correlations were in the middle to upper range (.47#r it #.78), except for item 1 (''As an overweight person, I feel that I am just as competent as anyone,'' reverse scored). This item showed a negative correlation (r it = 2.04) and was therefore removed from the WBIS, in accordance with Roberto et al. [4]. Thus, 10 items were retained as the final scale and used in all secondary analyses. Item homogeneity was optimal (mean inter-item correlation: r = .50; for the original 11 item scale: r = .40). Item statistics of the final scale are displayed in Table 2.
Reliability. The WBIS mean score showed excellent internal consistency (Cronbach's a = .91; for the original 11 item scale Cronbach's a = .87). Overall, excluding item 1 led to an improvement of item homogeneity and reliability.

Secondary Analyses
Distributions of WBIS mean scores. A univariate analysis of WBIS mean scores by sex, age, and weight status showed significant main effects for sex and weight status with small effect sizes Based on these findings, sex-specific percentile ranks were calculated and displayed in Table 3.
In further univariate analyses, additional inclusion of single sociodemographic variables showed significant main effects for education and income with small effect sizes Factorial validity. Results of the CFAs for a one-factor model are provided in Table 4. For the first CFA (N = 1092), the one-factor model did not provide a good fit to the data as indicated by the significant x 2 test statistic. However, the x 2 statistic is sensitive to sample size, so it is unclear whether this statistical significance was because of poor model fit or large sample size.
Regarding the indices of model fit, CFI and SRMR indicated a good model fit, while the other indices were slightly lower or higher than recommended, but still in an acceptable range (17). Factor loadings were medium to high for all items (range .47-.85).
Prediction of depression and somatic symptom severity. In order to determine whether and to what extent internalized weight bias predicted depression and somatic symptom severity, sex, age, BMI and WBIS mean scores were regressed on the BDI-PC and SSS-8 total scores, respectively. Regarding both depression and somatic symptom severity, sex, age and BMI explained a significant amount of variance in Step 1, while internalized weight bias significantly contributed to the explained variance in Step 2, both with small-to-medium effect sizes (see Tables 5, 6). Regarding gender differences in WBIS scores, in both regression analyses the additional inclusion of an interaction term between gender and WBIS scores yielded no further increase of explained variance (data available upon request).

Discussion
This study provides the first comprehensive analysis of the WBIS psychometric properties in a representative population sample. The results confirmed good psychometric properties of the WBIS after removing one item and extended the evidence  regarding item statistics, and discriminant and convergent validity.
In addition, sex-specific population norms were provided, allowing rapid classifications of individual WBIS scores using population percentiles. Item analysis showed that the WBIS leads to a low number of missing data (,5%). As the participants with overweight and obesity mostly endorsed item values indicating low to moderate weight bias internalization, all items deviated from normality and showed mostly flat distributions (low kurtosis) with a long tail to the right (positive skew). Relatedly, item difficulties were of medium size. Item-total correlations were favorable, with the exception of item 1 (''As an overweight person, I feel that I am just as competent as anyone;'' reverse scored) that yielded an insufficient, negative score. Because of this negative score, item 1 was removed from the WBIS, consistent with Roberto et al.'s results in adolescents with obesity [4]. For the 10-item WBIS, corrected item-total correlations and item homogeneity were good and improved when compared to the 11-item WBIS. Internal consistency of the 10-item WBIS was excellent, which is consistent with previous literature [4].
Regarding validity, the CFA of the shortened WBIS yielded a one-factorial structure as postulated, and extended Roberto et al.'s results in adolescents with obesity [4] to adults with overweight and obesity. Regarding discriminant validity, weight bias internalization was greater in women than in men, and in women of middle age than in those of higher age, while for men, no age differences were found. These results are inconsistent with previous research in a small sample suggesting an absence of sex or age effects [4], but consistent with a current study using a modified version of the WBIS [12]. In addition, there is evidence that women are more often exposed to diverse forms of weightrelated stigmatization and discrimination than men [20,21] so that they may present with greater weight bias internalization. The difference between middle-aged versus higher-aged women could be related to increased obesity rates and pronounced body dissatisfaction in this lower age group, while at higher ages, body dissatisfaction tends to decrease [22,23]. Weight bias internalization of young women who are at risk of a negative body image [23] may not have been higher than that of older women because of lower obesity rates at younger ages [24].
Further, individuals with obesity, identified on the basis of their self-reported weight and height, showed greater weight bias internalization than individuals with overweight. Previous research in small samples had mostly provided inconsistent associations with BMI [3,9,11], while two studies using another questionnaire or a modified WBIS version to assess weight bias internalization documented similar variations by weight status [5,12]. As a novel aspect, we studied associations among weight bias internalization, and education and household income. Individuals with lower education or lower income showed greater weight bias internal- Table 4. Goodness-of-fit indices for a one-factor model of the Weight Bias Internalization Scale in two samples.  Table 5. Prediction of depression by sex, age, and body mass index (Step 1) and Weight Bias Internalization Scale mean score (Step 2).
Step  Table 6. Prediction of somatic symptom severity by sex, age and body mass index (Step 1) and Weight Bias Internalization Scale mean score (Step 2).
Step ization than those with higher education or higher income. A low socioeconomic position including low education has significant associations with obesity in Western industrialized countries [25,26], and greater weight bias internalization is plausible in individuals with obesity who are pervasively exposed to weight bias. In contrast, income has previously shown more inconsistent associations with measures of obesity than education, presumably because income is less clearly associated with behaviors affecting energy balance (e.g., physical activity) [24]. Both low education and low income are also related to less positive self-beliefs [27], likely impacting greater weight bias internalization [7]. No further variation was found by marital status, residence, or nationality. Regarding convergent validity, the WBIS was associated with greater depressive symptoms, as expected [4,11]. In addition, for the first time the WBIS was shown to be associated with greater somatoform symptoms. Taken together with prior results demonstrating a link with a range of mental disturbances, weight bias internalization appears to be a common factor that may increase vulnerability to psychopathology in overweight and obesity. In addition, weight bias internalization independently contributed to depression and somatoform symptoms over and above BMI, in accordance with previous literature [8,9]. Longitudinal studies are needed in order to clarify causal associations with psychopathology and weight management. Thus far, one small-scale prospective clinical study suggested that weight bias internalization does not predict weight loss [3], and may therefore not represent a barrier to successful weight management.
A strength of this study includes the large sample, drawn from a survey representative of the German population regarding age and sex [28]. Presumably because fluent German was an inclusion criterion, participants with another nationality than German were, however, underrepresented. More research is desirable on selfstigma in migrant groups. A further limitation is that the definition of overweight and obesity was based on self-reported height and weight, commonly leading to an underestimation of these aspects, and thus to an underestimation of prevalence rates of obesity [29]. Because of the self-reported nature of body weight and height, norms were not given for overweight and obesity separately. Of note, because the WBIS addresses weight bias in the overweight spectrum, persons of normal weight or underweight cannot answer most WBIS items and were therefore not included in this report. Future research should consider reformulating the WBIS items so that they apply to all weight groups, enabling comparisons among them [12].
Overall, this study established good psychometric properties of the shortened WBIS. The provision of norms is essential to identify individuals at increased risk of psychopathology and in need of interventions to reduce weight bias internalization [30,31]. Future research is warranted on additional indicators of reliability and validity, for example, retest reliability, sensitivity to change, and prognostic validity.