Psychometric Evaluation of the HIV Stigma Scale in a Swedish Context

Background HIV-related stigma has negative consequences for infected people's lives and is a barrier to HIV prevention. Therefore valid and reliable instruments to measure stigma are needed to enable mapping of HIV stigma. This study aimed to evaluate the psychometric properties of the HIV stigma scale in a Swedish context with regard to construct validity, data quality, and reliability. Methods The HIV stigma scale, developed by Berger, Ferrans, and Lashley (2001), was distributed to a cross-sectional sample of people living with HIV in Sweden (n = 194). The psychometric evaluation included exploratory factor analysis together with an analysis of the distribution of scores, convergent validity by correlations between the HIV stigma scale and measures of emotional well-being, and an analysis of missing items and floor and ceiling effects. Reliability was assessed using Cronbach's α. Results The exploratory factor analysis suggested a four-factor solution, similar to the original scale, with the dimensions personalised stigma, disclosure concerns, negative self-image, and concerns with public attitudes. One item had unacceptably low loadings and was excluded. Correlations between stigma dimensions and emotional well-being were all in the expected direction and ranged between −0.494 and −0.210. The instrument generated data of acceptable quality except for participants who had not disclosed their HIV status to anybody. In line with the original scale, all subscales demonstrated acceptable internal consistency with Cronbach's α 0.87–0.96. Conclusion A 39-item version of the HIV stigma scale used in a Swedish context showed satisfactory construct validity and reliability. Response alternatives are suggested to be slightly revised for items assuming the disclosure of diagnosis to another person. We recommend that people that have not disclosed should skip all questions belonging to the dimension personalised stigma. Our analysis confirmed construct validity of the instrument even without this dimension.

questions belonging to the dimension personalised stigma. Our analysis confirmed construct validity of the instrument even without this dimension. Background About 35.3 million people were living with HIV worldwide in 2012, according to UNAIDS [1]. Currently, there are 6,469 known people living with HIV in Sweden (63% male, 37% female) [2]. Since the introduction of combined antiretroviral treatment (cART), HIV has changed from being a potentially deadly disease to a chronic disease in countries where cART is generally available to those in need. In Sweden, cART is available free of cost, and 93% of the population diagnosed with HIV is undergoing treatment. Patients without treatment often have a CD4 + Tcell count above 500610 6 cells/ml and are therefore currently not considered in need of treatment [2].
HIV-related stigma has been a barrier to HIV prevention since the beginning of the pandemic and has been shown to have negative effects on care and treatment, i.e. lower rates of HIV testing and lower adherence to medication [3,4,5]. Stigma, as defined by Goffman, appears when an attribute becomes deeply discrediting within certain relations and contexts [6]. HIV stigma is considered a social phenomenon, grounded on the labelling and stereotyping of people living with HIV, leading to loss of status and discrimination [7]. HIV stigma experienced by people living with HIV can be enacted, anticipated or internalised. Enacted stigma involves experiences of discrimination, stereotyping and or prejudice from others due to one's HIV infection. Anticipated stigma includes expectations of enacted stigma. Internalised stigma refers to a situation when stereotyping and or prejudice involving negative feelings and beliefs about people living with HIV have been internalised by people living with HIV [8].
Valid and reliable instruments for measuring stigma are needed to be able to map HIV stigma in affected populations as a base for the development of interventions against stigma and to evaluate the effects of stigma-reducing interventions [9]. According to Earnshaw and Chaudoir [8] it is important that such an instrument can differentiate between stigma mechanisms to identify the mechanism(s) that should be targeted in a potential intervention. Several instruments are designed to measure HIV-related stigma (see e.g. [8] for an extensive review), for example the HIV stigma scale by Sowell et al [10], the Internalized stigma scale by Sayles et al [11], the Measures of stigma and social impact of disease by Fife and Wright [12], the Enacted, vicarious, felt normative and internalized HIV stigma scales by Steward et al [13], the Stigma mechanisms of the HIV stigma framework by Earnshaw et al [14] and the HIV stigma scale by Berger et al [15]. However, only the HIV sigma scale designed by Berger et al [15] both differentiates between the three stigma mechanisms proposed by Earnshaw and Chaudoir [8] in one single instrument and produces an overall HIV stigma score in addition to the different stigma dimension scores. It has been used to measure stigma in various populations including African American women [16], men who have sex with men (MSM) [17] and adults 50 years and older [18] in the US and in men and women in Kenya and Puerto Rico [19]. Short versions of the scale have been developed in English and Swedish to measure HIV-related stigma among children and adolescents [20,21]. As no instruments for the measurement of HIV-related stigma among adults are available for a Swedish context, this study was set out to evaluate the psychometric properties of the HIV stigma scale in a Swedish context with regard to construct validity, data quality and reliability.

The HIV Stigma Scale
The HIV stigma scale consists of 40 items that form four subscales and an overall scale [15]. The development of the original HIV stigma scale was based on an extensive literature review regarding HIV-related stigma and psychosocial aspects of living with HIV as well as the involvement of experts and HIV-related organisations across the United States. Exploratory factor analyses of the original English version resulted in four factors representing four dimensions of stigma: (1) personalised stigma, (2) disclosure concerns, (3) negative self-image and (4) concerns with public attitudes, each composing a subscale of the instrument. The personalised stigma dimension is proposed to represent the enacted stigma mechanism, concerns with public attitudes and disclosure concerns are proposed to represent the anticipated stigma mechanism and negative self-image is proposed to represent the internalised stigma mechanism [8]. The 40 items are statements that a person living with HIV can agree or disagree with on four-point Likert-type response alternatives (completely disagree, disagree, agree and completely agree). Seventeen of the items in the instrument are statements that include an assumption of the disclosure of one's HIV status, at least to some extent; in a written instruction prior to this section, the participant is asked to imagine the situation if no one else knows that he or she has HIV. Subscale scores are calculated by summing the scores for the items belonging to each subscale, and an overall stigma score is calculated by summing the ratings for all 40 items. The instrument was originally tested in a sample of 318 persons (81% men) living with HIV in the US and showed satisfactory internal consistency for the subscales and overall scores with coefficient a ranging from 0.90 to 0.93. The test-retest reliability ranged from 0.89 to 0.92 with 2-3 weeks between tests. HIV-related stigma is hypothesised to be negatively related to self-esteem and positively related to depression, and in line with these assumptions, moderate to strong correlations between the HIV stigma scale and measures of self-esteem, depression and aspects of social support and conflict have been found, supporting the construct validity of the instrument [15].

Translation of the HIV Stigma Scale to Swedish
The 40 items were translated independently from English into Swedish by three members of the research group, all well experienced within the area of infectious diseases. The three translated versions were compared and merged into one Swedish version that was reviewed by a bilingual consultant (Swedish-English). This was followed by minor changes before the items were translated back into English by a professional translator and compared to the original scale. Additional small changes were conducted to ensure that the Swedish version did not differ from the original instrument.

Feasibility of the Swedish HIV Stigma Scale
The feasibility of the items was assessed through think-aloud interviews with a purposeful sample of people living with HIV (7 men, 2 women; 3 born in Sweden, 6 born in other countries) who completed the Swedish version of the HIV stigma scale whilst sharing their thoughts aloud [22,23]. The analysis showed that the participants overall found the items relevant and comprehensive.

Sample and Procedure
Data were collected from March through September 2013. Participants were recruited from the Department of Infectious Diseases at the Karolinska University Hospital in Stockholm, Sweden; the sample of patients listed at the clinic was judged representative regarding gender distribution and immigration status for people living with HIV in Sweden. The inclusion criteria were: (1) diagnosis with HIV and (2) 18 years of age and older. Patients who were newly diagnosed (,6 months) or had their first appointment at the clinic were excluded. A member of the research team approached eligible participants when they came to the clinic for scheduled appointments, and patients who accepted participation responded to the instrument either at the clinic or at home. Participants with insufficient knowledge in Swedish or English were offered the opportunity to fill out the questionnaire with a professional translator or with a member of the research team. The assistance from the research team was individualized and included explanations of the statements and response alternatives. At the end of the inclusion period, a shortage of men was noticed, so efforts were made to reach this group through purposive recruitment.

Additional Instruments
All participants were in addition to the HIV stigma scale asked to complete the Swedish Health-related Quality of Life Survey (Swed-Qual) [24]. Swed-Qual was derived from the Medical Outcome Study, MOS, consisting of 63 items and forming 13 scales covering physical, mental, social and general health. Swed-Qual has previously been used to measure quality of life among people living with HIV [25,26]. Two of the eleven multi-item scales from Swed-Qual, emotional wellbeing, negative effect and emotional well-being, positive effect were hypothesised to be associated with stigma mechanisms and used to investigate the construct validity of the HIV stigma scale. The emotional well-being, negative effect and emotional well-being, positive effect consists of six statements respectively (e.g. I have felt down and I have felt harmony). Each statement is rated on a four point Likert scale ranging from ''completely agree'' to ''completely disagree''. The answers are transformed to a 0-100 scale where 0 indicates worst possible and 100 best possible health-related quality of life; the two scales are presented as mean scores from the answers of the items belonging to the respective scale.

Data Analysis
Statistical analyses with the exception of parallel analysis were conducted in IBM SPSS Statistics 22. Randomised eigenvalues for parallel analysis were derived using the package nFactors in R Statistics [27,28].

Construct Validity
To explore the latent structure of the data set, an exploratory factor analysis was performed. The adequacy of the data for factor analysis was investigated with the Kaiser-Meyer-Olkin measure of sampling adequacy (KMO) and Bartlett's test of sphericity [29]. Alpha factoring with oblimin rotation was used as the extraction method to simulate the analysis performed by Berger et al. [15]. The number of factors extracted was determined through parallel analysis [30] and a screeplot [29]. The pattern matrix was analysed regarding loadings; only items with loadings of .0.32 [29] were included in the final version. An item with two or more loadings .0.32 was considered a cross-loading item [29] and assigned to the single factor with the highest loading. In the same way, an additional factor analysis was performed without the 16 items that loaded on the dimension personalised stigma to secure construct validity without this dimension.
The distribution of scores within the subscales was evaluated through means and standard deviations on the subscale and item levels. Item means and standard deviations were expected to be roughly equivalent within the subscale to justify the summation of item scores into subscale scores [31]. Corrected item-total correlation coefficients were examined and expected to exceed 0.4. Convergent validity was assessed by the Spearman's rank correlation coefficient between the HIV stigma scale and selected subscales from Swed-Qual: emotional well-being, negative effect and emotional well-being, positive effect. In both Swed-Qual subscales, lower scores reflect worse emotional well-being; it was hypothesised that the selected Swed-Qual scales would have moderate correlations with the stigma subscales. Correlation coefficients of 0.10-0.29, 0.30-0.49 and 0.49 and above were interpreted as small, moderate and large, respectively [32].

Data Quality
Data quality was evaluated through analysis of missing values for each item. Items that more than 5% of the participants had not answered, i.e. missing values, were further examined to see whether there was reason to believe that the item had been misunderstood or whether there were other explanations for the missing values [33].
The four-point Likert scale of the HIV stigma scale was evaluated through analysis of whether all response alternatives, 1-4, were used for all items. Floor and ceiling effects were calculated and considered acceptable if they did not exceed 15% [34].

Reliability
Cronbach's a was calculated for the subscales and for the overall scale to investigate the internal consistency of the scale and considered acceptable if it exceeded 0.7 [29].

Ethical considerations
The study was approved by the Regional Ethical Review Board of Stockholm [Regionala etikprövningsnämnden i Stockholm], FE 289, SE-171 77 Stockholm, Sweden (record no 2013/335-32) and has been performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and its later amendments. Written informed consent was collected from all participants. Oral and written information about the study was given in Swedish or English to all potential participants. The information included the aims of the study, voluntariness and the possibility to withdraw at any point without any effects on current or future care, the fact that that answers would be treated with confidentiality and that data would be presented only on a group level, with individuals kept anonymous. Participants who could not read Swedish or English received the written information read out loud by a member of the research team or a professional translator.

Results
One hundred and ninety-four people living with HIV agreed to participate in the study (85 women and 109 men in the ages 19-83 years; mean 48.8, SD 11.7, response rate 53%). Further socio-demographic data of the sample is shown in Table 1. One hundred and sixty-seven of the participants completed the Swedish version and 27 completed the English version of the questionnaire. Eight participants completed the questionnaire with assistance from a professional translator and 34 received assistance from the research team. The x 2 test showed that the sample was representative for people living with HIV in Sweden regarding gender and origin [2]. A significant difference was found regarding path of transmission, where an overrepresentation of heterosexual transmission was seen in the sample (60% vs. expected 51%, x 2 8.02, df 3, p,0.05).

Construct Validity
The answers from the participants who responded to all items in the HIV stigma scale (n5132) were used for the exploratory factor analysis. The dataset proved suitable for exploratory factor analysis with a KMO of 0.910 and statistical significance for Bartlett's test of sphericity (p,0.001). Parallel analysis and the screeplot indicated a four-factor solution, with all items loading on the same factor as in the original HIV stigma scale and five items cross-loading ( Table 2). The four factors accounted for 62.2% of the total variance. One item, number 11 (It is easier to avoid new friendships than to worry about telling someone that I have HIV), had no loadings .0.32 (Table 2) and was not included in further analyses. Each of the remaining 39 items was assigned to a single factor according to their highest loading, as presented in Table 2.
Item means within subscales were roughly equivalent, and the standard deviations were close to one (Table 3). However, the item means differed between subscales, where personalised stigma and negative self-image had the lowest mean scores of 2.16 and 2.17, respectively, and disclosure concerns had the highest mean score of 3.07. Corrected item-total correlations coefficients exceeded 0.4 for all items. Descriptive statistics for the four subscales are shown in Table 4.
Negative correlations were found between all subscales of the HIV stigma scale and the emotional well-being scales of Swed-Qual (Table 5). The correlation coefficients were of moderate size for the dimensions of personalised stigma, negative self-image and concerns with public attitudes. The magnitude of the correlation coefficients for the disclosure concerns dimension were small, however both exceeding 0.20.

Data Quality
The percentage of missing values for each item is presented in Table 3. The level of missing responses exceeded 5% for 18 of the items (5.2-10.8%). A majority of these items (n514) include an assumption that at least some people would know about the respondent's HIV status. Most of these items (n512) belonged to factor 1, personalised stigma. In the margins of the forms, some participants wrote ''nobody knows'' and either skipped items or marked the response alternative ''completely disagree'' for these items. Participants with missing values skipped either all the items for the personalised stigma dimension or skipped some items and chose the response alternative ''completely disagree'' on some. To secure that the instrument is valid and reliable for people without including this dimension we performed an exploratory factor analysis without the questions belonging to the dimension personalised stigma. Parallel analysis indicated a three factor solution with all items except item 37 loading on the same factors as in the four factor solution, as presented in Table 6. All response alternatives in the four-point Likert scale were used for all items. Floor effects ranged between 0.5 and 7.2% and ceiling effects between 1.5 and 9.3% (Table 4).

Reliability
Cronbach's a for the subscales ranged from 0.871 to 0.958 (Table 4), and the total scale including all 39 items generated an a of 0.958.

Discussion
In this study, the HIV stigma scale has been evaluated in a Swedish context with regard to psychometric properties. After excluding one item due to low factor loadings, the instrument, measuring four dimensions of stigma, was shown to have satisfactory construct validity and reliability. The instrument generated data of good quality, with the exception of items that assume that the participant has disclosed her or his HIV status to another person. The exploratory factor analysis supported the notion that the HIV stigma scale measures four dimensions of stigma in a Swedish context, as previously presented by Berger et al [15] in an American context; the content of the factors were interpreted to well represent the dimensions of personalised stigma, disclosure concerns, negative self-image and concerns with public attitudes. One item (item 11), It is easier to avoid new friendships than to worry about telling someone I have HIV, had no factor loadings .0.32, indicating that the item had a weak connection to all four stigma dimensions. Based on this, we recommend the item to be excluded when the HIV stigma scale is used in Sweden. When the original English version of the HIV stigma scale was developed, 16 items cross-loaded with high loadings on several factors. When designing the subscales, Berger et al [15] suggested that these items be assigned to several subscales and that subscale scores should be computed by summing the responses of all items belonging to each factor. In the factor structure performed in the current study, only five items cross-loaded. To avoid an item's appearance in more than one subscale, we recommend assigning each item to the one factor that it had highest correlation with. The summation of item scores into subscale scores is justified if the item scores are roughly equivalent across the subscale and the corrected item-total correlation exceeds 0.4 for all items [31], which is supported by our analysis on a subscale level. We therefore recommend that subscale scores be computed by summing the responses for the items belonging to each subscale. In the original HIV stigma scale, it is also suggested that all item scores be summed into a total stigma score. Since the number of items varies across the subscales and the item means differed between subscales, the use of a total score can be questioned.
The correlation coefficients between the HIV stigma scale and measures of emotional well-being indicated that people reporting more stigma experienced fewer positive emotions and more negative emotions. The moderate correlation coefficient between the personalised stigma, negative self-image and concerns with public attitudes subscales and emotional well-being supports convergent validity for these subscales. The correlations for the disclosure concerns subscale were also  in the expected negative direction but of small magnitude indicating a somewhat weaker relationship than the other three dimensions. All four response alternatives were used for all items, which provide evidence that a four-point Likert scale is sufficient. Our analysis of floor and ceiling effects met the standards [34] for all subscales, which indicates that the scale measures an accurate depth of the concept.
We found a high rate of missing answers among items that assume at least some extent of disclosure of the diagnosis, mostly belonging to the dimension of personalised stigma. In many of the cases with missing answers in the dimension of personalised stigma, the participant had skipped several of the items for this particular dimension. Based on written responses in the margins of the questionnaires, we conclude that these participants had not disclosed their HIV status to anyone and were thus not able or did not find it meaningful to imagine the situations as requested in the instructions for the items assuming that the participant's HIV status is known to other people. We suggest that the instructions requesting the participant to imagine the situation should be removed since some participants skipped these items rather than gave an answer based on imagination. We instead recommend that the instrument include a fifth response alternative, ''not applicable''. We recommend an additional item where the participant can state how many persons that, except healthcare providers, know about their HIV infection. If no-one except healthcare providers knows about the participant's HIV infection, they should skip all questions belonging to the dimension personalised stigma. The three-factor solution presented in Table 6 indicates that the instrument has construct validity even when the dimension personalised stigma is excluded. Item 37 have low loadings in this solution but we recommend keeping it in the dimension disclosure concerns due to its theoretical suitability for this dimension.
The item focusing on the risk of losing employment due to having HIV (item 5) had the highest non-response rate and may reflect the strong legal protection against this type of discrimination in Sweden. Based on written responses in the form ('Does this happen in Sweden?'), it can be assumed that some participants found this item irrelevant and therefore did not answer. However, 38% of the participants agreed or totally agreed, which indicates relevance for some participants. Item 9, People with HIV are treated like outcasts, had a rate of missing answers that exceeded the limit set at 5%. This statement can evoke negative reactions among participants, but so can the subject of stigma as a whole. We do not see the missing values for these items as reason to exclude the items from the scale, but we suggest that someone be available to answer questions when the instrument is distributed. The HIV stigma scale is not recommended to be distributed by mail.

Methodological Considerations
When evaluating exploratory factor analysis MacCallum et al [35] argue that the structure of the model and the communalities are of greater importance than the ratio of sample size to number of items. Further, in their Montecarlo study, MacCallum et al [35] shows that exploratory factor analysis can yield reliable solutions for sample sizes below 100 when communalities are high (overall .0.6) or wide (ranging from 0.2 to 0.8) and factors are over-determined (simple structure and high loadings on at least 3-4 items per factor) [35]. Our sample with 132 complete ratings for the full HIV stigma scale generates a solution with overdetermined factors and wide communalities ranging from 0.35 to 0.80, why we believe that it is possible to draw firm conclusions based on the sample.
Studies that use patient-reported data from people living with HIV generally face a problem of representativeness. Many studies, including the original paper about the HIV stigma scale [15], had an overrepresentation of MSM and an underrepresentation of people of younger age, people with intravenous drug use and people from ethnic minorities [36]. The representativeness of the sample in this paper when compared to people living with HIV in Sweden [2] was evaluated using the x 2 goodness of fit test, showing that no significant difference could be found regarding gender and origin. The slight overrepresentation of participants with heterosexual path of transmission reflects the flow of patients at the clinic where the data collection was conducted.

Conclusions
The HIV stigma scale, reduced by one item to 39 items, was shown to be a valid and reliable measure of four dimensions of stigma in a Swedish context. Response alternatives are suggested to be slightly revised for items assuming the disclosure of diagnosis to another person. We recommend that the items belonging to the personalised stigma dimension should be skipped for people that have not disclosed their HIV infection to anyone except their healthcare provider. Our analysis confirmed construct validity of the instrument even without this dimension.