## Figures

## Abstract

### Background

Published validation studies have reported different factor structures for the Self-Compassion Scale (SCS). The objective of this study was to assess the factor structure of the SCS in a large general population sample representative of the German population.

### Methods

A German population sample completed the SCS and other self-report measures. Confirmatory factor analysis (CFA) in MPlus was used to test six models previously found in factor analytic studies (unifactorial model, two-factor model, three-factor model, six-factor model, a hierarchical (second order) model with six first-order factors and two second-order factors, and a model with arbitrarily assigned items to six factors). In addition, three bifactor models were also tested: bifactor model #1 with two group factors (SCS positive items, called SCS positive) and SCS negative items, called SCS negative) and one general factor (overall SCS); bifactor model #2, which is a two-tier model with six group factors, three (SCS positive subscales) corresponding to one general dimension (SCS positive) and three (SCS negative subscales) corresponding to the second general dimension (SCS negative); bifactor model #3 with six group factors (six SCS subscales) and one general factor (overall SCS).

### Results

The two-factor model, the six-factor model, and the hierarchical model showed less than ideal, but acceptable fit. The model fit indices for these models were comparable, with no apparent advantage of the six-factor model over the two-factor model. The one-factor model, the three-factor model, and bifactor model #3 showed poor fit. The other two bifactor models showed strong support for two factors: SCS positive and SCS negative.

### Conclusion

The main results of this study are that, among the German general population, six SCS factors and two SCS factors fit the data reasonably well. While six factors can be modelled, the three negative factors and the three positive factors, respectively, did not reflect reliable or meaningful variance beyond the two summative positive and negative item factors. As such, we recommend the use of two subscale scores to capture a positive factor and a negative factor when administering the German SCS to general population samples and we strongly advise against the use of a total score across all SCS items.

**Citation: **Coroiu A, Kwakkenbos L, Moran C, Thombs B, Albani C, Bourkas S, et al. (2018) Structural validation of the Self-Compassion Scale with a German general population sample. PLoS ONE 13(2):
e0190771.
https://doi.org/10.1371/journal.pone.0190771

**Editor: **Ulrich S. Tran, Universitat Wien, AUSTRIA

**Received: **March 17, 2017; **Accepted: **December 14, 2017; **Published: ** February 6, 2018

**Copyright: ** © 2018 Coroiu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability: **Data is available online on a data sharing platform. The data can be accessed here: URL: https://osf.io/rav8k/?view_only=895737c523b2405189670bc086751635. DOI 10.17605/OSF.IO/RAV8K | ARK c7605/osf.io/rav8k.

**Funding: **The authors received no specific funding for this work.

**Competing interests: ** The authors have declared that no competing interests exist.

## Background

Compassion is a concept rooted in Buddhist philosophy, which represents a positive attitude whereby one can deeply care for one’s suffering (i.e., self-compassion) or be deeply moved by the suffering of another (i.e., compassion for others). In recent years, self-compassion has become a popular topic in psychology with numerous studies assessing the associations between self-compassion and other psychological variables. Meta-analyses on correlates of self-compassion, for instance, found positive associations with cognitive, psychological and affective well-being [1] and negative associations between self-compassion and physiological or psychological distress (i.e., anxiety and depression) [2]. The argument driving these studies is that self-compassion is a construct highly amenable to change, which can be used in psychotherapy to improve mental health outcomes. Interventions with a self-compassion component (e.g., compassionate mind training; mindful self-compassion program) have indeed been shown to increase life satisfaction in the general population [3] and to decrease depression, anxiety, and stress in community and clinical samples [3–7]. Gilbert [8] suggested self-criticism as the mechanism through which self-compassion acts to lower the emotional response to perceived threats. Self-criticism is a risk factor for various psychological difficulties, including mood disorders, anxiety, self-harm, suicide, and alcoholism [9]. In addition, studies have shown that self-compassion based interventions can be effective at improving self-compassion in various populations [3, 5, 10, 11]. Most studies to date assessed the construct of self-compassion using the Self-Compassion Scale (SCS) [12], with the majority of cross-sectional [2] and intervention studies [3, 10, 11] using a total summed SCS score as the measure of effect. Some studies have reported analyses involving SCS subscale scores [13, 14] or used both summed and subscale scores [5]. However, current methodological issues related to the measurement of self-compassion via the SCS make it difficult to contextualize the findings about the psychotherapeutic potential of self-compassion.

The Self-Compassion Scale was developed by Neff [12]. As per Neff’s definition, self-compassion encompasses three bipolar dimensions demarcated into six factors: a) the presence of self-kindness in contrast to self-judgment; b) a sense of common humanity as opposed to a sense of isolation; and c) mindfulness as opposed to over-identification [12, 15]. *Self-kindness* refers to being caring, understanding and accepting of oneself, becoming aware of, and being moved by one’s own suffering whereas *self-judgment* involves being harsh and extremely self-critical. *Common humanity* refers to the recognition that failure and hardship are shared human experiences, which foster connectedness to others rather than leaving one feeling *isolated* when faced with suffering. *Mindfulness* represents the acceptance of the present moment experience and involves taking an objective stance on one’s experience in order to gain perspective and to avoid *over-identification* with negative thoughts and emotions [15]. Neff [16] updated the definition of self-compassion to an interplay of positive and negative constructs, working as a dynamic system and includes “different ways that individuals emotionally respond to pain and failure (with kindness or judgment), cognitively understand their predicament (as part of the human experience or as isolating), and pay attention to suffering (in a mindful or over-identified manner).

The factorial structure of the original English SCS and its translations has not been consistently replicated and studies to date have yielded conflicting results. For a summary of factor analytical findings presented in this paragraph, please refer to S1 Table. Our extensive, yet not exhaustive review of the literature found that the six-factor structure of the SCS has been investigated in eleven studies and was confirmed in eight, which included college students, clinical and non-clinical samples, community samples, and the general population responding to the original SCS or translated versions of the SCS [12, 17–23]. The six-factor structure, however, was not confirmed in three studies using community, clinical, and online recruited samples [23–25]. Neff [12] proposed the existence of three conceptual SCS dimensions, but could not demonstrate them empirically. The three SCS factors corresponding to the three bipolar dimensions have also been investigated in two studies conducted with “psychologically-minded” individuals (e.g., college students) [18, 26], but the three factorial structure could not be demonstrated. A two-factor SCS structure corresponding to the positively phrased items (from self-kindness, common humanity, mindfulness subscales) and the negatively phrased items (from self-judgment, isolation, over-identification subscales), respectively, was investigated by 7 studies, and was replicated in 4 studies using college students, community and mixed samples [24, 26–28]. The two-factor SCS structure was not replicated in three studies with the general population, community members, psychologically-minded adults (e.g., university students and graduates), and a clinical sample [18, 19, 23]. In line with a possible two-factor structure of SCS, Neff [29] proposed that self-kindness, common humanity, and mindfulness represent a “self-compassionate frame-of-mind”. Similarly, Gilbert, McEwan [30] used the terms “*self-compassion*”to refer to a composite of the three positive subscales of SCS and “*self-coldness*” for the composite of the three negative subscales. A unifactorial structure of SCS has been investigated in at least 10 studies, none of which confirmed the presence of a single factor [12, 19, 20, 22, 23, 25, 26, 28, 31]. A higher order factor model with six first order factors and one second/higher order factor was confirmed in some studies [12, 21, 32, 33], but not in others [19, 20, 22–24, 27, 31]. A higher order factor model with six first order factors and two higher order factors was investigated by two studies, the model converged in one study [31], but not the other [22]. Bifactor models with six group factors and one or two general factors have been investigated in few studies: one study provided support for one general factor in three separate samples (college students, MTurk-recruited sample, meditators; MTurk of Mechanical Turk is an engine used to collect data from participants online, which is owned by Amazon), but not among a clinical sample [23]; a second study conducted with college students found support for two general factors, self-compassion and self-coldness [26]; a third study conducted with an online recruited sample reported inconclusive findings [25]. Among the 16 validation studies summarized, five used the English SCS version [12, 20, 23, 26, 28] of the SCS and 11 used SCS translations in German [18], Italian [19], Spanish [17, 31, 33], Dutch [27], Portuguese [21, 24, 31], Hungarian [25], French [32], and Japanese [22]. It is unclear whether the inconsistency of factorial results for the SCS could be attributed to translation, whether the self-compassion construct is expressed with some variability from one culture to another, or if there are other reasons for the inconsistency.

Despite a lack of clarity on the factorial validity of SCS, there is extensive interest from the research community in studying self-compassion. For example, meta-analyses exploring the relationship between a total SCS score and psychopathology [2, 34] and SCS and well-being [1, 2] found large effect sizes (range |.47| to |.54|). In addition, Muris and Petrocchi [34] assessed associations between the SCS subscale scores and measures of psychopathology and found stronger effects for the negative SCS subscales (r range = .47 to .50) than for the positive SCS subscales (r range = -.27 to -.34). These meta-analyses indicate that the use of a total SCS score could lead to inflated associations between the SCS total scale (which includes reverse-coded items tapping psychopathology features, such as isolation, self-judgment, and over-identification with negative emotions) and psychopathology measures; the use of a total score could also obscure important information, such as the unique contribution of the positive versus negative items of the SCS [34]. The use of a total summed SCS score as a measure of effect, when in fact SCS might not be unidimensional, could also negatively impact the validity of results from empirical and intervention studies. It is especially important to examine the factor structure in different translated versions of the SCS to determine whether cultural factors should be taken into consideration when assessing self-compassion. Establishing the factor structure of the SCS in different languages is an important step in improving research using this scale.

## Study objective

The main objective of this study was to investigate the factorial structure of the SCS in a large general population sample representative of the German population. We tested several models previously found in empirical research involving the SCS, but we did not have an a priori hypothesis as to which model would be a better fit for our data. The secondary objective of this study was to assess the convergent and divergent validity of the SCS. It was hypothesized that a) the positively phrased factor(s) of the SCS found to be valid would be positively associated with measures assessing positive beliefs about the self and negatively associated with measures assessing negative beliefs about the self and distress; and b) the negatively phrased factor(s) of SCS found to be valid will be negatively associated with measures assessing positive beliefs about the self and positively associated with measures assessing negative beliefs about the self and distress.

## Methods

### Participants and procedures

Data included in this study [35, 36] were extracted from a general population survey conducted in Germany in 2012. As part of this survey, a representative sample of the German general population (age 14 or above) with respect to age and gender was selected by a specialized population survey service in Germany (USUMA, Berlin). Ethics approval for the survey was obtained from the University of Leipzig Research Ethics Board (REB; Protocol # 092-12-05032012) prior to data collection. Informed consent was obtained verbally and documented by study representatives. The study also adhered to the ethical principles of the *International Code of Marketing and Social Research Practice* by the International Chamber of Commerce and the European Society for Opinion and Marketing Research. Socio-demographic information and other self-report measures were collected by trained interviewers via face-to-face interviews conducted in participants’ homes. The only criterion for inclusion in the current analysis was being at least 18 years old and having complete responses on the study measures.

### Measures

#### Self-Compassion Scale (SCS).

The SCS [12] is a 26-item self-report measure assessing self-evaluations via six proposed subscales. Three of the subscales are phrased in a positive direction: self-kindness (e.g., “*When I’m going through a very hard time*, *I give myself the caring and tenderness I need*”), common humanity (e.g., “*When I feel inadequate in some way*, *I try to remind myself that feelings of inadequacy are shared by most people*”), and mindfulness (e.g., “*When something upsets me*, *I try to keep my emotions in balance*”). The remaining three subscales are phrased in a negative direction: self-judgment (e.g., “*I’m disapproving and judgmental about my own flaws and inadequacies*”), isolation (e.g., “*When I fail at something that’s important to me*, *I tend to feel alone in my failure*”), and over-identification (e.g., “*When I’m feeling down I tend to obsess and fixate on everything that’s wrong*”). The items are scored on a 5-point Likert scale ranging from 1 (*very rarely*) to 5 (*very often*), with higher scores indicating higher levels of the construct measured. In the development sample, which consisted of undergraduate students, the internal consistency for the total scale was α = .92 and ranged from .75 to .81 for the subscales [12]. Three-week test-retest reliability for the total scale was .92 and ranged from .80 to .88 for the subscales. Validity analyses revealed positive associations with measures of social connectedness, life satisfaction, self-esteem, and emotional processing and negative associations with measures of psychological distress, self-criticism, neurotic perfectionism, and rumination [12]. The current study used the German version of the SCS [18]. Confirmatory factor analyses with the German validation sample confirmed the six subscales, which correlated as theoretically expected (in terms of directionality) with measures of self-esteem, perfectionism, neuroticism, anxiety, and depression and had good internal consistency reliability for the subscales (Cronbach’s alpha (α) ranging from 0.66 to 0.80) and test-retest reliability (coefficients ranging from 0.72 to 0.80) [18]. In the current sample, Cronbach’s α for the subscales ranged from .71 (over-identification) to .79 (self-kindness).

#### Patient Health Questionnaire-9 (PHQ-9).

The PHQ-9 [37] is a brief self-administered screening tool assessing depressive symptomatology as per the DSM-IV criteria. It asks respondents to indicate how often during the previous two weeks they were bothered by problems such as “feeling down, depressed, or hopeless”. The items are rated on a 4-point Likert scale ranging from 0 (*not at all*) to 3 (*nearly every day*), with higher scores indicating more symptoms and/or higher severity. The PHQ-9 has shown good construct and criterion validity [37–39]. While initially developed for clinical populations, the PHQ-9 can also be used as a depression screening tool among the general population [39]. In the current study, a total score computed across all nine items (α = .86) was used in analyses.

#### Generalized Anxiety Disorder screener (GAD-2).

The GAD-2 [40] is a two-item measure assessing the frequency of anxiety symptoms experienced over the previous two weeks. Two core items (“*Feeling nervous*, *anxious*, *or on edge*” and “*Not being able to stop or control worrying*”) of the GAD-7 tool [40] were retained to create an ultra-brief screening tool. Response options were rated on a 4-point Likert scale ranging from 0 (*not at all*) to 3 (*nearly every day*). The current study used the German version of the GAD-2, which was found to have comparable validity and reliability to the GAD-7 [41].

#### Core Self-Evaluations Scale (CSES).

The CSES [42] is a 12-item scale that measures four types of core self-evaluations: locus of control, neuroticism, self-efficacy, and self-esteem. Half of the items are positively worded (e.g., “I complete tasks successfully”) while the other half are phrased negatively (e.g., “Sometimes when I fail I feel worthless”) and reverse-scored for the purpose of deriving a total scale score. Answers are rated on a 5-point Likert scale ranging from 1 (*strongly disagree*) to 5 (*strongly agree*). The CSES is scored by averaging the responses, with higher scores indicating more positive core self-evaluations. The German version of the Core Self-evaluation Scale was used in the current study [43]. Confirmatory factor analysis (CFA) conducted with a German general population sample found that the positively and negatively worded items loaded into two separate factors [35]. In the current study, two composite scores corresponding to the positive (α = .85) and negative (α = .84) factors were used in analyses.

### Data analysis

#### Assessment of the factor structure of the SCS.

Descriptive statistics (mean, standard deviation, range, percentage) of demographic characteristics were computed. Based on previous reports on the factor structure of the SCS, six initial models were fitted using CFA in MPlus 7 [44]: (1) a unifactorial model encompassing all 26 items; (2) a two-factor model encompassing the positively phrased items and the negatively phrased items, respectively; (3) a three-factor model with one factor incorporating items from the self-kindness and self-judgment subscales, a second factor encompassing items from the common humanity and isolation subscales, and a third factor with items from mindfulness and over-identification subscales; (4) a six-factor model corresponding to the six subscales hypothesized by Neff [2003a]; (5) a second order model testing the hierarchical relationship between six first-order factors (six subscales) and two second-order factors, i.e., SCS positive (encompassing items belonging to the self-kindness, common humanity, mindfulness subscales) and SCS negative (encompassing items belonging to the self-judgment, isolation, over-identification subscales) and (6) a six-factor model in which we arbitrarily assigned the positively phrased items into three subscales and the negatively phrased items into three separate subscales. None of the arbitrarily created scales contained more than two items from the same original SCS subscale. The item assignment procedure for the last model was included in S2 Table. The rationale for this model was to a) compare it against the six-factor model hypothesized by Neff, as we suspected that the six subscales hypothesized by Neff were not distinct from each other, based on previously reported high correlations between the positive (r > .7) and negative (r >. 8) SCS subscales [23] and b) to help us with the selection of our best fitting model versus one that is a mere artefact of our modelling procedures. It is known that model fit indices tend to improve when more factors are being modelled [45, 46].

In addition, three bifactor models were tested. In a bifactor model, the items are modeled to concurrently load into group factors (SCS subscales) and the target factor(s) (latent overall SCS factor[s]). Bifactor models evaluate the degree to which covariance among a set of item responses can be accounted for by a single general factor that reflects common variance among all scale items, as well as specific factors reflecting additional common variance among clusters of items with similar content [47, 48]. The bifactor model can provide a useful alternative to standard correlated factor models for assessing aspects of multidimensionality and can be useful for providing possible explanations when there is a lack of clarity about dimensionality. The bifactor model can also be useful for evaluating whether a unit-weighted composite score for a single latent trait can be reasonably interpreted, versus creating subscales, in the context of identifiable multidimensionality [47–49]. In the bifactor model, the general (target) factor(s) represents the broad overarching construct that is being measured, and the group factors represent more narrowly defined subdomains [47]. In bifactor model #1, the general (target) factor represented the construct of self-compassion encompassing all 26 items of the SCS, and the group factors represented the two key subdomains, the negative and positive constructs of the SCS, respectively. In bifactor model #2 (or a two-tier bifactor model), there were two general (target) factors: the positive and negative SCS constructs, and six group factors, representing the SCS subscales, three corresponding to the negative SCS construct and three corresponding to the positive SCS construct. In bifactor #3, there was one general factor (overall SCS) and six group factors (the six SCS subscales). In the bifactor models, all items were specified to load on the general factor (or factors) plus their designated group factor, and the general and specific factors were specified to be orthogonal (uncorrelated), as per assumptions of the bifactor analysis. In order to assess the contribution of the general factor and the specific factors to explaining item covariance, we calculated explained common variance (ECV), which is the ratio of variance explained by the general factor divided by variance explained by the general plus the specific factors [47]. In addition, coefficient omega, which represents the proportion of total score variance attributed to all common factors, i.e., general and target factors, was computed. Omega is a model-based reliability estimate of the multidimensionality composite total score, which is analogous to coefficient alpha [50]. Coefficient omega hierarchical (omega-h), which represents the proportion of total score variance that can be attributed to a single common factor, was also calculated for the general factor/factors. Omega-h assesses the degree to which the total scores reflect variation on the target (general) dimension [50]. Coefficient omega-h subscale (omega-hs) was also calculated for the specific group factors in order to evaluate the degree to which the subscales provided reliable information that is unique, i.e., beyond that of the general factor. Omega-hs for the group factors represents the proportion of unique variance that can be attributed to each of the group factors (subscales) after controlling for the variance of the general factor [50]. In order to assess the unidimensionality of the SCS, we used recommended cut offs of ECV > .70 - .80 and omega-h (and omega- h subscale) > .80 to identify strong general (and group) factors [47, 51].

Item responses were ordinal Likert data and were therefore modeled using the weighted least squares (WLSMV) estimator with a diagonal weight matrix, robust standard errors, and a mean- and variance-adjusted chi-square statistic with delta parameterization [44]. The negatively phrased items were reverse coded for analyses. For the 2, 3, 6-factor, and the second order models correlations among the SCS factors were specified. In the bifactor models the factors are assumed to be orthogonal, so correlations between the factors were not allowed, as per the assumptions of the bifactor anlaysis. In addition to the chi-square test, which is highly sensitive to sample size (i.e., it can lead to erroneous rejection of the model fit) [52], a combination of other indices, i.e., the Tucker-Lewis Index (TLI) [53], the Comparative Fit Index (CFI) [54] and the Root Mean Square Error of Approximation (RMSEA) [55] was used to assess the model fit. Good fitting models are indicated by a TLI and CFI ≥ 0.95 and RMSEA ≤ 0.06 [56], although a CFI and TLI of .90 or above [57] and a RMSEA of .08 or less [58] are often regarded as indicators of an adequate model fit. Modification indices were used to identify pairs of items for which the model fit would improve if the error estimates were freed to covary and for which there appeared to be theoretically justifiable shared method effects (e.g., similar content) [59]. Further, guidelines developed by Cheung and Rensvold [60] were used to determine whether there were substantive differences in model fit between the models we tested. A difference of ≤ 0.01 in the CFI between any two models was considered indicative of similar models. To select the best fitting model, we considered overall model fit indices and differences in CFI between each two models, factor loadings, inter-scale correlations, and additionally, for the bifactor models, omega coefficients.

#### Assessment of the reliability and validity of the SCS.

Cronbach’s alpha was computed for the SCS factors. Convergent and divergent validity of the SCS factors were assessed through Pearson’s correlations with the PHQ-9, GAD-2, and the two subscales of the CSES. The magnitude of the bivariate correlations was interpreted following Cohen’s effect sizes, with *r* ≤ 0.10 indicating small, *r* = 0.30 indicating moderate, and *r* = 0.50 indicating large differences [61–63]. These analyses were conducted using SPSS, version 20.

## Results

### Sample description

Data were collected from 2510 people (56.5% response rate) of which 2448 (54% female, M age = 50.23, SD = 17.39, Range = 18–91) met the age restriction (≥ 18) for the current study. Socio-demographic characteristics of the sample are displayed in Table 1.

### Assessment of the factor structure of the SCS

Table 2 shows fit indices for all factor models. The unifactorial model and the three-factor model both showed poor fit. The two-factor model showed better, though less than ideal, fit, χ2 (298) = 6220.94, p < .001; CFI = .87; TLI = .85; RMSEA = .09. The correlation between the two latent factors was | r | = .23, p < .001 (negative items reverse coded; see Table 3). The fit indices for the six-factor model were similar to the two-factor model, χ2 (284) = 5265.15, p < .001; CFI = .89; TLI = .87; RMSEA = .09. The correlations between the three negatively phrased latent factors ranged from .79 - .93 and between the three positively phrased latent factors ranged from .76 - .81 (see Table 3). The six-factor CFA with arbitrarily assigned items revealed similar fit indices to those of the six-factor solution presented above, χ2 (284) = 5632.23, p < .001; CFI = .88; TLI = .86; RMSEA = .09. The second order model also showed less than ideal fit, χ2 (293) = 5167.83, p < .001; CFI = .89; TLI = .88; RMSEA = .08.

Upon inspecting the Modification Indices of the six-factor solution, three pairs of item errors were allowed to freely covary: items 10 and 7 (common humanity); items 18 and 13 (isolation); and items 19 and 5 (self-kindness), which resulted in minor improvements of the model fit for the two-factor model and the six-factor model (Table 2). The second order model did not further improve by freeing these error terms. Given the differences in CFI close to 0.01, the two-factor model, the six factor model, and the second order model were considered as having comparable model fit. For comparison purposes, the results of the CFAs with and without freeing covariances were included in Table 2.

The bifactor model #1, with one general (target) factor and two group factors, showed less than ideal fit, χ2 (274) = 5052, p < .001; CFI = .89; TLI = .87; RMSEA = .08. The omega coefficient (ω) was .94 and omega-hierarchical (ω_{h}) for the general factor was .42. The explained common variance for the common factor (ECV) was .33, and omega-h subscale (ω_{hs)} was .83 for the SCS positive group factor and .43 for the SCS negative group factor.

The bifactor model #2, the two-tier bifactor model with two general (target) factors and six group factors, showed adequate fit, χ2 (275) = 4501.37, p < .001; CFI = .90; TLI = .89; RMSEA = .08). For the SCS positive general factor ω = .91, ω_{h} for the positive factor = .83, and ECV = .72; ω_{hs} = .18 for self-kindness, ω_{hs} = .23 for common humanity, and ω_{hs} = .13 for mindfulness positive group factors. For the SCS negative general factor ω = .92, ω_{h} for the negative factor = .86, and ECV = .77; ω_{hs} = .23 for self-judgment, ω_{hs} = .16 for isolation, and ω_{hs} = .01 for over-identification negative group factors. The omega coefficients and the ECV computations for the bifactor models (1 and 2) were included in Table 3.

The bifactor model #3, with one general factor and six group factors, yielded a poor fit, χ2 (274) = 16194.59, p < .001; CFI = .64; TLI = .57; RMSEA = .15. Given the poor model fit, omega coefficients were not computed for this model.

### Validity and reliability of the SCS factors

#### SCS latent factor scores computed in MPlus.

In Mplus, we modelled latent factors. Correlations among the latent factors pertaining to the six-factor and two-factor solutions, and computed in Mplus, were included in Table 4. Factor loadings for the two-factor CFA were included in Table 5 and the loadings for the six-factor CFA were included in S3 Table. In the two-factor solution, there was a moderate positive correlation between the SCS positive factor and the SCS negative factor. In the six factor-solution, there were large positive correlations between the three positive and the three negative latent factors, respectively.

#### SCS subscale scores computed in SPSS.

In SPSS, we computed summed subscale scores. Correlations between the study measures and the six original SCS subscales and the two SCS subscales (derived from all of the positive items and all of the negative items, respectively), computed in SPSS, were included in Table 3. There was a moderate positive correlation between the SCS positive and SCS negative subscales. There were large positive correlations (.60 < *r* < .63) among the three positively phrased subscales, self-kindness, common, humanity, and mindfulness). There were also large positive correlations (.56 < *r* < .68) among the three negatively phrased subscales, self-judgment, isolation, and over-identification. Further, there were large positive correlations (*r* > .85) between the three positively phrased SCS subscales. Similarly, there were large positive correlations between the negatively phrased SCS subscales (*r* > .80). The associations between the positive SCS subscales and the PHQ-9, the GAD-2, and the CSES negative were negligible, but there was a small positive association with the CSES positive. The negative SCS factors showed moderate-to-large correlations with the PHQ-9, the GAD-2, and the CSES negative, and small to moderate negative correlations with the CSES positive.

#### Reliability assessment.

In the six-factor model, Cronbach’s alphas for the SCS subscales were .79 for self-kindness, .73 for common humanity, .77 for mindfulness, .73 for self-judgment, .77 for isolation and .71 for over-identification. In the two-factor SCS model, alpha was .88 for the SCS positive subscale and .87 for the SCS negative subscale.

## Discussion

Initial CFA showed acceptable and comparable model fit indices among the two- and six-factor models and for the second order model. A six-factor CFA with arbitrarily assigned items revealed a model fit almost identical to the fit of the six-factor model using Neff’s original item assignment, which renders little meaning to the computation of six subscales scores. The second order and six-factor models did not improve fit to the data substantively over the two-factor model [60]. Further, bifactor models supported the presence of two latent SCS factors. The first bifactor model that we tested with one general factor (SCS) and two group factors (SCS positive and SCS negative) had a large omega value, but much lower omega-h and ECV values. This shows that little of the variance in SCS total scores is due to the general factor and, conversely, that a considerable proportion of the variance in total scores is due to the two group factors, i.e., SCS positive and SCS negative. Further, omega-h group coefficients showed a high value for the SCS positive factor and much lower value for the SCS negative factor. This suggests that the SCS positive factor captures most of the variance in the SCS construct raising questions about the relevancy of the negative factor to the overall SCS construct. The second bifactor model we tested, with two general factors and six group factors, found high omega and omega-h coefficients and high ECV values for the general factors, and virtually nil omega-h group coefficients for the six group factors. The omega-h group coefficients, which are indicators of reliability of additional variance (i.e., beyond the variance explained by the two general factors) provided evidence against the six factors, suggesting that the six factors do not reliably explain additional variance beyond the two general factors, SCS positive and SCS negative.

In the six-factor model, the magnitude of the correlations among the positive latent factors and among the negative latent factors suggested significant redundancy within each set. We reported both correlations between SCS latent factors, computed as part of the CFA in MPlus (range .76 - .93) and correlations between SCS subscale scores, computed in SPSS (range .60 - .68). Error in the measurement reduces the correlation between variables by definition, as it is unrelated to the construct being measured. Therefore the SPSS correlations are lower than the MPlus correlations, while the latter are more accurately describing the relationship between the constructs. Given the magnitude of the correlations within each set, positive and negative, it seems reasonable to suggest that the sub-constructs within each set overlap so much that it is difficult to distinguish between them [64]. Further, the correlation matrix showed that the positively phrased SCS subscales (self-kindness, common, humanity, and mindfulness) and the SCS positive subscale showed a similar pattern of associations with the other study measures. In the same vein, the three negatively phrased SCS subscales (self-judgment, isolation, and over-identification) and the SCS negative subscale also showed a similar pattern of associations with the study measures. These patterns of associations did not suggest an advantage of using the six SCS subscales over the two SCS subscales, i.e., SCS positive and SCS negative.

Given all of these considerations, the two-factor solution, with SCS positive and SCS negative latent factors, was deemed to be the best fit for our observed data. The two-factor solution we identified in this study is consistent with previous factor-analytic studies, which also found a two-factor structure for SCS [24, 26–28], and with theoretical stances that the positive subscale and the negative subscale of SCS are inherently different from each other [28–30]. While the two SCS factors that we identified seem robust, the convergent and divergent analyses paint an unclear picture. Opposite to our hypothesis, which was informed by literature reviews on the relationship between self-compassion and well-being as well as psychopathology, we found that the SCS positive factor was unrelated to distress and only minimally related to positive self-evaluations. Future research is needed to determine if this unexpected finding is replicated and to generate an explanation for it. In agreement with our hypothesis, the SCS negative factor was consistently associated with measures of distress and negative self-evaluations. While it is premature to make definite claims about the positive SCS factor given the limited number of measures included in the study, it seems reasonable to suggest that the SCS negative factor assesses tendencies towards self-criticism, over-identification with negative self-statements, and an overall negative cognitive style, all of which represent symptoms of depression [24, 26, 27, 65] and/or maladaptive interpersonal styles, such as neuroticism [28, 34].

A strength of the present study was the large, representative sample of the general population and the comprehensive modelling approach. There are some limitations to be considered when interpreting the results of this study. First, this study used a German translation of the SCS. Therefore it is unclear whether the results are generalizable to other languages or cultures. Second, this study used a general population sample, which is arguably different from a more psychologically-minded sample, such as social science students or users of psychology-related websites, on which the English SCS was first developed and tested and which were used to initially validate the German SCS. The language and population considerations could have affected the model fit indices we found in the current study, which were less than ideal, but overall they provided a reasonable (adequate) fit for our data. Furthermore, given the limited number of measures for convergent and divergent validity testing included in this study, particularly the lack of measures assessing positive constructs (e.g., optimism, hopefulness, quality of life), it is difficult to demonstrate the construct validity of the SCS positive and negative factors. Future research is needed to elucidate whether the negative items of the SCS capture aspects that are unique to self-compassion or whether they tap vulnerabilities to depression (e.g., self-criticism) and/or other psychopathology [34]. Future research is also needed to cross-validate our results in more culturally and linguistically heterogeneous samples. Future exploratory research could use exploratory techniques such as exploratory structural equation modeling (ESEM) to further elucidate the structure of the SCS scale across various populations.

In sum, this study using data collected in Germany found that three models of SCS produced similar fit indices, with no substantive difference: a two-factor model, a six-factor model, and a hierarchical model with six first-order factors and two second (higher) order factors. Notably, the six-factor model hypothesized by Neff produced similar fit indices to a six-factor model with positive and negative items arbitrarily assigned to three positive and three negative factors [12]. The bifactor models supported a two-factor solution. The one-factor and three-factor models did not converge. While six factors can be modeled, the three negative factors and the three positive factors, respectively, do not seem to reflect reliable or meaningful variance beyond one positive and one negative item factor. As such, we recommend the use of two subscale scores to capture a positive factor and a negative factor when administering the German SCS to general population samples and we strongly advise against the use of a total score across all SCS items.

## Supporting information

### S1 Table. Summary of results from factor analytic studies assessing the structure of the Self-Compassion Scale.

https://doi.org/10.1371/journal.pone.0190771.s001

(PDF)

### S2 Table. Distribution of the Self-Compassion Scale items to three positive arbitrary and three negative arbitrary factors.

https://doi.org/10.1371/journal.pone.0190771.s002

(PDF)

### S3 Table. Factor loadings and Confidence Intervals (CI) for the six-factor structure of the Self-Compassion Scale.

https://doi.org/10.1371/journal.pone.0190771.s003

(PDF)

## Acknowledgments

We would like to thank Ms. Momoka Sunohara, MSc, PhD Candidate at Concordia University, Montreal, Canada for her help with the translation of relevant literature on the validation of the Japanese version of the Self Compassion Scale.

## References

- 1. Zessin U, Dickhäuser O, Garbade S. The relationship between self-compassion and well-being: A meta-analysis. Appl Psychol. 2015;7(3):340–64. pmid:26311196
- 2. MacBeth A, Gumley A. Exploring compassion: A meta-analysis of the association between self-compassion and psychopathology. Clin Psychol Rev. 2012;32(6):545–52. pmid:22796446
- 3. Neff KD, Germer CK. A pilot study and randomized controlled trial of the Mindful Self-Compassion Program. J Clin Psych. 2013;69(1):28–44. pmid:23070875
- 4. Beaumont EA, Galpin AJ, Jenkins PE. 'Being kinder to myself: A prospective comparative study, exploring post-trauma therapy outcome measures, for two groups of clients, receiving either cognitive behaviour therapy or cognitive behaviour therapy and compassionate mind training. Couns Psych Rev. 2012;27(1):31–43.
- 5. Birnie K, Speca M, Carlson LE. Exploring self-compassion and empathy in the context of mindfulness based stress reduction (MBSR). Stress Health. 2010;26(5):359–71.
- 6. Shapira LB, Mongrain M. The benefits of self-compassion and optimism exercises for individuals vulnerable to depression. J Posit Psychol. 2010;5(5):377–89.
- 7. Gilbert P, Procter S. Compassionate mind training for people with high shame and self-criticism: Overview and pilot study of a group therapy approach. Clin Psychol Psychother. 2006;13(6):353–79.
- 8. Gilbert P. Introducing compassion-focused therapy. Adv Psychiatr Treat. 2009;15(3):199–208.
- 9.
Gilbert P, Irons C. Focused therapies and compassionate mind training for shame and self-attacking. In: Gilbert P, editor. Compassion: Conceptualisations, research and use in psychotherapy. New York, NY: Routledge; 2005. p. 263–325.
- 10. Jazaieri H, Jinpa GT, McGonigal K, Rosenberg EL, Finkelstein J, Simon-Thomas E, et al. Enhancing compassion: A randomized controlled trial of a compassion cultivation training program. J Happiness Stud. 2013;14(4):1113–26.
- 11. Keng SL, Smoski MJ, Robins CJ, Ekblad AG, Brantley JG. Mechanisms of change in mindfulness-based stress reduction: Self-compassion and mindfulness as mediators of intervention outcomes. J Cogn Psychother. 2012;26(3):270–80.
- 12. Neff KD. The development and validation of a scale to measure self-compassion. Self Identity. 2003;2(3):223–50.
- 13. Mills A, Gilbert P, Bellew R, McEwan K, Gale C. Paranoid beliefs and self-criticism in students. Clin Psychol Psychother. 2007;14(5):358–64.
- 14. Ying YW. Contribution of self-compassion to competence and mental health in social work students. J Soc Work Educ. 2009;45(2):309–23.
- 15. Neff KD. Self-compassion: An alternative conceptualization of a healthy attitude toward oneself. Self Identity. 2003;2(2):85–101.
- 16. Neff KD. The Self-Compassion Scale is a valid and theoretically coherent measure of self-compassion. Mindfulness. 2016;7(1):264–74.
- 17. Garcia-Campayo J, Navarro-Gil M, Andrés E, Montero-Marin J, López-Artal L, Demarzo MMP. Validation of the Spanish versions of the long (26 items) and short (12 items) forms of the Self-Compassion Scale (SCS). Health Qual Life Outcomes. 2014;12(4):1–9. pmid:24410742
- 18. Hupfeld J, Ruffieux N. Validation of a german version of the self-compassion scale (SCS-D) [Validierung einer deutschen version der self-compassion scale (SCS-D)]. Z Klin Psychol Psychother. 2011;40(2):115–23.
- 19. Petrocchi N, Ottaviani C, Couyoumdjian A. Dimensionality of self-compassion: Translation and construct validation of the Self-Compassion Scale in an Italian sample. J Ment Health. 2014;23(2):72–7. pmid:24328923
- 20. Williams MJ, Dalgleish T, Karl A, Kuyken W. Examining the factor structures of the five facet mindfulness questionnaire and the Self-Compassion Scale. Psychol Assess. 2014;26(2):407–18. pmid:24490681
- 21. Castilho P, Pinto-Gouveia J, Duarte J. Evaluating the multifactor structure of the long and short versions of the Self-Compassion Scale in a clinical sample. J Clin Psych. 2015;71(9):856–70. pmid:25907562
- 22. Arimitsu K. [Development and validation of the Japanese version of the Self-Compassion Scale]. Shinrigaku kenkyu. 2014;85(1):50–9. pmid:24804430
- 23. Neff KD, Whittaker TA, Karl A. Examining the factor structure of the Self-Compassion Scale in four dstinct poulations: Is the use of a total scale score justified? J Pers Assess. 2017:1–12. pmid:28140679
- 24. Costa J, Marôco J, Pinto-Gouveia J, Ferreira C, Castilho P. Validation of the psychometric properties of the Self-Compassion Scale: Testing the factorial validity and factorial invariance of the measure among borderline personality disorder, anxiety disorder, eating disorder and general populations. Clin Psychol Psychother. 2015;22(1):75–82.
- 25. Tóth-Király I, Bőthe B, Orosz G. Exploratory structural equation modeling analysis of the Self-Compassion Scale. Mindfulness. 2016:1–12.
- 26. Brenner RE, Heath PJ, Vogel DL, Credé M. Two is more valid than one: Examining the factor structure of the Self-Compassion Scale (SCS). J Couns Psych. 2017:1–12. pmid:28358523
- 27. López A, Sanderman R, Smink A, Zhang Y, van Sonderen E, Ranchor A, et al. A reconsideration of the Self-Compassion Scale’s total score: Self-compassion versus self-criticism. PloS One. 2015;10(7):e0132940. pmid:26193654
- 28. Pfattheicher S, Geiger M, Hartung J, Weiss S, Schindler S. Old wine in new bottles? The case of self-compassion and neuroticism. E J Pers. 2017;31(2):160–9.
- 29.
Neff KD. Self-compassion. In: Leary MR, Hoyle RH, editors. Handbook of Individual Differences in Social Behavior. New York: Guilford Press; 2009. p. 561–73.
- 30. Gilbert P, McEwan K, Matos M, Rivis A. Fears of compassion: Development of three self-report measures. Psychol Psychother. 2011;84(3):239–55. pmid:22903867
- 31. Montero-Marín J, Gaete J, Demarzo M, Rodero B, Lopez LCS, García-Campayo J. Self-criticism: A measure of uncompassionate behaviors toward the self, based on the negative components of the self-compassion scale. Front Psychol. 2016;7. pmid:27625618
- 32. Kotsou I, Leys C. Self-Compassion Scale (SCS): Psychometric properties of the French translation and its relations with psychological well-being, affect and depression. PloS One. 2016;11(4):e0152880. pmid:27078886
- 33. de Souza LK, Hutz CS. Adaptation of the self-compassion scale for use in Brazil: Evidences of construct validity. Trends in Psychology [Temas em Psicologia]. 2016;24(1):159–72.
- 34. Muris P, Petrocchi N. Protection or vulnerability? A meta-analysis of the relations between the positive and negative components of self-compassion and psychopathology. Clin Psychol Psychother. 2016. pmid:26891943
- 35. Zenger M, Körner A, Maier GW, Hinz A, Stöbel-Richter Y, Brähler E, et al. The Core Self-Evaluation Scale: Psychometric properties of the German version in a representative sample. J Pers Assess. 2015;97(3):310–8. pmid:25531806
- 36. Körner A, Coroiu A, Copeland L, Gomez-Garibello C, Albani C, Zenger M, et al. The role of self-compassion in buffering symptoms of depression in the general population. PloS One. 2015. pmid:26430893
- 37. Kroenke K, Spitzer RL, Williams JBW. The PHQ-9: Validity of a brief depression severity measure. J Gen Intern Med. 2001;16(9):606–13. pmid:11556941
- 38. Kocalevent RD, Hinz A, Brähler E. Standardization of the depression screener Patient Health Questionnaire (PHQ-9) in the general population. Gen Hosp Psychiatry. 2013;35(5):551–5. pmid:23664569
- 39. Martin A, Rief W, Klaiberg A, Braehler E. Validity of the Brief Patient Health Questionnaire Mood Scale (PHQ-9) in the general population. Gen Hosp Psychiatry. 2006;28(1):71–7. http://dx.doi.org/10.1016/j.genhosppsych.2005.07.003. pmid:16377369
- 40. Spitzer R, Kroenke K, Williams J, Löwe B. A brief measure for assessing generalized anxiety disorder: The GAD-7. Arch Intern Med. 2006;166(10):1092–7. pmid:16717171
- 41. Kroenke K, Spitzer RL, Williams JBW, Monahan PO, Löwe B. Anxiety disorders in primary care: Prevalence, impairment, comorbidity, and detection. Ann Intern Med. 2007;146(5):317–25. pmid:17339617
- 42. Judge TA, Erez A, Bono JE, Thoresen CJ. The Core Self-Evaluations Scale: Development of a measure. Pers Psychol. 2003;56(2):303–31.
- 43. Stumpp T, Muck PM, Hülsheger UR, Judge TA, Maier GW. Core Self-Evaluations in Germany: Validation of a German measure and its relationships with career success. Appl Psychol. 2010;59(4):674–700.
- 44.
Muthén LK, Muthén BO. Mplus user's guide 6th ed. Los Angeles: Muthén & Muthén; 2010.
- 45.
Breivik E, Olsson UH. Adding variables to improve fit: the effect of model size on fit assessment in LISREL. In: Cudeck R, Jöreskog KG, Sörbom D, editors. Structural equation modeling: Present and future—A festschrift in honor of Karl Jöreskog. Lincolnwood, IL: Scientific Software International; 2001. p. 169–94.
- 46.
Meade AW. Power of AFI’s to Detect CFA Model Misfit. 23th Annual Conference of the Society for Industrial and Organizational Psychology; San Francisco, CA2008.
- 47. Reise SP. The rediscovery of bifactor measurement models. Multivariate Behav Res. 2012;47(5):667–96. pmid:24049214
- 48. Reise SP, Moore TM, Haviland MG. Bifactor models and rotations: Exploring the extent to which multidimensional data yield univocal scale scores. J Pers Assess. 2010;92(6):544–59. pmid:20954056
- 49. Cook KF, Kallen MA, Amtmann D. Having a fit: Impact of number of items and distribution of data on traditional criteria for assessing IRT’s unidimensionality assumption. Qual Life Res. 2009;18(4):447–60. pmid:19294529
- 50. Reise SP, Bonifay WE, Haviland MG. Scoring and modeling psychological measures in the presence of multidimensionality. J Pers Assess. 2013;95(2):129–40. pmid:23030794
- 51. Rodriguez A, Reise SP, Haviland MG. Applying bifactor statistical indices in the evaluation of psychological measures. J Pers Assess. 2016;98(3):223–37. pmid:26514921
- 52. Reise SP, Widaman KF, Pugh RH. Confirmatory factor analysis and item response theory: Two approaches for exploring measurement invariance. Psychol Bull. 1993;114(3):552–66. pmid:8272470
- 53. Tucker LR, Lewis C. A reliability coefficient for maximum likelihood factor analysis. Psychometrika. 1973;38(1):1–10.
- 54. Bentler PM. Comparative fit indexes in structural models. Psychol Bull. 1990;107(2):238–46. pmid:2320703
- 55. Steiger JH. Structural model evaluation and modification: An interval estimation approach. Multivariate Behav Research. 1990;25(2):173–80. pmid:26794479
- 56. Hu L, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Struct Equ Modeling 1999;6(1):1–55.
- 57.
Kline RB. Principles and practice of structural equation modeling. 2nd ed. New York: Guilford Press; 2005.
- 58.
Browne MW, Cudeck R. Alternative ways of assessing model fit. In: Bollen KA, Long JS, editors. Testing structural equation models Newbury Park Sage; 1993. p. 136–62.
- 59. McDonald RP, Ho MHR. Principles and practice in reporting structural equation analyses. Psychol Methods. 2002;7(1):64–82. pmid:11928891
- 60. Cheung GW, Rensvold RB. Evaluating goodness-of-fit indexes for testing measurement invariance. Struct Equ Modeling. 2002;9(2):233–55.
- 61. Bjorner JB, Rose M, Gandek B, Stone AA, Junghaenel DU, Ware JE. Method of administration of PROMIS scales did not significantly impact score level, reliability, or validity. J Clin Epidemiol. 2014;67(1):108–13. pmid:24262772
- 62.
Cohen J. Statistical power analysis for the behavioural sciences. 2nd ed. Hillsdale: Lawrence Erlbaum Associates; 1988.
- 63. Zwick R, Thayer DT, Mazzeo J. Descriptive and inferential procedures for assessing differential item functioning in polytomous items. Appl Meas Educ. 1997;10(4):321–44.
- 64.
Nunnally JC, Bernstein IH. The assessment of reliability. New York: McGraw-Hill; 1994.
- 65. Gilbert P, McEwan K, Catarino F, Baião R. Fears of compassion in a depressed population: Implication for psychotherapy. J Depress Anxiety 2014;S2(003):1–8.