An empirically derived recommendation for the classification of body dysmorphic disorder: Findings from structural equation modeling

Body dysmorphic disorder (BDD), together with its subtype muscle dysmorphia (MD), has been relocated from the Somatoform Disorders category in the DSM-IV to the newly created Obsessive-Compulsive and Related Disorders category in the DSM-5. Both categorizations have been criticized, and an empirically derived classification of BDD is lacking. A community sample of N = 736 participants completed an online survey assessing different psychopathologies. Using a structural equation modeling approach, six theoretically derived models, which differed in their allocation of BDD symptoms to various factors (i.e. general psychopathology, somatoform, obsessive-compulsive and related disorders, affective, body image, and BDD model) were tested in the full sample and in a restricted sample (n = 465) which indicated primary concerns other than shape and weight. Furthermore, measurement invariance across gender was examined. Of the six models, only the body image model showed a good fit (CFI = 0.972, RMSEA = 0.049, SRMR = 0.027, TLI = 0.959), and yielded better AIC and BIC indices than the competing models. Analyses in the restricted sample replicated these findings. Analyses of measurement invariance of the body image model showed partial metric invariance across gender. The findings suggest that a body image model provides the best fit for the classification of BDD and MD. This is in line with previous studies showing strong similarities between eating disorders and BDD, including MD. Measurement invariance across gender indicates a comparable presentation and comorbid structure of BDD in males and females, which also corresponds to the equal prevalence rates of BDD across gender.

Introduction Body dysmorphic disorder (BDD) is characterized by excessive concerns about perceived flaws in one's appearance (e.g. a crooked nose, skin blemishes, or not being sufficiently muscular in the case of the muscle dysmorphia [MD] subtype) and associated behavioral or mental rituals to hide, improve, or control these flaws [1]. Despite being a debilitating disorder [2,3] with high rates of suicidality [4], research interest in this area has only begun to grow in the last few years. Recently, the diagnostic entity of BDD has been reassigned: While it was classified as a subtype of hypochondriasis in the Somatoform Disorders category in the fourth version of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV), it now represents a stand-alone diagnosis within the Obsessive-Compulsive and Related Disorders (OCRD) category in the DSM-5 [5,6]. It is also set to be replaced accordingly in the upcoming with depression across important domains including comorbidity, family history, course of the disorder, and cognitive biases [27,28]. Thus, conceptualizing BDD as a disorder within a broader affective spectrum alongside anxieties, OCD, and depression might be another option (affective model).
Moreover, there is a fair amount of research hinting at a comparability of BDD and eating disorders. These disorders not only share the hallmark feature of a disturbed body image [29], but also show resemblances regarding onset and course as well as cognitive biases [30][31][32], with a particular similarity between anorexia nervosa and BDD [31]. Therefore, it has been proposed that BDD and the eating disorders might form a body image spectrum of disorders [33]. Such a categorization could also have the advantage of including muscle dysmorphia (MD), a subtype of BDD. The classification of MD has been a topic of discussion ever since it was coined as "reverse anorexia" by Pope et al. [34], given its large symptom overlap with the eating disorders [35]. Dos Santos Filho et al. [36] concluded from a systematic review that there is not sufficient scientific evidence to support the inclusion of MD in any existing category of psychological disorders. However, a conceptualization of BDD with MD as a subtype, in a body image disorders spectrum within a so-called body image model that also includes eating disorders, might pave the way for future discussions [37].
Finally, BDD still shows some unique features with respect to symptomatology, which actually complicates its identification and treatment [38]. This indicates that BDD might be seen as a unique factor which is separate from somatoform, affective, and eating disorders, as has already been suggested in an adolescent sample [39]. A major benefit of such a BDD model may lie in greater diagnostic accuracy and better treatment, by prompting health care workers to check for the unique characteristics of BDD [40].
Despite the aforementioned evidence to support the various classification suggestions, no investigation has been conducted using a bottom-up approach, with the exception of the aforementioned study in adolescents by Schneider and colleagues [39], which supported the BDDonly model. That is, no study has used real data across various symptoms and fitted these data to stipulated classification models. While Schneider et al. [39] provided valuable insights into the uniqueness of BDD symptoms in the classification in adolescents; their sample had a narrow age range (around the age of onset of BDD). Moreover, other relevant psychopathology (e.g. depressive symptoms, eating disorders; [41,42]) has not yet been comprehensively examined. Furthermore, Schneider et al.'s study failed to include the BDD subtype MD, which may have influenced the proximity to eating disorders in terms of classification. As the study also excluded somatoform disorders, the authors were unable to provide information about the usefulness of the DSM-IV/ ICD-10 classification of BDD. Finally, the study did not include OC-related disorders other than OCD itself, such as skin picking and hair pulling, thus reducing the informative value of the OCRD model. Therefore, the aim of the present study was to provide empirically derived recommendations for the classification of BDD and MD in adults. We thus analyzed the best classification of both BDD and MD symptoms based on shared phenomenological and comorbidity aspects, taking into account various different aspects of related psychopathology such as those of OC and related disorders (OCD, skin picking, and hair pulling), eating disorders, SAD and panic disorders, depression, illness anxiety disorder, and somatoform disorder. To assess these aspects of psychopathology, a large community and student sample completed an online survey. Based on the reviewed literature, mainly of transdiagnostic studies comparing BDD mostly with OCD, eating disorders, anxiety, and depression as well as the study by Schneider and colleagues [39] analyzing psychopathological data in adolescents, but also past and current classification of BDD, six potential classification models were specified. Models differed according to their classification of BDD symptoms within different categories and consisted of a general psychopathology (in which one factor loaded on all indicators), somatoform, an OCRD, an affective, a body image, and a BDD model. The best fitting model was also tested with respect to equivalence in male and female subsamples due to differences in prevalence rates of disorders included as well as gender-specific subtypes of specific disorders, in particular BDD with the subtype MD. Furthermore, the models were retested in a subgroup that indicated primary concerns other than weight and shape on the questionnaire assessing BDD symptoms. This should rule out an inadvertent overestimation of the association between eating disorder and BDD pathology due to high scores on the BDD measures originally stemming from high shape and weight concerns.

Participants and recruitment
The ethics committee of Osnabrück University has approved the study (no approval number available). Written informed consent was obtained by the participants. Recruitment was conducted through university press releases at the institutions of most authors, advertisements on social media and flyers, with the aim to recruit a non-clinical community sample of participants aged 18 years and older. We chose a community-based rather than a clinical sample in order to ensure that the comorbidity structures were representative of the general population [43]. Of N = 1166 persons who began the survey, n = 743 completed it (i.e., filled out all questionnaires included in this main manuscript results of two further questionnaires will be reported elsewhere); three participants were excluded as they were below the age of 18 years and four participants were excluded because they did not clearly identify as male or female. Thus, 736 participants were included in the analyses. As reimbursement for study participation, all participants who completed the survey were given the opportunity to enter a raffle to win one of 20 online shopping vouchers worth 20 Euros each, or received student credit if they were students at one of the participating institutions.

Procedure
The questionnaires (see description below) were programmed in the online survey software Unipark (QuestBack GmbH, Cologne Germany). The landing page of the survey informed potential participants about the study aim (assessment of various different symptoms in order to improve assessment and consequently treatment options) and duration (approximately 45 minutes) as well as privacy and confidentiality aspects of the study. After providing informed consent and completing completing the survey, participants who indicated their interest in taking part in the raffle or in receiving student credits were redirected to a separate page on which they provided their contact information.

Measures
In the following, the employed measures are described in order of their appearance in the survey. If not otherwise indicated the total scores of the instruments were employed. Instruments were chosen based on their brevity, representativeness of symptomatology, and frequency of use. Internal consistencies of these scores are presented in Table 1 and are good to excellent.
Demographic questionnaire. Assessed demographic and clinical variables included gender, age, sexual orientation, educational attainment, psychotropic medication, and current or lifetime diagnosis of a mental disorder.
The German-language Body Dysmorphic Symptoms Inventory (Fragebogen Körperdysmorpher Symptome, FKS; [44]) consists of 18 items assessing body dysmorphic disorder symptoms. The items can be summed into two subscales: "Specific BDD Symptoms" (items 1, 4-15) and "Associated Features" (items [16][17][18]. Items are scaled from 0 "not at all, never, do not think about it" to 4 "very strongly so, over 5-times a week, over 8 hours a day". The Muscle Dysmorphia Disorder Inventory (MDDI; [45]; German version; [46]) is a 13-item measure assessing symptoms associated with MD on three subscales (Drive for Size, Appearance Intolerance, Functional Impairment). Each item is rated on a 5-point Likert scale ranging from 1 "never" to 5 "always".
The Eating Disorder Examination-Questionnaire (EDE-Q; German-language version; [47]) assesses eating disorder psychopathology referring to the past 28 days. It consists of 22 items which are allocated to four subscales (Dietary Restraint, Eating Concern, Weight Concern, and Shape Concern), and scaled on a seven-point Likert scale from 0 "no day, not at all" to 6 "each day, markedly". Six additional items assess the frequencies of eating disorder behaviors, but were not used in the present study.
The Obsessive-Compulsive Inventory-Revised (OCI-R; German-language version; [48]) is an 18-item questionnaire assessing OCD symptom severity within the last month. Each item is rated on a 5-point Likert scale ranging from 0 "not at all" to 4 "extremely". The OCI-R comprises six subscales (Washing, Obsessing, Hoarding, Ordering, Checking, and Neutralizing).
The Liebowitz Social Anxiety Scale (LSAS; German-language version; [49]) assesses avoidance and fear in 24 situations that are likely to elicit social anxiety. Thirteen of the 24 items refer to performance situations and the remaining eleven items assess social interaction situations. For each of the 24 situations, the clinician derives ratings of avoidance and fear experienced by the respondent in the past week on a 4-point Likert scale. The fear scale ratings range from 0 "no fear" to 3 "severe fear" while the avoidance ratings also range from 0 to 3 (based on the percentage of time spent avoiding the particular situation from 0 "never" to 3 "usually" [67-100%]").
The Patient Health Questionnaire depression module (PHQ-9; German-language version; [50]) is a nine-item measure of depression severity. The items are rated on a 4-point Likert scale ranging from 0 "not at all" to 3 "nearly every day".
The Short Health Anxiety Inventory, modified version (mSHAI; German-language version; [51]) comprises 14 items assessing the severity of health anxiety. Items are rated on a 5-point Likert scale ranging from 1 "strongly disagree" to 5 "strongly agree". The Massachusetts General Hospital Hairpulling Scale (MGHHPS; [52] unpublished German translation) assesses the severity of repetitive hair pulling. It contains seven items, which are rated on a 5-point Likert scale from 0 "no urges, none, always in control" to 4 "near constant, extreme, never able to distract".
The Skin Picking Scale-Revised (SPS-R; German-language version; [53]) measures skin picking disorder severity and impairment using eight items rated on a 5-point Likert scale from 0 "none" to 4 "extreme".
From the Hamburg Modules for the Assessment of Psychosocial Health, short version (HEALTH-49 Kurzform; Hamburger Module zur Erfassung allgemeiner Aspekte psychosozialer Gesundheit für die therapeutische Praxis; [54]), we used the subscale Physical Complaints for the present study. This subscale contains seven items asking about physical pain or complaints in the last two weeks, rated on a 5-point Likert scale from 1 "not at all" to 5 "extremely".
The German-language Questionnaire on Body-Related Fears, Cognitions and Avoidance (AKV; Fragebogen zu körperbezogenen Ä ngsten, Kognitionen und Vermeidung; [55]) measures the severity of panic disorder symptoms. It encompasses the German versions of three questionnaires: the Body Sensation Questionnaire (BSQ; [56]), the Agoraphobic Cognitions Questionnaire (ACQ; [56]) and the Mobility Inventory (MI; [57]). In the present study, only the ACQ was used. The ACQ assesses the frequency of frightening or maladaptive thoughts about the consequences of panic and anxiety. It contains 14 items rated on a 5-point Likert scale ranging from 1 "thought never occurs" to 5 "thought always occurs when I am nervous".

Data analysis
Structural equation modeling (SEM) analyses were conducted with R v3.5.0 using the package lavaan v0.6-3 [58]. For all other analyses, we used SPSS v25 (IBM; Armonk, New York, USA). SEM analyses were performed by analyzing the covariance matrix of the symptom measures using maximum likelihood estimation. Factor intercorrelations were freely estimated in each model. As is customary, the first indicator of every factor (i.e. the uppermost indicator in Fig  1) was fixed at 1. Errors were not allowed to correlate. No post hoc modifications of the models were carried out.
As the data do not follow a multivariate normal distribution, we applied Satorra-Bentler scaled χ 2 -statistics (SBχ 2 ; [59]) and related robust fit indices. With a larger sample size, (scaled) χ 2 becomes increasingly sensitive to very small model-data discrepancies. Therefore, we focused on additional fit indices, which are less influenced by sample size. The following indices with cutoff values following the recommendations of Hu and Bentler [60] and Schermelleh-Engel, Moosbrugger, and Müller [61] are reported: (a) comparative fit index (CFI � .95 acceptable, � .97 good), (b) Tucker-Lewis index (TLI � .95 acceptable, � .97 good), (c) root mean square error of approximation (RMSEA � .08 acceptable, � .05 good), (d) standardized root mean square residual (SRMR < .10 acceptable, < .05 good), and (e) Akaike Information Criterion, AIC, and Bayes Information Criterion, BIC (no cutoff values, smaller values indicate better fit). For RMSEA, the 90% confidence interval is also reported. If the lower bound of this confidence interval is 0, the hypothesis of "exact-fit" (i.e. the null hypothesis that RMSEA in the population is 0) is retained (α = 0.05). In addition, the less stringent "close-fit hypothesis" (i.e. the null hypothesis that RMSEA in the population is less than or equal to .05) is tested. A p-value (denoted as p close ) of this test greater than α (i.e. statistically insignificant) supports the model. In addition to the global fit measures, we present the fully standardized solution of the best-fitting model. Significance tests of parameter estimates are based on robust standard errors (i.e. the MLM estimator in lavaan). These analyses were repeated with a restricted sample (n = 465) which did not indicate primary shape and weight concerns in the FKS (scores � 2 [scaled 0 "not at all" to 4 "very strongly so"] on item 3 "Is your main concern about your appearance that you are not thin enough or could become fat?").
If the best-fitting model shows an acceptable fit and the estimated parameters are as expected, the equivalence of the model in the male and female subsamples can be tested using multi-group SEM. In a first step, the model is tested for men and women separately. If the model fits well in both subsamples, one proceeds with a series of hierarchically nested models which allow increasingly stronger forms of invariance to be examined [62].
The first multi-group model is the baseline model, in which all parameters in both groups are freely estimated. A good fit of this baseline model speaks in favor of configural invariance (Model CI). In this weakest form of invariance, the number of factors and their correspondence to the indicators is the same across both groups. If configural invariance has been established, metric invariance (Model MI) can be examined. In this step, all factor loadings are constrained to be equal in both groups. Compared to the baseline model, this nested restricted model should not be statistically worse, i.e. the ΔSB-χ 2 test [63] should be non-significant. Furthermore, global fit should not practically decline too much. Following Chen [64], the cutoffs for a practically significant decrease in fit employed here are ΔCFI � -0.01, supplemented by ΔRMSEA � 0.015, or ΔSRMR � 0.03 (where the fit index of the restricted model is always subtracted from the fit index of the unrestricted model). If metric invariance is supported, all indicators contribute to their factor with similar magnitude and one can proceed to test for scalar invariance (Model SI). For this purpose, the intercepts are additionally constrained to be equal across the two groups. Again, the restricted model (Model SI) is then compared to the less restricted model (Model MI) and the cutoff values proposed by Chen [64] are applied, which are identical to those above with the exception of ΔSRMR, which requires ΔSRMR � 0.01. If scalar invariance can be established, it is possible to compare means on the latent factors between men and women.

Model fits
For all symptom scales, the respective items were averaged over non-missing data. None of the symptom scales had more than three missing values (<0.03%), with only 34 missing values across all items and scales in total (<0.03%). Table 1 displays means, standard deviations, and reliability information of the symptom scales. The bivariate product moment correlations r between the symptom scales are presented in Table 2. Since data were not bivariate normally distributed we employed bootstrapping instead of the standard parametric significance tests for r (against 0, percentile method, 1000 samples).
Global SEM fit indices of the six classification models (as presented in Fig 1) are shown in Table 3. The results reveal a very clear picture, with the 4-factor Body Image model (BIM) outperforming the other competing models. All global fit indices of the BIM indicated a good fit (CFI = 0.972, RMSEA = 0.049, SRMR = 0.027) or an acceptable fit (TLI = 0.959). According to the cutoffs reported above, none of the competing models yielded an acceptable fit on any of the fit measures with the exception of SRMR. Furthermore, the BIM showed a statistically better fit to the data than the general psychopathology model, ΔSB-χ 2 (6) = 277.24, p < .001. Since they are not nested, the BIM and the remaining models could only be compared by means of AIC and BIC. The BIM produced the lowest AIC and BIC values of all models, again indicating the best relative fit.
The BIM with completely standardized parameter estimates is shown in Fig 2. As expected, all factor loadings were positive and statistically significant (p < .01). The smallest loading was exhibited by the symptom "Hair pulling" on the factor "Impulse-control". All latent intercorrelations between the four factors were positive and statistically significant (p < .01). Notably, the correlation between the factors "Affective" and "Somatoform" was very high (ρ = 0.93, p < .001).

Gender invariance of the body image model
Building on the preceding results, the invariance analyses refer to the best-fitting BIM. Testing this model for both groups separately revealed good to acceptable fit for men and women (see Table 4). With one exception, all factor loadings were higher than 0.56 (completely standardized loadings, λ � ) and statistically significant, p < .001. The exception was the loading of the symptom "Hair pulling" on the factor "Impulse-control" (women: λ � = 0.20, p < .05; men: λ � = 0.45, p < .01), which was also the lowest-loading symptom in the total sample.
In the next step, parameters in both groups were estimated simultaneously. This baseline model yielded an acceptable fit, which speaks in favor of configural invariance and justifies the evaluation of more restrictive invariance models. The constraint of equal factor loadings in both groups produced a statistically significant increase in misfit (Model MI compared to Model CI, see Table 5). This speaks against metric invariance. On the other hand, the decrease in fit of the other indices was small and below the cutoff values proposed by Chen [64]. When examining the freely estimated loadings of the baseline model in detail, it became evident that in particular, the estimate of the path from the factor "Affective syndrome" to the "Obsessivecompulsive" symptoms diverged in the two subsamples. Therefore, we tested a model for partial metric invariance (Model PMI) with this path being freely estimated in both groups [65]. Freeing this loading resulted in a stronger standardized path for men (λ � = 0.710) than for women (λ � = 0.574). Compared to the baseline model, no statistically and no practically significant increase in misfit occurred.
As we obtained at least partial support for metric invariance, we also tested for scalar invariance (Model SI). Using the metric and the partial metric invariance model as bases, we constrained the intercepts to be equal across both groups. For both constrained models (SI and PSI, see Table 5), the scaled SB-χ 2 difference tests indicated significant increases in misfit. For model PSI, ΔCFI slightly exceeded the cutoff value. The change in the other model fit indices for models SI and PSI remained below the cutoffs. Although the results are somewhat inconsistent, there was not sufficient evidence to assume scalar invariance of the BIM in male and female subsamples. Thus, the BIM model demonstrated configural invariance and partial metric invariance. This means that the basic correspondence of symptoms to the underlying four factors is similar in men and women.

Replication of model fits in a sample without primary shape and weight concerns
For the restricted sample of participants who indicated that their main appearance concern did not refer to weight and/or shape, the correlations between the symptom scales are displayed above the diagonal in Table 2. We compared the six competing models in the same way as for the complete sample. From Table 6, it is apparent that the results of the SEM analyses showed the same pattern. The BIM fitted the data well, and better than all other models. Invariance analyses were not replicated, since the sample size of men in the restricted sample was only n = 112. Furthermore, for some models, improper SEM solutions emerged.

Discussion
The present study sought to provide an empirically based recommendation for a future classification of body dysmorphic disorder. To this aim, different classification models, i.e. with BDD symptoms being allocated to different diagnostic categories (somatoform, OCRD, affective, body image and BDD model), were tested using a structural equation modeling approach in a community-based sample of adults. Additionally, gender differences in model fits were examined.
Interestingly, neither the model that conceptualized BDD symptoms within an OCRD category nor the model with BDD symptoms as a stand-alone factor showed a satisfactory fit to the data. For the OCRD model, this strongly contradicts the current classification of BDD as a disorder in the DSM-5 [6] and the proposed classification in the ICD-11 [7] as well as studies highlighting similarities between OCD and BDD [66]. However, other empirical studies have also revealed large differences in phenomenology as well as in clinical and personality characteristics between BDD and other diagnoses of the category such as skin picking, hair pulling, or hoarding, and their combination within one category has received a great deal of criticism [23]. With regard to the model that proposes BDD symptoms to be a stand-alone factor, the findings of the present study are in contrast to those of Schneider and colleagues [39], who found this to be the best-fitting model. Schneider and colleagues [39] argued that there might be symptoms that stand out as unique in BDD in adolescents, e.g. the extent of delusionality with which beliefs are held, which are not mirrored in other disorders. While the same symptoms might still be unique in adulthood, further comorbidities might have led to a more inclusive picture of phenomenology. This assumption is supported by the finding that comorbid diagnoses (e.g. depression, SAD, OCD, and certain eating disorders such as bulimia nervosa) most often develop only after the onset of BDD [67][68][69][70][71]. Thus, while the main symptoms do not seem to differ between adolescents and adults [72], comorbidity patterns might change, impacting the recommendations for an ideal classification.
Furthermore, both models including BDD symptoms in an affective or a somatoform model did not show an adequate fit, despite previous evidence of similarities between BDD and anxiety disorders (particularly SAD) and depression, as well as high comorbidity rates [28,73]. At first glance, this is in contrast to previous findings which confirmed higher-order dimensions consisting of a whole range of internalizing (primarily emotional) disorders and externalizing syndromes, as proposed and first identified by Krueger in youth [74] and in adults [75]. However, a later study showed that this model is not developmentally stable and robust against the addition of further disorders, including eating disorders and somatoform disorders [76], which supports the current findings. Regarding the somatoform model, our findings are in accordance with empirical evidence showing little similarity between BDD and other disorders in the category, despite sharing a focus on the body [15]. Moreover, they further support the move of the diagnosis out of the DSM-IV [5] category Somatoform Disorders with the introduction of the DSM-5 [6]. Surprisingly, the model in which BDD symptoms are part of a body image disorders spectrum showed the best fit. This finding remained when the sample was reduced to those participants who indicated primary concerns other than shape and weight in the FKS. These participants were analyzed separately in order to prevent an overestimation of the correlation between ED and BDD symptom levels due to shared symptoms such a checking, avoidance, and behaviors to change the body such as dieting. This corroborates previous findings on the comparability of EDs and BDD on a whole range of clinical, personality, and cognitive variables [30]. Regarding non-clinical samples, Samad and colleagues [77] found that while ED and BDD symptoms appear to track together, they seem to identify different sets of psychopathological features. This supports our results that the two symptomatologies might best be captured under one category. Furthermore, such a classification would also accommodate the discussion around the BDD subtype MD, which was conceptualized as a variant of eating disorders in earlier research [33] and which shows great phenomenological similarities with eating disorders (e.g. dieting [78]). As a consequence, several authors have suggested a body image model incorporating the eating disorders with a clear reference to body image (e.g. anorexia and bulimia nervosa and potentially binge eating disorder) together with BDD and MD (e.g. [36,79]). Disorders falling in such a category would be characterized by a body image disturbance, involving an affective component such as body dissatisfaction, a misperception of the body, cognitive distortions such as attention and interpretation biases, and behaviors such as checking, avoidance, and appearance fixing [80,81]. Thus, these disorders might present with symptoms also present in other disorders such as sadness or OCD-like checking, but they are all exclusively related to the body. Disorders within this category might be most easily differentiated by examining the body parts individuals are most dissatisfied with, i.e., body zones typically linked to weight concerns in EDs, particularly anorexia nervosa, whereas facial features, hair and skin in BDD [82].
The body image model also achieved partial metric invariance, but not scalar invariance, across gender when the path from the factor "affective syndrome" to "OCD symptoms" was freed. The findings suggest that "affective syndrome" influences OCD symptoms more strongly in men than in women. This is in line with the analysis of the comorbidity structure of OCD in the two genders, which hints at a stronger comorbid presentation with affective disorders in men, and a stronger association of the religious obsessive-compulsive dimension (which is more typical in men) with depression (e.g. [83]). In sum, the basic correspondence of symptoms to the underlying four factors is similar in men and women. While this is in contrast to the findings of comparable studies for eating disorders (e.g. [84]), these findings might be due to the more equally distributed prevalence of BDD across gender compared to EDs (e.g. [4,85]). Furthermore, this finding is also in contrast to the study by Schneider and colleagues [39], who did not find metric invariance for their best-fitting model with the standalone BDD factor in adolescents. Developmental tasks in adolescence (e.g. findings one's own gender identity; [86]) might be more gender-specific than tasks in early or middle adulthood, thus leading to gender-specific differences in the development of comorbidities.
The present findings need to be interpreted in the light of the strengths and limitations of the study. Limitations pertaining to the sample might lie in the recruitment of a communitybased, non-clinical sample, as in particular, the structures of pathology and comorbidity may differ in clinical samples [87]. Along the same lines, we have only examined BDD symptoms independent of those warranting a diagnosis of BDD. Therefore, it might be advisable to replicate these findings in a clinical sample with subthreshold levels of symptoms looking into the comorbidity structure at the disorder level. Furthermore, participants were not recruited at random, which might limit generalizability of findings. Additionally, the sizes of the female and male samples differed substantially, meaning that gender differences between groups should be interpreted with caution. Moreover, the high level of education of the sample limits the ability to generalize the findings to persons with a more diverse educational background. Finally, from a design point of view, while the data-driven bottom-up approach of the present study can be criticized, it did allow for the identification of symptom clusters and the examination of empirically derived models. Also, as can be seen from Fig 1, some factors were assessed only by few indicator variables. In general, simulations show that more indicators per factor result in fewer nonconverged solutions, fewer improper solutions and more accurate parameter estimates (MBR; [88]). But simulations demonstrate as well that a low number of indicators can be compensated by larger sample size, e.g. N � 400 [61]. Moreover, there were no signs of nonconvergent or improper solutions in any of our models. Taken together, we found no evidence that the relatively low number of indicators has caused problems. The strengths of the study include the large sample size, the broad age range and the use of validated self-report instruments covering a broad spectrum of psychopathology. This is the first study to examine the empirical evidence for different classification models of BDD in adults, revealing a clear superiority of a body image model of BDD, MD, and eating disorders. Despite this highly important finding, it is of note, that besides this rather categorical approach of classification, recently, a dimensional one has been proposed and highly researched. Along the lines of the initiative Research Domain Criteria (RDoC) of the National Institute of Mental Health (e.g., [89]), researchers aim to organize mental disorders not only based on symptoms and behaviors presented but rather along domains of human functioning such as negative valence or cognition that can be measured rather diversely (e.g., self-report vs. on a physiological or a genetic level). Thus, future research might also want to be focused on finding common psychopathological organizers of BDD with other disorders and examine where BDD lies on these dimensions measured with different instruments, respectively.
Our findings have several clinical and research implications. From a clinical perspective, the results highlight the relatedness of EDs and BDD. Although BDD and eating disorders are also often comorbid [90][91][92], BDD might still go unnoticed in practice because eating disorders, particularly anorexia nervosa, might be more evident at first glance. As such, the present findings might be useful for clinicians and might lead to an improvement in the under-diagnosis of this severe mental disorder [93]. The better fit of a body image model of eating disorders may suggest that treatment techniques which bring about change in EDs might also be used in BDD. This transfer may lead to promising new avenues to tackle BDD. One such technique (e.g. mirror exposure) has already been successfully applied in BDD (e.g. [94]). In the next step, improvement of treatments could be informed by treatment of the respective other disorder. Furthermore, discussions on the development of the next generation of classification systems should take the present findings into account. With regard to non-clinical samples, the high correlation of symptoms in the present sample highlights the need to address the potentially present other symptom cluster in prevention programs. In terms of research implications, the findings need to be replicated in a clinical sample of patients with BDD. Finally, our findings should foster research comparing BDD with other disorders, in particular eating disorders, in order to establish a clear picture of similarities and differences.