Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Cross-cultural examination of the Big Five Personality Trait Short Questionnaire: Measurement invariance testing and associations with mental health

  • Laura Mezquita ,

    Roles Conceptualization, Formal analysis, Funding acquisition, Investigation, Methodology, Software, Supervision, Validation, Writing – original draft, Writing – review & editing

    Affiliations Department of Basic and Clinical Psychology and Psychobiology, Universitat Jaume I, Castelló de la Plana, Castelló, Spain, Centre for Biomedical Research Network on Mental Health (CIBERSAM), Instituto de Salud Carlos III, Castelló de la Plana, Castellón, Spain

  • Adrian J. Bravo,

    Roles Data curation, Funding acquisition, Investigation, Methodology, Project administration, Supervision, Writing – original draft, Writing – review & editing

    Affiliation Department of Psychological Sciences, William & Mary, Williamsburg, Virginia, United States of America

  • Julien Morizot,

    Roles Methodology, Supervision, Writing – original draft, Writing – review & editing

    Affiliation School of Psychoeducation, University of Montreal, Montreal, Quebec, Canada

  • Angelina Pilatti,

    Roles Funding acquisition, Investigation, Writing – original draft, Writing – review & editing

    Affiliation Facultad de Psicología, Universidad Nacional de Córdoba, Instituto de Investigaciones Psicológicas (IIPsi-UNC-CONICET), Córdoba, Córdoba, Argentina

  • Matthew R. Pearson,

    Roles Writing – original draft, Writing – review & editing

    Affiliation Center on Alcoholism, Substance Abuse, and Addictions, University of New Mexico, Albuquerque, New Mexico, United States of America

  • Manuel I. Ibáñez,

    Roles Conceptualization, Funding acquisition, Investigation, Supervision, Writing – original draft, Writing – review & editing

    Affiliations Department of Basic and Clinical Psychology and Psychobiology, Universitat Jaume I, Castelló de la Plana, Castelló, Spain, Centre for Biomedical Research Network on Mental Health (CIBERSAM), Instituto de Salud Carlos III, Castelló de la Plana, Castellón, Spain

  • Generós Ortet,

    Roles Conceptualization, Funding acquisition, Investigation, Supervision, Writing – original draft, Writing – review & editing

    Affiliations Department of Basic and Clinical Psychology and Psychobiology, Universitat Jaume I, Castelló de la Plana, Castelló, Spain, Centre for Biomedical Research Network on Mental Health (CIBERSAM), Instituto de Salud Carlos III, Castelló de la Plana, Castellón, Spain

  • Cross-Cultural Addictions Study Team

    Roles Conceptualization, Funding acquisition, Investigation

    Membership of the Cross-Cultural Addictions Study Team is provided in the Acknowledgments.

Cross-cultural examination of the Big Five Personality Trait Short Questionnaire: Measurement invariance testing and associations with mental health

  • Laura Mezquita, 
  • Adrian J. Bravo, 
  • Julien Morizot, 
  • Angelina Pilatti, 
  • Matthew R. Pearson, 
  • Manuel I. Ibáñez, 
  • Generós Ortet, 
  • Cross-Cultural Addictions Study Team


The present study examined the measurement invariance of the Big Five Personality Trait Short Questionnaire (BFPTSQ) across language (Spanish and English), Spanish-speaking country of origin (Argentina and Spain) and gender groups (female and male). Evidence of criterion-related validity was examined via associations (i.e., correlations) between the BFPTSQ domains and a wide variety of mental health outcomes. College students (n = 2158) from the USA (n = 1117 [63.21% female]), Argentina (n = 353 [65.72% female]) and Spain (n = 688 [66.86% female]) completed an online survey. Of the tested models, an Exploratory Structural Equation Model (ESEM) fit the data best. Multigroup ESEM and ESEM-within-CFA generally supported the measurement invariance of the questionnaire across groups. Internalizing symptomatology, rumination and low happiness were related mainly to low emotional stability across countries, while low agreeableness and low conscientiousness were related chiefly to externalizing symptomology (i.e., antisocial behavior and drug outcomes). Some correlational differences arose across countries and are discussed. Our findings generally support the BFPTSQ as an adequate measure to assess the Big Five personality domains in Spanish- and English-speaking young adults.


According to a biopsychosocial model of psychopathology, several biological, psychological and social variables have been indicated to impact health outcomes [1]. One of the nonspecific distal psychological variables that influences psychopathology development is personality [2]. Personality traits have been associated with other outcomes, such as happiness [3], academic and job performance, antisocial and criminal conduct [4], and a broad spectrum of health-related behaviors [5,6].

The Five-Factor Model (FFM; a.k.a. Big Five) is one of the most widely accepted structural personality models [7]. The FFM proposes five broad personality traits: openness to experience, extraversion, agreeableness, conscientiousness, and neuroticism (or its positive pole, emotional stability). Openness represents individual differences in curiosity, fantasy, appreciation of art and beauty, and social attitudes. Extraversion reflects individual differences in sociability, social ascendency, activity, excitement seeking, and positive emotionality. Agreeableness reveals individual differences in compliance, empathy, collaboration, and altruism. Conscientiousness represents individual differences in being methodical, planning, impulse control, and respecting and abiding by conventional social norms and rules. Neuroticism refers to individual differences in the tendency to experience frequently and intensively negative emotions, such as anxiety, fear, depression and irritability, as well as having low self-esteem [8].

Traditionally, measures to assess personality traits encompass many items and are, therefore, time-consuming. Several shorter alternatives, including the Big Five Questionnaire-Children version [BFQ-C; 9], the Big Five Inventory [BFI; 7,10], the Ten-Item Personality Inventory [TIPI; 11], the Mini-International Personality Item Pool Big Five Measure [Mini-IPIP; 12], Mini Modular Markers [3M40; 13], the NEO Five-Factor Inventory-3 [NEO-FFI-3; 14], the short form of the Junior Spanish version of the NEO-PI-R [JS NEO-S; 15], and the Big Five Personality Trait Short Questionnaire [BFPTSQ; 8], have been developed. These brief versions are particularly useful when there is limited administration time and/or when the target population’s characteristics (e.g., adolescents, elderly) impede the use of full versions.

Among these short personality measures, the BFPTSQ [8] has a number of advantages. First, it has wider conceptual breadth (i.e., content validity) than most available measures, particularly very short ones. Indeed, many available short personality measures suffer from limited conceptual breadth, which essentially means that these measures do not represent a number of important lower-order or primary traits [see 16]. When developing the BFPTSQ, Morizot built it from the initial pool of the English BFI items [7,10], but added 8 new items to tap into important primary traits that were missing in the original short measures (e.g., sensation seeking, impulsiveness, openness to cultural differences, etc.). Further, 2 items that could generate confusion were deleted (“prefer work that is routine” and “generates a lot of enthusiasm”). Second, 36 items were reworded to be more easily understood than in the original BFI, so they may be utilized in assessments with both adolescents and adults (S1 Table presents the item correspondence between the BFI and the BFPTSQ). This is particularly important for long-term longitudinal studies because they often employ different measures for adolescents and adults depending on the participants’ ages at the time points when assessments are conducted. In such cases, determining whether differences are due to real changes in traits or to the measure taken to assess personality during different developmental periods is no easy task. Finally, this instrument is in the public domain, so it can be used freely by researchers for applied purposes.

Originally, the psychometric properties of the BFPTSQ were examined in French-speaking adolescents from Quebec [8]. The French BFPTSQ scores showed adequate psychometric properties, including evidence for content and structure validity and adequate internal consistency [8]. The BFPTSQ scores also correlated with the NEO-PI-3 [14], supporting its convergent validity. Finally, criterion validity was demonstrated by predicting psychopathology symptoms (i.e., conduct disorder, major depression disorder, attention deficit hyperactivity disorder, bipolar disorder, oppositional defiant disorder, social phobia, substance use and generalized anxiety disorder) and academic achievement (i.e., grade point average). Moreover, this version was found to be invariant across gender groups [8].

The Spanish BFPTSQ was adapted and validated in a sample of Spanish adults [17]. Findings supported not only the structure reported by Morizot [8], but also the criterion-related validity (e.g., correlations between emotional stability and extraversion with happiness, and low conscientiousness and extraversion with alcohol consumption). Notably, these are the only two studies that have examined the psychometric properties of the BFPTSQ. To our knowledge, no previous work has examined the adequacy of the English BFPTSQ version to assess personality traits in English-speaking populations to date. Interest in understanding human psychology and behaviors outside traditionally studied cultures has increasingly grown [i.e., Western populations; 18,19], particularly by conducting cross-cultural research. However, a first key step to conducting cross-cultural research is to demonstrate that a questionnaire works in similar ways (i.e., measurement invariance) across countries, languages or other groups (e.g., gender). Only when measurement invariance is met is it legitimate to make valid comparisons of results across groups. Lack of measurement equivalence can lead to biased conclusions being drawn about potential cross-cultural differences [20].

In addition, although there is currently an increasing demand for short scales, their construction is not exempt of difficulties [21]. Following recommendations proposed by Ziegler et al. [21], and taking into account that our research purpose is assessing the Big Five in college/university students from different countries, we explored the factorial validity of the BFPTSQ with rigorous statistical strategies: 1) a structural equation modeling (i.e., ESEM) approach, 2) a calculation of different internal consistency indices (rather than just Cronbach’s alpha), and 3) correlational analyses across groups examining the empirical evidence supporting the interpretation of the test scores (i.e., criterion validity).

Specifically, in the present study we: (a) test the BFPTSQ structure in two different Spanish-speaking countries (Argentina and Spain); b) test the structure of the English BFPTSQ version; (c) test the measurement invariance across countries (Argentina vs. Spain), languages (English vs. Spanish) and gender groups; (d) explore the internal consistency of the scales among groups; (e) examine the associations among the five BFPTSQ domains and a large number of psychological constructs (i.e., psychopathology, antisocial behavior, marijuana use and negative marijuana-related consequences, rumination and happiness) in college students from the USA, Argentina and Spain (i.e., criterion-related validity). We focused on this set of variables because substance use [22,23] and mental health problems [2426] are particularly insidious among college students. Therefore, a valid measure like the BFPTSQ will facilitate the cross-cultural examination of personality traits and their associations with a large set of outcomes in college students from different cultures/countries. It will also be useful for identifying college students at more risk of developing substance-related and mental health problems.


Participants and procedure

College students from one university in Spain, one university in Argentina, and two universities in the USA completed an online survey on personality traits, personal mental health and marijuana use behaviors [for more information see 27]. Although 2192 college students completed the BFPTSQ, only those cases with less than 5% of missing values were retained. After deleting these cases (n = 29), and the five cases who failed to report their gender, the final sample included 2158 undergraduate students. Table 1 presents the descriptive statistics of the three samples.

Before the assessment of the participants, the ethic committee of the Universidad de Córdoba and Universitat Jaume I approved the study, as well as the Collaborative Institutional Training Initiative (CITI program) in the USA universities (ID: 21636999 and 21637000).


At all the university sites, the participants were administered the questionnaires below.

Big Five personality traits.

Personality traits were assessed with the 50-item Big Five Personality Trait Short Questionnaire [BFPTSQ; 8] at the US universities, and the Spanish version [17] at the sites in Argentina and Spain. This measure assesses the FFM personality traits on a 5-point Likert-type scale (0 = Strongly Disagree, 4 = Strongly Agree): openness, extraversion, agreeableness, conscientiousness and emotional stability. In the present study, all the reversed items were indicated with an r after the item number (e.g., 31r). Responses were summed on all five scales and divided by the number of their items [10]. Thus, the scale scores in the present study ranged from 0 to 4.

Mental health.

Past 2-week psychopathology was assessed using the 23-item DSM-5 Self-Rated Level 1 Cross-Cutting Symptoms Measure-Adult [29]. For Spanish-speaking students, the Spanish version was administered [30]. Participants are asked, “During the past two weeks, how much (or how often) have you been bothered by the following problems?” and responded on a 5-point response scale (0 = none, not at all, 1 = slightly or rarely, less than a day or two; 2 = mild, several days; 3 = moderately, more than half the days, 4 = severely, nearly every day). A score of 2 or higher in most domains, except substance use (score of 1 or higher), is suggestive of clinically-relevant mental health problems [31]. The measure has been validated with both clinical [31] and college students [26] samples.

The Cronbach’s alpha of the total scale in the current total sample was .92 (US: .94, Arg: .89, Sp: .88). In the case of the subscales with more than one item the Cronbach’s alphas were: depression .78 (US: .82, Arg: .77, Sp: .67), mania .55 (US: .65, Arg: .40, Sp: .45), anxiety .80 (US: .86, Arg: .74, Sp: .70), somatic distress .65 (US: .72, Arg: .62, Sp: .55), psychosis .83 (US: .86, Arg: .67, Sp: .78), repetitive thoughts and behaviors .76 (US: .77, Arg: .75, Sp: .75), and personality functioning .79 (US: .82, Arg: .75, Sp: .76).

The prevalence rates of potential symptom presentation for the 13 domains (for those domains with multiple items, percentages were averaged) of the DSM-5 Cross-Cutting Symptoms Measure in the whole sample were as follows: depression (34.60%), anger (37.17%), mania (25.40%), anxiety (24.06%), somatic distress (17.51%), suicidal ideation (10.50%), psychosis (4.60%), sleep disturbance (29.55%), memory problems (15.79%), repetitive thoughts and behaviors (12.22%), dissociation (16.60%), and personality functioning (22.48%). For substance use, the rates for specific substance are presented: alcohol use (31.72%), tobacco (24.14%), and illicit drug use (19.20%).

Significant differences in the prevalence rates of the following symptoms between countries were found: depression (US: 29.59%, Arg: 38.53%, Sp: 40.70%, χ2(2) = 26.06, p < .001), anger (US: 34.47%, Arg: 39.94%, Sp: 40.12%, χ2(2) = 7.18, p < .05), mania (US: 22.21%, Arg: 23.51%, Sp: 31.40%, χ2(2) = 19.70, p < .001), anxiety (US: 26.35%, Arg: 24.65%, Sp: 20.06%, χ2(2) = 19.70, p < .001), suicidal ideation (US: 12.15%, Arg: 12.15%, Sp: 7.99%, χ2(2) = 7.85, p < .05), psychosis (US: 6.56%, Arg: 1.98%, Sp: 2.76%, χ2(2) = 20.60, p < .001), memory problems (US: 17.90%, Arg: 14.45%, Sp: 13.08%, χ2(2) = 7.98, p < .05), alcohol use (US: 36.35%, Arg: 32.86%, Sp: 23.69%, χ2(2) = 31.62, p < .001), and illicit drug use (US: 20.00%, Arg: 22.66%, Sp: 16.13%, χ2(2) = 7.36, p < .05).

Antisocial behavior.

Antisocial behavior was assessed with the Antisocial Behavior Scale [ABS; 32]. The ABS contains 35 items that describe various antisocial behaviors (i.e. “I have broken, ripped, or damaged public properties” or “I have used knives or sticks in fights”) on a 4-point response scale (1 = Never or Almost Never, 4 = Very Frequently or Very Often). Summing the responses to all the items provides a total score. A previous project undertaken by the research team translated the ABS into and adapted it to English. The preliminary results revealed that the scores for the Spanish and English ABS versions displayed good internal consistency. The various differential item functioning analysis indicated that items generally operate similarly across the three participating countries [33]. The Cronbach’s alpha of the scale in the current total sample was .93, and by country were: .95 US, .87 Argentina and .92 Spain.

Negative marijuana-related consequences.

The 21-item B-MACQ was employed to assess negative marijuana-related consequences [34]. All the items scored dichotomously to reflect the absence/presence of any marijuana-related problem in the last month (0 = no, 1 = yes). The total score reflects all the consequences that individuals experienced in the last 30 days. Previous research supports the test-retest reliability, as well as the discriminant and convergent validity, of the B-MACQ [34], and has also measured invariance and criterion validity across cultures and languages [i.e., 27]. The Cronbach’s alpha of the scale in the current total sample was .87, and by country were: .89 US, .81 Argentina and .86 Spain.

Marijuana use.

Frequency of marijuana use was assessed by this question: “How many days in the last 30 days have you used marijuana?” If the participants responded 1 or higher, they completed the marijuana quantity measure. To report the consumed amount of marijuana, the participants were administered a visual guide indicating several amounts of marijuana in grams. Their typical weekly marijuana use in the last 30 days was assessed by the Marijuana Use Grid [MUG; 35]. The participants were asked to estimate the amount of marijuana they used in grams during each 4-hour period per day of a typical week. By adding all the values, an estimate of the typical amounts of marijuana used was made, which reflected the total grams marijuana they used in a typical week.


Rumination was measured by the Ruminative Thought Style Questionnaire [RTSQ; 36]. The participants were asked to express how well each item described them on a 7-point response scale (1 = Not at all, 7 = Very Well). In Argentina and Spain, the Spanish RTSQ version was utilized [see the translating and adaptation procedures in 37]. According to the former findings obtained with the USA, Argentinian and Spanish samples, a 15-item version of this measure was employed, which proved invariant across genders and countries [37]. The Cronbach’s alpha of the scale in the current total sample was .94, and by country were: .95 US, .94 Argentina and .94 Spain.


One question was about general happiness. The participants had to respond about how happy they felt in general that day (by attempting to ignore any feelings they had yesterday) on a 10-point scale (1 = Completely Unhappy to 10 = Completely Happy).

Statistical analysis

All the analyses were done with version 25 of SPSS and version 7.4 of Mplus [38]. The robust maximum likelihood estimator (MLR) was used in each analysis conducted in Mplus. The MLR provides adjusted standard errors and statistical fit tests that are robust to data non-normality. The 99% confidence intervals (CI) of the relevant estimates were calculated and reported. Two model types were employed to assess factor validity: the ICM-CFA (independent clusters model confirmatory factor analysis) and the ESEM (exploratory structural equation modeling) with target loading rotation.

In line with Marsh et al. [39] and Morizot [8], all the factor models were estimated both with and without a priori correlated uniquenesses, employed to reflect that some items relate to the same primary trait or subdomain, and they share either a similar content, but reversed scores, or contain the same word. Twenty-seven a priori correlated uniquenesses were posited. Specifically, the correlated uniquenesses introduced for openness were: 1 with 21, 11 with 36, 16 with 21, 26 with 41r, 26 with 46, 1 with 16, 41r with 46; for extraversion: 7r with 32r, 2 with 22r, 12 with 42, 2 with 27, 17 with 27; for agreeableness: 18 with 23, 8 with 33, 23 with 33, 23 with 43, 18 with 43; for conscientiousness: 29 with 39, 19r with 24r, 19r with 39, 9r with 19r, 4 with 14; and for emotional stability: 10 with 35, 10 with 15r, 5r with 25, 5r with 45r, 30r with 50r. A detailed description of the conducted ESEM and ICM-CFA models can be found in Morizot [8].

The model fit assessment was made according to various indices [40]. The chi-square test was run for all the models. Although a nonsignificant chi-square indicates a good fitting model, this test is generally too sensitive with large sample sizes. Thus, other fit indices were calculated. Values of .08 or lower for the root mean square error of approximation (RMSEA), values of .90 or more for the comparative fit index (CFI) and Tucker–Lewis index (TLI) and values of .10 or less for the standardized root mean square residual (SRMR) suggest acceptable model fit [41,42]. For the RMSEA 90% CI values, those under .05 for the lower bound and under .08 for the upper bound indicate acceptable fit [43].

After identifying the best factor model, factor structure was tested in each country, and measurement invariance was tested between the Spanish-speaking groups (Argentina and Spain), the Spanish and English versions, and across gender groups using multi-group ESEM. These models were assessed with a series of increasingly stringent multiple-group models (see [8]): configural invariance (MG1; all the loadings, intercepts and uniquenesses are freely estimated, with latent variances being constrained to 1 and latent means to 0), metric invariance (MG2; loadings constrained to invariance to make free estimations of the factor variances in one group), scalar invariance (MG3; intercepts constrained to invariance, to make free estimations of the factor means in one group), strict invariance (MG4; uniquenesses constrained to equality), correlated uniquenesses invariance (MG5), variance/covariance invariance (MG6; they must all be done simultaneously in ESEM), and latent means invariance (MG7). For all the models in this sequence, the imposed constraints are additive and the preceding model acts as a reference.

If there is evidence for noninvariance of the factor loadings across groups, as partial factor loading invariance cannot be tested in ESEM [44,45], an ESEM-within-CFA (ES-W-C) multi-group model was utilized. For the ES-W-C model, all parameter estimates from the ESEM solution were used as starting values. In addition, we added a total of 25 constraints (the square of the number of factors) to the ES-W-C model so that it was identified. Specifically, the 5 factor variances for the first group of the multiple group solution and the 20 “anchor items” were fixed. The anchor item or referent indicator for each factor is the item that has a large loading for the factor that it is designed to measure and small cross-loadings on other factors. Then these small cross-loadings were fixed to their values from the ESEM solution. This allowed a higher level of convergence with the ESEM solution. For all other parameter estimates, the patterns of the fixed and free estimates were the same as in the selected ESEM solution [44]. It is noteworthy that, in ES-W-C, the factor variances were fixed to one in the first group to identify the model. Then the covariances invariance across groups was tested, rather than the variance/covariance invariance.

To assess changes in the model fit tests, the Satorra-Bentler scaled chi-square test [46] was computed. However, the chi-square difference test is sensitive to sample size [47]. For this reason, more comparisons in the increment of other indices were made to test the invariance between less and more constrained models. In order to consider a model to be invariant, the ΔCFI should be ≤.010 and the ΔRMSEA should be ≤ .015 [48,49].

In both the Spanish and English questionnaire versions, sources of reliability were explored by resorting to Cronbach’s alphas and ordinal omegas [50]. The sources of evidence for criterion validity were explored with Pearson correlations among all the personality dimensions and psychopathology, antisocial behavior, marijuana outcomes, rumination and happiness in all three countries.


Descriptive analysis

Table 1 provides the descriptive statistics (means/standard deviations) for all the personality dimensions, criterion variables and the participants’ ages for the whole sample and per country. The comparison made of the magnitude of the mean differences across countries indicated that, despite medium (USA and Spain; Spain and Argentina) and large (USA and Argentina) differences in the participants’ ages across countries, all the differences in personality and the criterion variables were small (all the ds were below .50).

Factor structure

When studying the BFPTSQ structure in the total sample, the best fitting model was the ESEM model, in which correlated uniquenesses were allowed (M2b). See the fit indices of all the models performed in Table 2. Table 3 reports the standardized factor loadings for the whole sample. All the items had significant factor loadings on its hypothesized factor, except for item 31r (“Is not really interested in different cultures, their customs and values”) and 41r (“has few artistic interests”) on the openness factor. All the items for the conscientiousness factor and emotional stability presented the highest factor loading on their intended factor. Eight extraversion factor items showed the highest factor loadings on its hypothesized factor, while items 42 (“likes exciting activities that provide thrills”) and 47 (“has a tendency to laugh and have fun easily”) showed the highest factor loadings on the openness factor. Five agreeableness factor items had the highest factor loadings on its intended factor (items 3r, 13r, 28r, 38r, 48r), and five items showed similar cross-loadings between agreeableness and the openness factor (items 8, 18, 23, 33, 43). Table 4 shows the latent factor correlations from the final ESEM and the ICM-CFA.

Table 2. Goodness-of-fit statistics from the confirmatory factor analytic, Exploratory Structural Equation Models (ESEM) and ESEM-Within-CFA (EWC) Models.

Table 3. Standardized factor loadings from the exploratory structural equation model of the BFPTSQ in the whole sample (M2b) and in the English and Spanish versions of the questionnaire (MG1).

Upon finding the best factor solution, an ESEM was performed in each country. Fit indices were acceptable for the Spanish sample. In the Argentinian sample, the CFI and TLI were close, but lower than .90. However, the RMSEA, RMSEA 90% CI values and SRMR were adequate (≤.05). In both samples, factor loadings were salient and significant in its hypothesized factor, except for items 31 and 41 in the openness to the experience factor (see S2 Table). All the items for conscientiousness, agreeableness and emotional stability presented the highest factor loading on their intended factor, while items 27 (“show self-confidence, is able to assert himself/herself”) and 42 (“likes exciting activities that provide thrills”) from the extraversion factor showed similar cross-loadings in the extraversion and openness to the experience factor.

The fit indices of the English version were adequate. The factor loadings of the English version are presented in Table 3 (i.e., as they are the same as those obtained in the configural invariance model across the English and Spanish versions), and they were very similar to those found in the whole sample.

Measurement invariance

A few minor differences emerged across groups when studying the invariance of the Spanish questionnaire version between the Argentinian and Spanish participants. Constraining the intercepts across Spanish speakers resulted in a ΔRMSEA below .015, and ΔCFI was -.017 (MG3). Hence a model with a partial invariance of intercepts (MG3b) was estimated. Based on the modification indices, four items across groups were freed: 13r (“provokes quarrels or arguments with others”), 38r (“can sometimes be rude or mean to others”); 43 (“likes to cooperate with others”) (Arg > Sp) from the agreeableness factor; 49r (“can do things impulsively without thinking about the consequences”) (Arg > Sp) from the conscientiousness factor. This model gave a better fit than the model with the fully invariant intercepts and ΔCFI ≤ .01. When further constraints were included (MG4 to MG7), ΔCFI was ≤ .01 and ΔRMSEA was ≤ .015, which suggested reasonable invariance across groups. Considering that the structure of the Spanish BFPTSQ had been previously studied [17], and small differences across Spanish-speaking samples had also been found, the Argentinian and Spanish samples were considered together when the structure of the Spanish version was compared with the English version.

To test the measurement invariance of the English and Spanish (Spanish and Argentinian combined samples) versions, a configural invariance model was performed (MG1). This model showed acceptable fit indices as it can be seen in Table 2. Its factor loadings are presented in Table 3. When the factor loadings were constrained across Spanish and English speakers, ΔRMSEA was .001 and ΔCFI was -.016 (MG2). Therefore, an ES-W-C was run to test the partial metric invariance (MG2b). According to the modification indices, six factor loadings (6 of 250) were freely estimated across groups. One was a difference in the nonstandardized factor loadings of one item on their target factor: 20r (“worries a lot about many things”) on the emotional stability factor (Spanish = .454 [.361 .546]; English = .772 [.668 .876]). The others were differences in cross-loadings. Adding constraints between the intercepts across groups also indicated differences (MG3, ΔCFI = -.025). Thus, a model with partial invariance of intercepts (MG3b) was estimated. According to the modification indices, eight items were freed across groups: 4 (“works conscientiously, does the things he/she has to do well”) (Eng < Sp); 9r (“can be a little careless and negligent”) (Eng > Sp); 11 (“Is ingenious, reflects a lot”) (Eng < Sp); 20r (“worries a lot about many things”) (Eng > Sp); 22r (“is rather quiet, does not talk much”) (Eng > Sp); 28r (“can be distant and cold with others”) (Eng > Sp); 32r (“is timid, shy”) (Eng > Sp); 36 (“likes to reflect, tries to understand complex things”) (Eng < Sp). This model indicated a better fit than the model with the fully invariant intercepts and gave ΔCFI ≤ .01. Including additional constraints (MG4-MG7) gave ΔRMSEA ≤ .015 and ΔCFI ≤ .01, which suggests invariance across groups. Note that for convergence problems, in the case of invariance between the English vs. the Spanish version, the correlated uniquenessess invariance (MG4) was tested first followed by the measurement errors invariance (MG5), rather than backwards.

The results of the invariance analyses done across gender are also presented in Table 2 and indicated that this model was completely invariant (all the ΔCFI ≤ .01, and the ΔRMSEA ≤ .015) when specifying the constraints among factor loadings (MG2), intercepts (MG3), measurement errors (MG4), correlated uniqueness (MG5), variances and covariances (MG6) and factor means (MG7) across groups of males and females.

Internal consistency

Table 5 shows the internal consistency indices. Cronbach’s alpha and the ordinal omega of all the scales were .70 or higher, except for the ordinal omega of the agreeableness scale in the English version, which came close to the recommended cut-off of .70 (i.e., .689 [.617 .754]).

Criterion-related validity

The correlations between personality domains and criterion variables in all three countries are presented in Table 6. The results demonstrated that health outcomes were related to low emotional stability, low agreeableness, low conscientiousness and low extraversion. Internalizing symptomatology (i.e., depression, anxiety and somatic distress) showed the closest associations with low emotional stability in all three countries. Some externalizing behaviors were related to low agreeableness and low conscientiousness in the three countries (e.g., antisocial behavior), and others in only USA and Spain (e.g., alcohol, tobacco and illicit drug use). The correlations found for personality dimensions and the marijuana-related variables were low, but most of the significant associations were found with low conscientiousness. Finally, rumination in the three countries was related mainly to low emotional stability, and also to low conscientiousness, low agreeableness and low extraversion, but to a lesser extent. Happiness correlated mainly with emotional stability, followed by extraversion.

Table 6. Correlations between the five dimensions of the BFPTSQ and the criterion variables.

In order to determine if personality dimensions were related differentially to distinct criterion variables across countries, the absolute value of the differences in the magnitude of the correlations for pairs of countries was computed and is presented in Table 7. As the statistical tests of these differences can be oversensitive to small differences when including differences in sample sizes across countries, attention was paid to the magnitude of these differences. The average difference in correlations was .070 (SD = .055) across 330 possible comparisons. The results were interpreted using the following: differences <1 SD were small, differences between 1 SD and 2 SD were medium, those between 2 SD and 3 SD were large, and any over 3 SD were substantial. Results presented in Table 7 showed that large or medium size correlation differences across countries were found between conscientiousness and some health outcomes (i.e., anger, mania, anxiety, somatic distress, suicidal ideation and memory) (higher correlations in US than in Argentina or Spain), and also between low agreeableness and low conscientiousness with drug outcomes (higher correlations in US or Spain than in Argentina).

Table 7. Absolute value of the correlation differences across countries between the five dimensions of the BFPTSQ and the criterion variables.


The present study examined different sources of validity of the English and Spanish versions of the BFPTSQ [8,17] in college students from US, Argentina and Spain. Specifically, we examined whether the BFPTSQ was invariant across two Spanish-speaking populations (Spain and Argentina), across languages (Spanish and English) and across gender. The criterion-related validity was examined via associations among the five BFPTSQ domains and a large set of psychological constructs (i.e., psychopathology, antisocial behavior, marijuana use and negative marijuana-related consequences, rumination and happiness) in the full sample as well as within each country.

Evidence for internal structure validity

Not surprisingly, when considering previous work that have examined complex structures such as the Big Five [8,17,39], the factor analysis results for the whole sample suggested that ESEM provided a better data fit than the ICM-CFA. Thus, the fact that ESEM allows for all possible factor loadings appears to better approximate the true model than the ICM-CFA. The present study included 125 statistically significant cross-loadings of the 250 possible factor loadings (i.e., 50%). The inclusion of cross-loadings affected the intercorrelations among the personality dimensions as it was shown in Table 4. In the ICM-CFA, cross-loadings were set at 0, and the factor correlations were vastly inflated as this is how these cross-loadings can be represented [8,39]. However, the ESEM not only provides factor correlations that probably come closer to the true population parameters, but also supports the discriminant validity among the Big Five traits as measured by the BFPTSQ [8].

Our findings also indicated an improved model fit when correlated uniquenesses were allowed. Despite including the correlated uniquenesses between the items that were reversed-coded within the same factor or shared the same words being conceptually defensible and increasing the model’s fit, they also inevitably reduced the size of the factor loadings as factors had less variance left to explain. This was salient for the openness factor, which allowed seven correlated uniquenesses and provided lower factor loadings. However, it was noteworthy that the ESEM model’s fit was acceptable even when correlated uniquenesses were allowed, but was still far from excellent according to the typical criteria suggested for practical fit indices [51]. Morin et al. [45] noted that the adequacy of these typical criteria has yet to be demonstrated with ESEM.

All the items presented significant factor loadings on their intended target factor, except items 31 (“is not really interested in different cultures, their customs and values”) and 41 (“has few artistic interests”) of the openness factor (in both the whole sample and the Spanish- and English-speaking samples). A previous study conducted with a general Spanish population sample indicated the primary factor loadings of items 31 and 41 respectively on the openness factors of .38 and .58. The factor loadings for French-speaking [8] and Spanish adolescents [52] were also low. Taken together, and considering that the wording of the items is simple (which arguably implies fewer translation/adaptation problems), these findings suggest that they may not be that suitable for specific populations (i.e., adolescents and young adults) compared to general or adult populations. The rewording, or even the elimination, of these items should be considered in future research, chiefly because the BFPTSQ was developed to supply a useful valid measure for longitudinally assessing personality dimensions across development (i.e., from adolescence to adulthood).

With the extraversion dimension, all the items showed salient factor loadings on their intended factor (i.e., > .30), except for item 42 (“likes exciting activities that provide thrills”). This sensation seeking-related item had a factor loading of .27 on extraversion, and a factor loading of .38 on openness. Previous studies have indicated that sensation seeking tends to be openness-related [53,54].

All the emotional stability and conscientiousness items showed salient factor loadings on their intended factor, but some items in the agreeableness dimension also cross-loaded on the openness factor. As expected, the reverse-coded items that indicated antagonism or low agreeableness loaded on the agreeableness factor, while the positively worded items of this domain cross-loaded on the openness factor. Future revisions of the scale should consider these findings, and the fact that the use of positively worded items and reversed forms in the same scale (e.g., agreeableness) to reduce response bias has been questioned. Suárez-Alvarez et al. [55] illustrated this point and found that this common practice jeopardizes a measure’s unidimensionality by adding secondary sources of variance, and also reduces its reliability.

Sources of structure validity across groups

In line with previous research, our findings supported the measurement invariance of the BFPTSQ across gender groups [8]. The present findings extend these results by suggesting that the BFPTSQ is also invariant across Spanish-speaking countries. This is a key milestone in cross-cultural research as comparisons between cultures/countries are not valid unless measurement invariance is met [56,57]. Of all the possible comparisons based on CFI changes (in intercepts, factor loadings, uniquenesses, factor variances/covariances, factor latent means and correlated uniquenesses among groups), only four differences were found in the intercepts of the Spanish and Argentinian students. Compared to the Spanish students, the Argentinian ones scored higher for three agreeableness items, which cover the facets of compliance (13r and 38r) and cooperation (43), and also for one conscientiousness item, which covers the facet of deliberation (49r).

Our measurement invariance results across languages (i.e., Spanish and English) revealed that all the agreeableness items loaded primarily on their intended factor in the Spanish-speaking sample. However, in the English-speaking sample, the positively-worded agreeableness items had similar factor loadings on the agreeableness and openness factors as it was shown in Table 3. Having empirically tested the magnitude of these differences, only one difference was found in a primary factor loading: item 20r (“I see myself as someone who worries a lot about many things”). Despite the factor loading of this item being higher in the English-speaking sample than in the Spanish-speaking one, the factor loading was salient and significant in both samples. This finding indicated that it adequately represented its dimension in both groups.

The addition of constraints between intercepts only led to a few noninvariant intercepts across languages. Spanish speakers tended to obtain higher scores than English-speakers for one conscientiousness item tapping the self-discipline facet (4) and for two openness items tapping the intellectual inquisitiveness facet (11 and 36). Compared to the Spanish-speaking participants, English speakers scored higher for: a) two extraversion items tapping the expressiveness (22r) and sociability (32r) facets; b) one for the agreeableness item tapping the compassion facet (28r); c) one conscientiousness item tapping the order facet (9r); and d) one emotional stability item tapping the worry facet (20r). Although a few intercepts and noninvariant loadings were observed across groups based on CFI differences, the RMSEA differences still suggested that the model was completely invariant across languages. The noninvariance of some intercepts and loadings was based mainly on the proposed typical criteria of changes in the fit indices. Nevertheless, these criteria are rough guidelines [42] and some researchers have questioned their validity [58,59], especially for ESEM [45]. Hence rejecting the hypothesis of the invariance of complex models with several items based simply on these typical criteria might not be constructive. Taken together, the results herein obtained suggest that it can be reasonably assumed that the BFPTSQ factor structure offers acceptable measurement invariance across languages.

Criterion-related validity

The present study aimed to examine the association between the BFPTSQ scores with a wide diversity of outcome variables by particularly focusing on substance use (or substance-related variables) and poor mental health outcomes. As already noted, these behaviors are highly prevalent in college students around the world and a valid, yet brief, version will most likely facilitate both cross-cultural studies and routine interventions to detect students at high risk of developing substance use and/or mental health problems. As in previous studies, low emotional stability, low agreeableness, low conscientiousness and low extraversion were related to mental health outcomes [2,60]. Internalizing symptomatology (i.e., depression, anxiety and somatic distress) was related mainly to low emotional stability [2,60], while antisocial behavior was related to low agreeableness and low conscientiousness in all three countries [33,61].

Other externalizing behaviors, such as drug use, also showed significant correlations with low agreeableness and low conscientiousness, as in previous meta-analysis [2,61], at least in the USA and Spain. The association of disinhibition domains with drug outcomes was less consistent in Argentina and, consequently, some differences in the magnitude of the correlations arose across countries (i.e., medium-size difference in correlations between low conscientiousness and low agreeableness with alcohol use and tobacco use, respectively, in the USA and Spain compared to Argentina). The correlations previously reported between marijuana-related outcomes and conscientiousness/agreeableness [6265] were only replicated clearly in the USA sample. When we calculated the absolute value of the correlation differences across countries, the only large difference found (i.e., that was between 2 SD and 3 SD above the average difference in the magnitude of the correlations) was between conscientiousness and marijuana-related problems in the USA and Argentina. Lack of a significant negative association between marijuana-related problems and conscientiousness in the Argentinean sample was somewhat unexpected, as in the case of low conscientiousness and low agreeableness with alcohol, tobacco and illicit drug use. Future research should replicate this finding to know if the disinhibition-related domains assessed within the FFM framework could influence drug outcomes in Argentinian college students.

In line with previous studies conducted using the BFPTSQ and other measures to assess the FFM, happiness was mainly related to both emotional stability (or low neuroticism) and extraversion [17,66,67], while rumination was related chiefly to conscientiousness, emotional stability, agreeableness and extraversion [37].

Limitations and conclusions

Our research is not without its limitations. First, even though the BFPTSQ’s psychometric properties have been previously explored in a Spanish population [17], this is the first time that the English version structure has been tested. Based on some of our results (i.e., the nonsignificant factor loadings of items 31 and 41 on the openness factor, or the cross-loadings between the openness and agreeableness items), replication studies are needed before the questionnaire can be modified (i.e., remove or substitute items). Our sample comprised of university students, and future research should examine the reliability and construct validity of the English questionnaire version in both adolescent and general adult populations. Finally, our work explored evidence for criterion validity with a limited number of outcomes (i.e., psychopathology, antisocial behavior, marijuana-related outcomes, rumination, and happiness). Previous work has found an association between personality and a wide range of other health-related behaviors (e.g., work and educational outcomes [5,68]). Future research that employs the BFPTSQ could benefit from including more criterion variables.

Despite its limitations, the present research supports the BFPTSQ’s factor validity, the reasonable invariance of the measure across genders, across two Spanish-speaking populations, and between Spanish and English speakers. It also evidences the scales’ reliability and criterion validity (associations with distinct health outcomes). Taken together, our results suggest the BFPTSQ is a useful short measure for assessing the FFM broad domains between English and Spanish speakers, at least for young adults from the USA, Argentina and Spain.

Supporting information

S1 Table. Correspondence of the English / Spanish version of the BFI with the items of the BFPTSQ.

A) English, B) Spanish. In bold are marked the items that are identical in both questionnaires. The items can also be found in: Benet-Martinez, V., & John, O. P. (1998). Los Cinco Grandes across cultures and ethnic groups: Multitrait multimethod analyses of the Big Five in Spanish and English. Journal of Personality and Social Psychology, 75, 729–750; and Morizot, J. (2014). Construct validity of adolescents’ self-reported big five personality traits: importance of conceptual breadth and initial validation of a short measure. Assessment, 21, 580–606.


S2 Table. Standardized Factor Loadings from the Exploratory Structural Equation Model of the BFPTSQ in Argentinian and Spanish Samples (M2b).

The columns “Argentina” and “Spain” correspond to the factor loadings of the M2b model (see Table 2). Bold denotes all the significant factor loadings (the 99% CI does not cross zero). λ = factor loadings; δ = uniquenesses.



This project was completed by the Cross-cultural Addictions Study Team (CAST), which includes the following investigators (in alphabetical order): Adrian J. Bravo, William & Mary, USA (Coordinating PI); James M. Henson, Old Dominion University, USA; Manuel I. Ibáñez, Universitat Jaume I de Castelló, Spain; Laura Mezquita, Universitat Jaume I de Castelló, Spain; Generós Ortet, Universitat Jaume I de Castelló, Spain; Matthew R. Pearson, University of New Mexico, USA; Angelina Pilatti, National University of Cordoba, Argentina; Mark A. Prince, Colorado State University, USA; Jennifer P. Read, University of Buffalo, USA; Hendrik G. Roozen, University of New Mexico, USA.


  1. 1. Engel GL. The need for a new medical model: A challenge for biomedicine. Science. 1977;196:129–36. pmid:847460
  2. 2. Kotov R, Gamez W, Schmidt F, Watson D. Linking “Big” personality traits to anxiety, depressive, and substance use disorders: A meta-analysis. Psychol Bull. 2010;136:768–821. pmid:20804236
  3. 3. Deneve KM, Copper H. The happy personality: A meta-analysis of 137 personality traits and subjective well-being. Psychol Bull. 1998;124:197–229. pmid:9747186
  4. 4. Morizot J. The contribution of temperament and personality traits to antisocial behavior development and desistance. In: The development of criminal and antisocial behavior: Theory, research and practical applications. New York: Springer; 2015. p. 137–65.
  5. 5. Kuncel NR, Ones DS, Sackett PR. Individual differences as predictors of work, educational, and broad life outcomes. Pers Individ Dif. 2010;49:331–6.
  6. 6. Soto CJ. How replicable are links between personality traits and consequential life outcomes? The life outcomes of personality replication project. Psychol Sci. 2019.
  7. 7. John OP, Naumann LP, Soto CJ. Paradigm shift to the integrative big-five trait taxonomy: History, measurement, and conceptual issues. In: John OP, Robins RW, Pervin LA, editors. Handbook of personality: Theory and research. 3rd ed. New York: Guilford Press; 2008. p. 114–53.
  8. 8. Morizot J. Construct validity of adolescents’ self-reported big five personality traits: importance of conceptual breadth and initial validation of a short measure. Assessment. 2014;21:580–606. pmid:24619971
  9. 9. Barbaranelli C, Caprara GV, Rabasca A, Pastorelli C. A questionnaire for measuring the Big Five in late childhood. Pers Individ Dif. 2003;34:645–64.
  10. 10. John OP, Donahue EM, Kentle RL. The Big Five Inventory-Versions 4a and 54. Berkeley, CA: University of California, Berkeley, Institute of Personality and Social Research; 1991.
  11. 11. Gosling SD, Rentfrow PJ, Swann Jr. WB. A very brief measure of the Big-Five personality domains. J Res Pers. 2003;37:504–28.
  12. 12. Donnellan MB, Oswald FL, Baird BM, Lucas RE. The Mini-IPIP scales: Tiny-yet-effective measures of the Big Five Factors of personality. Psychol Assess. 2006;18:192–203. pmid:16768595
  13. 13. Saucier G. Orthogonal markers for orthogonal factors: The case of the Big Five. J Res Pers. 2002;36:1–31.
  14. 14. McCrae RR, Costa Jr. PT. NEO Inventories for the NEO Personality Inventory-3 (NEO-PI-3), NEO Five-Factor Inventory-3 (NEO-FFI-3), NEO Personality Inventory-Revised (NEO-PI-R): Professional manual. Lutz, FL: Psychological Assessment Resources; 2010.
  15. 15. Ortet G, Ibáñez MI, Moya J, Villa H, Viruela A, Mezquita L. Assessing the five factors of personality in adolescents: the junior version of the Spanish NEO-PI-R. Assessment. 2012;19:114–30. pmid:21622482
  16. 16. Saucier G, Goldberg LR. Assessing the Big Five: Applications of 10 psychometric criteria to the development of marker scales. In: De Raad B, Perugini M, editors. Big five assessment. Ashland, OH US: Hogrefe & Huber Publishers; 2002. p. 30–54.
  17. 17. Ortet G, Martínez T, Mezquita L, Morizot J, Ibáñez MI. Big Five Personality Trait Short Questionnaire: Preliminary validation with Spanish adults. Span J Psychol. 2017;20:E7. pmid:28181474
  18. 18. D’Amico EJ, Tucker JS, Shih RA, Miles JNV. Does diversity matter? The need for longitudinal research on adolescent alcohol and drug use trajectories. Subst Use Misuse. 2014;49:1069–73. pmid:24779507
  19. 19. Henrich J, Heine SJ, Norenzayan A. Most people are not WEIRD. Nature. 2010;466:29–29. pmid:20595995
  20. 20. Spector PE, Liu C, Sanchez JI. Methodological and Substantive Issues in Conducting Multinational and Cross-Cultural Research. Annu Rev Organ Psychol Organ Behav. 2015;2:101–31.
  21. 21. Ziegler M, Kemper CJ, Kruyen P. Short scales–Five misunderstandings and ways to overcome them. J Individ Differ. 2014;35:185–9.
  22. 22. Miech RA, Patrick ME, O’Malley PM, Johnston LD. The Influence of College Attendance on Risk for Marijuana Initiation in the United States: 1977 to 2015. Am J Public Health. 2017;107:996–1002. pmid:28426314
  23. 23. Schulenberg JE, Johnston LD, O’Malley PM, Bachman JG, Miech RA, Patrick ME. Monitoring the Future national survey results on drug use, 1975–2017: Volume II, college students and adults ages 19–55. Ann Arbor: Institute for Social Research, The University of Michigan; 2018.
  24. 24. January J, Madhombiro M, Chipamaunga S, Ray S, Chingono A, Abas M. Prevalence of depression and anxiety among undergraduate university students in low- and middle-income countries: a systematic review protocol. Syst Rev. 2018;7:57. pmid:29636088
  25. 25. Bruffaerts R, Mortier P, Kiekens G, Auerbach RP, Cuijpers P, Demyttenaere K, et al. Mental health problems in college freshmen: Prevalence and academic functioning. J Affect Disord. 2018;225:97–103. pmid:28802728
  26. 26. Bravo AJ, Villarosa-Hurlocker MC, Pearson MR, Protective Strategies Study Team. College student mental health: An evaluation of the DSM-5 Self-Rated Level 1 Cross-Cutting Symptom Measure. Psychol Assess. 2018;30:1382–9. pmid:30070557
  27. 27. Bravo AJ, Pearson MR, Pilatti A, Mezquita L, Cross-Cultural Addictions Study Team. Negative marijuana‐related consequences among college students in five countries: Measurement invariance of the Brief Marijuana Consequences Questionnaire. Addiction. 2019.
  28. 28. Cohen J. A power primer. Psychol Bull. 1992;110:155–9
  29. 29. American Psychiatric Association. The DSM-5 Self-Rated Level 1 Cross-Cutting Symptom Measure–Adult. 2013.
  30. 30. American Psychiatric Association. Manual diagnóstico y estadístico de los trastornos mentales. 5th ed. Madrid: Editorial Médica Panamericana; 2014.
  31. 31. Narrow WE, Clarke DE, Kuramoto SJ, Kraemer HC, Kupfer DJ, Greiner L, et al. DSM-5 Field Trials in the United States and Canada, Part III: Development and Reliability Testing of a Cross-Cutting Symptom Assessment for DSM-5. Am J Psychiatry. 2013;170:71–82. pmid:23111499
  32. 32. Mezquita L, Ibáñez MI, Moya J, Villa H, Ortet G. A longitudinal examination of different etiological pathways to alcohol use and misuse. Alcohol Clin Exp Res. 2014;38:1770–9. pmid:24797208
  33. 33. Mezquita L, Bravo AJ, Ortet G, Pilatti A, Pearson MR, Ibáñez MI. Cross-cultural examination of different personality pathways to alcohol use and misuse in emerging adulthood. Drug Alcohol Depend. 2018;192:193–200. pmid:30268069
  34. 34. Simons JS, Dvorak RD, Merrill JE, Read JP. Dimensions and severity of marijuana consequences: Development and validation of the Marijuana Consequences Questionnaire (MACQ). Addict Behav. 2012;37:613–21. pmid:22305645
  35. 35. Pearson MR, Marijuana Outcomes Study Team. Marijuana Use Grid: A brief, comprehensive measure of marijuana use. Manuscript Submitted to Publication. 2019.
  36. 36. Brinker JK, Dozois DJA. Ruminative thought style and depressed mood. J Clin Psychol. 2009;65:1–19. pmid:19048597
  37. 37. Bravo AJ, Pearson MR, Pilatti A, Mezquita L, Ibáñez MI, Ortet G. Ruminating in English, Ruminating in Spanish: Psychometric Evaluation and Validation of the Ruminative Thought Style Questionnaire in Spain, Argentina, and USA. Eur J Psychol Assess. 2018.
  38. 38. Muthén LK, Muthén BO. Mplus user’s guide. Eigth Edition. Muthén LK, Muthén BO, editors. Los Angeles, CA; 2018.
  39. 39. Marsh HW, Ludtke O, Muthén B, Asparouhov T, Morin AJS, Trautwein U. A new look at the big-five factor structure through exploratory factor structural equation modeling. Psychol Assess. 2010;22:471–91. pmid:20822261
  40. 40. West SG, Taylor AB, Wu W. Model fit and model selection in structural equation modeling. In: Hoyle RH, editor. Handbook of structural equation modeling. New York: Guildord Press; 2012. p. 209–31.
  41. 41. Bentler PM. Comparative fit indexes in structural models. Psychol Bull. 1990;107:238–46. pmid:2320703
  42. 42. Marsh HW, Hau K, Wen Z. In search of golden rules: Comment on hypothesis-testing approaches to setting cutoff values for fit indexes and dangers in overgeneralizing Hu and Bentler’s (1999) findings. Struct Equ Model. 2004;11:320–41.
  43. 43. MacCallum RC, Browne MW, Sugawara HM. Power analysis and determination of sample size for covariance structure modeling. Psychol Methods. 1996;1:130–49.
  44. 44. Marsh HW, Nagengast B, Morin AJS. Measurement invariance of big-five factors over the life span: ESEM tests of gender, age, plasticity, maturity, and la dolce vita effects. Dev Psychol. 2013;49:1194–218. pmid:22250996
  45. 45. Morin AJS, Marsh HW, Nagengast B. Exploratory Structure Equation Modeling. In: Hancock GR, Mueller RO, editors. Structural equation modeling: A second course (2nd ed) [Internet]. Charlotte, NC: Information Age Publishing, Inc; 2013. p. 269–314.
  46. 46. Satorra A, Bentler PM. A scaled difference chi-square test statistic for moment structure analysis. Psychometrika. 2001;66:507–14.
  47. 47. Brown TA. Confirmatory factor analysis for applied research. 2nd ed. New York: Guilford Press; 2015.
  48. 48. Chen FF. Sensitivity of goodness of fit indexes to lack of measurement invariance. Struct Equ Model. 2007;14:464–504.
  49. 49. Cheung G, Rensvold R. Evaluating goodness-of-fit indexes for testing measurement invariance. Struct Equ Model. 2002;9:233–55.
  50. 50. Dunn TJ, Baguley T, Brunsden V. From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation. Br J Psychol. 2014;105:399–412. pmid:24844115
  51. 51. Hu LT, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Struct Equ Model. 1999;6:1–55.
  52. 52. Ortet G, Mezquita L, Morizot J, Ibáñez MI. Assessment of the “Little” Big Five: The Spanish version of the Big Five Personality Traits Short Questionnaire in adolescents. Manuscript Submitted to Publication. 2019.
  53. 53. Aluja A, García O, García LF. Relationships among extraversion, openness to experience, and sensation seeking. Pers Individ Dif. 2003;35:671–680.
  54. 54. Roberti JW. A review of behavioral and biological correlates of sensation seeking. J Res Pers. 2004;38:256–279.
  55. 55. Suárez-Alvarez J, Pedrosa I, Lozano LM, García-Cueto E, Cuesta M, Muñiz J. Using reversed items in likert scales: A questionable practice. Psicothema. 2018;30:149–58. pmid:29694314
  56. 56. Millsap RE, Olivera-Aguilar M. Investigating measurement invariance using confirmatory factor analysis. In: Hoyle RH, editor. Handbook of Structural Equation Modeling. New York: Guilford; 2012. p. 209–31.
  57. 57. Spector PE, Liu C, Sanchez JI. Methodological and substantive issues in conducting multinational and cross-cultural research. Annu Rev Organ Psychol Organ Behav. 2015;2:101–31.
  58. 58. Fan X, Sivo SA. Sensitivity of fit indices to model misspecification and model types. Multivariate Behav Res. 2007;42:509–29.
  59. 59. Fan X, Sivo SA. Using goodness-of-fit indexes in assessing mean structure invariance. Struct Equ Model. 2009;16:54–69.
  60. 60. Malouff JM, Thorsteinsson EB, Schutte NS. The relationship between the Five-Factor Model of Personality and symptoms of clinical disorders: A meta-analysis. J Psychopathol Behav Assess. 2005;27:101–14.
  61. 61. Ruiz MA, Pincus AL, Schinka JA. Externalizing pathology and the Five-Factor Model: A meta-analysis of personality traits associated with antisocial personality disorder, substance use disorder, and their co-occurrence. J Pers Disord. 2008;22:365–88. pmid:18684050
  62. 62. Bogg T, Roberts BW. Conscientiousness and health-related behaviors: a meta-analysis of the leading behavioral contributors to mortality. Psychol Bull. 2004;130:887–919. pmid:15535742
  63. 63. Allen J, Holder MD. Marijuana use and well-being in university students. J Happiness Stud. 2014;15:301–21.
  64. 64. Flory K, Lynam D, Milich R, Leukefeld C, Clayton R. The relations among personality, symptoms of alcohol and marijuana abuse, and symptoms of comorbid psychopathology: Results from a community sample. Exp Clin Psychopharmacol. 2002;10:425–34. pmid:12498340
  65. 65. Terracciano A, Löckenhoff CE, Crum RM, Bienvenu OJ, Costa PT Jr. Five-factor model personality profiles of drug users. BMC Psychiatry. 2008;8:22. pmid:18405382
  66. 66. Gale CR, Booth T, Mõttus R, Kuh D, Deary IJ. Neuroticism and Extraversion in youth predict mental wellbeing and life satisfaction 40 years later. J Res Pers. 2013;47:687–97. pmid:24563560
  67. 67. Steel P, Schmidt J, Shultz J. Refining the relationship between personality and subjective well-being. Psychol Bull. 2008;134:138–61. pmid:18193998
  68. 68. Ozer DJ, Benet-Martínez V. Personality and the prediction of consequential outcomes. Annu Rev Psychol. 2006;57:401–21. pmid:16318601