Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A genome-wide association study (GWAS) of the personality constructs in CPAI-2 in Taiwanese Hakka populations

  • Pei-Ying Kao,

    Roles Formal analysis, Investigation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan

  • Ming-Hui Chen,

    Roles Data curation

    Affiliation Department of Hakka Language and Social Science, National Central University, Taoyuan, Taiwan

  • Wei-An Chang,

    Roles Funding acquisition, Project administration, Supervision, Writing – original draft

    Affiliations Department of Humanities and Social Sciences, National Yang Ming Chiao Tung University, Hsinchu, Taiwan, Research Center for Humanities and Social Sciences, National Yang Ming Chiao Tung University, Hsinchu, Taiwan

  • Mei-Lin Pan,

    Roles Resources, Writing – original draft

    Affiliation Department of Humanities and Social Sciences, National Yang Ming Chiao Tung University, Hsinchu, Taiwan

  • Wei-Der Shu,

    Roles Resources

    Affiliation Department of Humanities and Social Sciences, National Yang Ming Chiao Tung University, Hsinchu, Taiwan

  • Yuh-Jyh Jong,

    Roles Conceptualization, Funding acquisition

    Affiliations Department of Biological Science and Technology, College of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan, Graduate Institute of Clinical Medicine, College of Medicine, Kaohsiung Medical University (KMU), Kaohsiung, Taiwan, Departments of Pediatrics and Laboratory Medicine, KMU Hospital, Kaohsiung, Taiwan

  • Hsien-Da Huang,

    Roles Conceptualization

    Affiliations Department of Biological Science and Technology, College of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan, Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan

  • Cheng-Yan Wang,

    Roles Data curation, Validation

    Affiliation Institute of Education, National Yang Ming Chiao Tung University, Hsinchu, Taiwan

  • Hong-Yan Chu,

    Roles Resources

    Affiliation Research Center for Humanities and Social Sciences, National Yang Ming Chiao Tung University, Hsinchu, Taiwan

  • Cheng-Tsung Pan,

    Roles Resources

    Affiliation Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan

  • Yih-Lan Liu ,

    Roles Data curation, Methodology, Resources, Supervision, Validation, Writing – original draft (YLL); (YSL)

    Affiliation Institute of Education, National Yang Ming Chiao Tung University, Hsinchu, Taiwan

  • Yeong-Shin Lin

    Roles Investigation, Supervision, Validation, Writing – original draft, Writing – review & editing (YLL); (YSL)

    Affiliations Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan, Department of Biological Science and Technology, College of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan


Here in this study we adopted genome-wide association studies (GWAS) to investigate the genetic components of the personality constructs in the Chinese Personality Assessment Inventory 2 (CPAI-2) in Taiwanese Hakka populations, who are likely the descendants of a recent admixture between a group of Chinese immigrants with high emigration intention and a group of the Taiwanese aboriginal population generally without it. A total of 279 qualified participants were examined and genotyped by an Illumina array with 547,644 SNPs to perform the GWAS. Although our sample size is small and that unavoidably limits our statistical power (Type 2 error but not Type 1 error), we still found three genomic regions showing strong association with Enterprise, Diversity, and Logical vs. Affective Orientation, respectively. Multiple genes around the identified regions were reported to be nervous system related, which suggests that genetic variants underlying the certain personalities should indeed exist in the nearby areas. It is likely that the recent immigration and admixture history of the Taiwanese Hakka people created strong linkage disequilibrium between the emigration intention-related genetic variants and their neighboring genetic markers, so that we could identify them despite with only limited statistical power.


The genome-wide association studies (GWAS) have been commonly used for identifying disease-related single nucleotide polymorphisms (SNPs) in human populations (e.g., Manolio [1]; Pearson & Manolio [2]). Comparing with the bi-parental quantitative trait loci (QTL) linkage mapping method, the linkage disequilibrium (LD) mapping utilizing GWAS provides more insights into the molecular genetic basis [3]. Recently, the application has also been extended to the studies of psychology and personality (e.g., Montag et al. [4]). With these advanced techniques, geneticists could study not only the association between a certain personality and a certain gene as decades ago (e.g., Ebstein et al. [5]; Schinka et al. [6]), but the associations among various personality traits and various SNPs in the human genome (e.g., de Moor et al. [7]; Kim et al. [8]; Koshimizu et al. [9]; Luciano et al. [10]). Moreover, van den Berg et al. [11] within the Genetics of Personality Consortium applied Item-Response Theory (IRT) to harmonize Neuroticism and Extraversion measures from different inventories. The method and the enlarged datasets were thus utilized to perform GWAS analyses by some following studies (e.g., de Moor et al. [12]; Lo et al. [13]; van den Berg et al. [14]). Not limited to the Big Five personality traits (Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism; also known as the Five-Factor Model, FFM), Kim et al. [15] further studied six Neuroticism distinct facets to reveal their pathway-based associations.

The questionnaires utilized by the above GWAS studies involve some traditional ones like the Eysenck Personality Questionnaire [16], and mostly the FFM-related NEO Personality Inventory like Neo PI3 [17] or NEO-PI-R [18]. Although some of the aforementioned GWAS studies have included Eastern (oriental) samples to perform the analyses (e.g., Kim et al. [8]; Kim et al. [15]), the adopted personality questionnaires were actually developed based on Western theories of personality with Western samples. Originally when these measures were developed, it was assumed that personality traits are stable dispositions that are biologically based, and therefore they are applicable across cultures. However, personality traits are interdependent or mutually constitutive of environment or cultures [19]. Although dispositional traits have biological bases, and cultures don’t change the individuals’ genetic make-up, cultures may influence the ways dispositional traits are elaborated or reinforced during development and articulated or manifested across settings. Therefore, the assessment of personality across cultures would incorporate the measurement of universal and culture-specific traits [20].

To deal with this problem, Cheung et al. [21] regarded that the combination of emic-etic methods is a powerful way to integrate universal and culture-specific aspects of constructs or theories with a balanced treatment. They believed that cross-cultural comparison would be more significant if the instruments are designed to capture similar phenomena in different cultures as well as the characteristics culturally relevant to the local settings. The Chinese Personality Assessment Inventory (CPAI) [21] was developed by adopting the etic and emic approach in the Chinese context. CPAI consists of both etic personality construct similar to the constructs covered by the main Western personality theory, and emic personality constructs that are essential in the Chinese context. Given that CPAI is a lack of Openness construct, CPAI-2 was then revised based on CPAI by incorporating the Chinese indigenous openness concept [22]. A systematic cross-cultural comparison of the factor structures from Chinese samples (e.g., Hong Kong, Mainland China, and Taiwan), from other Asian samples (e.g., Japan, South Korea, Singapore, and Vietnam), and from Western samples (e.g., American, Dutch, and Romanian) consistently revealed that three of CPAI factors (i.e., Social Potency, Dependability, and Accommodation) converged into the five factors of the NEO Five-Factor Inventory, whereas Interpersonal Relatedness factor was independent of the FFM dimensions [2325]. The Interpersonal Relatedness factor contains the emic personality constructs that complement the etic constructs of CPAI-2. Previous studies using CPAI/CPAI-2 have shown its practicality as a culture-relevant personality measure in the Chinese context [26]. Here in this study, we attempt to perform GWAS to investigate the personality in the current Hakka people in Taiwan. We adopted CPAI-2 and aimed to better describe the personality of these Hakka people.

In Chinese, Hakka is a famous ethnic group for its history of large-scale migration. Migration and Hakka are two sides of the same thing. The origin of the name "Hakka" was given by the local Cantonese with the meaning of "guest". Leong [27] indicated the importance of migration and interaction with their neighbors to the formation of Hakka. Similarly, Hoerder [28] suggested internal Hakka migration in China up to the 1500s to be one of the "large scale and long-distance migrations", comparable to the Mongol expansion or Manchu penetration of China. On the other hand, Wang [29] suggested that Hakka’s readiness to move is "characteristic of the Hakka migration pattern". In his viewpoint, Hakka’s adventurous and pioneering spirit is one of the features that propel them to move compared to all the other dialect groups. He supplemented that the Hakka "were accustomed to working in remote areas throughout their history in China" [29] and hence, they simply migrated when the need arose. Leo [30] summarized these two studies and claimed that "migration" should be considered as a Hakka cultural marker. It should be the driving force that shaped and reshaped Hakka’s identity.

Since the seventeenth century, and particularly during the eighteenth and nineteenth centuries, Hakka who came mainly from Guangdong agricultural communities emigrated to Taiwan, Malaya, and other regions of Southeast Asia, and as far as South Asia, Africa, Oceania, Europe, the Caribbean, and North and South America [31]. Among them, the Hakka who immigrated to Taiwan is mainly men. As the saying goes, only the father is from mainland China, and the mother is not, so marriage between the Chinese immigrants and the Taiwanese aboriginal population has been quite common since the invasion of Chinese, and the consequence is that a great number of Pepohoans (Plain’s barbarians) have become assimilated in the manner of living to the immigrated Chinese peasants [32]. Strictly speaking, the current Hakka people in Taiwan are likely the descendants of a recent admixture between a group of immigrants with high emigration intention and a group of the aboriginal population generally without it.

Boneva and Frieze [33] proposed the term “migrant personality” that refers to psychological characteristics of individuals prone to emigration. They manifested that people who (1) are willing to do something challenging and unique (high achievement motive), (2) are willing to take the risk and endure dangers in reaching their goals (high power motive), and (3) show fewer concerns with emotional relationships or social network with families and friend (low affiliative motive), tend to have emigration intentions or behaviors. In case the personality, especially which related to the emigration intention, was indeed affected by some genetic factors, we would possibly find some of these phenotypic polymorphisms and the corresponding genetic polymorphisms in these Taiwanese Hakka people due to their special and recent immigration and admixture history. The associations between the phenotypic polymorphisms and genetic polymorphisms of nearby markers should have largely been maintained because the time elapsed since admixture is short and not enough for genetic recombination to break down their genetic linkages. Here we utilize these Hakka people to perform the GWAS. Since culture cannot change personality genetic make-up [34], we would like to know whether we could identify the genetic components of some personality constructs in CPAI-2 that were not previously observed using the traditional Big Five personality measures. To the best of our knowledge, this is the first study that investigates the biological basis of CPAI-2.



The current study is a sub-project of a large-scale cross-disciplinary research, which predominantly aimed to investigate the genetic origins of Taiwanese Hakka populations. Based on that specific purpose, the participants we recruited were restricted to individuals whose parents, maternal grandparents, and paternal grandparents all speak the same certain Hakka dialect. The Hakka dialects include North Sixian, South Sixian, Hailu, Dabu, and Zhao’an, which are the most common Hakka dialects in Taiwan. We used this stringent criterion to ensure that the collected samples are representative enough without the disturbance from the frequent inter-population marriages in recent decades. Once one eligible participant was recruited in our study, his/her relatives (based on the family genealogy provided by the participant) would be excluded from the further recruitment. The saliva sample and personality questionnaire data were collected for the 288 recruited participants (159 males and 129 females). Our study protocols were approved by the Research Ethics Committee for Human Subject Protection of National Chiao Tung University/IRB. The informed consent documents were also signed by all the participants. The ethical issue relating to usage institutional and government regulations is strictly followed within this study.

Personality measurements

The Chinese Personality Assessment Inventory 2 (CPAI-2) [22] was used as the personality measure in this study. This Inventory consists of 22 normal personality scales and 3 validity scales. For this study, we used 19 personality scales including the Social Potency factor (i.e., Novelty, Diversity, Divergent Thinking, Leadership, Logical vs Affective Orientation, Extraversion vs Introversion, and Enterprise), Dependability factor (i.e., Responsibility, Emotionality, Interiority vs. Self-Acceptance, Optimism vs. Pessimism, Meticulousness, and Family Orientation), Accommodation factor (i.e., Veraciousness vs. Slickness), and Interpersonal Relatedness factor (i.e., Traditionalism vs. Modernity, Ren Qing, Social Sensitivity, Discipline, and Thrift vs. Extravagance). Five samples with incomplete questionnaire data were removed. CPAI-2 is a self-report measure and participants respond to the description of behavior in a true-false format. The sum scores for each scale were subsequently included in the GWAS analyses. We conducted a pilot study to examine its validity and reliability with 349 samples (Mage = 37.73, SD = 18.74). The items with no discriminant validity were deleted. The Cronbach’s α values of the 19 personality scales ranged between 0.49 and 0.79 (the median Cronbach’s α was 0.65) in the current study.


A total of 547,644 SNPs were genotyped on the Illumina HumanCoreExome-24 v1.0 array from 288 samples in this study. Both the standard cluster and chromosome X cluster were conducted using GenomeStudio 2.0 [35] to genotype these SNPs. Four samples with insufficient genotyped SNPs (< 95%) were removed. A two-step quality control with six regulations was performed. The first step utilizing R programming ( removes (1) insertion / deletion polymorphisms (12,524 SNPs removed); (2) genotyped variants with the Chr. marker labeled as “0” (without the chromosome information; 803 SNPs removed), as “X”, “Y”, or “XY” (sex chromosome; 15,085 SNPs removed), or as “MT” (mitochondrial genome; 369 SNPs removed); (3) variants with missing rate > 5% (too many samples were not genotyped); and (4) positional duplicates (array SNPs with the same genomic position). A total of 11,364 SNPs were removed in (3) and (4). The second step utilizing PLINK [36] removes (5) variants with Hardy-Weinberg equilibrium (HWE) p value < 1 × 10−5 (a deviation from the HWE that probably due to genotyping errors; 74 SNPs removed); and (6) variants with minor allele frequency (MAF) < 1% (without sufficient statistic power; 246,868 SNPs removed). A total of 260,557 SNPs were thus retained for further analyses.

Genome-wide association testing

For each personality trait at each SNP locus, we used PLINK [36] to perform the linear regression analysis and calculate its probability value to clarify whether the personality scores from the 279 participants are related to their genotypes. The standard p value threshold 5 × 10−8 was used. Quantitative trait association with 19 questionnaire score was conducted and the results were plotted into Manhattan and quantile-quantile plots with R. We also used LocusZoom [37] to draw their regional association plots. The power and sample size calculations were performed using genpwr [38] with linear and additive models.

Results and discussion

The identified genomic regions

The distribution patterns of the 19 personality scales for the 279 qualified participants were represented in S1 Fig, while the Manhattan plots for the 19 personalities were shown in Fig 1. The first significant result we found is a region flanked by UTRN and EPM2A located on chromosome 6 that is strongly associated with the personality, Enterprise (Fig 2A and Table 1). The protein encoded by UTRN is utrophin, which may play roles at neuromuscular junctions [39] and provide structural support for the neuronal membrane [40]. On the other hand, mutations of EPM2A may cause the malfunction of its encoded phosphatase, Laforin, leading to the accumulation of aberrant glycogen or polyglucosans and becoming neurotoxic [41]. This disorder is referred to as neurodegenerative Lafora disease. The EPM2A knockout mice were shown to have abnormal motor coordination, hind limb clasping, and even episodic memory deficits [41].

Fig 1. The Manhattan plots of the probability values in linear regression analysis for the 19 personalities.

The horizontal axis in each plot represents the SNP location from chromosome 1 to chromosome 22, while the vertical axis indicates the significance level of that SNP (represented in–log10 format).

Fig 2. Regional association plots around the three most significant SNPs.

(A) SNP rs566661 on chromosome 6 with Enterprise; (B) SNP rs1267992 on chromosome 6 with Diversity; (C) SNP rs12503435 on chromosome 4 with Logical vs. Affective Orientation. SNPs were plotted according to their probability values in the linear regression analysis (Fig 3 as the example) with the corresponding personalities. The purple rhombuses with SNP names are the most significant ones. The colors of the other circles represent their linkage disequilibrium (r2) with the named top SNP. The genomic information denoted below was derived from human genome assembly GRCh37 (hg19) from Genome Reference Consortium.

Table 1. SNPs showing significant associations with one of the examined personality traits.

The second significant result is a region located in gene NKAIN2 on chromosome 6 (Fig 2B and Table 1) that is strongly associated with Diversity. NKAIN2 was previously reported relating to nervous system development [42]. One previous GWAS study showed that this gene is associated with Neuroticism [43]; meanwhile, another GWAS study proposed that this gene is associated with Extraversion [8]. These results suggest that NKAIN2 may have various effects on the development of personality.

The final significant result we found is a region flanked by RAPGEF2 and FSTL5 located on chromosome 4 that is strongly associated with Logical vs. Affective Orientation (Fig 2C and Table 1). The protein encoded by RAPGEF2 is a neural-specific activator of Rap proteins [44, 45]. It is up-regulated in Alzheimer’s disease patients’ hippocampus and may play roles in Aβ oligomer-induced synaptic and cognitive degeneration [46]. FSTL5 is expressed in cortical neurons [47], specifically in the hippocampus CA3 region and the cerebellum granular cell layer [48], and is important in synaptic transmission and plasticity [49]. It may also be related to obsessive-compulsive personality disorder [50].

The interpretation of our GWAS findings

The classical QTL mapping methods usually conduct a hybridization process between individuals with distinct phenotypes for the feature of interest. As long as the frequencies of both the phenotype-causing variant and its neighboring genetic markers are significantly different between the two intermixing groups, linkage disequilibrium will be created even though the two groups are originally in linkage equilibrium. The joint segregation of phenotypic values and genetic markers distributed across the genome could thus be examined. In our study, the recent interethnic marriage between a group of immigrants (with high emigration intention) and a group of the aboriginal population (generally without the emigration intention) might coincidentally represent such a hybridization process, and thus create a situation similar to the artificial inter-cross populations and provide a superior opportunity to identify SNPs related to the emigration intentions and behaviors. The accomplishment of mapping the QTL in natural hybrid crosses is mainly determined by two factors: the sample size [5153] and the duration of the time period [54]. Larger sample size may provide a larger statistical power; while a longer duration may increase the chance that the genetic linkage between the phenotypes and the genetic markers was broken by recombination events.

In recent years, most GWAS studies put their efforts into enlarging the sample size to increase the statistical power and the chance to successfully identify SNPs associated with traits of interest. For example, thousands to several hundred thousand individuals were usually included in the previously reviewed GWAS studies [710, 1215], comparing to only 279 qualified participants in our study. Thus, some may argue that the sample size in our study is too limited. However, a small sample size could only generate Type 2 error and thus reduce the statistic power, but have nothing to do with Type 1 error [55]. In other words, a limited sample size would only reduce the chance to identify the character-related SNPs, but would not falsely recognize a SNP that is actually unrelated to the character. When performing a statistical test, the sample size issue has already been addressed. As long as the test is significant, the result should be reliable and not falsely generated; moreover, the impact of the identified variant must be strong to compensate for the small sample size effect and thus the identified variants should be relatively more important than the ones identified from a large sample set [55].

In our study, despite of the small sample size due to the stringent criterion of participant recruitments which unavoidably limits our statistical power, thanks to the special and recent immigration and admixture history of the Taiwanese Hakka people [32], we still found three genomic regions showing strong association with some personalities that might be related to the emigration intentions and behaviors (Table 1). Our results suggest that selecting an appropriate target population is also a crucial task. We utilized the R2 values and genpwr [38] with linear and additive models to calculate the power for identifying the three most significant SNPs rs566661, rs1267992, and rs12503435 (Fig 3). The calculated values are 0.597, 0.571, and 0.049, respectively (S2 Fig). These values are not quite high and imply that there might be many other personality-related variants remaining unidentified due to our limited statistical power. The heritability contributed by these single variants could also be estimated from their allele frequencies and their effect sizes (defined as the regression coefficients) [56]. The estimated heritability values are 0.113, 0.105, and 0.050, which indeed supports the strong impacts of the identified variants on the analyzed Taiwanese Hakka populations.

Fig 3. The linear regressions between the personality score and genotype for the three most significant SNPs in Fig 2.

(A) SNP rs566661 on chromosome 6 with Enterprise. The Pearson correlation coefficient (R) is 0.331 and the regression coefficient (β, the slope) is -1.127; (B) SNP rs1267992 on chromosome 6 with Diversity (R = 0.327 and β = -0.675); (C) SNP rs12503435 on chromosome 4 with Logical vs. Affective Orientation (R = 0.225 and β = 1.078). The personality score and genotype of each participant were used to draw the swarm plots and obtain the regression lines.

It should be noted that the SNPs identified by GWAS are not necessarily themselves the phenotype-causing variants. In most cases, the association between the SNP and the feature is due to linkage disequilibrium. Moreover, the phenotype-causing variants may possibly be located in the regulatory regions instead of the coding regions, while the regulatory regions may still remain unknown. Therefore, we could only suggest that the phenotype-causing variant may occur in the neighboring areas but not a specific gene. Since multiple genes around the identified regions were reported to be related to the nervous system or expressed in the neural tissues, it is very likely that the personalities we examined are indeed controlled by some variants in these regions.

The personalities identified in this study might correlate with the intention to emigrate

Among our identified results, Enterprise, Diversity, and Logical vs. Affective all belongs to the Social Potency factor in CPAI-2 [22]. Enterprise is defined as the degree to which individuals are ready to take risks; Diversity is defined as the degree to which individuals have diverse interests; and Logical vs. Affective Orientation is defined as the degree to which individuals are objective or subjective in their thinking and behavior [22]. While these three personality scales were jointly factor analyzed with the NEO-PI-R facets [23], Diversity and Logical vs. Affective Orientation loaded on Openness factor, and Enterprise loaded on both Neuroticism and Openness factors. In fact, Diversity and Logical vs. Affective Orientation are also the two personalities deviate mostly in the quantile-quantile plots (S3 Fig). Considering the special and recent immigration and admixture history of the Taiwanese Hakka people [32], we speculated that the successful identification of these three personality traits is probably related to the emigration intentions and behaviors of their ancestors.

Among the Big Five personality traits (Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism), several previous studies have suggested that openness to experience is the most consistent one in predicting emigration intention [5761]. McCrae and Costa [62] described open people as “are adventurous, bored by familiar sights, and stiffed by routine”, and these characteristics are critical for self-selected migration. In contrast, conscientiousness trait was found to stably associate with lower desire to migrate [57, 58, 61, 63]. Conscientious people who tend to control impulses, comply with societal norms, and feel responsible for the home country have fewer intentions to leave [63]. Moreover, highly agreeable people were less likely to migrate because they have formed a stronger attachment with their homeland [64, 65]. In contrast, individuals who are extraverted and more emotionally stable displayed a stronger intention to migrate or work abroad [5759, 61, 66]. Migration is fundamentally a bold move, and extraverted individuals are also active and have positive emotions [62]. They are more likely to socially interact with strangers and better adjust to the new land [59]. Other than the Big Five personality traits, Tabor et al. [61] further indicated that persistence and patience are also important characteristics of migrants because only the most persistent people would be able to successfully make it through the migration process.

Our result is generally consistent with these studies [5761] that openness to experience is likely the most critical personality trait for emigration intentions and behaviors. We further suggest that the Social Potency factor should be the one most correlated with migrant personality in CPAI-2. However, it should be emphasized that, although the personalities mentioned above all belong to Openness and Social Potency factors, our results indicate that different facets in the same factor are controlled by different genetic variants (Table 1). For example, SNPs are significantly associated with one of the personality traits but not associated with the other two (Table 1). This implies that these genetic variants do not control the whole personality factor. Instead, they only independently control one of the facets in a personality factor defined in CPAI-2.

The reliability of our GWAS findings

Independent statistical replications were frequently adopted to explore the reproducibility and reliability of GWAS studies [67]. Nevertheless, splitting the obtained genetic data into discovery and replication sets was unrecommended because it may decrease the statistic power and moreover, the replication set was argued to be not truly independent; therefore, most GWAS studies utilized data from mega-biobanks or meta-analysis of many other studies to perform the replication [67]. However, most GWAS studies and mega-biobanks are population-specific, and the examined traits are mostly restricted to common phenotypes, which may lead to the lack of available replication data for studies on non-European ancestry with seldom investigated traits and, in consequence, lead to the GWAS publication bias [67]. As expected, considering that we are the first study to investigate the genetic basis of CPAI-2 and considering the special immigration and admixture history of the Taiwanese Hakka people, there is no suitable external replication cohort available for implementation. In fact, this is inevitable for pioneer studies. In this circumstance, we alternatively pursued other biological lines of evidence to inspect the reliability of our identified results since these kinds of evidence should potentially be considered in the same vein as replications [67].

The first convincible evidence comes from the finding that all the three identified regions are either located in a neural gene or flanked by neural genes. It was recently reported that many adjacent neural genes are co-expressed and this co-expression is mainly controlled by distinct enhancers in the shared extended intergenic regions between these neighboring neural genes [68]. These extremely long intergenic regions of neural genes may contain numerous cell- and tissue-specific cis-regulatory elements which could provide an enormous amount of regulatory information to manipulate the complex and diverse gene expression patterns required in the mammalian nervous system [68]. If the personalities we examined indeed have genetic basis, their corresponding human behaviors should be controlled by the nervous system and thus these cis-regulatory elements should play important roles and have the potential to serve as the phenotype-causing variants.

Additional supporting evidence stems from the fact that the gene NKAIN2 related to Diversity identified here was also reported to be associated with Neuroticism [43] and Extraversion [8] in previous GWAS studies. This could be considered as a different kind of replication since all the three studies independently confirm the important role of NKAIN2 on the development of personality, although different personality traits and different genetic markers were recognized, which suggests that NKAIN2 should have diverse functions in human behaviors. Based on the above descriptions, it is unlikely that our GWAS findings were derived just by chance or coincidence. The identified associations should thus provide valuable insights into the underlying genetic mechanisms of human behaviors and personalities.

The mainlanders, islanders, and migrants

Camperio Ciani and colleagues [57, 58] compared the personality differences between island and mainland populations and found that mainlanders and immigrants from the mainland are significantly more open to experience than original islanders. They proposed that the difference was generated by a strong and non-random emigration flow from the islands–open people tend to emigrate and the sedentary islanders are thus comparatively less open. In our study, we did not directly compare the Chinese population and the Taiwanese aboriginal population (which is unavailable due to the frequent inter-population marriages in recent decades), but only analyzed their admixture descendants. Therefore, we could only speculate that the traits and alleles related to emigration intentions and behaviors were transported by the ancient Chinese immigrants, but could not draw a definite conclusion. In spite of that, we successfully identified three genomic regions significantly associated with potential migrant personalities. This result is consistent with the expectation inferred from Camperio Ciani and Capiluppi [58], i.e., the personality differences between immigrants and original islanders have genetic basis so that the admixture of them could generate a group of individuals with various personality tendencies, which is especially ideal for the GWAS studies.


It should be noted that the methods and strategies we utilized were not specific for the analysis of migrant personality. The key point is that the Taiwanese Hakka sample we used has such a special immigration and admixture history in the past 400 years [32]. Any genetic variants, especially genetic differences between the immigrants and aboriginal populations, have the potential to be identified through our GWAS analyses. Among them, features related to emigration intentions and behaviors should have the largest discrepancy and the highest chance to be recognized–no matter the difference was derived from the immigrants’ high tendency, or from the sedentary islanders’ low tendency. Once the sample size was enlarged and the statistical power was increased, we may potentially find more other SNPs associated with other personalities not related to migration. Referring to the three SNPs identified in our study, the power to identify other SNPs with similar effect sizes could reach 0.9 as long as the sample sizes were enlarged to 400, 400, and 900, respectively (S2 Fig). It would be interesting to further investigate the genetic components of some CPAI-2 specific factors like Interpersonal Relatedness in Taiwanese populations.

Supporting information

S1 Fig. The distribution patterns of the 19 personality traits.


S2 Fig. The plots of power by sample sizes for the three most significant SNPs shown in Fig 2.


S3 Fig. The quantile-quantile plots for the 19 personalities.

The horizontal and vertical axes in each plot represent the theoretical and observed p-values in–log10 scale, respectively.



We are deeply grateful to all the study participants and numerous colleagues and friends who assisted with the recruitment process. We would like to thank Dr. Jenn-Kang Hwang for inspiring and initiating the project and Dr. Yan-Hwa Wu Lee for the continued supports. We also would like to appreciate the hardware supports from Dr. Tsun-Tsao Huang and the administrative supports from the Research Center for Humanities and Social Sciences, National Yang Ming Chiao Tung University.


  1. 1. Manolio TA. Genomewide association studies and assessment of the risk of disease. New England Journal of Medicine. 2010; 363(2):166–176. pmid:20647212
  2. 2. Pearson TA, Manolio TA. How to interpret a genome-wide association study. JAMA—Journal of the American Medical Association. 2008; 299(11):1335–1344. pmid:18349094
  3. 3. Alqudah AM, Sallam A, Baenziger PS, Börner A. GWAS: Fast-forwarding gene identification and characterization in temperate Cereals: lessons from Barley–A review. Journal of Advanced Research. 2020; 22:119–135. pmid:31956447
  4. 4. Montag C, Ebstein RP, Jawinski P, Markett S. Molecular genetics in psychology and personality neuroscience: On candidate genes, genome wide scans, and new research strategies. Neuroscience & Biobehavioral Reviews. 2020; 118:163–174. pmid:32681937
  5. 5. Ebstein RP, Novick O, Umansky R, Priel B, Osher Y, Blaine D, et al. Dopamine D4 receptor (D4DR) exon III polymorphism associated with the human personality trait of novelty seeking. Nature Genetics. 1996; 12(1):78–80. pmid:8528256
  6. 6. Schinka JA, Letsch EA, Crawford FC. DRD4 and novelty seeking: Results of meta-analyses. American Journal of Medical Genetics. 2002; 114(6):643–648. pmid:12210280
  7. 7. de Moor MHM, Costa PT Jr, Terracciano A, Krueger RF, de Geus EJC, Toshiko T, et al. Meta-analysis of genome-wide association studies for personality. Molecular Psychiatry. 2012; 17(3):337–349. pmid:21173776
  8. 8. Kim HN, Roh SJ, Sung YA, Chung HW, Lee JY, Cho J, et al. Genome-wide association study of the five-factor model of personality in young Korean women. Journal of Human Genetics. 2013; 58(10):667–674. pmid:23903073
  9. 9. Koshimizu H, Nogawa S, Asano S, Ikeda M, Iwata N, Takahashi S, et al. Genome-wide association study identifies a novel locus associated with psychological distress in the Japanese population. Translational Psychiatry. 2019; 9(1):52. pmid:30705256
  10. 10. Luciano M, Hagenaars SP, Davies G, Hill WD, Clarke TK, Shirali M, et al. Association analysis in over 329,000 individuals identifies 116 independent variants influencing neuroticism. Nature Genetics. 2018; 50(1):6–11. pmid:29255261
  11. 11. van den Berg SM, de Moor MHM, McGue M, Pettersson E, Terracciano A, Verweij KJH, et al. Harmonization of Neuroticism and Extraversion phenotypes across inventories and cohorts in the Genetics of Personality Consortium: an application of Item Response Theory. Behavior Genetics. 2014; 44(4):295–313. pmid:24828478
  12. 12. de Moor MHM, van den Berg SM, Verweij KJH, Krueger RF, Luciano M, Arias Vasquez A, et al. Meta-analysis of genome-wide association studies for neuroticism, and the polygenic association with major depressive disorder. JAMA psychiatry. 2015; 72(7):642–650. pmid:25993607
  13. 13. Lo MT, Hinds DA, Tung JY, Franz C, Fan CC, Wang Y, et al. Genome-wide analyses for personality traits identify six genomic loci and show correlations with psychiatric disorders. Nature Genetics. 2017; 49(1):152–156. pmid:27918536
  14. 14. van den Berg SM, de Moor MHM, Verweij KJH, Krueger RF, Luciano M, Arias Vasquez A, et al. Meta-analysis of genome-wide association studies for Extraversion: Findings from the genetics of personality consortium. Behavior Genetics. 2016; 46(2):170–182. pmid:26362575
  15. 15. Kim SE, Kim HN, Yun YJ, Heo SG, Cho J, Kwon MJ, et al. Meta-analysis of genome-wide SNP-and pathway-based associations for facets of neuroticism. Journal of Human Genetics. 2017; 62(10):903–909. pmid:28615674
  16. 16. Eysenck HJ, Eysenck SBG. Manual of the Eysenck Personality Questionnaire (junior and adult). Hodder and Stoughton Educational; 1975.
  17. 17. McCrae RR, Costa PT Jr. Personality in Adulthood. Guilford Press; 1990.
  18. 18. Costa PT Jr, McCrae RR. The Revised NEO personality inventory (NEO-PI-R) and NEO five-factor inventory (NEO-FFI). Odessa, FL: Psychological Assessment Resources; 1992.
  19. 19. Shweder RA, Sullivan MA. Cultural psychology: Who needs it? Annual Review of Psychology. 1993; 44(1):497–523.
  20. 20. Church AT. Personality measurement in cross-cultural perspective. Journal of Personality. 2001; 69(6):979–1006. pmid:11767826
  21. 21. Cheung FM, Leung K, Fan RM, Song WZ, Zhang JX, Zhang JP. Development of the Chinese personality assessment inventory. Journal of Cross-Cultural Psychology. 1996; 27(2):181–199.
  22. 22. Cheung FM, Leung K, Song WZ, Zhang JX. The cross-cultural (Chinese) personality assessment inventory-2 (CPAI-2). FM Cheung, Department of Psychology, The Chinese University of Hong Kong, Hong Kong SAR; 2001.
  23. 23. Cheung FM, Cheung SF, Zhang J, Leung K, Leong F, Huiyeh K. Relevance of openness as a personality dimension in Chinese culture: Aspects of its cultural relevance. Journal of Cross-Cultural Psychology. 2008; 39(1):81–108.
  24. 24. Cheung SF, Cheung FM, Howard R, Lim YH. Personality across the ethnic divide in Singapore: Are “Chinese Traits” uniquely Chinese? Personality and Individual Differences. 2006; 41(3):467–477.
  25. 25. Fan W, Cheung FM, Zhang JX, Cheung SF. A combined emic-etic approach to personality: CPAI and cross-cultural applications. Acta Psychologica Sinica. 2011; 43(12):1418–1429.
  26. 26. Cheung FM, Zhang J, Cheung SF. From indigenous to cross-cultural personality: The case of the Chinese Personality Assessment Inventory. In: Bond MH, editor. The Oxford handbook of Chinese psychology. Oxford University Press; 2010. pp. 295–308.
  27. 27. Leong ST. Migration and Ethnicity in Chinese History: Hakkas, Pengmin, and Their Neighbors. Stanford University Press; 1997.
  28. 28. Hoerder D. Cultures in contact. Duke University Press; 2002.
  29. 29. Wang G. The Hakka in migration history. In: Don’t Leave Home: Migration and the Chinese. Singapore: Eastern Universities Press; 2003. pp. 217–238.
  30. 30. Leo J. Global Hakka: Hakka Identity in the Remaking. Brill; 2015.
  31. 31. Martin HJ. The Hakka Ethnic Movement in Taiwan, 1986–1991. In: Constable N, editor. Guest People: Hakka Identity in China and Abroad. University of Washington Press; 1996. pp. 176–195.
  32. 32. Office of the Chief of Naval Operations. Civil Affairs Handbook: Taiwan (Formosa): June 15, 1944. Washington: Navy Department; 1944.
  33. 33. Boneva BS, Frieze IH. Toward a concept of a migrant personality. Journal of Social Issues. 2001; 57(3):477–491.
  34. 34. Church AT. Current perspectives in the study of personality across cultures. Perspectives on Psychological Science. 2010; 5(4):441–449. pmid:26162190
  35. 35. Illumina. GenomeStudio Genotyping Module v2.0 User Guide; 2016. papers2://publication/uuid/F3A189B9-CC5B-4B30-B5EA-47A3D382EEB0.
  36. 36. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics. 2007; 81(3):559–575. pmid:17701901
  37. 37. Pruim RJ, Welch RP, Sanna S, Teslovich TM, Chines PS, Gliedt TP, et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics. 2011; 26(18):2336–2337. pmid:20634204
  38. 38. Moore CM, Jacobson SA, Fingerlin TE. Power and sample size calculations for genetic association studies in the presence of genetic model misspecification. Human Heredity. 2019; 84(6):256–271. pmid:32721961
  39. 39. Deconinck AE, Potter AC, Tinsley JM, Wood SJ, Vater R, Young C, et al. Postsynaptic abnormalities at the neuromuscular junctions of utrophin-deficient mice. The Journal of Cell Biology. 1997; 136(4):883–894. pmid:9049253
  40. 40. Knuesel I, Zuellig RA, Schaub MC, Fritschy JM. Alterations in dystrophin and utrophin expression parallel the reorganization of GABAergic synapses in a mouse model of temporal lobe epilepsy. European Journal of Neuroscience. 2001; 13(6):1113–1124. pmid:11285009
  41. 41. García-Cabrero AM, Marinas A, Guerrero R, de Córdoba SR, Serratosa JM, Sánchez MP. Laforin and malin deletions in mice produce similar neurologic impairments. Journal of Neuropathology and Experimental Neurology. 2012; 71(5):413–421. pmid:22487859
  42. 42. Bocciardi R, Giorda R, Marigo V, Zordan P, Montanaro D, Gimelli S, et al. Molecular characterization of a t(2; 6) balanced translocation that is associated with a complex phenotype and leads to truncation of the TCBA1 gene. Human Mutation. 2005; 26(5):426–436. pmid:16145689
  43. 43. Calboli FCF, Tozzi F, Galwey NW, Antoniades A, Mooser V, Preisig M, et al. A genome-wide association study of neuroticism in a population-based sample. PLoS ONE. 2010; 5(7):e11504. pmid:20634892
  44. 44. de Rooij J, Boenink NM, van Triest M, Cool RH, Wittinghofer A, Bos JL. PDZ-GEF1, a guanine nucleotide exchange factor specific for Rap1 and Rap2. Journal of Biological Chemistry. 1999; 274(53):38125–38130. pmid:10608883
  45. 45. Liao Y, Kariya KI, Hu CD, Shibatohge M, Goshima M, Okada T, et al. RA-GEF, a novel Rap1A guanine nucleotide exchange factor containing a Ras/Rap1A-associating domain, is conserved between nematode and humans. Journal of Biological Chemistry. 1999; 274(53):37815–37820. pmid:10608844
  46. 46. Jang YN, Jang HC, Kim GH, Noh JE, Chang KA, Lee KJ. RAPGEF2 mediates oligomeric Aβ-induced synaptic loss and cognitive dysfunction in the 3xTg-AD mouse model of Alzheimer’s disease. Neuropathology and Applied Neurobiology. 2021; 47(5):625–639. pmid:33345400
  47. 47. Zhang D, Ma X, Sun W, Cui P, Lu Z. Down-regulated FSTL5 promotes cell proliferation and survival by affecting Wnt/β-catenin signaling in hepatocellular carcinoma. International Journal of Clinical and Experimental Pathology. 2015; 8(3):3386–3394.
  48. 48. Masuda T, Sakuma C, Nagaoka A, Yamagishi T, Ueda S, Nagase T, et al. Follistatin-like 5 is expressed in restricted areas of the adult mouse brain: Implications for its function in the olfactory system. Congenital Anomalies. 2014; 54(1):63–66. pmid:24588779
  49. 49. Maguschak KA, Ressler KJ. A role for WNT/β-catenin signaling in the neural mechanisms of behavior. Journal of Neuroimmune Pharmacology. 2012; 7(4):763–773. pmid:22415718
  50. 50. Lisboa BCG, Oliveira KC, Tahira AC, Barbosa AR, Feltrin ASA, Gouveia G, et al. Initial findings of striatum tripartite model in OCD brain samples based on transcriptome analysis. Scientific Reports. 2019; 9(1):1–12. pmid:30816141
  51. 51. Beavis WD. The power and deceit of QTL experiments: lessons from comparative QTL studies. In: Proceedings of the Forty-Ninth Annual Corn and Sorghum Industry Research Conference. Chicago, IL; 1994. pp. 250–266.
  52. 52. Beavis WD. QTL analyses: power, precision, and accuracy. In: Paterson AH, editor. Molecular Dissection of Complex Traits. CRC Press; 1998. pp. 145–162.
  53. 53. Orr HA. The genetics of species differences. Trends in Ecology and Evolution. 2001; 16(7):343–350. pmid:11403866
  54. 54. Boopathi NM. Genetic mapping and marker assisted selection. Springer; 2020.
  55. 55. Wang M, Xu S. Statistical power in genome-wide association studies and quantitative trait locus mapping. Heredity. 2019; 123: 287–306. pmid:30858595
  56. 56. Zuk O, Hechter E, Sunyaev SR, Lander ES. The mystery of missing heritability: Genetic interactions create phantom heritability. Proceedings of the National Academy of Sciences. 2012; 109(4):1193–1198. pmid:22223662
  57. 57. Camperio Ciani AS, Capiluppi C, Veronese A, Sartori G. The adaptive value of personality differences revealed by small island population dynamics. European Journal of Personality. 2007; 21(1):3–22.
  58. 58. Camperio Ciani AS, Capiluppi C. Gene flow by selective emigration as a possible cause for personality differences between small islands and mainland populations. European Journal of Personality. 2011; 25(1):53–64.
  59. 59. Canache D, Hayes M, Mondak JJ, Wals SC. Openness, extraversion and the intention to emigrate. Journal of Research in Personality. 2013; 47(4):351–355.
  60. 60. Otto K, Dalbert C. Individual differences in job-related relocation readiness: The impact of personality dispositions and social orientations. The Career Development International. 2012; 17(2):168–186.
  61. 61. Tabor AS, Milfont TL, Ward C. The migrant personality revisited: Individual differences and international mobility intentions. New Zealand Journal of Psychology (Online). 2015; 44(2):89–95.
  62. 62. McCrae RR, Costa PT Jr. Chapter 31—Conceptions and Correlates of Openness to Experience. In: Hogan R, Johnson J, Briggs S, editors. Handbook of Personality Psychology. Academic Press; 1997. pp. 825–847.
  63. 63. Paulauskaitė E, Šeibokaitė L, Endriulaitienė A. Big five personality traits linked with migratory intentions in Lithuanian student sample. International Journal of Psychology: A Biopsychosocial Approach / Tarptautinis psichologijos žurnalas: biopsichosocialinis požiūris. 2010; 7:41–58.
  64. 64. Marušić I, Kamenov Ž, Jelić M. Personality and attachment to friends. Društvena istraživanja. 2011; 20(4):1119–1137.
  65. 65. Polek E, Van Oudenhoven JP, ten Berge JMF. Evidence for a “migrant personality”: Attachment styles of Poles in Poland and Polish immigrants in the Netherlands. Journal of Immigrant & Refugee Studies. 2011; 9(4):311–326.
  66. 66. Furnham A. Personality differences in managers who have, and have not, worked abroad. European Management Journal. 2017; 35(1):39–45.
  67. 67. Huffman JE. Examining the current standards for genetic discovery and replication in the era of mega-biobanks. Nature Communications. 2018; 9:5054. pmid:30498205
  68. 68. Jaura R, Yeh S– Y, Montanera KN, Ialongo A, Anwar Z, Lu Y, et al. Extended intergenic DNA contributes to neuron-specific expression of neighboring genes in the mammalian nervous system. Nature Communications. 2022; 13:2733. pmid:35585070