Temperament Clusters in a Normal Population: Implications for Health and Disease

Background The object of this study was to identify temperament patterns in the Finnish population, and to determine the relationship between these profiles and life habits, socioeconomic status, and health. Methods/Principal Findings A cluster analysis of the Temperament and Character Inventory subscales was performed on 3,761 individuals from the Northern Finland Birth Cohort 1966 and replicated on 2,097 individuals from the Cardiovascular Risk in Young Finns study. Clusters were formed using the k-means method and their relationship with 115 variables from the areas of life habits, socioeconomic status and health was examined. Results Four clusters were identified for both genders. Individuals from Cluster I are characterized by high persistence, low extravagance and disorderliness. They have healthy life habits, and lowest scores in most of the measures for psychiatric disorders. Cluster II individuals are characterized by low harm avoidance and high novelty seeking. They report the best physical capacity and highest level of income, but also high rate of divorce, smoking, and alcohol consumption. Individuals from Cluster III are not characterized by any extreme characteristic. Individuals from Cluster IV are characterized by high levels of harm avoidance, low levels of exploratory excitability and attachment, and score the lowest in most measures of health and well-being. Conclusions This study shows that the temperament subscales do not distribute randomly but have an endogenous structure, and that these patterns have strong associations to health, life events, and well-being.


Introduction
Temperament refers to early-appearing individual differences in emotional responding and central to its definition is the notion that temperament is innate [1]. There is now substantial data supporting the biologically based nature of these individual differences in emotional experience and regulation, with heritability estimates ranging from 50 to 65% [2,3,4,5]. These stable and organized patterns of behavioral responses across a range of contexts are thus believed to form the basis of more complex psychological structures.
Several models have been proposed for classifying temperament, including Cloninger's psychobiological model of four higher-order temperament dimensions distinguished by their stimulus-response characteristics [6,7]. The widely used Temperament and Character Inventory (TCI) [7] measures individual differences along four main temperament dimensions: novelty seeking (NS), harm avoidance (HA), reward dependence (RD) and persistence (P). NS is a tendency to respond with intense excitement to novel stimuli, or cues for potential rewards or potential relief of punishment and thereby activating behavior. HA is a tendency to respond intensively to signals of aversive stimuli, thereby inhibiting behavior. RD is a tendency to respond intensely to signals of reward, especially social rewards, thereby maintaining and continuing particular behaviors. P is a tendency to persevere in behaviors that have been associated with reward or relief from punishment. Scores measured by the TCI distribute normally in the population with sex-dependent differences [8].
The importance of temperament to mental health [7] and to some extent to somatic health [9,10] has been previously established. In particular, high levels of HA are associated with a number of psychiatric disorders [11,12,13,14,15]. A more complete understanding of the influence of temperament on health outcome will require dissection of the specific pathways between temperament and outcome. However, despite the original hypothesis that temperament dimensions are independent behavioral systems [6], accumulating evidence suggests correlations between dimensions [16]. Organizing personality traits as profiles or clusters (person-oriented approach) provides an opportunity to examine the context of an individual's traits, as opposed to considering individual traits in isolation (variable-centered approach), and does not require the assumption that personality dimensions operate independently (Crockett et al., 2006). While there is growing support that such personality clusters predict a number of health outcomes, much of this work has been focused on clusters defined in childhood or adolescence (e.g., Crockett et al., 2006) or has been focused on a single diagnostic outcome (e.g., Muthen & Muthen, 2000).
In order to examine the inter-relationships between dimensions of temperament, as defined by Cloninger's model, and relationships between temperament and adult outcome, we conducted a series of analyses in the Northern Finland Birth Cohort 1966. This longitudinal birth-cohort provided the opportunity to assess the relationship between profiles of temperament and overt expression of psychiatric illness, health outcome, and lifestyle in a large, relatively genetically homogeneous population.
Specifically, we hypothesized that individuals from this large birth cohort could be stratified into functionally meaningful groups according to their temperament profiles. As temperament is an important determinant for affective regulation and behavior, we hypothesized that individuals from these separate temperament groups would also differ in their life habits, socioeconomic status, and psychiatric and somatic health. To test these hypotheses, we performed a cluster-based analysis of responses on the TCI in 3,761 men and women from the Northern Finland Birth Cohort 1966, and tested for differences between the resulting clusters on a wide range of life domains. In order to test the stability of these clusters in a separate sample, we also replicated the results of the cluster structure analysis among 2,097 participants of the Cardiovascular Risk in Young Finns study.

Participants and Measures
The Northern Finland Birth Cohort 1966 (NFBC 1966) is a longitudinal one-year birth cohort originally including all males and females whose expected year of birth was in 1966 in Finland's two northernmost provinces, Oulu and Lapland (N = 12,058 liveborn individuals) [17]. The cohort members have been carefully monitored prospectively from the prenatal period onwards. The current study sample is based on cohort members who lived in Finland at the age of 16 years (N = 10,526:5,365 male, 5,161 female), as validated psychiatric diagnoses from the Finnish Hospital Discharge Register are available for these subjects [18].
In the 31-year follow-up of the cohort, all subjects alive at that time with a known address were sent a postal questionnaire (N = 10,526). For subjects then living in the Oulu or Lapland provinces or in Helsinki area (N = 8,463), the questionnaire included an invitation to take part in a clinical examination. The subjects who participated (N = 5,960 (70%): 2863 male (65%), 3097 female (76%)) were also asked to fill in another questionnaire on temperament, health, and occupation [19].
This study was limited to those individuals for whom a complete personality questionnaire had been returned and who had not been diagnosed with mental retardation (N = 3761 out of the 5084 who returned the questionnaire). Of all subjects who were provided the temperament questionnaire, 63% (60% of the males, 66% of the females) participated.
Health-related quality of life was assessed with the 15D measure [20]. The HSCL-25-depression questionnaire, which is a 25-item shortened version of the original 90-item questionnaire [21], was used for measurement of symptoms for depression and anxiety as described by Veijola et al [22], and the Twenty Item Toronto Alexithymia Scale (TAS-20), which has been translated into Finnish [23], was used to measure the three facets of alexithymia [24]. Subjects were asked whether they had ever been diagnosed by a physician as having depression (yes/no), while a diagnosis of schizophrenia was based on validated diagnosis [17] data from hospital discharge register up until the end of year 1997.
The second questionnaire included a Finnish translation of the 107-item Temperament and Character Inventory (TCI) for measurement of four dimensions of temperament (NS, HA, RD and P) and their respective subscales (HA1: anticipatory worry, HA2: fear of uncertainty, HA3: shyness, HA4: fatigability; NS1: exploratory excitability, NS2: impulsiveness, NS3: extravagance, NS4: disorderliness; RD1: sentimentality, RD3: attachment, RD4: dependence, and P: persistence). Specifically, this was the Tridimensional Personality Questionnaire (TPQ) subset of the TCI version 9, with 107 binary items. Although the TPQ originally measured three dimensions, these original items were rearranged to calculate summary scores for the revised four dimensions of the TCI (with five items originally contributing to RD now analyzed separately as Persistence, and one of the original RD items now contributing to NS). In addition, the questionnaire included the following: the Perceptual Aberration Scale (PER), Physical Anhedonia Scale (PAS) and Social Anhedonia Scale (SAS) were included to measure traits which indicate a predisposition toward psychosis [25]; the Schizoid Scale (SCHD), extracted from the Minnesota Multiphasic Personality Inventory, to measure psychotic traits [26]; the Hypomanic Personality Scale (HPS) [27]; and a scale referred to here as ''Bipolar II scale'' to identify those depressed subjects who are at risk for later conversion to bipolar disorders [28]. For reference of collection and application of these scales, see Miettunen et al. [29,30].
All subjects included in the present study gave written consent for their data to be used. Permission to gather registry data was obtained by the Finnish Ministry of Social Welfare and health. The study was approved by the Ethics Committee of the Faculty of Medicine, University of Oulu.
An additional sample was examined in order to replicate the cluster structure identified in the NFBC1966 cohort. In the Cardiovascular Risk in Young Finns study (YF), which began in 1980, subjects for the original sample (N = 3,596) were selected randomly from six different age cohorts (3 to 18 years) in the population register of the Social Insurance Institution, a database covering the whole population of Finland. The design of the study and the selection of the sample have been described in detail by Raitakari et al [31]. Measurements for the present study were carried out in year 2001 when 2,105 participants (59% of the 1980 cohort) completed the TCI-version 9 questionnaire. As there was no significant association of the TCI scales to age group in this cohort, we used the TCI variables without any further adjustment for age.

Data Analysis
In cluster analysis, a set of individuals is divided into groups such that individuals designated to the same group are as similar to each other as possible while being as different from individuals in other groups as possible. Clustering methods applied to psychological data have lead to biologically meaningful results in previous work on schizophrenia by members of our group [32].
A large number of different clustering methods exist, often tailored to specific data types or applications. We performed initial tests with the probabilistic Gaussian mixture model clustering [33], a density-based clustering method [34], and the k-means method [35], and chose k-means for further analysis. We chose k-means for further analysis, as this method is well established in the literature and, compared to the other two methods, produces very robust and stable clusterings of temperament data, and delivers easily interpretable prototypical descriptions for the clusters found in the form of cluster centers.
Before clustering, all the scales were normalized to mean 0 and SD 1. The Euclidean distance between the 12-dimensional temperament subscale vectors was used as similarity measure. As the distribution of the subscales in the two genders differ significantly [7,8], clustering and subsequent analyses were performed separately for both genders. The k-means clustering algorithm requires the user to select k, the number of clusters. We  computed clusterings for 2-12 clusters and selected the best model from among those by the Bayesian information criterion [36].

Replication Analysis
To further analyze the validity of the clusters, we performed a cluster analysis on the TCI temperament subscales in the YF sample. Assigning a cluster to the individuals in the replication sample based on the model obtained on original NFBC1966 data, and vice versa, enabled us to analyze whether these two models represent the same underlying population structure. Further details of this method and results are presented in (Materials S1), including Figure S1 and Tables S1, S2, S3.

Association Analyses
In order to test differences between clusters on outcome variables representing a range of life domains, we conducted one-way analyses of variance for continuous variables and a chisquare tests of independence for discrete variables. A total of 115 variables representing occupation, lifestyle, socioeconomic status, and mental and physical health were examined. The collection of data in the NFBC1966 is extensive; in order to span the range of adult health and outcome while including those variables with adequate data, we selected variables if 1) sufficient information was available about the nature of the variable (i.e., how information was collected and measured), 2) more than 50% of cohort members had data available for that variable, and 3) we considered that variable to be meaningful for psychological and somatic health. An un-weighted Bonferroni correction [37] for pvalues was applied to compensate for multiple hypothesis testing.
In order to compare results between temperament clusters and original subscale scores, we also tested differences between each individual subscale and the same variables representing adult outcome across a range of life domains. All individuals were ordered based on their score in the subscale in question and split into four groups of sizes equal to the clusters. For these groups, we performed the same statistical test as was used for the clusters, corrected again with the Bonferroni correction, and compared the strength of the resulting association with that obtained from analyses using the clusters.

Cluster Analyses
Clustering of the NFBC1966 gender-specific samples according to the TCI main scales did not yield evidence for a cluster structure. In contrast, clustering of the 12 TCI subscales showed stable results with an optimum of four clusters in both genders (Figure 1). In the replication of the TCI subscale clustering in the YF sample, Cohen's kappa values were between 0.7 and 0.9, indicating a strong agreement between the models based on the separate samples (see Materials S1).
Despite genders having been clustered separately, we found similar clusters in the gender-specific models. Figure 2 shows these clusters as star plots (with 0 as sample mean and 1 as sample standard deviation). For females, Clusters I, II, III and IV include 26%, 25%, 28% and 21% of the subjects, whereas for males these numbers are 26%, 22%, 30% and 22%.

Association Analyses
The number and proportion of variables that are significantly associated with cluster membership and TCI scales are presented in Table 1. We examined the associations of temperament and the health-related quality of life as measured with the 15D measure [20] ( Tables 2-3). Cluster IV consistently reported the most problems across the 15 dimensions measured, while Clusters I and II reported the least problems both in males and females. Of the subscales, HA-1 and HA-4 had power equal to the clusters to ''predict'' these variables, with RD-1, RD-3, NS-1, NS-3 and the other HA scales also having some associations, although with less power than the clusters.
We tested for differences in education, work and socioeconomic status between the four clusters, and found significant differences in education (Tables S4-S5). For example, only 41% and 26% of women and men in Cluster IV had finished secondary school, while 56% and 35% in Cluster II had finished secondary school. These clusters also represent the two extremes in higher level of education. In concordance, Cluster IV reported lowest working capacity in both genders. Marital status differed between clusters, with the highest rate of marriage in Cluster I for both females (62%) and males (50%), and the highest rate of divorce in Cluster II females (7%) and males (5%). For males, the TCI subscales most strongly associated to these variables were HA-3 and HA-4. For females, there were very few strong associations of the individual scales, with the exception of a strong association of rate of marriage to NS-3 (in females only).
Individuals from Cluster II reported the best, while those from Cluster IV the worst, physical functional capacity both in females and males. Over 10% of Cluster II females, but only 4% of Cluster IV females, reported no trouble running 2 or 5 kilometers, with a similar pattern observed for males. Self-reported physical activity also followed the pattern of self-reported physical functional capacity.
As indicators of taking care of oneself, Cluster IV individuals tended to report brushing their teeth less often than other clusters, while members from Cluster II reported using more alcohol and more members reported smoking regularly (52% of females and 55% males). Almost no statistically significant differences could be observed in the physical measurements of the individuals, including height, weight, BMI, blood pressure, and levels of fasting sugar, insulin, and cholesterol. Triglyceride levels were slightly higher in Cluster IV (1.35 mmol/dL) and lower in Cluster I (1.20 mmol/dL) males. Females, particularly those in Cluster II, had a tendency to underestimate their weight in the postal questionnaire compared to the measurements at the physician's office.
In terms of the individual temperament dimensions, for females, smoking and alcohol were associated to NS-3 almost exclusively, while the HA scales were associated to self-reported physical capacity. NS-3 and the HA scales also dominated the pattern in males, but the performance of clusters vs. scales was the opposite that seen in females. The lack of associations indicate that using only subscales without clustering would miss the associations to physical activity, frequency of brushing teeth, lifetime abstinence from alcohol, calcium intake, and difference in self-reported and measured BMI in females. In males, while using the scales instead of clusters would miss associations to the brushing of teeth, using the clusters alone would miss associations certain physical variables, all strongly associated to NS-3.
In terms of confirmed diagnoses, we found minor differences in physical health, with a considerable difference between males and females (Table 4). For females, individuals in Cluster IV reported less allergic rhinitis and eczemas, and more rheumatoid arthritis than individuals in other clusters. Among males, Cluster II individuals had almost double the prevalence of asthma compared to the other clusters, while individuals in Cluster II and IV had hypertension more often. For both genders, self-reported lifetimedepression and register-based diagnosis of schizophrenia were over twice as common in Cluster IV, while Cluster I clearly had the lowest prevalence. In terms of the individual subscales, depression was associated to HA-4 in both genders, while there were no additional associations between other diagnoses and individual subscales.
We also analyzed associations to certain validated psychological scales completed by the participants (Tables 5-6). Individuals from Cluster IV scored the highest and individuals from Cluster I or II the lowest on all scales measuring traits predisposing to psychosis, including the PER, PAS, SAS and SCHD. Similar findings were obtained with SCL and TAS-20 factors, with individuals in Clusters I and II scoring consistently lowest and individuals from Cluster IV highest on symptoms related to anxiety, depression and alexithymia. Interestingly, individuals from Cluster II scored highest on the HPS (mean in females 16.2 and in males 16.4) while individuals from Cluster III scored lowest (mean in females 9.5 and in males 8.5).
These psychological scales were also associated to many of the TCI subscales (Tables 5-6). For example, in females, we found strong associations of PAS to RD-1 and HA-1, and SCL symptoms to HA-1 and HA-4. It is interesting to note that while for males the HA scales dominate the association picture, there are two scales that were not associated to any temperament dimension: the schizoid scale, and the social anhedonia scale. In addition, in females, the cluster association to the HPS would be missed by an analysis using the subscales alone.

Discussion
We present a stable and robust clustering of domains of temperament in a population-based sample. Our results on .2,000 females and .1,700 males from a longitudinal birth cohort from Northern Finland demonstrate that the analyzed temperament dimensions of the TCI do not distribute randomly among individuals but have a consistent, endogenous pattern, and this structure is supported by a replication analysis in a separate, representative population sample of .2,000 Finnish individuals. In addition, our results provide further evidence for the importance of temperament to health and well being, with statistically significant differences between these temperament clusters across a number of life domains.

Properties of the Temperament Clusters
A stable and robust clustering was found with the TCI subscales with an optimum of four clusters (I-IV) for both males and females. The results proved to be quite similar for both genders, despite the separate analyses, and in agreement in a replication sample, providing further support for the stability of these temperament profiles. We did not, however, find evidence for a stable clustering pattern based on the four TCI scales alone; this is likely because the use of subscales provided more information that the clustering algorithm could use to partition the individuals into stable and robust clusters. Individuals from Cluster I can be described as stable, persistent and not very impulsive. They report a high quality of life and selfreported working capacity, and a relatively high level of education. Both females and males are more often married than individuals from the other clusters. Their life habits are healthy: they brush their teeth, do not drink very much alcohol and only rarely smoke. They score lowest in most of the scales for psychosis proneness and symptoms for depression and anxiety, and this cluster has a lower prevalence of depression and schizophrenia than other clusters. Consequently, our results suggest that this temperament profile, which is characterized by remarkably average levels on most of the temperament traits except particularly low levels of impulsivity (HA2) and disorderliness (HA4), may possess features enabling mental stability and psychological adaptability, leading to practice of healthy life habits, stable life features, and decreased risk for mental disorders.
Individuals from Cluster II can be characterized as outgoing, energetic people who tend to be impulsive. Like individuals from Cluster I, they have a high quality of life and self-reported working capacity. They have the highest level of education, their annual income is on average higher than other clusters, and they also report the best physical functional capacity. Divorces are more common and they have a tendency for higher consumption of alcohol and more smoking, particularly in females. Cluster II is characterized by relatively low levels of depression and schizo-phrenia, supported by low scores for all psychological scales that measure traits predisposing toward psychosis, anxiety or depression, except on the hypomania personality scale, on which Cluster II members score the highest. It is noteworthy that we observed here a tendency of Cluster II individuals to embellish reality. This is consistent with this cluster's temperament profile, as well as the high scores on the HPS [38]. Individuals with hypomanic personality have been reported to provide high estimates of their future academic and occupational performance [27,38], which leads to a need for caution when interpreting the self-reports of positive lifestyle and health-related variables in Cluster II.
In terms of quality of life, socioeconomic status, mental and physical health, individuals from Cluster III do not show any extreme characteristics. The education, working capacity and physical functional capacity are higher in Cluster III than in Cluster IV but generally lower than for individuals from Clusters I or II. They score low in scales for psychosis proneness as well as for the hypomania personality scale, the latter being likely to reflect the low energy level of these individuals.
Individuals from Cluster IV could be described as shy and pessimistic persons who prefer routine and privacy. They score the lowest in most fields of health and well-being and are more often unemployed. They also report lowest working capacity scores and their annual income is lowest among the four clusters. Members of Cluster IV are the least physically fit. Virtually all indicators for psychological health show signs for increased mental health problems, both in levels symptoms as well as manifest disorders. Thus, based on the particular traits measured in the NFBC1966, our results suggest that a profile characterized by excessively high HA and low NS, RD and P (representing approximately 20% of cohort members) may capture a profile of increased physical and mental health risk.
Overall, our results are in line with the previously published findings. In a previous study on NFBC1966, all four domains of temperament were found to be associated to socioeconomic status, alcohol consumption and smoking behavior in varying configurations [10]. There was a negative gradient between HA and level of education and a tendency towards higher RD and P with increasing socioeconomic status. Previous studies have also demonstrated relationships between high HA (and other related personality traits, such as high Neuroticism and dysthymic temperament) with depression and anxiety [39,40,41,42,43]. In our current analyses, we also found that the clusters do as well as, and in many cases better than, the individual dimensions to find associations with outcome variables. A further advantage of using these clusters to investigate the relationship between temperament and health outcomes is that clusters have an additional value of simpler data structure (as each individual has only one definition), meaning also that fewer statistical tests are needed.

Strengths and Limitations
The primary strength of our analyses stems from the prospective design of the study and follow-up of a large birth cohort, which allows for control of recall bias. Furthermore, a previous analysis has demonstrated that participation in this cohort does not vary across specific disorders, nor do gender or education explain the association of psychiatric disorders with participation [19]. An additional strength of the present study is the relative homogeneity of its population: all subjects were of the same age and ethnicity. This implication of this is that the TCI scores in young Finns are likely not biased by cross-cultural issues associated with temperament measurement [28,29].
One potential limitation is that temperament in the NFBC1966 was assessed only at one time point. However, although absolute scores in temperament may change over time, inter-individual differences typically remain relatively stable [42]. Nevertheless, application of repeated measures of temperament would likely add to the accuracy of the results.
A final limitation is that clustering itself cannot answer the question of whether the clusters found reflect real clusters in the data or are artifacts of the method. However, to begin to address this limitation, we replicated the clustering analysis in a separate sample that represents the Finnish population well, thereby providing additional support for this pattern of temperament profiles.

Implications
To our knowledge, this is the first study to investigate temperament patterns using cluster analysis tools in populationbased samples of both females and males. One previous study reported an association between temperament profiles and a high level of physiological CHD risk factors; however, this study included only men and was comprised of a relatively small sample of 190 individuals [44].
Our results further question the assumption that temperament domains are independent. The heritability of each temperament factor has been estimated to range from 50 to 65% [2,3,4,5]. Although these factors have been suggested to be independent of each other, contradictory results have also been reported [45]. For example, a recent meta-analysis supports temperament dimension inter-relatedness, particularly given the consistent negative correlation between NS and HA in a number of studies [3]. These results also lend support to a person-centered approach. While the majority of studies to combine independent dimensions and examine the ability of resulting temperament profiles to predict mental health have focused on samples of children and adolescents [46,47,48,49,50], our results support the argument that additional information can be obtained by considering cluster profiles. Indeed, in a manuscript submitted concurrently, we demonstrate significant relationships between these same cluster profiles and life course measures (e.g., early environment, neurobehavioral development, and adolescent behavior)(Congdon et al., submitted concurrently).
Obtaining a better understanding of these relationships will be critical for understanding the underlying genetic architecture and other possible etiological processes predisposing individuals to a particular temperament profiles, as well as the relationship between genetic factors, patterns of temperament, and ultimately psychiatric and somatic health outcomes. Figure S1 Histograms of chi-square values. Chi-square values of 100 experiments using generated data and of results based on cross-tabulation of 4-cluster solutions (presented in Table  S2), with green for females and red for males. (TIF)   Materials S1

Supporting Information
(DOC)