Observer Bias: An Interaction of Temperament Traits with Biases in the Semantic Perception of Lexical Material

The lexical approach is a method in differential psychology that uses people's estimations of verbal descriptors of human behavior in order to derive the structure of human individuality. The validity of the assumptions of this method about the objectivity of people's estimations is rarely questioned. Meanwhile the social nature of language and the presence of emotionality biases in cognition are well-recognized in psychology. A question remains, however, as to whether such an emotionality-capacities bias is strong enough to affect semantic perception of verbal material. For the lexical approach to be valid as a method of scientific investigations, such biases should not exist in semantic perception of the verbal material that is used by this approach. This article reports on two studies investigating differences between groups contrasted by 12 temperament traits (i.e. by energetic and other capacities, as well as emotionality) in the semantic perception of very general verbal material. Both studies contrasted the groups by a variety of capacities: endurance, lability and emotionality separately in physical, social-verbal and mental aspects of activities. Hypotheses of “background emotionality” and a “projection through capacities” were supported. Non-evaluative criteria for categorization (related to complexity, organization, stability and probability of occurrence of objects) followed the polarity of evaluative criteria, and did not show independence from this polarity. Participants with stronger physical or social endurance gave significantly more positive ratings to a variety of concepts, and participants with faster physical tempo gave more positive ratings to timing-related concepts. The results suggest that people's estimations of lexical material related to human behavior have emotionality, language- and dynamical capacities-related biases and therefore are unreliable. This questions the validity of the lexical approach as a method for the objective study of stable individual differences.


Introduction
Language, emotionality and energetic capacities likely impose biases in perception of lexical material Social bias of language. Language is an invention of group dynamics, which functions to facilitate socialization, an exchange of information and to synchronize group activity. The social function of language therefore serves the needs of the society more than the needs of an individual. This explains why there are several language-related biases in our lexicon. For example: -there are more behavioral descriptors of people in active states than in relaxed states; -there are more words related to social than physical aspects of behavior; -there are more words in our language describing negative emotions than positive emotions [1], and negative descriptors receive more attention [2][3][4] -there are more words describing energetic aspects of behavior than plasticity or tempo of actions.
Contrary to the bias of lexical descriptors in human language, healthy people are in a positive or balanced emotional state no less often than in a negative one, even when there are fewer words related to the positive states; people spend on average a minimum of half a day on either sleeping, napping, sitting while eating, driving, waiting, watching, or not engaging in intense activities. Unless we are talking about teenagers, people are involved in communication no more than in physical or intellectual activities. Furthermore, every action has a certain plasticity of construction and a tempo, i.e. not just the energetic, but also the lability aspects of actions are also present in every performance. These differences between the ratios in language descriptors and the ratios of actual occurrences of certain behaviors, demonstrate a language bias, compromising the validity of the public's evaluations which use lexical descriptors.
In addition to the social bias of language, people tend to conflate properties and traits that are similar in appearance, and this makes their opinions unscientific and unreliable [2], [4][5][6][7][8]. People's perception of ''psychological types'', for example, are used by the lexical approach to derive a model of personality dimensions. In using this example, the conflation of the properties of an object of analysis (structure of human individuality) emerges when charac-teristics tend to either enhance or diminish each other's visibility when they play as an ensemble.
For example, high sociability (social endurance) is more noticeable in people with a high tempo of speech, or with high empathy, or with high impulsivity and adventure seeking. That does not mean that people with a low tempo of speech, or wellregulated people, or people avoiding risks cannot have high sociability. At the other extreme, well-regulated, intelligent and secure people, who have a low need for social approval and therefore low socialization, were thrown together by the lexical approach models into one (''introvert'') group with unempathic (psychopathic or autistic) people and/or people with low physical energy. Research into flat affect shows that patients with this symptom may be more affectively responsive than has usually been assumed. According to patients' reports as well as electrodermal measures of arousal, flat-affect patients actually have the same or even a higher affective response that do normal individuals [9][10]. It has been suggested that the discrepancy between the experience and expression of emotion (''affective incongruence'') may result from neuromuscular abnormalities that prevent normal (or even abnormally intense) emotions from being expressed in normal ways [11].
Anxious people were inferred to have a strong inhibitory system, and impulsive people were inferred to have a strong activation system. In clinical practice, however, anxious patients, especially children, show higher impulsivity and inability to focus than non anxious people. Another confusion is the attribution of impulsivity to highly active people with approach and sensation seeking tendencies, and of low-risk and low impulsivity to highly intellectual people. In reality, intellectual people are endowed with all kinds of ''temperament packages'', including high or low sociability, with or without autistic tendencies, as well as with or without impulsivity, sensation seeking or high verbal-social tempo. Indeed in self-confident and/or energetic people, impulsivity is more visible when present, but depressed people, especially people with comorbid depression and anxiety, who report fatigue and slow-down also have impulsivity symptoms [12]. Such attribution errors are common, because a group of traits with similar behavioral appearances is more noticeable and can be attributed to one factor more readily than in cases where the same traits have ''conflicting'' directions.
Emotionality bias of cognition. Another factor that compromises the objectivity of people's evaluations: an impact of emotionality on cognition. Findings in neuropsychology showed that emotional evaluation always comes prior to the detailed cognitive assessment of the events, due to the lead of subcortical brain structures in information processing. Both subcortical and cortical structures are involved in emotional processing via multiple routes [13][14][15][16]. Vuilleumier and colleagues in their fMRI studies of the amygdala and related brain structures found a key role of these structures not only in emotional processing, but also in attention and perception [17][18]. Their results showed that emotional processes can modify perception -an effect that they called ''emotional attention'' [19]. Similar interlocking between emotionality and cognition was described be Pessoa [20][21]. Adolphs, Tranel and Buchanan [22] reported that enhanced memory for the visual details of a negative item within the affective-attentional network does not lead to the successful encoding of all details of an item's presentation. Activity in these affective processing regions showed no correspondence (in the case of the amygdala) or a negative relation (in the case of the orbitofrontal cortex, striatum, and anterior cingulate gyrus) to the successful encoding of the task performed with an item. This means that even though emotional processing dominates the first stages of evaluation and creates emotionality bias, it misses important details of the perceived objects.
These neuropsychological findings, which showed how unconscious emotional processing can affect human cognition, were reflected in theories of emotionality, offering the concepts of ''primary appraisals'' [23], ''basic emotions'' [24][25], ''core affects'' [26], ''somatic markers'' [27] as pre-cognitive emotional dispositions [28][29]. This was in line with common observations from clinical psychology and psychiatry on biased perceptions of patients with mood disorders, based on the same neurochemical factors as temperament. The presence of a background emotionality seems to be a built-in primary biological component of semantic processing (facilitating approach-withdrawal behavioral reaction) not only in humans, but also in other animals, including the most primitive ones [30]. More recently direct studies of the impact of emotionality on the semantic perception of amodal lexical material reported a negative bias in the semantic perception of socially anxious and neurotic people [31][32][33].
With such a strong and direct impact on cognition, emotionality can collapse the structure of perceived phenomena into just two dimensions related to emotional valence. There are indeed two valence-based biases noted in neurophysiological studies of emotionality.
First, emotional arousal has a well-documented negativity bias [1][2][3][4] [34][35][36]. There are more cognitively distinct negative emotions than positive ones [37][38], and people appeared more likely to feel multiple positive emotions at one time than multiple negative ones [39]. A negative default reaction has been observed in human affective judgments [40][41], judgements of objects, events, or choices [42][43], and event-related brain potentials to affective stimuli [44]. Meta-analysis of autonomic activity in response to all negative emotions combined compared with all positive emotions combined also showed a negativity bias. The components of autonomic response (diastolic blood pressure, blood volume, cardiac output, left ventricular ejection time, preejection period, pulse transit time, heart rate) had significantly greater activation during negative than positive emotions, and no autonomic responses showed the opposite pattern [44].
Second, socialization and perceived social support was found to have a positive emotionality bias. Mu-opioid receptors (MOP) inducing positive emotionality were found to be important players in the perception of social support [45] and in motivation by possible engagements in activities [46]. The impact of MOP and oxytocin on perceived social support and affiliative behavior was found to include dopamine (DA) release: both MOP and oxytocin induce this release during pro-social behavior. DA release in projections between NA, VTA and medial PFC also plays an important role in affiliative behaviour [24], [47], and an activation of the 5-HT2A receptor in hypothalamus reportedly increases hormonal levels of oxytocin, prolactin, ACTH, corticosterone, and rennin [48]. There are numerous reports on the role of oxytocin in perceived social support [47], [49][50], in affiliative behavior, and the formation of social memories [47]. Moreover, gonadal steroids, and vasopressin were found to have modulating effects on sensory, perceptual, and attentional processing of affiliative stimuli [49]. There are obvious evolutionary benefits for the development of a positive reinforcement system for socialization in humans, which we do not discuss here.
The bottom line is that these two regulatory systems -a system of emotional arousal and a system of affiliative behavior -are just two among many systems regulating human behavior. They, however, differ from other systems by having the power to induce a negative or positive emotional bias onto cognitive evaluations, due to the above noted interlocking between emotionality and cognition. As a result their biases will more likely appear in human estimations and even in the lexicon. The unconscious contribution of emotionality to cognitive processes raises concerns about the scientific validity of models that are based on people's evaluations, and not on experimental or clinical work.
Capacities-related bias in estimations of lexical material. Physical and social capacities were found to be interlocked with the way how people evaluate stimuli. For example, physical exercises were found to shift perception to more positive evaluations and even to endorphin-based improvement of depressive symptoms [51][52][53][54]. People with higher endurance in communication were reported to perceive neutral abstract concepts significantly more positively [31] and performed better on recall or categorization of emotionally positive words [32][33], than other temperament groups in these studies. A series of theories of embodiment, which echoed James-Lange theories emerged in cognitive psychology during the past 20 years [55][56][57][58][59][60][61][62][63].
Trofimova [31], in her study of the impact of sex and temperament on the perception of common words, described the phenomenon of «projection through capacities». Such a phenomenon emerges when a person's evaluations depend on their capacity to handle the events, and such evaluations therefore have bipolar outcomes (''can handle'' or ''cannot'') with corresponding positive or negative emotional dispositions towards the subject of evaluation. In projection through capacities, individual information processing therefore has a capacity-related bias, as the individual registers mostly those aspects of objects or of a situation that they can properly react to and deal with according to their inherent capacities (including a capacity to avoid harm). The interlocking of energetic capacities with emotional valence is in line with the idea of embodiment in cognition (supporting James-Lange theory) and the findings that the same neurochemical agents (neuropeptides and monoamine neurotransmitters) are involved in both energetic and emotional regulation of behavior. Such interlocking between energetic status and emotionality suggested that the largest dimensions in lexical approach models, i.e. Extraversion and Neuroticism, are not independent and essentially represent one dimension, and that executive and evaluative systems regulating human behavior might have a much closer interaction than was previously thought. This questions the benefits of a dimensionality-oriented approach in differential psychology, including the use of factor analysis, which relies on the independence of dimensions.
Lexical approach models might not be applicable to investigations in biology of individuality, due to language, capacities and emotionality biases The lexical approach in differential psychology involves people estimating very long lists of words for describing human traits and characteristics. A factor analysis is then applied in order to classify the traits into factors presented as dimensions of personality. The lexical approach was used by Cattell to derive his 16-factor model [64] and by a number of researchers proposing 5-factor models [64][65][66][67][68][69][70][71]. The modern 5-factor model was named the ''Big Five'' by Goldberg, and was intensively promoted around the world by McCrae and Costa [71].
Since phenomenon of personality is to a large extent a product of individual's socialization it would be natural to investigate how linguistic processes reflect the human perception of individual differences. This is a subject of investigation of cognitive psychology and, to a degree, personality psychology. The problem with the lexical approach arises when researchers manipulating the lists of verbal descriptors of personality present their models not as an interesting artifact of human verbal cognition or socialization processes, but as revealing a real core of biologically based individual differences in humans.
For example, researchers of the Big Five model claimed that it is capable of reflecting the structure of the biological systems of individuality (called either ''first order'' or ''second order personality traits'' within the lexical approach, and which are called ''temperament''here), just by application of factor analysis to the lists of personality descriptors. (Note that temperament was a topic of research for over 100 last years, especially in European tradition in the work of Kant, Pavlov, Heymans, Wundt, Stern, Lasursky, Jung, Adler, Kretschmer, Spranger, Teplov, Nebylizyn, Eysenck, Thayer, Gray, Tellegen, Rusalov, Netter, Watson and Tellegen, Strelau, etc. Historically the North-American tradition of temperament research was scattered within three different disciplines: developmental psychology, psychiatry and personality theory looking for biologically based traits using a lexical approach. This might explain frequent misconception on North American psychology that ''temperament is the same as personality''). In other words, the conclusions on the structure of biologically based traits were made using cognitive psychology methods.
The main assumption of the lexical approach is that correlations between common verbal descriptors of personality correspond to correlations between biologically based systems of human individuality. Only if this is true then the models of lexical approach can claim that they found a structure of the biologically based systems of individuality, and not just the nature of verbal descriptors themselves. To sort out whether or not the relationships between verbal descriptors can be considered as a representation of relationships within real biological systems of individuality we should look at: 1) nature of the descriptors (their source, scientific validation, possible biases), and 2) completeness of the list and any asymmetries in distribution of these descriptors that might lead to biases in resulting factors (keep in mind that the size of factors and the pattern of loadings are the direct reflection of such relationships).
1) In terms of the nature of lexical descriptors or their perception, the arguments above showed that they have at least three well-recognized types of biases: the socialization-bias of descriptors themselves, capacities-related (embodiment) bias and an emotionality bias in cognition. There are too many words related to socialization, in comparison to other words (socialization-language bias); there is a negativity bias in words related to emotionality; and there are more words related to energetic rather plasticity aspects of activities (capacities bias). Due to social and emotionality biases in our lexicon, if we collect all words related to individual differences and apply either factor analysis or just common perception in order to group these descriptors into conflating characteristics, the largest factors (groups) will be a) the words separated by the criteria of pro-social and sociallypragmatic values (Extraversion, or Activity are the candidate names) and b) the words contrasted by emotional valence, with a prevalence of negative emotionality, due to the prevalence of such words in human lexicon.
That is what happened to early temperament models and to the personality model of the lexical approach (often generalized as a general model of human individuality). Practically all early temperament models (starting from Kant's presentation of Hippocrates's four temperaments along the dimensions of Activity and Emotionality), which were based on people's observations (Kant, Wundt, Heymans, Adler, Kretschmer [72][73][74][75]), came up with two dimensions -those describing emotionality and energetic aspects of behavior. As language-and emotionality-biases are real and very strong, it is not a surprise that two factors of the Big Five model (Extraversion and Neuroticism) consistently appeared in the personality research performed in various cultures, and three other factors (Openness to Experience, Conscientiousness and Agreeableness) failed to show the same consistency across cultures or even within the same culture for different age groups [76][77]. The outcomes of the lexical approach in its contribution to differential psychology were, therefore, rather modest. Testing thousands just to confirm the relative independence of the two major dimensions of temperament would seem a rather expensive enterprise, especially since these two dimensions were already described in differential psychology for at least a century.
The emotionality bias in cognition implies that higher emotional people might have a greater bias and less ability to distinguish between non-evaluative criteria of assessment than lower emotional people. If so, there is a possibility that, if we rely on the estimations of verbal material by lower emotional people, it will diminish the emotionality bias (even though the language bias will remain). On the other hand, the estimations of lower emotional people might bring more contextual diversity, which would produce so much variance that it would result in too large a portion of variance unexplained by the resulting factors. The first possibility would improve the validity of the lexical approach, and the second possibility would not. The lexical approach does not differentiate between high-emotional and low-emotional people, but it is not clear if using it with low-emotional people would make any difference. Based on the reported emotionality bias in cognition, there are high chances that this bias affects the results of factor analysis, gluing criteria of estimations of highly emotional people in evaluative manner.
The existence of language-and emotionality biases in verbal descriptors of human behavior has serious implications for the use of the lexical approach in differential psychology. If these biases indeed exist then the models of lexical approach reflect only relationships between the lexical descriptors of human individuality, and not biologically based systems of this individuality. For this reason science usually does not rely on inexpert evaluations of words describing natural phenomena, but experiments and natural observations of these phenomena are used instead, especially when it comes to claims in regards to biology and physiology. This paper investigates if and how these biases show up in different temperament groups, but before we focus on the details of the study, let us briefly comment here on the second key flaw in the clam of the lexical approach related to a discovery of biological systems of individuality.
2) As noted, in order for lexical approach models to reflect the nature of phenomena under study (biological systems of individuality) their lists of descriptors should be complete and should arise within the scientific, not the general, community. Eschewing experimental and physiological techniques for investigation of what biological regulatory systems indeed exist, lexical approach researchers also insist that their concept of personality is practically equal to the concept of temperament. In American psychology every time that a study on temperament is performed, it is common for reviewers to demand the inclusion of references to findings from some lexical approach study.
Meanwhile temperament is defined as the most consistent, biologically based dynamical aspects of behavioral regulation, which are relatively independent from educational and sociocultural impact [78][79][80][81][82][83][84]. Similarly to other factors which depend on neurochemistry and neurophysiology of the body (such as sex and age), temperament should be treated as comprising the biological systems of human individuality, and not personality.
On this criterion of completeness the lexical approach also shows significant flaws: it completely misses the regulatory characteristics related to lability of behaviour (plasticity-rigidity, tempo of verbal or physical activities). Yet, these characteristics are well-known for being based on biological systems. Pathologically low plasticity (perseverance) has been described in clinical cases of frontal lobe damage for over 70 years. Contrary to unification of endurance and tempo under one dimension of Extraversion in the Big Five model, these characteristics are based on different biological systems: just consider the difference between capacities used by marathoners and sprinters.
More specifically, it was suggested that temperament appeared to regulate three dynamical aspects of activities (energetic, programming-lability and sensitivity-orientational), multiplied by two probabilistic levels of regulation (stereotypical, learned vs. new or complex activities). There also are different traits regulating three distinct functional aspects activities of activities (motor vs. social vs. mental) and there are, of course, at least three types of emotionality traits (neuroticism, confidence and impulsivity), which do not substitute for energetic or plasticity components of behavioral regulation [28][29], [83], [94], [95]. These dynamic properties, described by the concept of temperament, traditionally were considered separately from content-related aspects of behaviour, such as values, beliefs, knowledge, motives, preferences, i.e. the key components of personality. After all, content-related aspects were thought to be mainly based on socio-cultural factors, and temperament is based on biological factors. Ever since it was described by the physicians Hippocrates and Galen 2500 year ago, European researchers of temperament linked it to the properties of nervous systems, neurochemistry and endocrinal regulation of the body.
This work showed much higher, nonlinear, dimensionality and complexity in biological regulation of behavior, even though these models also included emotionality and energetic dimensions. After all it is hard to believe that complex human behavior would be regulated just by two biological systems. Robbins and Everitt [92] pointed out in regards to the concept of Extraversion that ''various indices of arousal do not intercorrelate to a high degree, as would be expected of a unitary construct (Eysenck, 1995), and putative manipulations of arousal, whether pharmacological or psychological, do not interact in a manner suggestive of an underlying unidimensional continuum'' (p.703). Instead they described not one, but rather four reticular arousal systems, related to four types of neurotransmitters.
There have been practically no studies (except the cited three [31][32][33]), which directly investigated all named biases in the semantic perception of verbal material. One of the main reasons for this is the methodological challenges involved in studies perception of the meaning, i.e. semantic perception.

Methodological considerations of studies investigating the semantic perception of lexical material
The main challenge in studying the semantic perception of words is the diversity of meanings and associations that people attribute to the words. Meaning appeared to be individually unique and different not only between people from different cultures, social and family background, but also between all individuals. Many parametric methods appeared to be useless in measuring meaning, as what psychomantics does looks like comparing oranges and apples.
A common way to extract biases in meaning attribution is to apply projective methods, using material of a very general nature, which is open to individual interpretation. Projective Semantics [31], [101] based on Osgood's [97] Semantic Differential method (SD), asks people to estimate well-known general concepts using common adjectives in the form of bipolar scales. These scales are then grouped into a small number of factors to facilitate the analysis. The difference between the Semantic Differential and the Projective Semantic methods is that the SD studies conducted by Osgood or his followers used various numbers (between 15 and thousands) of diverse adjectives and concepts to be assessed. The Projective Semantic method uses primarily concepts of a very specific nature: these nouns correspond to the 7 groups (factors) of adjective-scales, which were those most consistently found in crosscultural studies, as described in the previous paragraphs. Seven types of adjective scales and seven types of concepts describing the same aspects of reality as the scales establish the object-scale symmetry (OSS) between scales and objects, which improves the sensitivity of the method to any bias in responses.
For example, if there are no differences between participants in the perception of the common words, then the concepts ''Reality'', ''Present'' are expected to be assessed unequivocally as ''very real'' on adjectives such as ''real-imagined'' (i.e. scales of the Reality factor); the concepts ''Complexity'', ''Chaos'' as ''very complex'' along the scales of the Complexity factor, the concepts ''Beauty'', ''Freedom'' are expected to be on the positive pole on the Evaluation scales, ''Time'', ''Development'' -on the negative pole of the scales of the Stability factor, etc. It is expected that a deformation of this symmetric and very basic matrix would reveal underlying biases of two types: in either using certain scales and/or in the assessment of certain concepts. The concepts are chosen to correspond not only to the groups (factors) of scales, but also to temperament traits, i.e. to the dynamical characteristics of behaviour (Effort, Work, Relaxation -to scales measuring endurance; Speed, Motion -to scales related to Tempo and Plasticity of activity; Prestige, Reputation, Beauty -to Emotionality scales) and to the scales related to different areas of activity, such as physical (Motion, Work, Speed), social (Society, Person, My contemporary) and intellectual (Complexity, Chaos, Time, History).
Before we summarize the hypothesis and methods, the following requirements should be mentioned, which narrowed the focus of our study: a) A concern about an impact of culture and education on the semantic perception of lexical material. Since Luria's experiments in the 1930s within low-educated cultures it was shown that culture is an important determinant of which kinds of nonevaluative criteria people use for classification and evaluation (See [102] for review). One component of the educational impact is a previous functional experience related to the objects of assessment. We agree with Cree and McRae's [103] differentiation between amodal semantics, sensory/ functional semantics and domain-specific semantics (which classifies objects according to concrete, knowledge-based features). The assumptions of the lexical approach relate to amodal semantic perception as this approach was applied to the concepts of personality and human character, and searched for cross-cultural universality. Our study therefore should, similar to the lexical approach, focus on amodal (and not so much on sensory/functional or feature-specific) semantic perception using the most common lexical material, accessible for all cultures and educational levels. b) A concern about a situational embodiment effect. The idea that the somato-visceral feedback from the body affects the emotional perception of an individual was suggested independently by James and Lange at the end of the 19 th century. It was echoed in Bernstein's work on action construction [104][105], who concluded that the initial stage of our attribution of meaning to objects is related to a body-object interaction (p. 126). Several constructivist theories suggested that evaluative and motor systems of behaviour do not function in parallel, but rather work in an ensemble, simultaneously constructing an action every time anew based on the state, need and resources of the body [27], [31], [55], [104][105][106][107][108]. The embodiment theories of cognition suggested that neurophysiological systems involved in physical action contribute to the representation and comprehension of language stimuli [56][57][58][59][60][61][62][63], [109][110][111][112][113]. Fast implicit motor activation during language processing was found to be an important component of disposition for semantic retrievals [59][60][61][62][63]. Moreover, it was shown that the brain appears to process meaning attribution to words in a modality-specific manner when it comes to concrete activities and objects [61], [109][110][111][112][113][114]. To control these effects, in addition to the application of amodal lexical material, a setting up of physically identical conditions for all subjects would help to minimize a possible situational embodiment-related bias (i.e. bias based on the specifics of body-object interaction in a given moment). The difference between the majority of embodiment studies using tasks of specific modalities and this study is that the experimental groups of the former were contrasted by more diffuse (unrelated to specific actions) dynamical aspects of behavioral regulation: endurance, changeability, directionality of behavior. In contrast to the well-defined anatomy of motor and sensory systems, temperament is based on the neurochemistry of the body (neurotransmitters, neuropeptides, opioid receptors and hormones), and therefore this study presents a different aspect of embodiment research.

Goals, hypothesis and methodological considerations of the present studies
One of the assumptions behind the lexical approach, which this study challenged, is that existing relations between features of human behavior will be adequately reflected in correlations between lexical descriptors of behavior without any language or emotionality bias. Therefore, one of the goals of our study was to investigate whether temperament groups, which are contrasted by the energetic and sensory capacities perceive lexical material differently. More generally, it was investigated if semantic perception of lexical material differed between these contrast groups for verbal non-evaluative criteria of organization, probability, complexity and stability. The first, the projection through capacities hypothesis, was based on embodiment theories and on reports describing the interlocking between capacities and emotionality, and between emotionality and cognition. According to this hypothesis people with higher endurance or tempo in either physical or social aspects of behaviour perceive objects, even amodal verbal material, in more positive terms in comparison to their opposite temperament groups.
It is not at all clear, however, whether people with higher socialverbal endurance and tempo of speech have more adequate semantic perception of lexical material than people with higher intellectual or physical endurance. These people might either have a more detailed perception of social concepts, or, alternatively, the social nature of language might create a positive bias in their perception in favor of socialization-related concepts. Mental (intellectual endurance) is expressed in individual attentive abilities. It is possible that people with higher scores on this trait might have more adequate use of criteria in the assessment of abstract concepts.
Moreover, the logic of embodiment theory suggests that lability traits of temperament (ability to restructure actions and tempo of performance) might have an impact on the perception of timerelated lexical constructs. To date there has been practically no research focusing on the semantic perception of time concepts by people with different speeds of performance. The only study with such focus was conducted by Trofimova [31], which used a small number of scales and concepts. This study reported that participants with higher physical tempo perceived abstract concepts as more fast, more energetic and acute whereas participants with higher verbal tempo perceived the same object as more constant. As an indirect indication of the body's-temporelated bias in the perception of action verbs, several studies showed a presence of motor activation during the processing of these verbs [112][113], [115][116]. Based on these results we hypothesized that people with a stronger dynamical (temperamental) system generating a tempo of performance will project more energetic and speedy properties to abstract concepts than people with slower tempo.
Our second goal of the study was related to the assessment of a possible impact of language (socialization) bias, especially in the perception of lexical material by people with high social-verbal capacities. The second, language bias hypothesis, suggested that people with stronger endurance and tempo in verbal-social activities would differ from people with stronger motor-physical or intellectual endurance in their semantic perception of lexical material. Moreover, as a part of this hypothesis we investigated if temperament traits which are not related to socialization or emotionality (such as plasticity and tempo) will have any interaction with semantic perception of lexical material.
The third goal of this study was to investigate if there is a conflating impact of emotionality bias in the use of non-emotional criteria of estimations. If there is no such emotionality bias, then the non-evaluative (probabilistic or structural) criteria would be used independently from evaluative criteria in semantic processing. If there is such emotionality bias, the polarity of nonevaluative criteria would be conflated with the valence of evaluative criteria. Our emotionality bias hypothesis suggests that emotional dispositions create a universal, object-independent bias in meaning attribution, and that this bias is especially strong in amodal semantic perception (i.e. when no concrete feature-specific objects are given, and a person has to assess abstract lexical material). We hypothesize that this bias might be so strong that it forces non-evaluative criteria to follow the valence of emotional evaluative criteria. In this case the background emotional component of semantic perception would correlate (or ''glue'') parameters of assessment into the factors regardless of the objective features of these elements.
At this point we have three hypotheses, which relate to capacities-, language-and emotionality biases in semantic perception. In order to separate these biases in our investigation we had to use experimental groups that were contrasted by various aspects of physical, social and intellectual capacities and emotionality. Temperament traits are hard to monitor in short-term experiments, as they emerge only as consistent behavioral patterns in a variety of situations over long periods of time. Such monitoring is especially challenging if we need to study all traits at once. Neurophysiological methods for separating contrast temperament groups are also not developed yet, as temperament traits are based more on neurochemical processes rather than brain morphology, visible in neuroimagery studies. For these reasons the main instruments to diagnose temperament are validated temperament tests.
The present study used the activity-specific tests of temperament, which make the most detailed differentiation between all main aspects of temperament (i.e. biologically based systems regulating behavior). Both tests were developed within the longestrunning (Pavlovian) experimental tradition investigating properties and types of nervous systems. These tests are called ''activityspecific'' because they differentiate between temperament traits regulating physical, verbal-social and mental aspects of actions [39], [94][95], [117][118][119][120]. Both tests consists of 12 temperament traits: 3 traits of Emotionality and 9 executive traits (related to energetic, lability and orientation aspects of actions, each considered in 3 types of activity -physical, social and mental).
Such an integrated approach (admittedly complicating the readability of the results) was necessary to analyze for the possibility of interactions between the three described biases in the perception of the participants of the study.
In sum, to test our hypothesis we used 12 independent variables, corresponding to 12 temperament traits. Dependent variables were not single measures, but patterns of relationships between multiple semantic measures (scales). The analysis was focused on both quantitative (statistically significant differences) and qualitative effects (the way that these differences were grouped along the poles and features of the scales). We expected the following: (1) If the ''projection through capacities'' hypothesis is true, the results would show: a) that people with higher endurance attribute more positive meaning to neutral common words than people with low endurance; b) that people with higher tempo would have universally higher estimations to timing-related concepts (Time, Motion, Development, Speed) than other temperament groups. (2) If the ''language bias'' hypothesis is true, a temperament group with higher verbal-social endurance will give more positive estimations of concepts than other temperament groups, due to their better verbal capacities; (3) If the ''emotionality bias'' hypothesis is true, then a) the differences in estimations on non-evaluative scales (describing probability, complexity and organization aspects of concepts) would follow the same polarity as on evaluative scales (which belong to Evaluation or Stimulation factors); b) emotional reactivity, as an expectation of a failure, will affect the estimations of people with higher emotionality or neuroticism, emerging as a universal negative evaluative bias in their estimations.

Procedure
The study received an approval from the McMaster University Ethics Committee for all procedures. All subjects received debriefing and signed an informed consent form and then completed the Extended Structure of Temperament Questionnaire and participated in the experiment. University students received a practicum credit for their participation.
Protocols having scores of 18-24 on the Validity scale were considered invalid as the respondents were likely to demonstrate a positive impression bias in their responses.
The Semantic Task experiment used 60 6-point bipolar scales to estimate the 29 general concepts ( Table 1) chosen according to the Projective Semantic method as described above. Each concept was presented by the program ''Expan'' on a computer monitor at the top of the screen along with each of the bipolar evaluating scales placed horizontally at the middle of the screen (i.e. 1740 screens were presented for the estimation). Both poles of the scales had 3 degrees of freedom (''very much'', ''somewhat'', ''weakly''). The order of scales and concepts was changed for each protocol to avoid the consecutive use of several scales related to one factor.

Statistical processing
To avoid the impact of a social desirability or negative impression bias, the contrast temperament groups were selected, not based on the T-scores on the STQ, but based on the relative rank of the position of a score on a given scale in relation to scores on the other scales within an individual temperament's profile. The implication was that participants might over/underestimate the expression of their dynamical traits under the influence of social expectations, but they could be more objective about which temperament characteristic is stronger or weaker in comparison to their other characteristics. For each protocol, the scores on 12 scales were transformed into 12 ranks. The protocols were sorted into three groups with lowest, middle and highest ranks and the contrast (i.e. lowest (1-4) and highest (9-12)) groups were processed further. Table 2 shows the sizes of the contrast groups, and Table 3 shows the statistical details related to temperament scores. The differences in estimations were assessed with the Mann-Whitney U test, separately for men's and women's contrast temperament groups (based on 12 scales of the STQ-150). To control multiple comparisons using the Bonferroni correction the significance of differences was set to p,0.0063.

Results, Study 1
In order to summarize the results of 1740 estimations given by each participant in 24 contrast temperament groups and 2 gender groups, we grouped the objects according their cluster analysis and grouped the scales according to factor analysis. Factor analysis confirmed the affiliation of the scales to seven factors, as presented in Table 1: Stimulation, Evaluation, Power, Reality-Probability (typicality), Organization, and Stability-limitation. In more detail, Figure 1 assigns a specific color pattern to each of 7 factors (connotative groups), to facilitate the perception of the spectrum of significant differences. The height of the pattern is proportional to the number of scales related to the particular factor that showed significant differences, with greater heights representing more scales (see Table 4, Table S1 for statistical details). The position of the patterns with respect to the zero line indicates the polarity of estimations preferred by the group with the higher scores on a given trait, i.e. to which side of the scale this group's means were closer. The above 0 position indicates that means were closer to the positive pole of the scale; below 0-to the negative pole, see Table 1 for the assignment of poles. For example, for the concepts of social attractors, Figure 1, Table 4 (and Table S1) shows that in total 10 scales had significant differences in the male group contrasted by Social-verbal Endurance trait. From the height of each pattern one deduces that there were significant differences on the scales of the Stimulation (2 scales), Probability (2 scales), Organization (3 scales), Stability (1 scale) and Complexity (1 scale) factors. This stacked column shows that the estimations of socially energetic men are significantly closer to the positive poles on Stimulation, Probability and Organization descriptors and to the negative pole on Stability and Complexity adjectives than those given by lower energetic men. Similarly, for the same concepts and the same temperament group (high Social Endurance), but in the female sub-sample, there were 9 scales, with the height of the patterns indicating significant differences on 4 Probability scales, 3 Organization scales, and also 1 Stability and 1 Complexity scale, with highly social women preferring the positive poles of the eight scales.
The results show that in terms of capacity-biases effects, a high number of statistically significant differences were found for male groups contrasted by physical (Motor) Endurance. Men with reported stronger physical endurance estimated work-and timerelated concepts significantly more positively than men with reported weaker endurance. The same positive bias, but with a smaller number of statistically significant differences, was found in male estimations of people-and reality related concepts and social attractors. Temperament groups contrasted by Social-verbal Endurance (ERS) showed a positive bias in estimations of socially energetic participants, which was consistent across two sex groups and all groups of concepts.
In terms of lability-related temperamental traits, males with higher scores on the scale of Motor Tempo had significantly more positive estimations of timing-related concepts than the opposite temperament group, but this was not the case for female groups. At the same time males with a faster Tempo of Social-verbal activities had significantly more negative estimations of Power and Timing-related concepts than males with a slower Social Tempo. Men with higher Social Plasticity gave significantly more negative estimations to social attractors and work-related concepts than men with lower Social Plasticity scores.
The differences between social, physical and intellectual endurance were that 1) higher social-verbal endurance was associated with a more general positive bias in estimations, and that 2) intellectual endurance (ERI), i.e. the ability to stay focused on a mental task, was associated in females with a bias that was opposite to social and physical types of endurance. Women with a higher ERI estimated social attractors, work-related and timingrelated concepts with more negative bias than women with the lower ERI scores. Men with higher ERI scores however gave more positive estimations of reality-related concepts than males with lower ERI.  In terms of emotionality effects, non-evaluative scales followed the polarity of evaluative scales in a very consistent manner (Fig. 1, Table 4). For example, whenever a temperament group assessed the concepts as more ''good'' and ''interesting'' (i.e. with more positive estimations on the scales of Evaluation and Stimulation factors than the contrast group), it would also assess these concepts as more ''organized'', ''probable'', ''real'' and ''stable'' (i.e. also with positive estimations on the scales of Organization, Probability-Reality and Stability factor. The impact of Emotionality traits was not universal across objects (concepts). It was specific to possible areas of failures and was gender-specific. The most dramatic number of significant differences was observed in temperament groups contrasted by Social Emotionality and Social Endurance. Socially emotional men saw people-related, reality (life)-related concepts and social attractors as more draining, uninteresting, irritating, severe, dark, dirty, cold, false, unknown, improbable, irregular, irrational, senseless, and unstable than socially low-emotional men. Female     Tables:  females with higher Social Emotionality estimated people-related concepts as less interesting (p = 0.0017), and life/reality-related concepts as more negative along fours scales of the Evaluation and Stimulation factor) than low-emotional females (p = 0.0000-0.0048). A differential impact of Emotionality related to physical-Motor activities (EMM) in male contrast temperament groups was observed for people-related concepts: men with higher scores in this type of emotionality estimated the concepts Person, Unknown person, Society and My contemporary as more simple and overall more positive than low-emotional men. There were sex differences in contrast EMM groups also in estimations of social attractors and reality-related concepts: while female contrast groups did not show many differences (one in each group of concepts). Men with higher EMM scores estimated social attractors as significantly more imaginary, impossible and unusual, and reality-related concepts as chaotic, irregular, rare, imagined and unstable than low-emotional men.
Overall, male groups had a much higher number of statistically significant differences between contrast temperament groups than females. Males also had more diverse scores on temperament scales than women (their standard deviation was higher than female SDs on 9 out of 12 scales (exceptions were Social Plasticity, Motor and Intellectual Emotionality scales) (Tables 2 and 3). An ANOVA comparison of the means revealed male superiority in six temperament scales: Motor-physical Endurance (ERM) (at p,0.0000), Motor Plasticity (PLM), Motor Tempo (TMM) (at p,0.0025-0.0029), Intellectual Endurance (ERI) (at p,0.0042), Intellectual Plasticity (PLI) and Intellectual Tempo (TMI) (at p,0.0000). Females had superiority in Social Endurance (ERS) (at p,0.003), Social-verbal Tempo (TMS) (at p,0.0000), and higher Emotionality in Intellectual (EMI) and Social (EMS) aspects of activities (at p,0.0000). The scales of Social Plasticity (PLS) and Emotionality in motor-physical (EMM) activities did not show significant sex differences ( Table 3).
The data is deposited on the server of McMaster University, Faculty of Health Sciences at: http://fhs.mcmaster.ca/cilab/ DataPLOS1.xls.

Method, Study 2
This study was investigating the same hypothesis as the Study 1 using a partially different version of the temperament measure.

Procedure
All subjects received debriefing, signed an informed consent form, then completed the Compact Structure of Temperament Questionnaire (STQ-77) and participated in the Semantic Task experiment.
The Compact Structure of Temperament Questionnaire (STQ-77) [83], [94], [119][120] has 77 statements, assigned to 12 temperamental scales (6 items each) and the validity scale (5 items)   The Semantic Task experiment and statistical procedures were similar to Study 1. The only difference was that participants assessed 24, and not 29 concepts, with the concepts Task, Effort, Unknown person, My contemporary and Reputation excluded. In regards to the ''projection through capacities hypothesis'', males with stronger Motor Endurance (ERM) estimated people-, work/reality-and time-related concepts in more positive terms than males with a weaker endurance. Females with stronger ERM estimated social attractors in more positive terms than females with weaker ERM. Both male and female temperament groups with stronger Social-verbal Endurance showed a universal positive bias in their estimations, especially for social and work/ reality-related concepts, in comparison to participants with lower sociability. A trait of self-confidence created a positive evaluative bias only in estimations of social attractors, and only in men.

Results, Study 2
In terms of tempo-related scales, the most significant positive bias in estimations of men with higher Motor Tempo was found in their evaluation of time-related concepts.
Similar to the findings of Study 1, social, physical and intellectual endurance were associated with different biases in estimations. When significant differences were found between the temperament groups contrasted by physical and social endurance, the people with stronger endurance had a more positive evaluative bias than people with weaker endurance. Intellectual endurance, however, was associated with a negative evaluative bias. Women with higher Intellectual Endurance estimated social attractors as more unstable, disorganized, negative and non-stimulating compared to women with the lower ERI. No differences were found in the male groups.
In terms of emotionality, the differences on non-evaluative scales (i.e. included in factors of Probability-Reality, Complexity, Stability and Organization) followed the polarity of differences on evaluative scales (included in factors Evaluation, Stimulation, Power). Men with higher Neuroticism scores on the STQ-77 estimated people-related concepts in more negative terms than low-neurotic men, but almost no differences were found for social attractors in both male and female groups contrasted by neuroticism. Both men and women with high Neuroticism and high Empathy estimated work-and reality-related concepts with negative bias in comparison to low-neurotic and low-empathic groups. Highly neurotic women also gave more negative evaluations to timing-related concepts and to the Past and Future than their contrast temperament group.
Temperament traits related to different types of sensitivitysensitivity to sensations, sensitivity to probabilities, empathy and neuroticism -all had a polarity of bias, which was opposite to those of physical and social endurance traits (in the cases when significant differences were found). Both men and women with higher Sensitivity to Sensation (SS) scale of the STQ-77 gave more negative estimations of work/reality-related concepts than participants with lower sensation seeking. The Sensitivity to Probabilities Table 5. Means (M), and standard deviations (SD) and ANOVA effects (F, p) of sex differences in means on STQ-77 scales (Study 2).   scale, which measures a person's ability to learn and to derive causal relationships, showed effects in the male contrast group: men with high scores on this scale estimated social attractors and time-related concepts as significantly more bad, cold, insignificant, imaginary, unusual, blurred, irregular, unreliable and slow than men with lower scores. Men with higher ranks on Sensation Seeking scale estimated the concepts Simplicity, Order, Relaxation and Faith as significantly less stable, organized (!), and real than low sensation-seekers. At the same time participants with high Social Endurance (sociability) in both sex groups gave more positive evaluations than their contrast group.
Overall male contrast temperament groups had a higher number of statistically significant differences between them than did the female groups (Table 6, Figure 2, Table S2). An ANOVA comparison of the means revealed male superiority in seven temperament scales: Motor-physical Endurance, Motor Tempo, Sensitivity to Probabilities (at p,0.0000), Plasticity, Self-Confidence (at p,0.0003), (EMP) (at p,0.009) and Intellectual Endurance (at p,0.01). Females had superiority in Social-verbal Tempo (at p,0.0001).

Discussion
In theory, and from the lexical approach perspective, there should not be any temperament-related differences in the assessment of amodal common concepts using very common adjectives, especially in people with at least high school education in a developed Western country. The experimental material in our studies had a very general and non-biased nature. Yet, a complex pattern of significant temperament-related differences was found in semantic perception even for words with a very high level of generality. The results were more profound for the STQ-150, in comparison to the STQ-77, but this might be due to the size of the samples.
Our projection through capacities hypothesis was supported in comparisons of estimations of temperament groups contrasted by two (physical and social-verbal) types of endurance and tempo. When significant differences were found, participants with stronger physical or social endurance in both studies gave more positive ratings to concepts than participants with weaker endurance of these types (Figures 1-2). Men with weaker motor endurance but with faster social-verbal tempo had even more negative estimations of work-and reality-related concepts than women with the same traits ( Figure S1, Table S3). This was consistent with the positive evaluation bias observed in extraverts in other studies [31][32][33]. Significantly fewer differences were found between the temperament groups contrasted by intellectual endurance, and when such differences were found they had a pattern opposite to the groups contrasted by the other two types of endurance (physical and social). In line with our hypothesis, men with higher Motor Tempo and Endurance gave more positive  ratings to timing-related concepts than did their opposite groups. It was interesting that in both studies significant differences were found for the concepts Order and Simplicity (i.e. the concepts requiring an opposite to lability) between female temperament groups contrasted by lability traits (Motor Tempo in Study 1 and Impulsivity in Study 2). Faster and more impulsive females saw these objects in more negative terms. In regards to our ''language-bias'' hypothesis, we found a strong pro-social bias in estimations of general concepts. The scales of Social-verbal Endurance, and Social emotionality/Neuroticism were associated with much more significant differences in estimations, in comparison to all other temperament scales, and this was observed in both studies. Social endurance had the most number of significant differences, and people with high social endurance had a tendency for more cheerful estimations, even when it came to non-social concepts (such as Work-Reality, Simplicity-Order or Timing groups). Social emotionality, i.e. sensitivity to failures in social activities, produces a much stronger negative affective bias in meaning attribution than sensitivity to failures in physical activities.
Moreover, the concepts related to social attractors and ''people'' had the highest number of significant differences between temperament groups. If the strongest effects, i.e. the largest variance in data are produced by the difference between estimations of social vs. non-social people, and to a lesser degree -by any other temperament types, then the factors resulting from public assessments of individuality would reflect mostly socialization aspects. The pro-social bias of language skews the frequency of common lexical descriptors related to socialization vs. other aspects of behavior and therefore makes these descriptors an unreliable source of information, especially in regards to psychological phenomena, which has a strong social component. Such a pro-social bias of lexical material in semantic processing supports our arguments about flaws (namely observer's bias) in the lexical approach as a method of investigation of the structure of some objects. If the strongest effects in lexical approach studies are induced by the estimations of either socially emotional or socially active people, this makes modeling within the lexical approach ''a science of extraverts'', with limited benefits for general differential psychology.
Our third hypothesis was also supported, and this finding was in line with the neuropsychological reports of the interlocking of emotional processing with attention and perception [20][21][22]. Overall the pattern of our results showed the existence of an initial evaluative stage in human semantic perception, which uses two emotional poles even in estimations of abstract neutral concepts. A strong effect of emotionality bias was found in a universal tendency across objects for non-evaluative scales to follow the same polarity as the scales for the Evaluation and Stimulation factors (for example, whenever a contrast temperament group assessed a concept as more ''good'' and ''interesting'', it would also assess this word as more ''organized'', ''probable'', and ''real''). Such grouping (''gluing'') of non-evaluative scales with evaluative criteria was described in Kelly's theory and was likely the reason for Osgood's factors to have a strong evaluative content. It is likely that in the perception of the words related to more concrete objects, this primary stage of meaning attribution is followed by other stages of detailed, knowledge-, education-, experience-and intelligence-driven meaning attribution.
More importantly, the inherent bipolarity and evaluative nature of amodal semantic perception means that this bipolarity can be projected onto the properties of objects of estimation. It is likely that when the lexical approach asks people to estimate individual characteristics of other people using just verbal descriptors, bipolar evaluative bias will dominate over other criteria of assessment. This emotionality bias, combined with the social nature of language, would divide all non-evaluative features of an object into groups of features related to the interests of the society. In the example of the models of individuality offered by the lexical approach, these categories would relate to social approach and withdrawal behavior. Such a division would present the perspective of a socialized and emotional observer, judging the object's (i.e. personality's) features primarily from the point of view of socialization, but omitting other important (non-evaluative) features of the object. As a result, when the lexical approach or parental verbal observations are used to derive a model of personality or temperament ''in the way how people see it'', it is natural to expect that the biggest dimensions of the model would be Extraversion and Neuroticism, or Approach and Withdrawal, or Positive/Negative emotional dispositions.
Even more problematic are the claims of the lexical approach that this method found the structure of all biologically based individual differences. In spite of the intensive promotion of the Big Five model, it does not correspond to the findings in experimental (i.e. more objective) and neurophysiological studies in differential psychophysiology. These findings indicated that there are important biologically based characteristics, that are unrelated to socialization or emotional evaluation, and therefore were unnoticed by a human observer of personality or temperament structure. For example, the lability of behavior (mobilityrigidity of generation of an action, impulsivity, preferred tempo of performance) and traits related to the types of preferred reinforcers (sensations seeking, empathy, causal thinking) were differentiated from endurance traits in experiments on the properties of nervous systems and in several temperament models [78][79][80][81][82][83][84][85][86][87][88][89][90][91], [94][95], [117][118], but were missed in lexical approach models of individuality. Moreover, extraversion, described as a dimension related to the energetic aspects of activity, missed a differentiation between several types of endurance: social-verbal (ability to sustain prolonged conversations), mental (ability to stay focused on mental tasks) and physical. Yet, the differentiation between these three types of endurance is in line with the functional specialization of temporal, frontal and sensory-motor cortex [83]. (Note that recently, to accommodate these findings Big Five researchers had to use additional techniques in factor analysis, presenting sociability, impulsivity, positive affect, empathy, self-confidence and (in some models) sensation seeking as ''second order'' traits, i.e. components of extraversion. This did not help to overcome a bipolar emotional division of traits: positive affect was identified as a part of Extraversion, and negative affect -as a part of Neuroticism, even though affective systems are based on different neurophysiological systems than other components of these traits).
There was an aspect of emotionality bias, which we did not expect in our hypothesis, but which suggests an interaction between emotionality and capacities-related biases. This aspect relates to a specificity of emotional sensitivity to lexical material, even when the most amodal and abstract material is used. In line with observations in clinical psychology, our results showed that people with higher sensitivity to failure and neuroticism had more negative estimations of neutral abstract concepts, i.e. emotional negativity bias. Such effects were, however, far from being universal across both gender groups and the objects of estimation. Negative affect had a tendency to color the perception of emotional people in relation to specific possible areas of their failures, and not blindly to all objects. For example, socially emotional and neurotic men in both studies perceived peoplerelated and reality-related concepts more negatively than their contrast groups, but men who were sensitive just to their failures in physical activities saw people-related concepts significantly more positively than low-emotional men. Interestingly, when temperament was not considered, males estimated social attractors (Beauty, Prestige, Reputation, Power) more positively than women while few sex differences in estimations were found for the concepts describing ''people'' [101].
Further evidence of the impact of emotionality in an objectspecific manner was that socially emotional and neurotic women perceived time-related concepts in more negative terms than other concepts. This is consistent with the more negative estimations of women on timing concepts when temperament differences were not taken into account [76], [121] and significantly lower scores on the temperament scales of Motor Tempo in women, in comparison to men in both studies (Tables 3-4). To integrate the results from Trofimova previous [101] and the present studies it can be proposed that, judging by lower emotionality scores on temperament scales in men, it almost looks like men usually care less about their social failures than do women, but those who do care (i.e. men with Social Emotionality) have really big issues, namely, with people, and much less so with social values. Women's emotionality, however, simply affects their meaning attribution when it relates to the perception of concepts of speed and timing, in which they feel inferior.
Similarly to the results of the studies of sex differences in the semantic perception of lexical material [101], [121], men had more negative evaluations for work-and reality-related concepts than women ( Figure S1 and Table S3). This coincides with the findings of Study 2 that both men and women with high sensation seeking evaluated these concepts more negatively than participants with lower sensation seeking, and that men with higher sensation seeking estimated the concepts Simplicity, Order, Relaxation and Faith as significantly less stable, organized, and real than low sensation-seekers. Men are reported to have higher risk-and sensation-seeking behavior, especially in youth [91], and it is natural to see that concepts related to routines are perceived more negatively by men than by women [101], [121]. This study did not find significant differences between men and women on sensation seeking per se, but it is possible that it is extreme temperament traits that induce the biases in semantic perception. For example, men with both lower and higher social endurance gave significantly lower evaluations for work-and reality-related concepts than women with such traits ( Figure S1). Different types of sensitivity (sensitivity to probabilities, to other people's state (empathy), to physical sensations (sensation seeking), neuroticism) were associated in Study 2 with either no differences or negative estimation biases by people with high scores on such sensitivities. Here we see how easy it is for humans to be confused about the structure of individual differences when different traits ''look the same'' on a bipolar dimension related to emotional or social behavior. People with high sensitivity of very different types (sensation seeking, empathy, probabilistic thinking), people with high attentive abilities and people with lower physical and social endurance are likely all to have negative evaluative biases and could be classified as one group (previously known as ''introverts''). People with high endurance and tempo, but of different kinds, as well as people with poor attention (intellectual endurance) are likely to exhibit positive evaluative biases, and from the emotionality and socialization point of view they would all be in a group of ''extraverts''. Such emotional biases in semantic perception, endorsing socialization-based categories over other structural analysis was likely the reason why early differential psychologists came up with two-dimensional models based on either two poles of emotionality or dimensions of Energy-Strength-Activity-Arousal-Extraversion and Emotionality-Neuroticism.
These findings and arguments question the validity of the lexical approach in differential psychology, which derives the structure of human individuality based upon people's estimations of verbal material (see Discussion of the Controversial Issues for further discussion on related controversial issues). The coupling of several types of biases in the semantic perception of lexical material likely affects the way that scales group into factors in the factor-analytic models of the lexical approach, and masks important objective features of the assessed objects. Repeating the application of this approach in dozens of languages brings consistent results because it doesn't change the social-evaluative nature of lexical material and doesn't improve the flaws of this approach. Besides, the use of linear factor analysis, which looks for independent dimensions, is hardly appropriate in psychological investigations due to nonlinearity, feedback and contingent relationships (i.e. interdependence) between psychological characteristics. Similar to deriving a structure of an object by measuring its shadows on the walls, deriving the structure of biologically based individual differences from people's lexical appraisals of observable behavior is likely not a very informative scientific method.
In summary, our studies showed that capacities-related, language-related and emotionality biases in semantic perception make people very unreliable observers, especially when lexical material is used and when it comes to assessment of social or people-related concepts.
The limitation of the study was related to the use of selfevaluation tests to assess the temperament traits. Considering that 12 temperament traits had to be measured in the same sample, only one method could be realistically implemented: the use of self-report test with the calculation of the rank of capacity to which an individual feels a given trait was developed. The use of a rankbased instead of a value-based system for classification of subjects into the contrast temperament groups (as described in Method section) hopefully addressed this limitation.

Conclusions
Our studies investigated an impact of biases related to biologically-based capacities, language and emotionality on the semantic perception of lexical material. All three types of biases were found, even though the lexical material was of the most neutral, abstract and amodal nature.
In line with the ''projection through capacities'' hypothesis and previous findings, we found that people with higher physical and social endurance gave more positive evaluations to neutral concepts than people who felt that their endurance was rather weak. Moreover, participants with faster physical tempo gave more positive estimations to time-related concepts than participants with slower tempo. These findings reflect on another aspect of embodiment in cognition: such embodiment emerges not only as an impact of the situational physical state of a body, but also as a contribution from the consistent dynamical (potential rather than situational) capacities of the body in the semantic perception of lexical material.
A language bias in lexical material was identified when experimental groups contrasted by social emotionality, social endurance and social tempo showed the highest number of effects than the groups contrasted by other abilities. The concepts related to social objects had more significant differences between these contrast temperament groups than the non-social concepts.
A strong impact of emotionality appeared in findings that nonevaluative criteria for categorization (related to complexity, organization, stability and probability of occurrence of objects) followed the polarity of evaluative criteria, and did not show even a weak independence from this polarity. Moreover, neurotic people did not have a universal negative bias in their perception, and negativity bias was rather specific to words describing potential areas of failure in various contrast groups. This was an indication that emotionality bias overpowers semantic perception and collapses differentiation of all other possible descriptors to bipolar emotional criteria.
Overall these findings suggest that people's estimations of lexical material related to human behavior are 1) capacities-biased; 2) unreliable, judging by the opposite patterns in estimation of different temperament groups; 3) influenced by the social nature of language, designed to improve processes of socialization and social interaction, and 4) driven by emotionality, which shrinks the dimensionality of possible criteria into bipolar evaluative constructs. The described biases in estimations of lexical material lead to a collapse the complexity of perceived objects into two or even one dimension based on emotional valence. The study shows that such a collapse happens even for the most neutral and abstract lexical constructs.
This questions the validity of the lexical approach as a method for the objective study of psychological phenomena (including the example of biologically-based individual differences, which was discussed in this article).

Discussion of the Controversial Issues
This paper suggests that the lexical approach is not an appropriate tool for the investigation of biologically-based traits (called in differential psychology ''temperament''). The lexical approach might still be a valid tool for the investigation of socialverbal phenomena, including the influence of lexical processing on the perception of personality differences. The limitations of the lexical approach discussed here relate to its weakness in representation of biologically-based systems of individuality and not to the way in which socialization shapes our perception of personality types. These comments will likely meet objections from personality psychologists using the Big Five model. During the process of review and revision of this article several issues were discussed that are relevant in addressing such objections. The author is grateful to the reviewers and to the editor for suggestions for clarifying the author's position on the following issues.
1. Lexical approach is an analysis of relationships between lexical descriptors of behavior, and not actual behaviour. For those who defend lexical approach models it is useful to keep in mind the nature of the material that these models are based on. The dates of publication are given here to underline the time line of this research. This approach started when Allport (1937) suggested that since most relevant personality characteristics are encoded in natural language we can derive all aspects of individuality just using language descriptors. It was assumed that even biologically based characteristics will be fairly reflected in language. Now we know that this is not true, and that language is by nature a social invention, designed to reflect primarily socialization aspects of human life, and not biological factors of individuality. Moreover, there is a strong positive emotionality bias associated with socialization processes and a negative emotionality bias in people with high emotional arousal. A more detailed review of lexical approach research can be found in John and Srivastava (2001), but in brief the Allport-Odbert collection consisted of about 18000 personality descriptors, which several American psychologists tried to sort out, including Cattell (1945 Goldberg (1990) used as scales in his FA. Goldberg had a series of studies using 1710, 475 or 435 trait adjectives with various groupings into clusters and factors. In several studies he used self and peer ratings using the selected adjectives, and then conducted FA on his data. He consistently received a 5-factor solution, similar to the 5-factor solution received by Tupes and Christal (1961), Norman (1963), Borgatta (1964), Digman and Takemoto-Chock (1981) and then promoted by McCrae and Costa (1992). Moreover, in cross-cultural studies the same approach was used: it started from the collecting of lexical descriptors in other languages, with application of FA to group these descriptors. After difficulties replicating the same 5-factor structure in several languages, cross-cultural studies shifted to simple adaptation of the NEO-FF to other languages and verification of the psychometric properties of new versions of the test.
In other words, the Big Five was developed based on research that used subjective selection of lexical descriptors, and self-and peer assessment of correspondence between (only these) descriptors and observable behavior. And that is what the Big Five represents: a consistent model of how humans reflect individuality using language, no more. There were no considerations of findings in neuroanatomy, neurochemistry, experimental psychology, observations of behavior of people or animals in real situationsnone of this was used at the research stage leading to the development of the Big Five. In this sense we can say that the Big Five does not represent the structure of temperament or the structure of biologically based traits, even though lexical perception reflects some elements of it.

Why the results of this study do not confirm the validity of Big
Five model of personality but rather show its deficits. This article suggests that models of lexical approach reflect only relationships between the lexical personality descriptors that affected by three types of biases, and do not present biologically based systems of human individuality. We noted that there are at least three biases that are present in human lexicon and that can compromise the results of factor analysis in lexical approach: too many words related to socialization, in comparison to other words (socialization-language bias); negativity bias in words related to emotionality; and more words related to energetic rather plasticity aspects of activities (capacities bias). Plus the descriptors of individual behaviour that the lexical approach is using are borrowed from common language and not from scientific language (experimental studies, clinical observations, modeling, theoretical research). As the result, what the lexical approach found is not the structure of biologically-based regulatory systems (tempera-ment), but the socialization and emotionality biases in perception of lexical personality descriptors. Our study investigated whether temperament traits describing socialization and emotionality will show any interaction, or show different interactions with the perception of verbal material than traits that do not relate to socialization and emotionality. After all, as was described in the article, the impact of emotionality on cognition was shown to be strong and not under the control of the individual. The study showed is that in spite of multiple components within temperament structure, temperament traits influence semantic cognition in concert with the described biases, at least with the negativity bias of emotionality and positivity bias of socialization. The way that they interact (more emotional people indeed have more negative estimations and more social or energetic people have indeed more positive estimations) confirms the interlocking between emotionality, capacities and sociability, on the one hand and cognition, on the other hand. Suggestions of emotionality-and sociability-related biases in the assessment of lexical material are bad news for the lexical approach, and having confirmation of such biases using contrast temperament groups is even worse news.
These biases overshadow temperament traits (neurophysiological systems of regulation) which are not related to socialization or emotionality: plasticity, tempo, impulsivity, differentiation between regulatory systems of mental, socio-verbal and physical aspects of behavior. This means that estimations of lexical material collapse the complexity of perceived objects into two or even one dimension based on emotional valence. The study shows that such a collapse happens even for the most neutral and abstract lexical constructs.
For example, if the descriptors referred to elementary particles and only reflected positive-negative charge-related observations (but not spin, mass, strangeness etc.) we would not be able to differentiate between the particles comprising matter and fields (fermions vs. bosons), or stable and unstable particles. Moreover, neutral particles would not be described at all. Similarly, emotionality bias divides perceived properties into positive and negative valence grouping very different traits into the one category and missing emotionally neutral properties out of its analysis. The lexical approach completely misses the regulatory characteristics related to lability of behaviour (plasticity-rigidity, tempo of verbal or physical activities). Yet, these characteristics are well-known for being based on biological systems. Pathologically low plasticity (perseverance) has been described in clinical cases of frontal lobe damage for over 70 years. Contrary to unification of endurance and tempo under one dimension of Extraversion in the Big Five model, these characteristics are based on different biological systems: just consider the difference between capacities used by marathoners and sprinters.
In summary, the study showed how the emotional cognition of lexical material collapses the dimensionality of perceived phenomenon, and this is not necessarily good news for the Big Five theory. Findings within the lexical approach are useful for investigations of verbal processes within cognitive psychology, but we should not mix common human beliefs with scientific findings in neurophysiology and differential psychology.
3. What the lexical approach missed. As noted above, the three biases in the perception of lexical material collapse the complexity of regulatory systems into two dimensions based on emotional valence. Let us briefly list the aspects of biological systems of regulation of human behavior that are being hidden during such a collapse. It is almost impossible to summarize all important findings in temperament research, neuroanatomy and neurochemistry which relate to these aspects, but here are just several examples: -The findings in neurochemistry indicate that our behaviour is regulated by systems related to the functional aspects of construction of an action: orientation, programming and energetic maintenance (endurance) of an action [92][93], [106], [122][123][124][125][126][127][128][129]. These functional aspects have leading neurotransmitters regulating them (in concert with other neurotransmitters): NE-based orientation system, DA-based programming and integration system and 5-HT maintenance/ performance system [28][29], [92], [130]. -These three aspects are regulated differently during routines, habit/skills formation and habit use -such a habit management system is based on GABA/Glutamate exchange with the three other neurotransmitter systems. Moreover, there are several additional levels (hypothalamic hormones, neuropeptides including opioid receptors) regulating the same functional aspects of behaviour, especially in the deterministic aspects of behaviour [24], [28][29]. Neuroanatomically cortical vs. striatum integration of an action relates to the differences between regulation of novel/complex vs. routine aspects of the action (since [104][105] literature is extensive). In this sense a separation between temperament traits related to probabilistic vs. deterministic aspects of behaviour was useful in temperament research. -There are multiple reports indicating that nonlinearity, contingency and feedback processes are essential properties of literally every single neurochemical system of behavioral regulation: serotonin [130][131], monoamines [92], [132][133][134][135], prolactin [136], hypocretins [137][138][139][140], and HPA hormones [141]. What nonlinearity and feedback properties mean is that a linear increase of one parameter (call it a factor if you wish) does not give a proportional response in behaviour (increase or decrease on some observable trait), but instead can have several, often opposite responses in observations. What contingency means is the presence of ''if this… then..'' in 2-, 3and multi-way relationships between chemical agents regulating our behaviour. Examine any decent handbook of Neurochemistry (for example, [135]) to appreciate the complexity of our regulatory system. There is no way that common adjectives related to human behaviour would carefully reflect this complexity if common people, and even a majority of psychologists do not know how their behaviour is regulated (neurochemically) inside their own body. -In spite of the contributions of oxytocin and dopamine systems to th regulation of behaviour they are only a small part of the neurochemistry of behavioural regulation. Moreover, numerous neuropeptides and hormones likely contribute more to biological systems of behavioral regulation (i.e. temperament) than monoamine neurotransmitters. -In clinical psychology and psychiatry there were several observations as well as analyses of the effect of neurotransmitters that resulted in multiple models of temperament (or character) [81][82], [85]. -Developmental psychologists were using parent observations of their children in the study of temperament for more than half of the last century. Even though parents are not the most reliable raters, the fact that these studies used standardized observations of behavioral elements of real individuals (and not verbal descriptors of such elements) makes these studies more valuable contributors to differential psychology than the lexical approach [78][79][80], [90]. 4. ''A cat knows whose meat she ate.'' Temperament research has continued already for more than a century, considering that experiments on properties and types of nervous systems started in Europe in 1906, and a set of temperament models were developed within European psychiatry and American developmental psychology since then. It is a rather parochial position of personality theorists to avoid citation of the findings of these traditions, especially considering that the first models including Emotionality and Activity/Energy dimensions of individuality were developed long before the appearance of the lexical approach (look at the dates: Kant, 1798;Heymanns, 1910Heymanns, -1923. It worth noting that those first models describing these two dimensions targeted an explanation of the nature of Hippocrates' four temperaments. Pavlov's research on temperament, and clinical models of several European psychiatrists were also an attempt to explain the four ancient temperament types. In this sense it would be at least fair to acknowledge that the two main dimensions in the Big Five model (and the only two that showed more or less stability across cultures) were previously described in temperament research. This acknowledgment would of course devalue the novelty of the Big Five model and would suggest that it is in fact temperament and not personality structure that this model describes.
To avoid such devaluation, Big Five papers a) never mention the early two-dimensional models describing four temperament types; b) Eysenk and Gray models are named as models of personality, and the word ''temperament'' in personality papers seems to be prohibited; c) there are no citations of temperament research in personality journals, whether past or present; d) the word ''temperament'' is never used for their factors describing biologically based traits; e) whenever an article on temperament comes to a journal of personality it is rejected without review, especially if it does not cite studies using Big Five. You can contact the author for examples.
Personality researchers working within the Big Five seem to have difficulties admitting that they started digging in somebody else's garden and abandoned their own: their subject is the socialization and acculturation processes shaping human individuality, and not biologically based systems of behavioral regulation (temperament). Temperament, similar to sex and age, has nothing to do with socialization, even though, like any other psychological property, temperament interacts with social factors. Also similar to sex and age, temperament relates to neurochemical systems, and social-cultural factors have a rather minimal influence on it. It is wrong therefore to believe that ''temperament is almost the same as personality'', or that ''temperament is under the umbrella of personality''. We don't put concepts of sex and age under the umbrella of personality, even though studies show personality changes with age, and differences in extraversion between men and women. If we don't mix sex and age with personality, we should not do it with temperament. The differences between personality and temperament lie in the differences between biological factors of individuality and products of socialization. Besides, it we want to pick an umbrella, it is temperament that is the more strong and consistent factor determining the behaviour of an individual, with personality-related socialization developing on the basis of temperament during the life time (i.e. it is personality that should be under the umbrella of temperament). The expression about the ''umbrella'', however, is scientifically contra-productive. It mixes two different concepts in one pile, and leads to more confusion. It would be more productive if personality theorists would stop downplaying the concept of temperament and the findings within temperament research and not project their lexical models to biological systems of behavioral regulation. 5. We usually do not substitute the theoretical physics with the maintenance reports of technicians serving the equipment. For the same reason we should not judge the validity of a model of individual differences based on the reports of psychometric properties of self-report tests. After all, biological sciences have more weight in determining what biologically-based traits of individuality exist than psychometric measures of self-reports in regards to these traits. If we want to measure biologically-based systems that make people consistently different, a traditional scientific approach consists of two stages The first stage is to identify the biological systems of individuality, i.e. to partition all the behavioral variance into dimensions that later should determine the scales of our ''individuality'' test. That is what differential psychology and differential neurophysiology do. At this stage of research it would be natural to derive our partitions based on: a) neuroanatomic studies; b) studies of neurochemical systems regulating human behaviour; c) studies of similar biologically-based traits in other mammals, and d) studies of consistent differences between young children (after all, if we study biologically based characteristics of individuality we need to be sure that they are not a product of social expectations and cultural training).
Only when we have an idea of what can be found in biology that makes people different, can we switch to the second, psychometric, stage in the development of our Individuality test. Luckily the findings in research of types and properties of nervous systems, neurochemistry, neuroanatomy and psychiatry have a lot to offer. The psychometric stage has very little to do with biological investigations and should not substitute for the first stage. Psychometrics is a small area of applied psychology, developing techniques for monitoring the quality of psychological tests. The psychometrist's job is to verify that the items in our test reflect whatever researchers have identified as an important regulatory system; that all items along one scale measure the same trait (without asking the same question all the time); that scales are relatively independent and do not influence each other. For whatever properties we identify at the Research stage we can ask test developers to design a device measuring these properties. If we didn't order the test developers to have items reflecting some important property identified at the Research stage -a scale measuring this property would not appear by itself, and existing items belonging to other scales would reflect the void. If we did put too many items related to the same property -we will receive the strongest and most consistent factor, but it might have nothing to do with how important this property is in real biological systems of individuality. In this sense, if the Research stage is not done properly, the resulting test will measure our object only partially and likely from the wrong angle. For this reason the devices used in medicine are based on principles discovered in biology, clinical and chemical studies of actual life systems, and not on the technical notes of the staff tuning these devices.
In summary, the Research stage results in the description of an object to be measured and the nature of this object, which is relevant to obtain for our goals. The Psychometric stage works on the development and perfection of a measurement device to measure only those aspects of our object that the Research stage pointed to. Yet, I often hear often arguments in favour of lexical approach models of biologically based traits based on psychometric studies, including CFA and EFA of their tests. It is the same difference as the difference between theoretical/experimental physics and engineering. Engineers are the wrong people to ask about what the Universe is made from. Similarly, psychometrists are the wrong people to ask about what fundamental, biologicallybased regulatory systems humans have: they can not answer questions on the nature of the object that we measure. They can only answer questions about the measurement device.
6. There is a fundamental conflict of interest between psychometrics and differential psychology: dimensionality vs. functionality. This conflict of interest emerges when conclusions are drawn as to the structure of individuality. Psychometrics seeks out independence of a test's dimensions (scales) and (in the best case scenario) correspondence of test items to some actual elements of behaviour. Psychometrics does not concern itself with the nature of these behavioral elements, only about the correspondence between what is observed and the reading on the measuring device. That's where psychometrics and all techniques based on factor analysis (FA) really lose out: they are incapable of dealing with feedback, and nonlinear and contingent relationships between systems. Yet all life systems (including human individuality) are based on such relationships between their components. These relationships mean interdependency between dimensions, and feedback, nonlinear and contingent mechanisms are described in numerous reports within neuroanatomy and neurochemistry. For this reason functionality (the functional role of regulatory systems) rather than dimensionality should provide the main criteria for describing the structure of natural systems.
If we apply a method designed to search for independent dimensions to systems that have strong interdependency -by definition we will have the wrong picture, even though we are guaranteed, like in gypsy fortune telling, to have some sort of picture. For this reason factor analysis is never considered to be applicable in neurochemical or neuroanatomic models and research. Even outside of psychometrics and te lexical approach, FA is almost useless in the derivation of the taxonomy of objects whose elements have contingent and feedback relationships. By my estimation, I have conducted about 2000 FA protocols just over the past 10 years, within my research on the dimensionality of semantic spaces. Thus I am not as scared of criticising FA as other psychologists might be and to say that it is a very weak investigative tool for research on the structural relationships within natural systems. FA is very dependent on what items you put into its centrifuge, and the more degrees of freedom (diversity, variability) our research object has, the weaker are the resulting factors. This is the position of a differential psychologist. For psychometrists, however, it is very important that the scales are independent and do not overlap. The quality of the test is measured by the independence of the scales. This is problematic as NONE of the traits or properties ever measured by any psychological test are completely independent from any other psychological property. For this reason there is no such a thing as a perfect test, and psychometrists (i.e. people developing tests) try very hard to brush out this interdependence by creating substructures, smaller facets, hierarchy of factors, etc. It is a hard work, but this does not help to change the fact that FA is completely incapable of reflecting feedback and contingent relationships within the systems under study. Figure S1 The number of statistically significant sex differences in estimation of given concepts within specific temperament groups: low and high Motor (ERM), Social Endurance (ERS), and high Social Tempo (TMS). The stacked columns represent the total number of significant differences in the groups of concepts. The colours represent the spectrum of these differences along seven factors to which the scales are associated. The sign indicates the pole of the scales chosen by the male group with the higher scores on a given temperament trait for the given concepts (for example, a positive pole of the scales of Complexity factor is ''complex'' and a negative pole is ''simple''). Female groups with these traits had therefore the opposite patterns of estimations. ERM: Motor Endurance, ERS: Social Endurance, TMS: Social Tempo. (TIF) Table S1 The complete list of the significant differences in estimations of contrast temperament groups, Study 1.