The authors have read the journal's policy and have the following conflict: Dr. R. Goekoop is a psychiatrist working for Parnassia BAVO group, which is strictly a nonprofit, nonacademic (“peripheral”) mental health care institution. Like many other nonacademic centers in The Netherlands, it has turned into a company due to government policies that promote a “marketoriented” view of health care. This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials. In no way, shape or form, Parnassia Bavo Group has any interest in a commercial exploitation of the findings reported in the present paper.
Conceived and designed the experiments: RG JGG. Performed the experiments: RG. Analyzed the data: RG. Contributed reagents/materials/analysis tools: RG HSS. Wrote the paper: RG. Reviewed the manuscript: RG JGG HSS.
Human personality is described preferentially in terms of factors (dimensions) found using factor analysis. An alternative and highly related method is network analysis, which may have several advantages over factor analytic methods.
To directly compare the ability of network community detection (NCD) and principal component factor analysis (PCA) to examine modularity in multidimensional datasets such as the neuroticismextraversionopenness personality inventory revised (NEOPIR).
434 healthy subjects were tested on the NEOPIR. PCA was performed to extract factor structures (FS) of the current dataset using both item scores and facet scores. Correlational network graphs were constructed from univariate correlation matrices of interactions between both items and facets. These networks were pruned in a linkbylink fashion while calculating the network community structure (NCS) of each resulting network using the Wakita Tsurumi clustering algorithm. NCSs were matched against FS and networks of best matches were kept for further analysis.
At facet level, NCS showed a best match (96.2%) with a ‘confirmatory’ 5FS. At item level, NCS showed a best match (80%) with the standard 5FS and involved a total of 6 network clusters. Lesser matches were found with ‘confirmatory’ 5FS and ‘exploratory’ 6FS of the current dataset. Network analysis did not identify facets as a separate level of organization in between items and clusters. A smallworld network structure was found in both item and facet level networks.
We present the first optimized network graph of personality traits according to the NEOPIR: a ‘Personality Web’. Such a web may represent the possible routes that subjects can take during personality development. NCD outperforms PCA by producing plausible modularity at item level in nonstandard datasets, and can identify the key roles of individual items and clusters in the network.
Currently, the most influential way of looking at human personality is the multidimensional trait approach
In personality research, PCA is the preferred type of factor analysis. A single component score (or factor score) can be calculated for each principal component by averaging the scores on the covarying subvariables that define the component. Thus, a limited set of factor scores (a factor profile) can be used to provide a compact description of individual subjects, hence its attraction in personality research. Factor profiles can be used to define the personality of individual subjects or groups. Such profiles allow for predictions of specific human behaviors (e.g. cigarette smoking, obesity, divorce) under certain circumstances (e.g. stress)
Currently, the most influential multidimensional descriptive model of human personality is the fivefactor model
Although the NEOPIR has an impressive record of empirical research behind it, its factors are based on PCA, which has several limitations. Most importantly, the mutual relationships between the items that make up the factors are not explicitly modeled, and hence disregarded. This may be unfortunate, since some of these interactions may have disproportionate importance when compared to others (e.g. some items may be correlated to many or fewer other items, show stronger or weaker correlations, explain more variance in factor scores, or have causal dominance over others). As a result, items, facets and factors of the NEOPIR are given the same weighting and diagnostic value. This may not be desirable, given the possibility that certain personality traits (such as neuroticism or agreeableness) have a disproportional importance in mediating healthy personality development. Additionally, PCA is known to produce erroneous results when performed at item level in smallerthanstandard personality datasets. This is because PCA requires large numbers of subjects (typically 6 times the number of items) to produce reliable factor structures
To address these issues, we examined whether a novel method to identify modularity in datasets (Network Community Detection  NCD) could compensate for some of the limitations of PCA. Interactions between variables in a dataset (such as correlations between item scores of the NEOPIR) can be viewed as a network of nodes (items) that communicate with each other through links (e.g. significant correlations). NCD involves the identification of dense cliques of interacting nodes within network graphs. In a recent study, it has been shown that network clusters produced by some NCD algorithms are both practically and theoretically very similar to those identified using PCA
To examine the results of NCD when compared to PCA, we performed both types of analyses on data of 434 healthy subjects that completed the NEOPIR. Network graphs were created from correlation matrices representing the interactions between item scores and facet scores of the NEOPIR, in which nodes represent items or facets, links between the nodes represent significant correlations, and weights along the links are correlation coefficients (r). Given the theoretical similarity between PCA and NCD, we hypothesized that the FS of the NEOPIR dataset would closely match its NCS. NCS and FSs were first compared at ‘facet level’, i.e. between FSs and NCSs based on facet scores. Since PCA performed at facet level produces results that are comparable between standard and nonstandard datasets such as the present dataset, NCS were expected to resemble both the FS of the current dataset and the norm structure. Next, we examined NCSs in networks representing correlations between itemscores. This was done to examine whether NCD, as opposed to PCA, would produce plausible modularity at item level in the current (nonstandard) dataset. If NCD, in contrast to PCA, would find meaningful modules, this would indicate that NCD can outperform factor analysis at this level. In case plausible modules would occur (i.e. resembling the results of PCA at item level in standard datasets), we expected the emergence of largescale network clusters without an intermediate facet level, as is the case when using PCA at item level in standard datasets. Finally, we expected to obtain a richer view of the singular nature of individual items, facets and network clusters than provided by PCA.
To allow for a direct comparison between NCD and PCA, a matching procedure was used in which the match of NCS to FS was optimized. In order to find the best match, the global threshold of the personality network (i.e. the threshold for the significance of a link) was raised incrementally in the order of increasing levels of significance of links (‘incremental pruning’) until its NCS showed an optimal match with FS. Three different FSs were used as templates for matching: the FS of the standard (norm) dataset that contains 5 principal components (the ‘standard 5FS’), as well as the FSs derived from the present dataset using both a 5factor PCA (the ‘confirmatory 5FS’) and an exploratory PCA, which produced a sixfactor structure (the ‘exploratory 6FS’). Thus, it was possible to examine whether NCD in nonstandard datasets such as the present dataset would produce NCSs that follow the standard FS rather than the local (exploratory or confirmatory ) FSs. If that would be the case, NCD would show greater generalizability of its modularity than PCA. As a null hypothesis, we expected the generalizability of NCD to be the same or worse than that of PCA. Thus, we expected NCSs to show a better match with exploratory and confirmatory FSs from the present dataset than with the standard FS, since the latter is derived from a different (norm) dataset. Network graphs of the ‘winning’ network cluster decompositions were produced at both item and facet levels. This provided the first description of human personality structure in terms of an optimized network of mutually dependent personality traits. Such a ‘Personality Web’ shows certain paths (sequences of traits) that represent an array of routes (“highways”) that subjects may take during the course of personality development. We identified network nodes and modules that are potentially important for personality development in terms of their singular attributes and position within the personality network.
All subjects in this study provided both verbal and written informed consent and were aware that their personality rating scores were to be used for research purposes. This procedure was approved by the Ethics Committee of the Department of Psychology of the University of Amsterdam, under project number 2008PN427. No research was conducted outside our country of residence (The Netherlands) or outside of the context of the institutions that contributed to this study (see affiliations).
A group of 434 healthy Dutch psychology students was selected for this study. The only inclusion criterion was the ability to sustain an interview of about 40 minutes. Exclusion criteria were signs of psychopathology as defined by DSMIVTR (as assessed by SCL90) and a native language other than Dutch. Male to female ratio was 28.4% versus 71.6%, mean age was 20.6 years (SD 5.39, range 17–61).
All subjects completed the NEOPIR selfrating scale
At facet level, exploratory and confirmatory PCA produced a 6 factor and a 5 factor decomposition, respectively (see
As expected, exploratory PCA on itemscores in the present (nonstandard) dataset produced implausible results: a 10factor solution was found that, on visual inspection, showed no resemblance to either the standard 5factor structure or any of the 30 facets of the NEOPIR (
Factor structure  
Type  Standard  Confirmatory  Exploratory  
FACET  N  E  O  A  C  Factor 1  Factor 2  Factor 3  Factor 4  Factor 5  Factor 1  Factor 2  Factor 3  Factor 4  Factor 5  Factor 6 
n1 




n2 


−.570 

−.524  
n3 




n4 




n5 

−.435  .404 


n6 




e1 




e2 




e3 




e4 




e5 



−.410  
e6 



.483  
o1 




o2 




o3 




o4 


.502 


o5 




o6 




a1 


.463 


a2 




a3 

.514 



a4 




a5 




a6 




c1 




c2 




c3 




c4 




c5 




c6 



Network graphs were constructed at both item and facet levels from symmetrical univariate correlation matrices, with rows and column names referring to items or facets. These matrices were filled with the corresponding correlation coefficients (r) and transformed into undirected and weighted network graphs by means of NodeXL
In order to identify network clusters, we used the WakitaTsurumi NCD algorithm integrated within NodeXL
In contrast to computer networks or the internet, links in correlational network graphs are present with a certain probability, or ‘significance’. The pscore of a correlation (or network link) expresses the probability that the correlation is unjustified (i.e. the link is not there). Hence, the smaller p, the higher the chance of a connection being present. Hence, the identification of an optimal NCS in correlational network graphs (such as a graph of the NEOPIR dataset) involves the identification of a level of probability p for the significance of a link at which the NCS of the network is optimal. Until now, a definitive way of defining a p value at which an optimal NCS is obtained has been lacking from the international literature. Here, we describe a procedure by which the global threshold of the facet and item level personality networks is gradually raised (i.e. the networks are pruned in a linkbylink fashion), in the order of increasing r (correlation coefficient, which is directly linked to p), until an optimal match is found between NCS and different template FSs. This “incremental pruning” technique gradually removed lowly significant links from the network. After removal of each link, NCD was applied and the resulting subgraph and corresponding network cluster decomposition was saved for further analysis. At facet level, this procedure resulted in a total of 420 network cluster decompositions ((30*30)/2 30 links). At item level, a total of 28560 network cluster decompositions ((240*240)/2 240 links) were produced. These NCSs were matched against the factor contents of the three alternative FSs produced at item and facet levels (standard 5FS, confirmatory 5FS and exploratory 6FS), which served as matching ‘templates’. The matching of network cluster composition with factor structures involved the comparison of the ‘factor membership’ and ‘network cluster membership’ of each individual facet and item of the NEOPIR rating scale. Factor membership was represented as a twodimensional binary matrix (i.e. factor number×facet number, and factor number×item number), which was filled with ones (1) for membership and zeroes (0) for nonmembership. A similar matrix was made for network cluster membership (cluster number×facet number, or cluster number×item number). This was done for all individual subgraphs derived from the incremental pruning stage. Mismatches between factor and cluster structures were identified by subtraction of the binary membership matrices of factor structures and network cluster structures for all subgraphs and corresponding network clusters derived from incremental pruning, resulting in a clustertofactor mismatch (dissimilarity) measure for each subgraph and corresponding set of clusters. Since factors and clusters could differ in their respective sizes, the level of mismatch could differ between comparisons on this account. To prevent unreliable mismatch scores as a result of differences in factor or cluster sizes, we normalized the mismatch scores with respect to these size differences, by taking the absolute mismatch score for each clustertofactor comparison and dividing it by the maximum possible mismatch score for that comparison (i.e. factor size+cluster size).
Factor solutions allow the same items or facets to load on multiple factors (
In some cases, the matching procedure could result in more than one solution with equally low clustertofactor mismatch scores at different thresholds for the significance of a link (e.g. Facet Level_SOLUTION1 and Facet Level_SOLUTION2). If multiple community structures were found that showed an equal lowest levels of mismatch, a winning NCS was identified by selecting the NCS that explained the greatest amount of variance in the corresponding factor scores. To this end, network cluster scores were calculated by summing facet or itemscores and dividing the result by the total number of facets in the network cluster. Next, the correlation coefficients of significant correlations (p<0.01) between cluster and factor scores were squared (r^{2}) and summed to produce a measure of the total amount of variance in factor scores, as explained by network cluster scores.
For item, facet and cluster level networks, the following network metrics were calculated for each node
At facet level, exploratory factor analysis showed a 6factor structure that deviated from the standard 5factor structure, although the standard structure could still be largely recognized (
As expected, PCA at item level produced erroneous or weak results (see introduction, M&M). Exploratory PCA showed a 10factor structure without any resemblance to either a 5factor structure or a 30facet structure (
Cluster to factor matching  BEST OVERALL MATCH  BEST MATCH PER FACTOR  
Facet Level  r  p  % mismatch  r  p  % mismatch  
N  0.271  4.89E09  20.0%  0.215  3.07E06  9.1%  
E  0.271  4.89E09  0.0%  0.271  4.89E09  0.0%  
STANDARD (5)  O  0.271  4.89E09  0.0%  0.271  4.89E09  0.0%  
A  0.271  4.89E09  7.7%  0.265  1.10E08  0.0%  
C  0.271  4.89E09  7.7%  0.299  1.04E10  0.0%  
F1  0.271  4.89E09  11.1%  0.215  3.07E06  0.0%  
F2  0.271  4.89E09  0.0%  0.271  4.89E09  0.0%  
Factor structure  CONFIRMATORY (5)  F3  0.271  4.89E09  0.0%  0.271  4.89E09  0.0% 
F4  0.271  4.89E09  7.7%  −0.265  1.10E08  0.0%  
F5  0.271  4.89E09  0.0%  0.271  4.89E09  0.0%  
F1  0.282  1.18E09  9.1%  0.303  5.53E11  0.0%  
F2  0.282  1.18E09  16.7%  0.282  1.18E09  16.7%  
EXPLORATORY (6)  F3  0.282  1.18E09  9.1%  0.306  3.67E11  0.0%  
F4  0.282  1.18E09  0.0%  0.282  1.18E09  0.0%  
F5  0.282  1.18E09  0.0%  0.282  1.18E09  0.0%  
F6  0.282  1.18E09  75.0%  0.266  9.05E09  33.3% 
The community structure of this graph has an overall best fit with the confirmatory 5FS, occurring at r>0.271, p<4.89 E09. See
When the requirement for an overall match between NCS and FS was dropped, some individual factors showed a better match with individual clusters at various thresholds (see
Cluster to factor matching  BEST OVERALL MATCH  BEST MATCH PER FACTOR  
Item Level  r  p  % mismatch  r  p  % mismatch  
N  0.164  3.08E04  27.1%  0.190  3.38E05  16.7%  
E  0.164  3.08E04  16.7%  0.164  3.04E04  16.7%  
STANDARD (5)  O  0.164  3.08E04  16.3%  0.166  2.57E04  14.9%  
A  0.164  3.08E04  20.4%  0.158  4.64E04  17.9%  
C  0.164  3.08E04  18.4%  0.170  1.86E04  10.4%  
F1  −0.169  2.03E04  28.4%  0.220  1.92E06  17.5%  
F2  −0.169  2.03E04  17.9%  0.164  3.04E04  16.7%  
Factor structure  CONFIRMATORY (5)  F3  −0.169  2.03E04  19.5%  0.166  2.57E04  14.9% 
F4  −0.169  2.03E04  20.8%  0.158  4.64E04  17.9%  
F5  −0.169  2.03E04  16.4%  −0.140  1.72E03  13.3%  
F1  −0.169  2.03E04  28.4%  0.220  1.92E06  17.5%  
F2  −0.169  2.03E04  22.1%  0.164  3.04E04  20.8%  
EXPLORATORY (6)  F3  −0.169  2.03E04  13.5%  0.164  3.04E04  13.2%  
F4  −0.169  2.03E04  27.3%  0.159  4.59E04  22.4%  
F5  −0.169  2.03E04  16.4%  0.140  1.72E03  13.3%  
F6  0.221  1.71E06  78.9%  0.148  9.73E04  54.5% 
The community structure of this graph has an overall best fit with the standard 5FS, occurring at r = 0.164, p = 3.08E04. Node = item, link = significant correlation. Red links: positive correlations. Blue links: negative correlations. Node size = degree (larger nodes are bigger hubs, scale = 1 to 10). For further information, see supporting information (
When the requirement for an overall match between NCS and FS was dropped, better matches were found for individual clustertofactor comparisons (up to 10.43% mismatch for network cluster 5 with conscientiousness of the standard factor solution), see
Network clusters at item level immediately produced largescale clusters showing good correspondence with standard factors. No evidence was found for an intermediate facet level at either higher or lower thresholds. The NCS that showed an optimal match with the standard 5FS involved a sixcluster network structure. The sixth cluster contained a total of 9 items. These involved one extraversion item, two neuroticism items (with negative correlations with the other items within the cluster), three openness items and three agreeableness items (
CLUSTER1_N, CLUSTER2_E, CLUSTER3_O, CLUSTER4_A, CLUSTER5_C: clusters showing maximum correspondence with standard clusters of Neuroticism, Extraversion, Openness, Agreeableness and Conscientiousness, respectively. CLUSTER6: newly found sixth factor (see
Network Cluster  Node  Facets  Degree  Betweenness Centrality  Closeness Centrality  Eigenvector Centrality  PageRank  Clustering Coefficient 
1  CLUSTER1_N  See supp. Inf.  3  0.000  0.143  0.125  0.730  1.000 
2  CLUSTER2_E  See supp. Inf.  5  0.667  0.200  0.184  1.138  0.800 
3  CLUSTER3_O  See supp. Inf.  4  0.000  0.167  0.161  0.927  1.000 
4  CLUSTER4_A  See supp. Inf.  4  0.000  0.167  0.161  0.927  1.000 
5  CLUSTER5_C  See supp. Inf.  5  0.667  0.200  0.184  1.138  0.800 
6  CLUSTER6  See supp. Inf.  5  0.667  0.200  0.184  1.138  0.800 
A. 
Network Cluster  Node  Facet  Degree  Betweenness Centrality  Closeness Centrality  Eigenvector Centrality  PageRank  Clustering Coefficient 
1  n1  n1  7  2.090  0.019  0.034  0.825  0.714 
1  n3  n3  13  15.507  0.022  0.060  1.419  0.526 
1  n4  n4  8  7.685  0.019  0.037  0.942  0.571 
1  n6  n6  10  13.668  0.020  0.043  1.147  0.511 
2  e1  e1  13  25.286  0.022  0.055  1.464  0.449 
2  e2  e2  8  6.439  0.019  0.038  0.951  0.607 
2  e3  e3  12  28.507  0.021  0.050  1.350  0.409 
2  e4  e4  8  8.127  0.020  0.037  0.946  0.571 
2  e5  e5  3  0.983  0.016  0.013  0.450  0.333 
2  e6  e6  14  40.807  0.023  0.057  1.584  0.396 
3  o1  o1  4  4.023  0.015  0.008  0.615  0.500 
3  o2  o2  5  5.130  0.016  0.011  0.734  0.400 
3  o3  o3  7  16.556  0.018  0.020  0.952  0.333 
3  o4  o4  5  5.775  0.016  0.015  0.708  0.200 
3  o5  o5  5  6.744  0.017  0.013  0.715  0.300 
3  o6  o6  3  1.452  0.015  0.009  0.482  0.000 
4  a1  a1  13  35.185  0.021  0.053  1.475  0.397 
4  a2  a2  7  5.856  0.018  0.026  0.869  0.667 
4  a3  a3  12  24.946  0.022  0.049  1.363  0.455 
4  a4  a4  7  3.867  0.017  0.027  0.867  0.571 
4  a5  a5  4  0.963  0.015  0.013  0.565  0.667 
4  a6  a6  6  2.883  0.016  0.022  0.770  0.667 
4  n2  n2  13  22.977  0.022  0.057  1.432  0.449 
5  c1  c1  9  6.755  0.018  0.036  1.048  0.583 
5  c2  c2  5  0.000  0.016  0.020  0.644  1.000 
5  c3  c3  10  17.680  0.020  0.040  1.150  0.511 
5  c4  c4  12  32.751  0.021  0.049  1.356  0.409 
5  c5  c5  12  19.192  0.021  0.050  1.338  0.470 
5  c6  c6  6  0.850  0.016  0.023  0.750  0.867 
5  n5  n5  9  22.318  0.019  0.033  1.090  0.417 
B. 
Metric  A. 
B. 
C. 
Level of detail  FACET LEVEL  ITEM LEVEL  CLUSTER LEVEL 
Vertices  30.0000  240.0000  6.0000 
Unique Edges  125.0000  5738.0000  13.0000 
Edges With Duplicates  0.0000  0.0000  0.0000 
Total Edges  125.0000  5738.0000  13.0000 
SelfLoops  0.0000  0.0000  0.0000 
Connected Components  1.0000  1.0000  1.0000 
SingleVertex Connected Components  0.0000  0.0000  0.0000 
Maximum Vertices in a Connected Component  30.0000  240.0000  6.0000 
Maximum Edges in a Connected Component  125.0000  5738.0000  13.0000 
Maximum Geodesic Distance (Diameter)  4.0000  4.0000  2.0000 
Average Geodesic Distance  1.8222  1.8789  0.9444 
Graph Density  0.2874  0.2001  0.8667 
Minimum Degree  3.0000  1.0000  3.0000 
Maximum Degree  14.0000  106.0000  5.0000 
Average Degree  8.3333  45.4667  4.3333 
Median Degree  8.0000  41.5000  4.5000 
Minimum Betweenness Centrality  0.0000  0.0000  0.0000 
Maximum Betweenness Centrality  40.8066  590.6442  0.6667 
Average Betweenness Centrality  12.8333  109.4292  0.3333 
Median Betweenness Centrality  7.2200  79.6211  0.3333 
Minimum Closeness Centrality  0.0152  0.0014  0.1429 
Maximum Closeness Centrality  0.0227  0.0027  0.2000 
Average Closeness Centrality  0.0186  0.0022  0.1794 
Median Closeness Centrality  0.0185  0.0022  0.1833 
Minimum Eigenvector Centrality  0.0079  0.0001  0.1248 
Maximum Eigenvector Centrality  0.0598  0.0109  0.1842 
Average Eigenvector Centrality  0.0333  0.0042  0.1667 
Median Eigenvector Centrality  0.0349  0.0037  0.1727 
Minimum PageRank  0.4502  0.1684  0.7304 
Maximum PageRank  1.5840  2.0419  1.1380 
Average PageRank  1.0000  1.0000  0.9999 
Median PageRank  0.9494  0.9502  1.0328 
Minimum Clustering Coefficient  0.0000  0.0000  0.8000 
Maximum Clustering Coefficient  1.0000  1.0000  1.0000 
Average Clustering Coefficient  0.4983  0.4818  0.9000 
Median Clustering Coefficient  0.4848  0.4763  0.9000 
Smallworldness  0.2735  0.2564  N/A 
The current study directly compared the result of network community detection (NCD) and principal component analysis (PCA) in a dataset of personality scores (NEOPIR). Network community structure (NCS) was matched to factor structure (FS) while gradually raising the threshold for the significance of network links until NCS showed an optimal match with FS. Our analyses show that PCA and NCD generate highly similar results, confirming the theoretical similarity between the two techniques
At facet level, NCD showed a best match with the confirmatory 5FS (96.2%). A similar tight match was found with the standard FS (92.0%), which differed little from the confirmatory structure. These findings confirm our expectation that NCS would show a tight match with (standard) FSs at facet level. This match was not a matter of coincidence, given the steep decline of the mismatch curves, which clearly converged onto an optimal solution, which was stable across 8 consecutive pruning actions (
In both PCA and NCD, facets of the neuroticism dimension deviated from the standard solution and were partly redistributed across the agreeableness and conscientiousness clusters. Hence, it is possible that the neuroticism dimension in our sample of young psychology students deviated from the standard (norm) population. The redistribution of n2 and n5 facets caused an exaggerated amount of mismatch (3.8%) between NCD and PCA findings, since mismatch scores were not only found for neuroticism, but also for agreeableness and conscientiousness clusters, although these latter clusters were perfectly reproduced apart from the incorporation of neuroticism facets. NCD is therefore expected to behave even more similarly to PCA (i.e. >96.2%) when applied in larger (standard) datasets.
In contrast to PCA, NCD at item level produced a limited set of plausible modules that showed good correspondence with the FS of the norm dataset (80%). This match was not a matter of coincidence given the dip of the mismatch curves, which clearly converged onto an optimal solution, which was stable across 19 consecutively pruned links (
NCD showed a closer global fit with confirmatory 5FSs, confirming our expectations that NCS shows a better match with FSs derived from the same datasets. However, the winning fit was found for the standard (norm) FS, which involved a local fit. This was contrary to our expectations and introduces the possibility that NCD, when performed at item level, is able to utilize additional information that facilitates the extraction of the “true” (standard) modular structure of human personality from nonstandard datasets. One explanation why NCD outperforms PCA at item level is the forcedchoice nature of the WakitaTsurumi NCD algorithm, which dichotomizes network cluster membership. Whereas this may be problematic in smaller clusters (e.g. at facet level, see above), this has the potential of diminishing the sensitivity to chance deviations in larger clusters (e.g. at item level), since these are averaged out. However, this cannot entirely explain the better performance, since a similar forcedchoice filter was applied to factor loadings to avoid differences between the results of NCD and PCA precisely for this reason. Hence, some attribute specific to NCD may be responsible for the better performance of NCD when compared to PCA. PCA first identifies the factor that explains most variance in the data, after which its effect is linearly subtracted from the data and the process repeats. Instead, NCD greedily builds modules in a bottomup fashion, increasingly considering the global picture of the dataset. Thus, NCD may have access to a larger pool of information. Further studies are needed to examine whether NCD indeed produces results in smaller datasets that can still be generalized to the population at large.
At item level, NCD immediately identified large clusters, i.e. no intermediate (facet) level of aggregation was found. This finding is in accordance with previous factor analytic studies that found no evidence for facets as an intermediate level of aggregation in between items and factors, when adopting a bottomup approach
At item level, the NCS that showed an optimal likeness with (standard) 5FS was found at a global threshold that introduced a small sixth factor next to the other five (
The idea that personality develops towards maturity along certain paths or sequences of personality traits that are attained in the course of life has extensive support from studies of healthy personality (e.g.
We have shown that the network (community) structure of data derived from multidimensional questionnaires can be optimized with respect to the FS of such datasets. However, factor analysis is prone on its own inaccuracies and mistakes. Thus, the network cluster decomposition may be biased by factor analytic results. Factor analysis is generally considered to be an “objective” technique, which examines observed covariance in datasets. The cutoff points used for the significance of factor loadings and the inspection of a screeplot may, however, be considered rather arbitrary
The current study employed the WakitaTsurumi NCD algorithm because this technique has a strong theoretical match with principal component analysis and works for networks with large as well as relatively small numbers of nodes
Some remarks need to be made with respect to studies of the network structure of phenotypical data (questionnaires). In brain data or genetics, it is important to distinguish between nodes (genes, neurons or voxels) that show all positive (excitatory) or all negative (inhibitory) interrelations. In phenotypical data, however, such divisions are not straightforward. For instance, one item may ask whether a subjects likes bungeejumping, whereas another item may ask whether a person dislikes taking risks. These items will have item scores that are likely to be negatively correlated, although they both attempt to measure the same underlying global trait (e.g. openness to experience). If such items would be clustered into different clusters (e.g. using signed cluster analyses), that would add little to the knowledge of the cluster structure of personality and more likely reveal peculiarities in the way the various questions are phrased. Hence, the sign of the correlations is of lesser importance in phenotypical studies than in biological studies. A similar language problem may bias the detection of hierarchies, which may turn out to represent either more general or more specific phrasings while testing for the same basic trait.
In summary, some level of caution is advised when interpreting the results of network cluster algorithms at the phenomenological level. It is important to attribute a correct amount of value to the information given by the singularity of certain nodes of the Personality Web. However, the NEOPIR is a very thoroughly studied questionnaire in which redundant questions that explain little additional variance in factor scores have been removed. Hence, it seems acceptable to regard hubitems in the NEOPIR as genuine highdegree connectors, and not as the product of badly phrased questions that correlate with many other item scores. The presence of a smallworld structure in the NEOPIR network seems to point in the direction of a biologically plausible network
Network analysis of phenotypical personality data can be used to construct a Personality Web. Such webs are powerful tools for studies of normal personality development and personality disorders. A translation can be made from previous factor analytic findings to networkbased descriptions of human personality. This is an exciting new avenue that has the potential to change our view of both healthy human functioning and disease. Network science provides a solid theoretical framework for studies of human personality. Since the human brain has a clear multimodular hierarchic network structure
(XLSX)
(XLSX)
We would like to thank the management team of the Outpatient Clinic of the Department of Mood disorders, PsyQ, for providing the required amount of research time. Also, we would like to thank the students of the University of Amsterdam for kindly providing their personality scores.