Using passive sensor data to probe associations of social structure with changes in personality: A synthesis of network analysis and machine learning

Social network analysis (SNA) is an increasingly popular and effective tool for modeling psychological phenomena. Through application to the personality literature, social networks, in conjunction with passive, non-invasive sensing technologies, have begun to offer powerful insight into personality state variability. Resultant constructions of social networks can be utilized alongside machine learning-based frameworks to uniquely model personality states. Accordingly, this work leverages data from a previously published study to combine passively collected wearable sensor information on face-to-face, workplace social interactions with ecological momentary assessments of personality state. Data from 54 individuals across six weeks was used to explore the relative importance of 26 unique structural and nodal social network features in predicting individual changes in each of the Big Five (5F) personality states. Changes in personality state were operationalized by calculating the weekly root mean square of successive differences (RMSSD) in 5F state scores measured daily via self-report. Using only SNA-derived features from wearable sensor data, boosted tree-based machine learning models explained, on average, approximately 28–30% of the variance in individual personality state change. Model introspection implicated egocentric features as the most influential predictors across 5F-specific models, with network efficiency, constraint, and effective size measures among the most important. Feature importance profiles for each 5F model partially echoed previous empirical findings. Results support future efforts focusing on egocentric components of SNA and suggest particular investment in exploring efficiency measures to model personality fluctuations within the workplace setting.


Introduction
Personality can be broadly defined as an individual's recurring behaviors and dispositions [1]; however, more nuanced perspectives exist in the psychological literature. One classical view frames personality as static representations of the individual. Indeed, this person-based perspective defines personality by traits that represent stable dispositions, highlighting interindividual rather than intraindividual variation. A popular model that evolved from this framework is the Five Factor (5F) model [2,3]. This model operationalizes personality in terms of five primary traits: extraversion/introversion, agreeableness/antagonism, conscientiousness/disinhibition, stability/neuroticism, and openness/close-mindedness. The 5F Model has been applied to a variety of psychological constructs and contexts, including anxiety [4], depression [5], eating disorders [6,7], mental health treatment outcomes [8], academic performance [9], and organizational dynamics [10]. Moreover, research that has applied the 5F model within the mental health domain has frequently shown phenomenologically informative interactions among the five trait dimensions [5,8,11].
Despite the prevalence of the 5F model and its trait-based perspective of personality, it is noteworthy that this model is not designed to consider momentary, within-person fluctuations of personality [12]. Accordingly, a competing view within the literature frames personality as a state and emphasizes a situation-based perspective to personality expression. Under this paradigm, personality states represent dynamic dispositions that vary by social and environmental context and thus emphasize analysis of intraindividual differences [13]. Overall, the dichotomy between trait-and state-based personality stems principally from biological (of the former) and social-cognitive (of the latter) theoretical foundations that differentially emphasize the impact of nature and nurture on behavioral phenotypes.
Although the person-and situation-based perspectives of personality have traditionally been in competing opposition, recent literature has suggested that they may be integrated to good effect. In one study, researchers leveraged thousands of ecological momentary assessments (EMAs) to reveal that individuals may visit certain environments more frequently in accordance with their stable 5F personality traits [14]. The results suggested that an individual's environment may positively reinforce (i.e., lower the variability of) their dominant personality attributes over time and result in an increased stability of specific personality dimensions. In fact, more recent personality theories have begun to reconcile both the person-perspective and situation-perspective [15,16]. This reconciliation falls under the umbrella of Whole Trait Theory (WTT). WTT posits that individuals will experience each of the 5F traits throughout their lives, but to varying degrees from day-to-day [17]. As WTT has become an increasingly popular approach to operationalize personality in the recent literature [18], there is strong empirical precedent to build upon existing efforts in this direction. One manner with which to accomplish this is through creative methods development, specifically the application of quantitative techniques that have been proven to be useful in related psychological domains and provide an intuitive framework to model and explore state-trait personality duality in uniquely insightful ways.
One such suite of analytic techniques, social network analysis (SNA), has become an increasingly popular and effective tool for modeling and investigating psychological phenomena [19], notably personality and agency, which oftentimes fall within the purview of organizational psychology [20,21]. A social network is a representation of the relationships between social actors or nodes, which can be individuals, groups, and organizations [22]. Analyses of social networks focus on the structure of relationships-represented as edges, or ties-in the network, and particularly what might facilitate or restrict the exchange of information. Importantly, social networks allow for the examination of social processes over time in order to reveal how network ties evolve and actors are influenced [23,24]. Given that individuals' personalities likely influence (and are influenced by) those with whom they interact [14], it is reasonable to apply social networks-models of interpersonal relationships-to study personality dynamics. While the connection between social networks and personality has been established in the literature, the majority of works that examine personality within a social network context do so by modeling personality as a node attribute to examine its influence on tie formation and other network processes [25,26]. Operationalizing personality in this manner is inherently a trait-based framing-information on static measures of personality complements the social processes made observable and quantifiable through SNA. In light of WTT and other theories that view personality as fluid and determinable via intrapersonal and interpersonal factors, changes to personality state can also be interrogated as an outcome of the network structure itself. Therefore, the interest of the present analysis is to leverage SNA as a means of exploring social structural attributes that are most predictive of personality state change.
Along with SNA, machine learning presents itself as a powerful analytical tool within the personality literature and has shown success in predicting personality types and traits [27][28][29][30]. Importantly, studies in social psychology have applied machine learning alongside SNA in an independent and complementary fashion [31,32], yet few have utilized these methods in direct analytical combination-where attributes of a network are used as input predictors for a machine learning model. With a few exceptions, such direct use of these methods has been largely limited to the domain of cognitive neuroscience [33][34][35]. One unique study within psychology utilized data from 200 Twitter accounts to first construct a social network based on users' followers, extract features related to connectivity and engagement, and then apply these features to a support vector machine classifier in the prediction of self-report anxiety [36]. The results indicated that a model trained on social network features were predictive (AUC = 0.84) of anxiety disorder status [36]. Two additional studies focused on the predictive utility of egocentric network structural features using passively collected smartphone interaction data from N = 53 participants [37] and N = 130 participants [38] across 8 weeks to predict classification of low/high 5F-defined personality states. Egocentric networks focus on an individual (ego) and their direct connections with others (alters) to emphasize social standing from the perspective of the individual. Through comparison of egocentric networks across individuals within the broader social network, potentially informative structural differences can emerge. Broadly, the results of [37,38] indicated strong and uniquely elucidative contributions of various egocentric network structural features to the prediction of personality. Bolstered by these studies and the documented, individual strengths of both machine learning and SNA, the present work sought to further apply the promise of this methodological marriage within the personality domain.
As a resource toward this goal, the current study extended the novel work of Gundogdu et al. (2017b), who implemented a social network analytic approach to interrogate the dynamics of personality [39]. In their study, N = 54 participants from an Italian research center wore a sociometric badge to record daily social interactions (counts of face-to-face contacts) over six weeks. Importantly, this collection method was continuous and passive in nature, providing uninterrupted monitoring of social behavior throughout the work day and eliminating the reliance on participants to actively log their interactions. The resulting data were used to create person-specific dyad, triad, and tetrad induced subgraph representations of interactions across different time intervals. To characterize the nodes in these subgraphs, participants were asked to complete EMA personality prompts three times a day which asked participants to reflect on recent interactions/behaviors with their coworkers. Each EMA item corresponded to a specific 5F personality trait. All responses ranged from "strongly disagree" to "strongly agree" on a 7-point Likert scale [39]. Logistic linear mixed models were used to predict personality state transitions of an individual (e.g., "high" to "low" extraversion) as a function of the personality states of those with whom an individual interacted over a period of time. The results indicated that within-person variability in 5F personality traits was associated with variation in daily faceto-face interactions, with associations differing across traits [39]. For instance, participants were more likely to transition from a state of "low" agreeableness to a state of "high" agreeableness after interacting with two individuals with "high" agreeableness, whereas participants were more likely to transition to "high" openness when interacting with individuals with "low" openness [39].

Motivation
The work of Gundogdu et al. (2017b) is a valuable and creative first step in utilizing SNA to informatively blend trait-based measures of personality with the dynamics of situational context. To expand upon these efforts, the current exploratory work aimed to re-analyze their published dataset with the following key changes: (i) operationalize personality state change on a continuous scale instead of binary to more closely align outcome quantification with how the 5F (and WTT) literature conceptualizes personality manifestation (ii) focus more holistically on the utility of social network structural features to predict personality change instead of on the phenomenological insights gleaned from isolated graphlets of interactions, and (iii) embed SNA within a machine learning paradigm instead of employing a more traditional statistical modeling approach. In regard to this last point, few works [37,38] have operationalized SNA in conjunction with machine learning within the personality literature. Under this relatively novel paradigm, the present study thus aimed to explore theoretical and practical extensions of these works and analytically complement the groundwork laid by Gundogdu et al. (2017b). Their work thus served both as a practical basis for model implementation and as a valuable opportunity to further our understanding of personality dynamics with ecologically valid data. To this end, the current work sought to incorporate a broader suite of SNA-derived features as well as focus on the ability to predict change in personality state rather than predicting the states themselves.
For the purposes of this study, utilizing SNA in direct interface with machine learning can provide insight into the relative importance of a wide array of social network structural features in the prediction of 5F personality state trajectories. As structural attributes of social networks capture different aspects of social processes, those attributes that are found to be most influential in a predictive model of personality state change may reflect prominent sociological contexts underlying or driving that change, ultimately highlighting foci for future hypothesisdriven research. Importantly, the analytic framework put forth in this study is intended to be used for hypothesis generation, with the goal of providing a means to investigate WTT-defined personality against the backdrop of evolving social environments. To this end, this endeavor was guided by the following aims: i. Using network representations of week-long social interactions alongside machine learning, quantify and summarize the idiographic utility of SNA-derived structural features for the prediction of 5F personality-specific change trajectories.
ii. Explore and summarize the relative importance of SNA-derived structural features within and across the 5F personality traits.
iii. Draw insights from resulting personality-specific profiles of relative SNA-derived structural feature importance to provide specific network attributes for further consideration and research.
iv. Present a transparent, repeatable, and accessible quantitative framework for future application and refinement within the personality research domain.

Overview of study population and dataset
The current work utilized a previously published and publicly available dataset deposited online through Dryad [40]. The previous study was interested in utilizing wearable infrared passive sensing devices, specifically sociometric badges [41], in conjunction with structured EMA self-report, to interrogate the association between social interaction and personality state [39]. Accordingly, the dataset consisted of two separate but related timestamped logs across six weeks (January 30, 2012-March 9, 2012) for N = 54 Italian office employees (87% male; 84% Italian nationality). Participants varied in age from 23 to 53 years old (mean = 36.88, s.d. = 8.54). The first log listed device-detected instances of one-on-one, reciprocal employee interactions throughout the normal hours of the work week (Monday through Friday only). The second log regarded individual self-report EMA responses to reflections on recent (within the past half hour) personality-related behaviors. EMA prompts were given three times a day at 11:00 AM, 2:00 PM, and 5:00 PM, with domains and questions modeled after the Big Five Marker Scale (BFMS) [42] and the Ten Item Personality Inventory (TIPI) [43], and with responses ranging on a 7-point Likert scale from "strongly disagree" (1) to "strongly agree" (7). Under this framework, each EMA entry was associated with five scores, one for each of the Big Five personality states of extraversion, agreeableness, conscientiousness, emotional stability, and openness to experience. Moreover, each of these reported scores was an average of two TIPI item-level response scores that represented the respective personality state. Ultimately, this resulted in a dataset with 3,220 unique EMA responses quantified to represent an ecologically valid self-report summary of an individual's personality state and further contextualized with 248,749 contemporaneously recorded social interactions.

Data preprocessing and outcome operationalization
Using the provided EMA response logs, the data was first split based on the five-day work week. This resulted in six separate weekly logs which spanned the entirety of the data collection period. The EMA data in each week-based log was then processed independently to arrive at participantspecific operationalizations of personality state change for each respective week. Quantifying the average of absolute moment-to-moment changes in personality, the root mean square of successive differences (RMSSD) [44] was calculated for responses to extraversion, agreeableness, conscientiousness, emotional stability, and openness to experience. RMSSD is calculated as: ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi As illustrated in Formula (1), i is the measurement occasion, N is the total number of measurements, and x is the measurement value. This measure has been frequently used in the psychological literature to summarize dynamic change in affect [45]. Per-week, personalityspecific RMSSD values for each participant thus represented the five independent modeling outcomes of interest. Table 1 summarizes the start and end dates, as well as the median, minimum, and maximum RMSSD values for each response across participants for each weekly log. This step is summarized in panel 1A of Fig 1.

Social network construction and visualization
Similar to the EMA data in 2.2, the available infrared passive sensing logs of participant interactions were split based on the five-day work week. Following this, the networkX (v2.4) package [46] in the Python programming language (v3.8.3) was leveraged. The networkX package allows for the in-depth study of networks by providing a broad toolkit to create, manipulate, and probe relationship structure, dynamics, and function. Specifically, this study used networkX to build and visualize six undirected, weighted graphs representing the network of social interactions of all participants within a given work week. A node in these graphs represented a participant, while each edge represented a logged reciprocated interaction between two participants. Edge weights between nodes were equivalent to the total count of interactions between participants within the designated week. It is important to note that the authors of the original data reported issues of detector reciprocity in a subset of the logged interactions-one detector in a pair would log a specific interaction while the other would not. To address this issue, the current study chose to model as edge weights the minimum possible number of interactions between every participant as determined by taking the lower summed count of interactions logged by the associated detectors in question. This ensured consistency in handling discrepancies between detectors. The resulting networks were visualized using the Fruchterman-Reingold algorithm [47]. This step is summarized in panel 1B-1 of Fig 1.

Network feature extraction
To holistically operationalize the context of social interaction, 26 structural and nodal features from the week-based social networks (see 2.3) were quantified using the networkX package. Table 2 provides an exhaustive list of each feature along with its associated scope, operational definition, and general contextual meaning. In an effort to maximize practical utility, interpretability, and accessibility, the theoretical intricacies of some of the more complex features exceed the scope of the current work. However, interested readers are highly encouraged to consult an excellent online textbook on the topic of social networks from which many of this study's contextual network interpretations were derived [48]. In summary, features were selected to capture unique aspects of (i) the overall social network structure of interactions, (ii) the individual's (node's) positioning within the broader network of interactions, and (iii) the local structural properties of the network from the egocentric perspective of the individual. Moreover, these features can be interpreted relative to the environmental context of this study's data-the workplace. As one example, reach efficiency quantifies the unique social value of an interaction in the network-in other words, the degree to which one employee interacts with another employee with connections that the other does not have. Thus, an employee with high reach efficiency could be thought to interact with a co-worker who interacts with several other co-workers that are more similar to each other (e.g., same project team, workspace proximity, personal interests), but for one reason or another do not interact directly with the employee. In this manner, the employee "extends" his or her work network (and the ability to transmit information) via a valuable interaction with a single co-worker. In total, this network approach thus sought to probe an array of social processes and phenomena and ultimately relate their presence and magnitude to individual, self-report conceptualizations of personality state.
To fully accomplish this goal, network features were derived from each complete (global) weekly network and from the construction of egocentric (ego) networks-subnetworks that include a focal node (the ego) and all of the nodes to whom the ego has a connection (one step removed)-for each individual across the six work weeks of the data collection period. Where the global network structural features are, by definition, consistent across individuals within a week, yet uniformly different across the cohort from one week to the next, global nodal and ego structural features differ among individuals both within and across weeks. The final prediction space for subsequent modeling (see 2.5), thus consisted of these 26 features across six weeks for N = 54 individuals and resulted in 307 unique data points with which to predict associated within-week RMSSD values that represented dynamics of personality state change. This step is summarized in panel 1B-2 of Fig 1. Please see S1 File for the entirety of this derived dataset along with feature-specific distributional statistics.

Machine learning modeling and analysis
A machine learning approach was employed to model a suite of extracted network features and explore the relative predictive merit of each within the context of personality state change. In this application, the dynamics of the model's learning process serve as a means with which to highlight potentially relevant aspects of social networks (and their associated phenomenology). Features found to be most important in informing the decision process of a well-performing model thus present as signals that can inform and direct further research efforts and hypotheses.
To achieve this, all predictive modeling and analyses were conducted using the R programming language (v4.0.2). Five separate and parallel eXtreme Gradient Boosting tree (xgbtree) models [50] were constructed, validated, and assessed with the caret library [51] to predict weekly person-specific RMSSD of (i) extraversion, (ii) agreeableness, (iii) conscientiousness, (iv) emotional stability, and (v) openness to new experience personality state self-report scores as a function of contextual social network structural features (see 2.4). Briefly, the xgbtree model operates by constructing decision trees in a sequential manner, where each subsequent tree in the sequence learns from the mistakes of its predecessor and updates the residual errors accordingly. This process, known as "boosting", converts what would normally be a set of Derivation of outcome data for modeling. Raw EMA data is separated into six weeks and the RMSSD for each personality state self-report response within each week is calculated for each participant. (1B-1) Weekly cross-sectional social networks are constructed from the raw infrared passive sensing device log data. (1B-2) Features from the constructed networks are calculated (9 global structural, 3 global nodal, and 13 egocentric network features) to serve as predictors for the machine learning models. (2) The machine learning modeling framework is parallelized to independently predict each of the five personality states' RMSSD values. The model is trained on N-1 participants' network feature and outcome data and validated on a held-out participant's data. This is repeated 54 times such that each participant is held-out and trained using all other participants' data (LOSO cross-validation). A uniquely tuned model is validated for each fold and each model's predictions are saved. (3A) Individual model (fold) performance is assessed using variance explained (R 2 ), and average R 2 is calculated to assess overall performance of the machine learning framework across participants for each personality state outcome. (3B) Introspection of the models is performed via quantification of feature importance. https://doi.org/10.1371/journal.pone.0277516.g001

Global Network Node
From the Louvain modularity algorithm [49], the number of nodes that belong to the assigned cluster of the target node decoupling potential of the target node; how many individuals comprise the subcommunity to which the target node belongs?

Number of Subcommunities
Global Network Structure total number of clusters found via the Louvain modularity algorithm [49] compartmentalization; how divided are individuals within the work community?

Number of Ordered Pairs
Global Network Structure total number of possible ties among all nodes opportunity within the global network; how many potential interactions are there within the work community?
weak learners into a single strong learner. For context, the model representation and inference of xgbtree is identical to that of other tree-based learners such as the popular Random Forest model [52]; however, the underlying algorithm is distinct. Each model was trained with leave-one-subject-out (LOSO) cross-validation. Under this scheme, all rows of data (1-6 rows; see 2.4) corresponding to a target individual were held-out while the remaining data across N-1 individuals (one fold) were used to train, hyperparameter tune, and test the model on the held-out individual's data. The following seven model hyperparameters were tuned using the default grid search algorithm in caret: (i) the number of boosting iterations to perform (nrounds), (ii) the percent of training data to subsample for a given boosting iteration (subsample), (iii) the number of features to randomly subsample for each tree (colsample_bytree), (iv) the maximum depth allowed for each tree (max_depth), (v) the minimum weight required for each leaf node (min_child_weight), (vi) the minimum loss reduction required to further partition a leaf node (gamma), and (vii) the learning rate (eta). This was repeated N = 54 times to assess the model's performance both specifically within a fold (on a per individual basis) and holistically across all folds. This step is summarized in panel 2 of Fig 1. Each of the five LOSO cross-validated models were assessed using R 2 (variance explained) and root mean square error (RMSE) at two levels of organization. The first level considers overall average performance of the model across all LOSO folds, while the second level considers performance of the model in predicting each individual's outcome as a function of all other individuals' network structural features. Performance results are summarized and presented in tabular, histogram, and linear graphical format (S1 File). This step is summarized in panel 3A of Fig 1. At the request of a reviewer, the authors additionally compared the overall average performance of each xgbtree model to two more simplistic and algorithmically distinct models: (i) a regularized generalized linear model and (ii) a k-nearest neighbors, clustering-based model using the same cross-validation approach and with default parameters in caret. Note. Each of the listed 26 network features were used as predictors in five separate machine learning modeling pipelines (one for each 5F state change outcome). Selected network features vary in scope, including features that quantify the overall structure of the summative weekly workplace social network (9 features To introspect the resulting performance of the five models, the scaled feature importance in each model was calculated using the varImp function in caret. Intuitively, varImp operates by calculating differences in model error as a consequence of variable/feature permutation. Decreases in error represent improvements to the model and thus contribute to the overall magnitude of a feature's importance. This importance can then be scaled in relation to the importance of all other features for direct comparison purposes. This study reported both the scaled feature importance for each of the five personality-specific outcome models as well as the average overall feature importance across models. All features were ranked in order of descending average overall importance. This step is summarized in panel 3B of Fig 1. The data preprocessing and network building Python script, as well as the R script for machine learning modeling and analysis, are available in S2 and S3 Files, respectively.

Social networks
The graphs for each week-based social network are presented in Fig 2. As reported, the global network statistics of density, transitivity, and (especially) centralization change from week to week, thus indicating that the broader summative social context of the participants' work environment was not consistent across time. Against this shifting backdrop, individuals were not all interacting with the same co-workers or groups of co-workers, nor were they engaging in social interactions with the same frequency over time. The weekly networks express qualitatively appreciable variation in social engagement through time at both the individual and community level.

Model performance
Across individuals, Fig 3 illustrates that the predictive models built solely on social network structural features explained, on average, 28% of the variance in extraversion (A), 28% of the variance in agreeableness (B), 29% of the variance in conscientiousness (C), 30% of the variance in stability (D), and 29% of the variance in openness to experience (E), indicating that there were, on average, large predictive associations for each of the 5F constructs [53]. While promising, it is important to note that there was a large degree of interspecific variability in each of the personality-specific models. Moreover, Table 3 illustrates the high intraspecific variability (both in terms of variance explained and error) across personality models. For the majority of participants (37/55), models for some personalities performed very poorly while others performed very well (e.g., participant 509 with minimum R 2 = 0.04 and maximum R 2 = 0.72). Idiographically, the models were consistently informative across personality outcomes for approximately one-third (18/55) of the participants (see gray cells in Table 3). Models were defined as consistently informative if the worst performing personality model explained at least 5% of the variance (R 2 Min � 0.05) and the average normalized RMSE across models did not exceed 25% (RMSE Avg � 0.25) of the model outcome's range of observed values (i.e, RMSSD ranging from 0.00 to 3.48; see Table 1). Under this operationalization, performance was consistently poor only in one instance (participant 538 with minimum R 2 = 0.00 and maximum R 2 = 0.04). Furthermore, in consideration of model error independently of R 2 , predictions exhibited relatively small deviances (RMSE � 0.25) from the actual outcome values for 52/54 participants. These results holistically suggest the predictive capability of social network structural features in modeling individual personality state change.
For a comprehensive account of all in-fold (idiographic) predictions for each of the 5F personality state models, S4 File provides performance plots of observed versus predicted RMSSD values along with respective R 2 calculations.
Post hoc comparison in overall average performance for each of the above 5F xgbtree models along with their respective generalized linear model and k-nearest neighbor model implementations indicated that the xgbtree models consistently explained a larger percentage of the variance in personality state RMSSD relative to the generalized linear models, while variance explained in xgbtree models was comparable or greater in relation to all corresponding k-nearest neighbors models. Despite generally superior R 2 , overall average RMSE was consistently highest (however marginally in most cases) among the xgbtree models. S5 File provides a table which details comparative performance among the models. Table 4, efficiency (100.00) and constraint of the ego within the ego network (80.30) were among the most important features in predicting change in self-report extraversion over time. The size of the ego network (58.47) was considerably more important for extraversion state change predictions relative to all other personality states (17.87 on average). No features were of much lower importance when compared to all other personality models. Table 4, effective size of the ego network (100.00) and closeness centrality of the ego within the global workplace network (93.77) were the most important features predicting change in self-report agreeableness over time. Effective size was uniquely highly important relative to other personality states (45.94 on average), while closeness centrality was considerably more important for agreeableness state change predictions compared to all other personality state models (55.98 on average). Average similarity of the ego within the ego network (64.69) was also highlighted as more uniquely important relative to the other personality states (28.55 on average). However, no features were of much lower importance when compared to all other personality models. Table 4, efficiency (100.00), reach efficiency (95.65), and constraint (79.67) of the ego within the ego network were among the most important features in predicting change in self-report conscientiousness over time. Effective size of the ego network was considerably lower in importance (29.50) for conscientiousness models in relation to all other personality state models (63.56 on average). No features were of considerably higher importance when compared to all other personality models. Table 4, reach efficiency (100.00) and efficiency (85.03) of the ego within the ego network were among the most important features in predicting change in self-report stability over time. Density of the egocentric network was considerably lower in importance (18.71) for stability models in relation to all other personality state models (44.59 on average). No features stood out as being considerably higher in importance when compared to all other personality state models.

Stability models. As shown in
3.3.5 Openness to experience models. As shown in Table 4, efficiency of the ego within the ego network was the only structural network attribute among the most important features (importance > 75.00) predicting change in self-report openness to experience over time. Constraint (22.17) and betweenness centrality (22.25) of the ego within the ego network as well as closeness centrality of the ego within the global workplace network (33.48) were of uniquely low importance relative to other personality models (74.40, 51.49, and 71.05 on average, respectively). No features were of uniquely high importance relative to other personality models.
3.3.6 Average across personality-specific models. When considering all personality models together, efficiency (91.31), reach efficiency (75.02), and constraint (63.95) of the ego within the ego network, along with both closeness centrality of the ego in the global workplace network (63.54) and the effective size of the ego network (56.75) were the top five most important structural social network attributes influencing the prediction of personality state change over time.

Discussion
This study leveraged passively collected logs of workplace interaction alongside EMA data on personality-related behaviors from a previously published study on N = 54 office workers to characterize and compare the utility of social network structural features in the prediction of Note. Individuals are ranked highest to lowest based on minimum R 2 model performance (R 2 Min ) and then by average normalized RMSE (RMSE Avg ) from lowest to highest. Highlighted cells represent individuals with whom each of the five models explained at least 5% of the variance (R 2 Min � 0.05) with an accompanying RMSE Avg across models that does not exceed 25% (RMSE Avg � 0.25) of the model outcome's range of observed values (0.00-3.48).
https://doi.org/10.1371/journal.pone.0277516.t003 personality state change within and among 5F personality constructs. Through a combination of both SNA and machine learning, this research aimed to present a relatively uncommon exploratory workflow and demonstrate its ability to operationalize and highlight potentially significant social processes within the context of the personality literature. Model performance (in-fold) was highly heterogeneous within and across individuals (Table 3); however, results at the cohort level (out-of-fold) reflect an average of above-chance predictive performance (Fig  3). From these models, efficiency-the proportion of non-redundant ties, signifying an individual's diversity of social interactions in an SNA framework-was found to be consistently important in predicting change across all 5F personality states, while several other features such as effective size, closeness centrality, and constraint exercised a uniquely high or low influence within specific personality state predictions. Most broadly, these findings bolstered past findings which found predictive merit of egocentric network structural features in personality modeling [37,38]. Moreover, the results specifically highlighted efficiency, reach efficiency, and constraint within workplace egocentric networks, indicating that shifting social contexts which particularly describe the diversity and shared interactions of people may be important indicators of personality fluctuation. Furthermore, this has implications for future research which may benefit from exploring these features to profile individual personality constructs and characterize how they change over time.
Each of the 5F models was capable of accounting for 28-30% of the variance in individual personality state change on average across the cohort (Fig 3). Given the complexity of the phenomena in question (i.e., personality change), the ability to obtain a predictive signal using only network structural measures was promising. While model performance at the level of the individual was highly variable (Table 2), the modeling framework only performed poorly across all personality state outcomes for one participant (ID 538). Additionally, in over 50% of the cohort, at least one dimension of personality state change was substantively (R 2 >0.5) explained by social network structural features. The results simultaneously speak to the challenges of modeling constructs that are innately heterogeneous and suggest utility in employing SNA structural operationalizations alongside machine learning to model personality. Taken together, the cohort-level performance of the five machine learning models justified a closer inspection of which network features were being utilized to predict personality state change.
In this study, model introspection specifically concerned the relative predictive importance of each network structural feature. As each of these features capture different social situations or processes, those features found to be more important in predicting change in personality state would potentially suggest a phenomenological linkage between the operationalized social dynamic and the specific personality state modeled as an outcome. Ultimately, highlighting these social "determinants" through definition of network feature profiles may direct the efforts of future research endeavors. Feature importance results across the five models implicated (i) efficiency, (ii) reach efficiency, (iii) constraint, (iv) closeness centrality, and (v) effective size of the egocentric network as the most important network features in the prediction of personality state change on average (Table 4). First, the ranking broadly indicates that the majority of the most influential structural attributes are related to the egocentric network, rather than the global workplace network or the individual embedded within this global scope. The importance pattern suggests that understanding how individuals are forming direct ties in their local neighborhood of interactions offers more predictive value into personality state change than how the community is forming ties as a whole. Second, the specific prominence of efficiency and closeness centrality echoes previous work where measures of centrality and efficiency were found to outperform measures of transitivity [38]. In the current study, transitivity within the global and egocentric networks was comparatively low in importance. Third, and in particular consideration of the office setting, the importance of efficiency and closeness centrality-measures that quantify the influence/value of ties and the ability to spread and receive information, respectively-speaks to the potential preference or need to disseminate information quickly to well-connected people. For example, one research study found that individuals concerned with how they were perceived by others within the workplace (i.e. those who were highly conscientious self-monitors) tended to occupy more central positions within the workplace network [54].
Focusing more specifically on a few of the 5F personality state-specific results, the size of the egocentric network was of a noticeably higher relative importance (58.7; rank 4) in predicting extraversion state change compared with all other personality states (17.9; rank 15 on average). Previous work has specifically demonstrated a positive correlation between extraversion and egocentric network size [55]. Relatedly, the importance of closeness centrality (71.78; rank 3), while not the highest among the personality models, was still highly influential in predicting extraversion state change. Like egocentric network size, several measures of centrality, including closeness centrality, have previously been found to positively correlate with extraversion [37,56]. The importance of constraint-the extent to which the ego's connections are to others who are themselves connected to each other-signifies that the ego's interactions with others who either have a higher number of shared (high constraint) or a higher number of distinct (low constraint) interactions from the ego are predictive of extraversion. While this study does not assess directionality of the feature's importance on the model's prediction, the purported association of extraversion with the "diversity" of ties (similar to what is captured by efficiency) seems reasonable whether it is positive or negative. Furthermore, the intuitive hypothesis can be made that more highly extraverted behaviors may invite lesser constraint through willingness to interact with a broader array of individuals; however, further research would be needed to test whether this is the case.
In consideration of agreeableness, previous research has found that more agreeable people tend to form "small worlds" or short chains of interactions connecting individuals [38]. Structurally, this phenomenon is partly characterized by short distances (larger number of interactions) between nodes (individuals). In line with this, current analyses found that closeness centrality was particularly important (93.77; rank 2) in predicting agreeableness state change. Closeness centrality was considerably higher in importance when compared to all other personality states (55.98; rank 5 on average). Moreover, the connection between small world networks, closeness centrality, and agreeableness has been documented specifically within the workplace domain. One study examined small world networks among CEOs and their employees and found a positive, significant association between customer satisfaction and the interaction term for CEO agreeableness and closeness centrality [57]. The current work also found a uniquely higher relative importance for average similarity-the degree to which alters interact with the same individuals as the ego-within the egocentric network (64.69; rank 5). Compared with all other personality states (28.55; rank 11.25 on average), the importance of egocentric similarity for agreeableness speaks to the findings of a Facebook study where people, particularly males, with similarity in agreeableness, had stronger connectedness relative to others [58]. The unique implications for agreeableness similarity on the resulting structure of egocentric networks in males may partially explain the heightened predictive utility of this feature in the current study's cohort.
Turning lastly to openness, the majority of the centrality measures for both the global workplace and egocentric networks had relatively low importance in the openness models relative to all other personality constructs. This pattern makes some intuitive sense given that those who are open to new experiences may be more likely to interact with a wider array of people regardless of other socially relevant factors. Accordingly, the degree to which one is well-connected or central within the social network may carry less significance. However, the fact that efficiency solely dominates the prediction dynamics of this personality state could be a function of both the most consequential manifestation of openness behaviors within the workplace environment (e.g., openness to interact with individuals regardless of research group/department, thus influencing non-redundancy of ties) in combination with this potentially less informative signal in centrality. Importantly, an individual can be efficient without being central (in the strict sense of degree, betweenness, and closeness, but see [59]) thus this combination may be a unique signal for the openness personality state within the work setting. Differing from the current results, one study found that individuals with a higher openness to experience tend to more likely act as intermediaries between previously unconnected individuals [25]. This suggests that betweenness centrality should have some predictive merit; however, for both the global and egocentric networks, it was found to be of low importance (22.25; rank 9 and 33.87; rank 6) relative to all other personality constructs. Future research may benefit from focusing on the relative efficiency and centrality profile of openness (and personality more broadly) to see if the current results generalize to other cohorts both inside and outside the workplace setting.
This exploratory work has several important limitations. First, feature importance is a scalar measure, thus the current analysis was unable to probe the directionality of a feature's influence on a model's prediction. More complex model introspection techniques such as SHapely Additive exPlainers (SHAP) [60,61] or LIME [62] may be employed in future efforts on larger datasets to more reliably ascertain potentially meaningful magnitudes of predictive influence. Relatedly, there is an inherent tradeoff between the complexity and interpretability of any "black box", machine learning-based approach. While the ability to peer inside these models (as mentioned in the first point above) has partially mitigated this tradeoff and has allowed researchers to contextualize and detail model performance within the purview of real-world phenomena, any interpretation outside the model's demonstration of predictive merit should be treated as hypothesis-generating and exploratory rather than hypothesis-testing and confirmatory. Indeed, the current exploratory work performed introspection on a model that, despite being easy to implement in practice, is algorithmically complex and thereby unable to provide the transparency of a more traditional statistical model.
Third, it is important to recognize that this approach highlights potentially fruitful patterns of network structure that inform personality state change; however, it cannot be used to ascertain causal relationships. Model importance does not necessarily relate to phenomenological importance; thus, results should be used as a guide to inform future research inquiries rather than as a direct translation of real-world phenomenology. Third, the cohort was largely homogenous, consisting mostly of Italian male researchers working in an office environment. Accordingly, the findings may not reflect processes that occur in the general population or across disparate contexts. Nevertheless, this work offers specialized insight into the network features that drive personality dynamics in an office setting. Fourth, a commonly cited limitation in passive sensing social network studies [37] is that the data did not account for interactions involving individuals who have not participated in the study. Lastly, behavioral reports during non-working hours were not collected and could therefore not be analyzed to provide a continuous account of personality state change from one day to the next.
Despite these limitations, the results are valuable in that they contribute additional proof of promise in the application of machine learning-based network modeling to personality-a literature which is currently still relatively small and underdeveloped. The exploratory modeling paradigm presented in this work is easily scalable to larger datasets and can be reproduced handily with only a few software packages in Python and R (see S2 and S3 Files). The current research has bolstered previous findings in support of the especially informative nature of egocentric social network features and has particularly implicated measures of efficiency, constraint, and closeness centrality as potentially fruitful markers of workplace personality dynamic change. Further exploration into specific structural network features of personality may yield unique insights that practically influence workplace organization and efficiency as well as theoretically inform broader facets of mental health and the human social condition.