Fig 1.
Word clouds showing the most/least correlated words for each factor (with rotation).
Word clouds showing the most/least correlated words for each FA factor (with rotation) as obtained using Differential Language Analysis ([11]). The larger the word, the more strongly it correlates with the factor (For all word clouds shown, FDR correction has been done to only show significant words. Also spatial location does not code for anything.). Color indicates frequency (grey = low use, blue = moderate use, red = frequent use).
Fig 2.
Correlations between the learned factors and the Big5 factors.
Fig 3.
Individual factor correlations with outcomes.
Note how F4 which captures the use of swear words negatively correlates with Satisfaction with Life (SWL).
Fig 4.
Questions (left of each factor) and Likes (right of each factor) that correlate the highest (green) and lowest (pink) for each of our 5 behavioral-linguistic trait factors.
Fig 5.
Word clouds showing the effect of a rotation.
Word clouds showing the effect of a rotation. A rotation yields markedly distinct factors. Note the absence of words like “paste this” in the rotated version in multiple factors as opposed to the unrotated version where multiple factors are characterized by words like “paste this” and “status update”. The larger the word, the more strongly it correlates with the factor. Color indicates frequency (grey = low use, blue = moderate use, red = frequent use) [11].
Table 1.
Predictive performance for behavioral/economic outcomes.
Table 2.
Predictive performance for questionnaire based outcomes.
Table 3.
Questions with the best and worst predictive performance.
Table 4.
Likes with the best and worst predictive performance.
Fig 6.
Test re-test validity of our learned factors.