6 Jan 2014: Masuda N, Kurahashi I, Onari H (2014) Correction: Suicide Ideation of Individuals in Online Social Networks. doi: info:doi/10.1371/annotation/d589857d-b3c6-4a16-acfe-423f9bf529f1 View correction
Suicide explains the largest number of death tolls among Japanese adolescents in their twenties and thirties. Suicide is also a major cause of death for adolescents in many other countries. Although social isolation has been implicated to influence the tendency to suicidal behavior, the impact of social isolation on suicide in the context of explicit social networks of individuals is scarcely explored. To address this question, we examined a large data set obtained from a social networking service dominant in Japan. The social network is composed of a set of friendship ties between pairs of users created by mutual endorsement. We carried out the logistic regression to identify users’ characteristics, both related and unrelated to social networks, which contribute to suicide ideation. We defined suicide ideation of a user as the membership to at least one active user-defined community related to suicide. We found that the number of communities to which a user belongs to, the intransitivity (i.e., paucity of triangles including the user), and the fraction of suicidal neighbors in the social network, contributed the most to suicide ideation in this order. Other characteristics including the age and gender contributed little to suicide ideation. We also found qualitatively the same results for depressive symptoms.
Citation: Masuda N, Kurahashi I, Onari H (2013) Suicide Ideation of Individuals in Online Social Networks. PLoS ONE 8(4): e62262. doi:10.1371/journal.pone.0062262
Editor: Attila Szolnoki, Hungarian Academy of Sciences, Hungary
Received: January 3, 2013; Accepted: March 18, 2013; Published: April 26, 2013
Copyright: © 2013 Masuda et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: The authors acknowledge financial supports provided through Grants-in-Aid for Scientific Research (No. 23681033). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have read the journal's policy and have the following interest: Authors Issei Kurahashi and Hiroko Onari are affiliated with iAnalysis LLC, Japan. There are no patents, products in development or marketed products to declare. This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials.
Suicide is a major cause of death in many countries. Japan possesses the highest suicide rate among the OECD countries in 2009 . In fact, suicide explains the largest number of death cases for Japanese adolescents in their twenties and thirties . Suicide is also a major cause of death for youths in other countries including the United States .
Since the seminal sociological study by Durkheim in the late nineteenth century , suicides have been studied for both sociology interests and public health reasons. In particular, Durkheim and later scholars pointed out that social isolation, also referred to as the lack of social integration, is a significant contributor to suicidal behavior –. Roles of social isolation in inducing other physical and mental illnesses have also been examined . Conceptual models that inherit Durkheim’s idea also claim that social networks affect general health conditions including tendency to suicide –.
Social network analysis provides a pragmatic method to quantify social isolation , . In their seminal work, Bearman and Moody explicitly studied the relationship between suicidal behavior and egocentric social networks for American adolescents using data obtained from a national survey (National Longitudinal Study of Adolescent Health) . They showed that, among many independent variables including those unrelated to social networks, a small number of friends and a small fraction of triangles to which an individual belongs significantly contribute to suicide ideation and attempts. A small number of friends is an intuitive indicator of social isolation. Another study derived from self reports from Chinese adolescents also supports this idea in a quantitative manner . The paucity of triangles, or intransitivity , also characterizes social isolation . Individuals without triangles are considered to lack membership to social groups even if they have many friends ; social groups are often approximated by overlapping triangles , .
Nevertheless, the structure of the Bearman–Moody study  implies that our understanding of relationships between social networks and suicide is still limited. First, in the survey, a respondent was allowed to list best five friends of each gender. However, many respondents would generally have more friends. The imposed upper limit may distort network-related personal quantities such as the number of friends and triangles. Second, their study was confined inside each school in the sense that only in-school names are matched. If a respondent X named two out-school friends that were actually friends of each other, the triangle composed of these three individuals was dismissed from the analysis. Therefore, the accuracy of the triangle counts in their study may be limited such that the relationship between intransitivity and suicidal behavior remains elusive.
In the present study, we examine the relationship between social networks and suicide ideation using a data set obtained from a dominant social networking service (SNS) in Japan, named mixi. Our approach addresses limitations in the previous study . First, an entire social network of users is available, where a link between two users represents explicit bidirectional friendship endorsed by both users. Some users have quite a large number of friends, as in general social networks . Second, for the same reason, we can accurately calculate the number of triangles for each user. An additional feature of the present data set is that the sample is relatively diverse because anybody can register for free. In contrast, the respondents were 7 to 12 graders in schools in the Bearman–Moody study.
Multivariate Logistic Regression
We defined the group of users with suicide ideation and the control group of users, as described in Methods. Table 1 indicates that the difference in the mean of each independent variable (see Methods for the definition of the independent variables) between the suicide and control groups is significant (, Student’s -test). We also verified that the distributions of each independent variable are also significantly different between the two groups (, Kolmogorov-Smirnov test).
The results obtained from the multivariate logistic regression are summarized in Table 2. The VIF values (see Methods) are much less than 5 for all the independent variables. The three types of correlation coefficients between pairs of the independent variables are also sufficiently small (Table 3). On these bases, we justify the application of the multivariate logistic regression to our data.
The odds ratio (OR) values shown in Table 2 suggest the following. A one-year older user is 1.00463 times more likely to belong to the suicide group than the control group on average. Likewise, being female, membership to one community, having one friend, an increase in by 0.01, an increase in the fraction of friends in the suicide group (i.e., homophily variable) by 0.01, and one day of the registration period make a user 0.821, 1.00733, 0.99790, , , and 0.999383 times more likely to belong to the suicide group, respectively. For all the independent variables, the 95% confidence intervals of the ORs do not contain unity, and the -values are small. Therefore, all the independent variables significantly contribute to the regression. In addition, because the AUC (see Methods) is large (i.e. 0.873), the estimated multivariate logistic model captures much of the variation in the user’s behavior, i.e., whether to belong to the suicide group or not.
Univariate Logistic Regression
All the independent variables significantly contribute to the multivariate regression probably because of the large sample size of our data set. Therefore, we carried out the univariate logistic regression between the dependent variable (i.e., membership to the suicide versus control group) and each independent variable to better clarify the contribution of each independent variable.
The results obtained from the univariate logistic regression are shown in Table 4. Although the -value for each independent variable is small, the AUC value considerably varies between different independent variables. The ORs for the community number, local clustering coefficient, homophily, and registration period are consistent between the multivariate and univariate regressions. For example, both regressions indicate that a user with a large community number tends to belong to the suicide group. These independent variables also yield large AUC values under the univariate regression.
The community number makes by far the largest contribution among the seven independent variables. The AUC value obtained from the univariate regression (0.867) is close to that obtained by the multivariate regression (0.873).
The independent variable with the second largest explanatory power is the local clustering coefficient (AUC 0.690). The results are consistent with the previous ones . We stress that we reach this conclusion using a data set whose full social network is available.
The homophily variable makes the third largest contribution (AUC 0.643). Although we refer to this independent variable as homophily (see Methods), the effect of this variable is in fact interpreted as either homophily or contagion , . Nevertheless, the result is consistent with previous claims that suicide is contagious (for recent accounts, see , –; but see  for a critical review) and that other related states such as depressive symptoms are contagious ,  (but see , ).
The effect of the age, gender, and degree (i.e., number of friends), on suicide ideation is small, yielding small AUC values, close to the minimum value (Table 4). In addition, the ORs for these variables are inconsistent between the multivariate and univariate regressions. For example, a female user is more likely to belong to the suicide group according to the univariate regression and vice versa according to the multivariate regression. Therefore, we conclude that these three independent variables do not explain suicide ideation.
The registration period also yields a small AUC value (i.e., 0.545). Therefore, suicide ideation depends on the community number, local clustering coefficient, and homophily variable not because they commonly depend on the registration period.
Our data set allows us to investigate correlates between users’ other characteristics and the independent variables if the characteristics have corresponding used-defined communities in the SNS. We repeated the same series of analysis for depressive symptoms, which are suggested to be implicated in suicidal behavior , , . A user is defined to own depressive symptoms when the user belongs to at least one of the seven depression-related communities (Methods).
The statistics of the independent variables for the depression group are compared with those for the control group in Figures 1, 2, 3, and Table 5. Each independent variable in the depression and control groups is significantly different in terms of the mean (, Student’s -test; see Table 5) and distribution (, Kolmogorov-Smirnov test).
We set the bin width for generating the histogram to 50. The abrupt increase in the distribution at 1000 communities for the suicide and depression groups is owing to the restriction that a user can belong to at most 1000 communities.
Each data point for degree is obtained by averaging over the users in a group with degree . Large fluctuations of at large values are caused by the paucity of users having large .
We applied the multivariate and univariate logistic regressions to identify independent variables that contribute to depressive symptoms (i.e., membership to the depression group). The control group is the same as that used for the analysis of suicide ideation. The results are shown in Tables 6 and 7. The VIF values shown in Table 6 and the correlation coefficient values shown in Table 3 qualify the use of the multiple logistic regression. The results are qualitatively the same as those for the suicide case.
We investigated relationships between suicide ideation and personal characteristics including social network variables using the data obtained from a major SNS in Japan. We found that an increase in the community number (i.e., the number of user-defined communities to which a user belongs), decrease in the local clustering coefficient (i.e., local density of triangles, or transitivity), and increase in the homophily variable (i.e., fraction of neighboring users with suicide ideation) contribute to suicide ideation by the largest amounts in this order. In addition, the results are qualitatively the same when we replaced suicide ideation by depressive symptoms. Remarkably, the most significant three variables represent online social behavior of users rather than demographic properties such as the age and gender.
Our result that the age and gender little influence suicide ideation is inconsistent with previous findings . The weak age effect in our result may be because the majority of registered users is young; the mean age of the users in the control group is 27.7 years old (Table 1). Nevertheless, we stress that suicide is a problem particularly among young generations to which a majority of the users belong.
We concluded that the node degree little explains suicide ideation. In contrast, previous studies showed that suicidal behavior is less observed for individuals with more friends , . It has also been a long-standing claim that social isolation elicits suicidal behavior –. As compared to typical users, some users may spend a lot of time online to gain many ties with other users and belong to many communities on the SNS. Such a user may be active exclusively online and feel lonely, for example, to be prone to suicide ideation. Although this is a mere conjecture, such a mechanism would also explain the strong contribution of the community number to suicide ideation revealed in our analysis. In contrast, many people nowadays, especially the young, regularly devote much time to online activities including SNSs . Therefore, the data obtained from SNSs may capture a significant part of users’ real lives.
Because mixi enjoys a large number of users and implements the user-defined community as a main function, its user-defined communities cover virtually all major topics. Therefore, applying the present methods to other psychiatric illness and symptoms, such as schizophrenia, bipolar disorder, and alcohol abuse, as well as positive symptoms may be profitable.
Our studies are limited in some aspects. First, we identified suicide ideation with the membership to a relevant community, but not with suicide attempts or committed suicides. Second, membershipship to a relevant community may not even imply suicide ideation. Users may enter the suicide group because they have encountered suicide among their friends or family. Third, our data are a specific sample of individuals from a general population. This criticism applies to any work that relies on SNS data. However, it is particularly pertinent when one focuses on individuals’ chracteristics (e.g., personality and attitudes) rather than collective phenomena online (e.g., contagion on SNSs). Although it is beyond the scope of the current study, quantifying the extent to which our sample accurately represents general populations remains a future challenge.
Mixi is a major SNS in Japan. It started to operate on March 2004 and enjoys more than registered users as of March 2012. Similar to other known SNSs, users of mixi can participate in various activities such as making friendship with other users, writing microblogs, sending instant messages to others, uploading photos, and playing online games. Registration is free. See  for a previous study of the mixi social network.
In mixi, there were more than user-defined communities on various topics as of April 2012. Users can join a user-defined community if the owner personally permits or the owner allows anybody to join it.
We identified suicide ideation with the membership of a user to at least one suicidal community. To define suicidal community, which is sufficiently active, we first selected communities satisfying the following five criteria: (1) The name included the word “suicide” (“jisatsu” in Japanese), (2) there were at least 1000 members on November 2, 2011, (3) there were at least 100 comments posted on October, 2011, which were directed to other comments or topics, (4) there were at least three independent topics on which comments were made on October, 2011, and (5) the condition for admission was made open to public. Seven communities met these criteria. Then, we excluded one community whose name indicated that it concentrated on methodologies of committing suicide and two communities whose names indicated that they encouraged members to live with hopes (one contained the word “want to live”, and the other contained the word “have a fun” in their names; translations by the authors).
As a result, four communities were qualified as suicidal communities. The user statistics of these communities are shown in Table 8. A user that belongs to at least one suicidal community is defined to possess suicide ideation. To exclude inactive users, we restricted ourselves to the set of active users. The active user was defined as users that existed as of January 23, 2012 and logged on to mixi in more than 20 days per month on average from August through December 2011. A similar definition was used in a previous study of the Facebook social network . We also discarded users with zero or one friend on mixi because the triangle count described below was undefined for such users. Despite this exclusion, the remaining data allowed us to examine the effect of social isolation in terms of the degree, i.e., number of neighbors, because the degree was widely distributed between 2 and 1000. There were 9990 active users with suicide ideation (suicide group).
We statistically compared the users in the suicide group with users without suicide ideation. Because the number of users was huge, we randomly selected 228949 active users that possessed at least two friends and belonged to neither of the seven candidates of the suicidal community defined above nor the ten candidates of the depression-related community defined below. We call this set of users the control group.
The employees of mixi deleted private information irrelevant to the present study and encrypted the relevant private information before we analyzed the data. In addition, we conducted all the analysis in the central office of mixi located in Tokyo using a computer that was not connected to Internet.
The dependent variable that represents the level of suicide ideation is binary, i.e., whether a user belongs to a suicidal community or not. Therefore, we used univariate and multivariate logistic regressions. To check the multicollinearity between independent variables to justify the use of the multivariate logistic regression, we carried out two subsidiary analysis. First, we measured the variance inflation factor (VIF) for each independent variable (see ,  and references therein). The VIF is the reciprocal of the fraction of the variance of the independent variable that is not explained by linear combinations of the other independent variables. It is recommended that the VIF value for each independent variable is smaller than 10 (preferably smaller than 5) for the multivariate logistic regression to be valid. Second, we measured the Pearson, Spearman, and Kendall correlation coefficients between the independent variables.
To quantify the explanatory power of the logistic model, we measured the area under the receiver operating characteristic curve (AUC) for each fit (e.g., ). The receiver operating characteristic curve is the trajectory of the false positive (i.e., fraction of users in the control group that are mistakenly classified into the suicide group on the basis of the linear combination of the independent variables) and the true positive (i.e., fraction of users in the suicide group correctly classified into the suicide group), when the threshold for classification is varied. The AUC value falls between 0.5 and 1. A large AUC value indicates that the logistic regression fits well to the data in the sense that users are accurately classified into suicide and control groups.
We considered seven independent variables. Their univariate statistics for the suicide and control groups are shown in Table 1.
Demographic independent variables include age and gender. Our analysis does not include ethnic components because most users are Japanese-speaking Japanese; mixi provides services in Japanese. Other demographic, socioeconomic, and personal characteristic variables such as residence area, occupation, company/school, and hobby, were not used because they were unreliable. In fact, many users leave them blank or do not fill them consistently, probably because they do not want to disclose them.
The number of user-defined communities that a user belongs to was adopted as an independent variable. We refer to this quantity as community number. The community number obeys a long tailed distribution for both suicide and control groups (Figure 1). The mean is quite different between the two groups (Table 1).
When a user sends a request to another user and the recipient accepts the request, the pair of users form an undirected social tie, called Friends. A web of Friends defines a social network of mixi. We adopted degree as the most basic network-related independent variable. The degree is the number of neighbors (i.e., Friends), and denoted by for user . The system of mixi allows a user to own at most degree 1000. As is consistent with the previous analysis of a much smaller data set of mixi , the degree distributions for both groups are long tailed (Figure 2). A small degree is an indicator of social isolation.
Local clustering coefficient.
We quantified transitivity, or the density of triangles around a user, by the local clustering coefficient, denoted by for user . A directed-link version of the same quantity was used in the Bearman–Moody study. For user having degree , there can be maximum triangles that include user . We defined as the actual number of triangles that included divided by . Examples are shown in Figure 4. By definition, . We discarded the users with because was defined only for users with . quantifies the extent to which neighbors of user are adjacent to each other , . If is large, the user is probably embedded in close-knit social groups , , . A small value is an indicator of social isolation. As in many networks , decreases with in both suicide and control groups (Figure 3). The results are consistent with those in the previous study in which the average obtained without categorizing users is roughly proportional to . Therefore, we carefully distinguished the influence of and on suicide ideation by combining univariate and multivariate regressions.
The shown values of and are for the nodes shown by the filled circles.
Suicide may be a contagious phenomenon (e.g., , –). If so, a user is inclined to suicide ideation when a neighbor in the social network is. Therefore, we adopted the fraction of neighbors with suicide ideation as an independent variable. It should be noted that, even if a user with suicide ideation has relatively many friends with suicide ideation, it does not necessarily imply that suicide is contagious. Homophily may be a cause of such assortativity. In this study, we did not attempt to distinguish the effect of imitation and homophily. The differentiation would require analysis of temporal data , . Nevertheless, for a notational reason, we refer to the fraction of neighbors as the homophily variable.
A user that registered to mixi long time ago may be more active and own more resources in mixi than new users. Such an experienced user may tend to simultaneously have, for example, a large community number, large degree, and perhaps high activities in various communities including suicidal ones. To control for this factor, we measured the registration period defined as the number of days between the registration date and January 23, 2012.
Analysis of Depressive Symptoms
To define depression-related community, we identified the communities satisfying the five criteria as in the case of suicidal community, but with the term suicide in the community name replaced by depression (“utsu” in Japanese). There were ten such communities. We excluded three of them because their names include positive words (let’s overcome, resume one’s place in society, cure; translations by the authors). We defined the remaining seven communities, summarized in Table 9, to represent depressive symptoms of users. The depression group is the set of active users that belongs to at least one depression-related community listed in Table 9. The depression group contains 24410 users.
Mixi approved the provision of the data.
We thank mixi, Inc. for providing us with their data and Taro Takaguchi for careful reading of the manuscript.
Conceived and designed the experiments: NM. Performed the experiments: HO. Analyzed the data: HO. Contributed reagents/materials/analysis tools: IK NM. Wrote the paper: NM.
- 1. Chambers A (2010) Japan: ending the culture of the ‘honourable’ suicide. The Guardian (3 August 2010).
- 2. US Bureau of the Census (2012). Statistical abstract of the United States.
- 3. Durkheim E (1951) Suicide. New York: Free Press.
- 4. Trout DL (1980) The role of social isolation in suicide. Suicide Life-Threatening Behav 10: 10–23.
- 5. Joiner Jr TE, Brown JS, Wingate LR (2005) The psychology and neurobiology of suicidal behavior. Annu Rev Psychol 56: 287–314. doi: 10.1146/annurev.psych.56.091103.070320
- 6. Wray M, Colen C, Pescosolido B (2011) The sociology of suicide. Annu Rev Sociol 37: 505–528. doi: 10.1146/annurev-soc-081309-150058
- 7. Putnam RD (2000) Bowling Alone. New York: Simon & Schuster.
- 8. Pescosolido BA, Georgianna S (1989) Durkheim, suicide, and religion: toward a network theory of suicide. Amer Sociol Rev 54: 33–48. doi: 10.2307/2095660
- 9. Bearman PS (1991) The social structure of suicide. Sociol Forum 6: 501–524. doi: 10.1007/bf01114474
- 10. Berkman LF, Glass T, Brissette I, Seeman TE (2000) From social integration to health: Durkheim in the new millennium. Soc Sci Med 51: 843–857. doi: 10.1016/s0277-9536(00)00065-4
- 11. Kawachi I, Berkman LF (2001) Social ties and mental health. J Urban Health 78: 458–467. doi: 10.1093/jurban/78.3.458
- 12. Wasserman S, Faust K (1994) Social Network Analysis. New York: Cambridge University Press.
- 13. Newman MEJ (2010) Networks – An introduction. Oxford: Oxford University Press.
- 14. Bearman PS, Moody J (2004) Suicide and friendships among American adolescents.
- 15. Cui S, Cheng Y, Xu Z, Chen D, Wang Y (2010) Peer relationships and suicide ideation and attempts among Chinese adolescents. Child Care Health Dev 37: 692–702. doi: 10.1111/j.1365-2214.2010.01181.x
- 16. Krackhardt D (1999) The ties that torture: Simmelian tie analysis in organizations. Research in the Sociology of Organizations 16: 183–210.
- 17. Palla G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435: 814–818. doi: 10.1038/nature03607
- 18. Onnela JP, Saramäki J, Hyvönen J, Szabó G, Lazer D, et al. (2007) Structure and tie strengths in mobile communication networks. Proc Natl Acad Sci USA 104: 7332–7336. doi: 10.1073/pnas.0610245104
- 19. Backstrom L, Huttenlocher D, Kleinberg J, Lan X (2006) Group formation in large social networks: membership, growth, and evolution. Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining : 44–54.
- 20. Aral S, Muchnik L, Sundararajan A (2009) Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks. Proc Natl Acad Sci USA 106: 21544–21549. doi: 10.1073/pnas.0908800106
- 21. Shalizi CR, Thomas AC (2011) Homophily and contagion are generically confounded in observational social network studies. Sociol Methods Res 40: 211–239. doi: 10.1177/0049124111404820
- 22. Mann JJ (2002) A current perspective of suicide and attempted suicide. Ann Intern Med 136: 302–311. doi: 10.7326/0003-4819-136-4-200202190-00010
- 23. Baller RD, Richardson KK (2002) Social integration, imitation, and the geographic patterning of suicide. Amer Soc Rev 67: 873–888. doi: 10.2307/3088974
- 24. Romer D, Jamieson PE, Jamieson KH (2006) Are news reports of suicide contagious? A stringent test in six U. S. cities. J Communication 56: 253–270. doi: 10.1111/j.1460-2466.2006.00018.x
- 25. Hedström P, Liu KY, Nordvik MK (2008) Interaction domains and suicide: a population-based panel study of suicides in Stockholm, 1991–1999. Soc Forces 87: 713–740. doi: 10.1353/sof.0.0130
- 26. Baller RD, Richardson KK (2009) The “dark side” of the strength of weak ties: the diffusion of suicidal thoughts. J Health Soc Behav 50: 261–276. doi: 10.1177/002214650905000302
- 27. Gould MS, Wallenstein S, Davidson L (1989) Suicide clusters: a critical review. Suicide Life-Threatening Behav 19: 17–29.
- 28. Christakis NA, Fowler JH (2009) Connected. New York: Little, Brown and Company.
- 29. Rosenquist JN, Fowler JH, Christakis NA (2011) Social network determinants of depression. Mol Psychiatry 16: 273–281. doi: 10.1038/mp.2010.13
- 30. Lyons R (2011) The spread of evidence-poor medicine via flawed social-network analysis. Stat Politics Policy 2: Article 2.
- 31. VanderWeele TJ (2011) Sensitivity analysis for contagion effects in social networks. Sociol Methods Res 40: 240–255. doi: 10.1177/0049124111404821
- 32. Brezo J, Paris J, Turecki G (2006) Personality traits as correlates of suicidal ideation, suicide attempts, and suicide completions: a systematic review. Acta Psychiatr Scand 113: 180–206. doi: 10.1111/j.1600-0447.2005.00702.x
- 33. Martin D (2010) What Americans do online: social media and games dominate activity. Nielsen News, Online (2 August 2010).
- 34. Yuta K, Ono N, Fujiwara Y (2007). A gap in the community-size distribution of a large-scale social networking site.
- 35. Ugander J, Karrer B, Backstrom L, Marlow C (2011). The anatomy of the Facebook social graph.
- 36. Stine RA (1995) Graphical interpretation of variance inflation factors. Am Stat 49: 53–56. doi: 10.1080/00031305.1995.10476113
- 37. Tufféry S (2011) Data Mining and Statistics for Decision Making (2nd edition). Chichester: Willey.
- 38. Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393: 440–442. doi: 10.1038/30918