Identifying the phonological backbone in the mental lexicon

Michael S. Vitevitch; Mary Sale

doi:10.1371/journal.pone.0287197

Abstract

Previous studies used techniques from network science to identify individual nodes and a set of nodes that were “important” in a network of phonological word-forms from English. In the present study we used a network simplification process—known as the backbone—that removed redundant edges to extract a subnetwork of “important” words from the network of phonological word-forms. The backbone procedure removed 68.5% of the edges in the original network to extract a backbone with a giant component containing 6,211 words. We compared psycholinguistic and network measures of the words in the backbone to the words that did not survive the backbone extraction procedure. Words in the backbone occurred more frequently in the language, were shorter in length, were similar to more phonological neighbors, and were closer to other words than words that did not survive the backbone extraction procedure. Words in the backbone of the phonological network might form a “kernel lexicon”—a small but essential set of words that allows one to communicate in a wide-range of situations—and may provide guidance to clinicians and researchers on which words to focus on to facilitate typical development, or to accelerate rehabilitation efforts. The backbone extraction method may also prove useful in other applications of network science to the speech, language, hearing and cognitive sciences.

Citation: Vitevitch MS, Sale M (2023) Identifying the phonological backbone in the mental lexicon. PLoS ONE 18(6): e0287197. https://doi.org/10.1371/journal.pone.0287197

Editor: Yiu-Kei Tsang, Hong Kong Baptist University, HONG KONG

Received: March 9, 2023; Accepted: June 1, 2023; Published: June 23, 2023

Copyright: © 2023 Vitevitch, Sale. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data are available from the Open Science Framework DOI 10.17605/OSF.IO/962S4 https://osf.io/962s4/.

Funding: The authors received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

The mathematical tools of network science are being used increasingly in the speech, language, hearing, and cognitive sciences to better understand typical processing as well as various disorders that affect speech, language, and hearing (e.g., [1–5]). An early application of network science to language mapped the phonological similarity that existed among 19,340 words believed to be stored in the mental lexicon of a typical adult [6, 7]. Nodes represented phonological word forms, and edges connected words that were phonologically similar based on the addition, deletion, or substitution of a phoneme to form a web-like network, a portion of which is shown in Fig 1 (see [8, 9] for other ways to define phonological similarity).

Download:

Fig 1. Nodes represent words, and edges are placed between words that sound similar to each other.

In this network, phonological similarity is defined by a simple computational metric (add, delete, or substitute a phoneme in a word to form another word), but phonological similarity can be defined in other ways (e.g., [8, 9]).

https://doi.org/10.1371/journal.pone.0287197.g001

A central tenant of network science is that the structure of the network influences processing in that system [10–12]. Computational analyses of several other languages by [13] found the same structural features in the networks of those languages that were previously observed in the phonological network of English [6], suggesting that the structure of phonological networks is not unique to English. Subsequent psycholinguistic experiments in English found that various measures of the phonological network influenced spoken word recognition [14], speech production [15], word-learning [16], long- and short-term memory [17], and perception of the speech to song illusion [18] in typically-developing language users as well as in people who stutter [19] and in people with aphasia [20].

Among the various measures one can make of individual nodes in a network (i.e., the micro-level), of a subset of nodes in the network (i.e., the meso-level), or of the whole network (i.e., the macro-level), are metrics that allow one to identify nodes that are “important” in the network in some way. Two previous studies used different methods to identify individual nodes and a subset of nodes in the phonological network that were “important” in some way.

In [21] the network science measure known as closeness centrality was used to identify “important” nodes in the network. Closeness centrality measures the average distance between a node and all other nodes in the network, and is therefore considered a characteristic of an individual node [22]. In [21] it was found that words like can with high closeness centrality (i.e., it is close to many other words in the lexicon) were responded to more quickly in several psycholinguistic tasks than words like cure that were similar in several important psycholinguistic variables (e.g., frequency of occurrence, word-length, etc.), but had low closeness centrality (i.e., it is far from other words in the lexicon), demonstrating that a micro-level measure of “importance” can influence processing.

In a different study [23], an algorithm developed by [24] was used to identify a set of “important” nodes in the network called keyplayers. Keyplayers are considered “important” because the removal of this set of nodes results in the maximal fracturing of the network. It was found in [23] that the set of words identified as keyplayers were responded to more quickly in several psycholinguistic tasks than words that were similar to the keyplayer words in several important psycholinguistic variables (e.g., frequency of occurrence, word-length, etc.), but which were not in the set of keyplayers. Importantly, being a keyplayer is a characteristic of a set of nodes, not of individual nodes as in the closeness centrality measure used in [21]. Thus, the findings in [23] demonstrate that a meso-level measure of “importance” can influence processing.

In the present study we used another approach—extracting the backbone—to identify “important” nodes in the phonological network. In contrast to identifying individual nodes [21] or a set of nodes [23] that are “important” in the network, the backbone approach can be thought of as a whole-network/macro-level approach to identify “important” nodes. In the backbone approach, the essence of a larger, complex network is distilled into a smaller, simplified subnetwork that maintains the basic and crucial features of the original network [25]. The smaller, simplified subnetwork is obtained by discarding redundant or unnecessary edges. Thus, the nodes and edges that appear in the subnetwork that is extracted via the backbone method can be considered a way to identify “important” nodes/edges in the network at the whole-network/macro-level.

One thing that makes the backbone method appealing to use is that often, the smaller, simplified subnetwork reveals relationships that may have been hidden in the larger, more complex network. For example, backbone extraction was used on a network of US Senators who co-sponsored bills to reveal a smaller, simplified subnetwork that provided evidence for partisan polarization in the Senate, which was not evident using other network measures or in the larger, complex network [25]. In the context of a network of phonologically related word-forms, the backbone may capture the distinctive (i.e., marked) phonological features that must be retained to differentiate between phonemes in words, and the redundant edges removed by the backbone procedure may reflect the “default” features of phonemes in words that are easily predicted by phonological rules (and can therefore be discarded) as proposed by various theories of phonological underspecification [26–29].

In the present study we used the backbone package in R [25] to extract the backbone of the phonological network first examined in [6], and then compared the lexical and network characteristics of the words in the backbone to the words that were not in the backbone. The “important” nodes and edges that remain in the backbone might point to a set of essential words and phonological relationships (i.e., a kernel vocabulary [30]) that may prove useful to researchers and clinicians working in various areas including language development, second language learning, and aphasia, and which might not have been revealed using other measures from network science or using more traditional measures from psycholinguistics.

Methods

The phonological network in [6] was a unipartite network that contained 19,340 nodes representing words, and 31,267 undirected edges. Edges connected words if the addition, deletion, or substitution of a single phoneme changed one word into the other. Additional details about the structure of the original phonological network can be found in the results section in the comparisons between the original network and the extracted backbone.

The backbone package for R (v2.1.1; [25]) was used to extract the unweighted backbone from the whole network of 19,340 nodes. The backbone of a network is essentially a simplified and smaller subnetwork that is obtained by removing redundant or unnecessary edges (see [25] for a more technical account of the procedure). There are a variety of backbone models that can be used depending on several factors, including whether the edges are weighted or unweighted, whether one is interested in preserving a hidden hub-and-spoke structure or in revealing a hidden community structure, etc. (for guidance see [25, 31, 32]). Because previous work demonstrated the importance of community structure in the intact phonological network of English [33], we wished to maintain and examine further the simplified community structure that might be revealed in the backbone. Therefore, we used the local graph sparsification model (L-spar; [34]) with the following R command and parameter settings: sparsify(escore = "jaccard", normalize = "rank", filter = "degree", umst = FALSE.

The escore parameter determines how to score the importance of the edges with the jaccard coefficient being used to assess the similarity between the neighborhoods of the endpoints of each edge (from 0, no overlap, to 1, complete overlap). The normalize parameter determines the method to normalize the edge scores (from 0 to 1) with the rank setting being used to assign the value of 1 to the strongest edge. The filter parameter determines which edges are retained, with degree indicating that the d^s most important edges are retained (s = sparsification parameter, ranging from 0 to 1, with 0 leading to the sparsest backbone where only the strongest edge of each node is retained). In order to obtain the sparsest network possible, we selected s = 0 as the sparsification parameter, which resulted in 68.5% of the edges being removed (and 0% reduction in the number of connected nodes). We used Gephi (0.9.2; [35]) to measure various structural features of the original and backbone networks. Additional analyses were performed with JASP (Version 0.16.3 [36]).

Results

The original phonological network from [6] was a unipartite network that contained 19,340 nodes representing words, and 31,267 undirected edges. It had a giant component (i.e., the largest cluster of interconnected nodes in the network) of 6,508 nodes and 29,627 edges. There were 10,256 nodes, such as the words obtuse or spinach, that were not connected to any other word in the network. Unconnected nodes are called isolates in the network science literature, however, in the context of the phonological network they were referred to as “lexical hermits” [6]. The remaining 2,567 words were connected to each other in 1,019 smaller components that were not connected to other smaller components or to the giant component. These components ranged in size from 2 to 53 nodes in a component, and in the context of the phonological network they were referred to as “lexical islands” [6].

After the extraction of the backbone, the 19,340 nodes were connected via 9,843 edges. Table 1 shows various network values for the original (intact) network, and for the network after the backbone had been extracted. The values reported in Table 1 confirm that the phonological network has been significantly “simplified” by the backbone extraction procedure.

Download:

Table 1. Comparison of network characteristics in the original network and the extracted backbone network.

https://doi.org/10.1371/journal.pone.0287197.t001

To determine what enabled some words to “survive” the extraction process and remain in the simplified giant component after the backbone sparcification process, we compared several psycholinguistic characteristics and several network science measures (that have previously been shown to influence language-related processes) of (1) the words in the giant component of the original network (Original GC), (2) the words that remained in the giant component after the backbone was extracted (GC of Backbone), and (3) words that were previously in the giant component of the original network, but that did not make it in to the giant component of the backbone (Orig. GC/Not Bb). For the network science measures, the values of each measure are for the words before the backbone was extracted. To adjust for the unequal sample sizes and unequal variances the Welch correction for independent sample ANOVA (with adjusted degrees of freedom) was used for all comparisons. The Tukey correction was used to adjust for multiple post-hoc comparisons. Table 2 shows the mean (and standard deviation) values for the analyses reported in this section.

Download:

Table 2. Psycholinguistic and network science characteristics of words that remained in the giant component after the backbone process and of words outside of the giant component.

https://doi.org/10.1371/journal.pone.0287197.t002

Familiarity was measured on a seven-point scale, with 1 = don’t know the word to 7 = know the word [37]. There was no difference in familiarity ratings among the three different conditions of words (F (2, 801.71) = 1.82, p = .16).

Word frequency refers to the average occurrence of a word (per million words) in the language [38]. Because word frequency counts are not normally distributed, a log₁₀ transformation was used. A significant difference overall was observed among the three conditions of words (F (2, 822.15) = 9.10, p < .001). Post hoc comparisons revealed that there was no difference between the words in the original GC and the words in the GC of the backbone (t (1) = 0.55, p = .85). However, the words that were originally in the GC but ended up not in the backbone were significantly different from the words in the original GC (t (1) = -3.42, p = .002) and the words in the GC of the backbone (t (1) = 3.58, p = .001), suggesting that words that typically occur less often in the language did not “survive” the backbone extraction process.

Word length was measured as the number of phonemes in the word. A significant difference overall was observed among the three conditions of words (F (2, 798.36) = 84.77, p < .001). Post hoc comparisons revealed that there was no difference between the words in the original GC and the words in the GC of the backbone (t (1) = -2.23, p = .06). However, the words that were originally in the GC but ended up not in the backbone were significantly different from the words in the original GC (t (1) = 13.96, p < .001) and the words in the GC of the backbone (t (1) = -14.61, p = .001), suggesting that longer words did not “survive” the backbone extraction process.

Degree refers in network science to the number of nodes that are directly connected to a given node. In psycholinguistic terms this measure in the phonological network is equivalent to phonological neighborhood density, or the number of words that are similar to a given word based on the substitution, deletion, or addition of a single phoneme in any position of the target item [8]. For a review of how degree/neighborhood density influences speech perception and production see [39]. A significant difference overall was observed among the three conditions of words (F (2, 846.63) = 92.88, p < .001). Post hoc comparisons revealed that there was no difference between the words in the original GC and the words in the GC of the backbone (t (1) = 1.46, p = .31). However, the words that were originally in the GC but ended up not in the backbone were significantly different from the words in the original GC (t (1) = -9.11, p < .001) and the words in the GC of the backbone (t (1) = 9.54, p < .001), suggesting that words with fewer phonological neighbors did not “survive” the backbone extraction process.

Clustering Coefficient in the phonological network measures the extent to which phonological neighbors are also neighbors of each other. More precisely, the clustering coefficient (C) is the ratio of the actual number of edges existing among neighbors of a given word to the number of all possible edges among neighbors if every neighbor was connected. C has a range from 0 to 1. When C = 0, none of the neighbors of a given node are neighbors of each other. When C = 1, the neighbors are fully interconnected, meaning every neighbor is also a neighbor of all the other neighbors of a given word. This variable has been shown to influence spoken word recognition [14], speech production [15], word-learning [16], long- and short-term memory [17], and perception of the speech to song illusion [18]. There was no difference in the clustering coefficient values among the three different conditions of words (F (2, 791.84) = 0.13, p = .88).

Closeness Centrality measures the average distance from one node to all other nodes in the network (following the shortest path between any two nodes being considered). This variable has been shown to influence language processing in healthy young adults [21], adults who stutter [19], and adults with aphasia [20]. A significant difference overall was observed among the three conditions of words (F (2, 789.61) = 117.62, p < .001). Post hoc comparisons revealed that each condition was significantly different from the others: Original GC and the words in the GC of the backbone (t (1) = 3.14 p = .005); Original GC and the words not in the GC of the backbone (t (1) = -19.62, p < .001); Words in the GC of the backbone and words not in the GC of the backbone (t (1) = 20.53, p < .001). These results suggest that words that (on average) are farther away from other words (as indicated by the lower normalized inverse measure of the average distance to all other nodes) do not “survive” the backbone extraction process.

Finally, we examined the community structure of words in the giant component before and after the backbone had been extracted [40]. Communities are smaller sub-groups of nodes that tend to be more connected to each other than to nodes found in another community (see [33]). Using the Louvain community detection algorithm, a commonly used community detection algorithm [41–43], we found that before the backbone procedure was executed the giant component contained 26 communities. Modularity, Q, is typically used to measure the extent to which clear, well-defined communities are found in a network [44]. For a formal definition of Q see [40]. Positive Q values close to the maximum of +1.0 indicate the presence of clear, well-defined communities in the network. The community detection analyses in the giant component before the backbone procedure had Q = .68. For the words in the (smaller) giant component that emerged after the backbone procedure was executed, the words were distributed among 60 communities, with Q = .87.

Fig 2 shows the 60 communities in the giant component from the backbone (only the 10 largest communities are colored). And Fig 3 shows a single, representative community with words labeling the nodes. Visual inspection of Fig 3 confirms that words in the same community have several phonological sequences in common (e.g., /et/ as in the words bet, debt, pet, wet, set, etc.), consistent with the initial observation in [33].

Download:

Fig 2. The 60 communities found in the giant component extracted from the backbone procedure.

Only the 10 largest communities are colored; all other nodes/communities are grey.

https://doi.org/10.1371/journal.pone.0287197.g002

Download:

Fig 3. The words found in one of the communities identified in the giant component after the backbone procedure.

https://doi.org/10.1371/journal.pone.0287197.g003

Conclusion

Previous studies have identified at the micro-level individual nodes [21] and at the meso-level a set of nodes [23] in a phonological network that were “important” for lexical processing. In the present study we used a whole-network/macro-level approach to identify “important” nodes. Namely, we extracted the backbone from the phonological network of English words. In the backbone approach, a larger, complex network is distilled into a smaller, simplified subnetwork that maintains the basic and crucial features of the original network [25].

The backbone extraction process removed 68.5% of the redundant and unnecessary edges in the phonological network examined in [6]. We compared the psycholinguistic and network measures of the 6,211 words that remained in the giant component (which originally contained 6,508 words) to words were originally in the giant component but were not in the backbone after the sparcification procedure. Words that remained in the giant component of the backbone occurred more frequently in the language, were shorter in length, were similar to more phonological neighbors, and were closer to other words compared to the words that did not “survive” extraction of the backbone. These lexical characteristics suggest that the words in the backbone of the phonological network might form a “kernel lexicon,” or a small but essential set of words that allows one to function (although perhaps not optimally) in a wide-range of situations. Consider the analysis of 4.45 million words extracted from Massive Open Online Courses by [30], who found that the ~5000 most frequent words covered 95% of the course content, and that the ~9000 most frequent words covered 98% of the course content (see also Up Goer Five https://xkcd.com/1133/ and The Thing Explainer https://xkcd.com/thing-explainer/). Perhaps the words in the phonological backbone constitute a “kernel lexicon” of phonological words-forms that allows a typical speaker to navigate most day-to-day situations.

The edges that remained after the backbone extraction process may reflect important relationships or distinctions between words that cannot be obtained in some other way, such as through phonological rules (e.g., underspecification theories by [26–29]), semantic information, context, or visual features of the lips and jaw in the articulation of the words (as might be used during lip-reading; [45]). Thus, the nodes and edges in the phonological backbone may constitute words and phonological distinctions that are crucial for successful word recognition under less-than-ideal situations, such as when listening to a speaker who is wearing a mask in the era of COVID-19 [46].

Analyses of several network science measures revealed that the removal of redundant edges in the backbone extraction procedure significantly reduced the values for degree, and clustering coefficient, and increased the number of communities, indicating that the network was becoming less interconnected overall. Although the removal of 68.5% of the edges by the backbone extraction procedure significantly reduced the overall connectivity of the network, no words became isolates (i.e., lexical hermits). Rather, any words that were severed from their original structure in the original network formed smaller components (i.e., lexical islands; see [20, 47] for the influence of “lexical islands” on language processing). The fact that the removal of a large percentage of edges resulted in such little damage to the system speaks to the resilience of the phonological network (see also [48]).

Finding a resilient kernel lexicon in the phonological network could be useful for scientists and clinicians in the speech, language, and hearing sciences. The set of words identified in the backbone may provide guidance on which words to focus on to facilitate typical development, and to accelerate rehabilitation efforts. Finally, with the increased application of network science to the speech, language, hearing and cognitive sciences, the backbone extraction method that we explored in the present study may prove useful in other applications of network science. We hope that researchers studying typically developing children (e.g., [3]), children with language disorders (e.g., [1, 2], or the process of reading (e.g., [49] will consider how the various techniques of identifying “important” nodes in a network might be fruitful for advancing those and other research areas.

References

1. Beckage N., Smith L., & Hills T. (2011). Small worlds and semantic network growth in typical and late talkers. PLoS ONE, 6(5), e19348. pmid:21589924
- View Article
- PubMed/NCBI
- Google Scholar
2. Benham S., Goffman L. & Schweickert R. (2018). An application of network science to phonological sequence learning in children with developmental language Disorder. Journal of Speech Language Hearing Research, 61, 2275–2291. pmid:30167667
- View Article
- PubMed/NCBI
- Google Scholar
3. Bower C.A., Mix K.S., Yuan L. & Smith L.B. (2022). A network analysis of children’s emerging place-value concepts. Psychological Science, 33(7), 1112–1127. pmid:35699572
- View Article
- PubMed/NCBI
- Google Scholar
4. Siew C.S.Q.; Pelczarski K.M.; Yaruss J.S.; Vitevitch M.S. (2017). Using the OASES-A to illustrate how network analysis can be applied to understand the experience of stuttering. Journal of Communication Disorders, 65, 1–9. pmid:27907811
- View Article
- PubMed/NCBI
- Google Scholar
5. Vitevitch M. S. (ed.) (2019). Network Science in Cognitive Psychology. Routledge.
6. Vitevitch M.S. (2008). What can graph theory tell us about word learning and lexical retrieval? Journal of Speech Language Hearing Research, 51, 408–422. pmid:18367686
- View Article
- PubMed/NCBI
- Google Scholar
7. Vitevitch M.S. (2022). What Can Network Science Tell Us About Phonology and Language Processing? Topics in Cognitive Science, 14: 127–142. pmid:33836120
- View Article
- PubMed/NCBI
- Google Scholar
8. Luce P.A.; Pisoni D.B. (1998). Recognizing spoken words: the neighborhood activation model. Ear & Hearing, 19, 1–36. pmid:9504270
- View Article
- PubMed/NCBI
- Google Scholar
9. Castro N. & Vitevitch M.S. (2022). Using network science and psycholinguistic megastudies to examine the dimensions of phonological similarity. Language & Speech, pmid:35586894
- View Article
- PubMed/NCBI
- Google Scholar
10. Kleinberg J.M. (2000). Navigation in a small world. Nature, 406, 845. pmid:10972276
- View Article
- PubMed/NCBI
- Google Scholar
11. Latora V.; Marchiori M. (2001) Efficient behavior of small-world networks. Physical Review Letters, 87, 198701. pmid:11690461
- View Article
- PubMed/NCBI
- Google Scholar
12. Watts D.J. & Strogatz S.H. (1998). Collective dynamics of ‘small-world’ networks. Nature, 393, 440–442. pmid:9623998
- View Article
- PubMed/NCBI
- Google Scholar
13. Arbesman S., Strogatz S.H. & Vitevitch M.S. (2010). The Structure of Phonological Networks Across Multiple Languages. International Journal of Bifurcation & Chaos, 20, 679–685.
- View Article
- Google Scholar
14. Chan K.Y. & Vitevitch M.S. (2009). The Influence of the Phonological Neighborhood Clustering-Coefficient on Spoken Word Recognition. Journal of Experimental Psychology: Human Perception & Performance, 35, 1934–1949. pmid:19968444
- View Article
- PubMed/NCBI
- Google Scholar
15. Chan K. Y., & Vitevitch M. S. (2010). Network structure influences speech production. Cognitive Science, 34, 685–697. pmid:21564230
- View Article
- PubMed/NCBI
- Google Scholar
16. Goldstein R., & Vitevitch M. S. (2014). The influence of clustering coefficient on word-learning: How groups of similar sounding words facilitate acquisition. Frontiers in Language Sciences, 5, 01307. pmid:25477837
- View Article
- PubMed/NCBI
- Google Scholar
17. Vitevitch M.S.; Chan K.Y. & Roodenrys S. (2012) Complex network structure influences processing in long-term and short-term memory. Journal of Memory & Language, 67, 30–44. pmid:22745522
- View Article
- PubMed/NCBI
- Google Scholar
18. Vitevitch M.S.; Ng J.W.; Hatley E. & Castro N. (2021). Phonological but not semantic influences on the speech-to-song illusion. Quarterly Journal of Experimental Psychology, 74, 585–597.
- View Article
- Google Scholar
19. Castro N., Pelczarski K.M. & Vitevitch M.S. (2017). Using network science measures to predict lexical decision performance of adults who stutter. Journal of Speech, Language, and Hearing Research, 60, 1911–1918.
- View Article
- Google Scholar
20. Vitevitch M.S. & Castro N. (2015). Using network science in the language sciences and clinic. International Journal of Speech Language Pathology, 17, 13–25. pmid:25539473
- View Article
- PubMed/NCBI
- Google Scholar
21. Goldstein R. & Vitevitch M.S. (2017). The Influence of Closeness Centrality on Lexical Processing. Frontiers in Psychology, 8, pmid:29018396
- View Article
- PubMed/NCBI
- Google Scholar
22. Borgatti S. P. (2005). Centrality and network flow. Social Networks, 27, 55–71
- View Article
- Google Scholar
23. Vitevitch M. S., & Goldstein R. (2014). Keywords in the mental lexicon. Journal of Memory & Language, 73, 131–147.
- View Article
- Google Scholar
24. Borgatti S. P. (2006). Identifying sets of key players in a network. Computational, Mathematical and Organizational Theory, 12, 21–34.
- View Article
- Google Scholar
25. Neal Z.P. (2022). Backbone: An R Package to Extract Network Backbones. PLOS ONE, 17 (5), https://doi.org/10.1371/journal.pone.0269137.
- View Article
- Google Scholar
26. Archangeli D. (1988). Aspects of underspecification theory. Phonology 5, 183–207.
- View Article
- Google Scholar
27. Kiparsky P. (1985). Some consequences of lexical phonology. Phonological Yearbook, 2, 85–138.
- View Article
- Google Scholar
28. Mohanan K. P. (1991). On the bases of radical underspecification. Natural Language & Linguistic Theory, 9, 285–325.
- View Article
- Google Scholar
29. Steriade D. (1995). “Underspecification and markedness,” in The Handbook of Phonological Theory, ed Goldsmith J. A. (Oxford and Cambridge, MA: Blackwell Publishing), 114–174.
30. Xodabande I., Ebrahimi H. & Karimpour S. (2022). How much vocabulary is needed for comprehension of video lectures in MOOCs: A corpus-based study. Frontiers in Psychology, 13:992638. pmid:36248503
- View Article
- PubMed/NCBI
- Google Scholar
31. Gomes Ferreira C.H., Murai F., Silva A.P.C., Trevisan M., Vassio L., Drago I., et al. (2022). On network backbone extraction for modeling online collective behavior. PLoS ONE 17(9): e0274218. pmid:36107952
- View Article
- PubMed/NCBI
- Google Scholar
32. Hamann M., Lindner G., Meyerhenke H., Staudt C. L., & Wagner D. (2016). Structure-preserving sparsification methods for social networks. Social Network Analysis and Mining, 6(1), 22. https:://10.1007/s13278-016-0332-2
- View Article
- Google Scholar
33. Siew C. S. (2013). Community structure in the phonological network. Frontiers in psychology, 4, 553. pmid:23986735
- View Article
- PubMed/NCBI
- Google Scholar
34. Satuluri, V., Parthasarathy, S., & Ruan, Y. (2011). Local graph sparsification for scalable clustering. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data (SIGMOD ’11). Association for Computing Machinery, New York, NY, USA, 721–732. https://doi.org/10.1145/1989323.1989399
35. Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: An open source software for exploring and manipulating networks. In Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media; San Jose, CA, pp. 361–362.
36. Team JASP (2022). JASP (Version 0.16.3) [Computer software].
- View Article
- Google Scholar
37. Nusbaum H. C., Pisoni D. B., & Davis C. K. (1984). Sizing up the Hoosier Mental Lexicon: Measuring the familiarity of 20,000 words. Research on Speech Perception Progress Report, 10, 357–376.
- View Article
- Google Scholar
38. Kučera H., & Francis W. N. (1967). Computational analysis of present day American English. Providence, RI: Brown University Press.
39. Vitevitch M.S. & Luce P. (2016). Phonological neighborhood effects in spoken word perception and production. Annual Review of Linguistics, 2, 75–94.
- View Article
- Google Scholar
40. Newman M. E. J. (2004). Detecting community structure in networks. Eur. Phys. J. B 38, 321–330.
- View Article
- Google Scholar
41. Blondel V.D., Guillaume J.L., Lambiotte R. & Lefebvre E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 10, P10008.
- View Article
- Google Scholar
42. Girvan M., and Newman M. E. J. (2002). Community structure in social and biological networks. Proc. Natl. Acad. Sci. U.S.A. 99, 7821–7826. pmid:12060727
- View Article
- PubMed/NCBI
- Google Scholar
43. Newman M. E. J., and Girvan M. (2004). Finding and evaluating community structure in networks. Phys. Rev. E 69:026113. pmid:14995526
- View Article
- PubMed/NCBI
- Google Scholar
44. Fortunato S. (2010). Community detection in graphs. Physics Reports, 486(3), 75–174.
- View Article
- Google Scholar
45. Fisher C. G. (1968). Confusions among visually perceived consonants. Journal of Speech and Hearing Research, 11(4), 796–804. pmid:5719234
- View Article
- PubMed/NCBI
- Google Scholar
46. Cox B., Tuft S. E., Morich J., & McLennan C. T. (2023). EXPRESS: Examining listeners’ perception of spoken words with different face masks. Quarterly Journal of Experimental Psychology, 0(ja). https://doi.org/10.1177/17470218231175631
- View Article
- Google Scholar
47. Siew C.S.Q. & Vitevitch M.S. (2016). Spoken word recognition and serial recall of words from components in the phonological network. Journal of Experimental Psychology: Learning, Memory, and Cognition, 42, 394–410. pmid:26301962
- View Article
- PubMed/NCBI
- Google Scholar
48. Vitevitch M.S., Castro N., Mullin G.J.D. & Kulphongpatana Z. (2023). Exploring the resilience of the phonological network: Implications for developmental and acquired disorders. Brain Sciences, 13(2), 188.
- View Article
- Google Scholar
49. Siew C.S.Q. & Vitevitch M.S. (2019). The phonographic language network: Using network science to investigate the phonological and orthographic similarity structure of language. Journal of Experimental Psychology: General, 148, 475–500. pmid:30802126
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Beckage N., Smith L., & Hills T. (2011). Small worlds and semantic network growth in typical and late talkers. PLoS ONE, 6(5), e19348. pmid:21589924
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Benham S., Goffman L. & Schweickert R. (2018). An application of network science to phonological sequence learning in children with developmental language Disorder. Journal of Speech Language Hearing Research, 61, 2275–2291. pmid:30167667
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Bower C.A., Mix K.S., Yuan L. & Smith L.B. (2022). A network analysis of children’s emerging place-value concepts. Psychological Science, 33(7), 1112–1127. pmid:35699572
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Siew C.S.Q.; Pelczarski K.M.; Yaruss J.S.; Vitevitch M.S. (2017). Using the OASES-A to illustrate how network analysis can be applied to understand the experience of stuttering. Journal of Communication Disorders, 65, 1–9. pmid:27907811
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref5] 5. Vitevitch M. S. (ed.) (2019). Network Science in Cognitive Psychology. Routledge.

[ref6] 6. Vitevitch M.S. (2008). What can graph theory tell us about word learning and lexical retrieval? Journal of Speech Language Hearing Research, 51, 408–422. pmid:18367686
View Article
PubMed/NCBI
Google Scholar

[19] View Article

[20] PubMed/NCBI

[21] Google Scholar

[ref7] 7. Vitevitch M.S. (2022). What Can Network Science Tell Us About Phonology and Language Processing? Topics in Cognitive Science, 14: 127–142. pmid:33836120
View Article
PubMed/NCBI
Google Scholar

[23] View Article

[24] PubMed/NCBI

[25] Google Scholar

[ref8] 8. Luce P.A.; Pisoni D.B. (1998). Recognizing spoken words: the neighborhood activation model. Ear & Hearing, 19, 1–36. pmid:9504270
View Article
PubMed/NCBI
Google Scholar

[27] View Article

[28] PubMed/NCBI

[29] Google Scholar

[ref9] 9. Castro N. & Vitevitch M.S. (2022). Using network science and psycholinguistic megastudies to examine the dimensions of phonological similarity. Language & Speech, pmid:35586894
View Article
PubMed/NCBI
Google Scholar

[31] View Article

[32] PubMed/NCBI

[33] Google Scholar

[ref10] 10. Kleinberg J.M. (2000). Navigation in a small world. Nature, 406, 845. pmid:10972276
View Article
PubMed/NCBI
Google Scholar

[35] View Article

[36] PubMed/NCBI

[37] Google Scholar

[ref11] 11. Latora V.; Marchiori M. (2001) Efficient behavior of small-world networks. Physical Review Letters, 87, 198701. pmid:11690461
View Article
PubMed/NCBI
Google Scholar

[39] View Article

[40] PubMed/NCBI

[41] Google Scholar

[ref12] 12. Watts D.J. & Strogatz S.H. (1998). Collective dynamics of ‘small-world’ networks. Nature, 393, 440–442. pmid:9623998
View Article
PubMed/NCBI
Google Scholar

[43] View Article

[44] PubMed/NCBI

[45] Google Scholar

[ref13] 13. Arbesman S., Strogatz S.H. & Vitevitch M.S. (2010). The Structure of Phonological Networks Across Multiple Languages. International Journal of Bifurcation & Chaos, 20, 679–685.
View Article
Google Scholar

[47] View Article

[48] Google Scholar

[ref14] 14. Chan K.Y. & Vitevitch M.S. (2009). The Influence of the Phonological Neighborhood Clustering-Coefficient on Spoken Word Recognition. Journal of Experimental Psychology: Human Perception & Performance, 35, 1934–1949. pmid:19968444
View Article
PubMed/NCBI
Google Scholar

[50] View Article

[51] PubMed/NCBI

[52] Google Scholar

[ref15] 15. Chan K. Y., & Vitevitch M. S. (2010). Network structure influences speech production. Cognitive Science, 34, 685–697. pmid:21564230
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref16] 16. Goldstein R., & Vitevitch M. S. (2014). The influence of clustering coefficient on word-learning: How groups of similar sounding words facilitate acquisition. Frontiers in Language Sciences, 5, 01307. pmid:25477837
View Article
PubMed/NCBI
Google Scholar

[58] View Article

[59] PubMed/NCBI

[60] Google Scholar

[ref17] 17. Vitevitch M.S.; Chan K.Y. & Roodenrys S. (2012) Complex network structure influences processing in long-term and short-term memory. Journal of Memory & Language, 67, 30–44. pmid:22745522
View Article
PubMed/NCBI
Google Scholar

[62] View Article

[63] PubMed/NCBI

[64] Google Scholar

[ref18] 18. Vitevitch M.S.; Ng J.W.; Hatley E. & Castro N. (2021). Phonological but not semantic influences on the speech-to-song illusion. Quarterly Journal of Experimental Psychology, 74, 585–597.
View Article
Google Scholar

[66] View Article

[67] Google Scholar

[ref19] 19. Castro N., Pelczarski K.M. & Vitevitch M.S. (2017). Using network science measures to predict lexical decision performance of adults who stutter. Journal of Speech, Language, and Hearing Research, 60, 1911–1918.
View Article
Google Scholar

[69] View Article

[70] Google Scholar

[ref20] 20. Vitevitch M.S. & Castro N. (2015). Using network science in the language sciences and clinic. International Journal of Speech Language Pathology, 17, 13–25. pmid:25539473
View Article
PubMed/NCBI
Google Scholar

[72] View Article

[73] PubMed/NCBI

[74] Google Scholar

[ref21] 21. Goldstein R. & Vitevitch M.S. (2017). The Influence of Closeness Centrality on Lexical Processing. Frontiers in Psychology, 8, pmid:29018396
View Article
PubMed/NCBI
Google Scholar

[76] View Article

[77] PubMed/NCBI

[78] Google Scholar

[ref22] 22. Borgatti S. P. (2005). Centrality and network flow. Social Networks, 27, 55–71
View Article
Google Scholar

[80] View Article

[81] Google Scholar

[ref23] 23. Vitevitch M. S., & Goldstein R. (2014). Keywords in the mental lexicon. Journal of Memory & Language, 73, 131–147.
View Article
Google Scholar

[83] View Article

[84] Google Scholar

[ref24] 24. Borgatti S. P. (2006). Identifying sets of key players in a network. Computational, Mathematical and Organizational Theory, 12, 21–34.
View Article
Google Scholar

[86] View Article

[87] Google Scholar

[ref25] 25. Neal Z.P. (2022). Backbone: An R Package to Extract Network Backbones. PLOS ONE, 17 (5), https://doi.org/10.1371/journal.pone.0269137.
View Article
Google Scholar

[89] View Article

[90] Google Scholar

[ref26] 26. Archangeli D. (1988). Aspects of underspecification theory. Phonology 5, 183–207.
View Article
Google Scholar

[92] View Article

[93] Google Scholar

[ref27] 27. Kiparsky P. (1985). Some consequences of lexical phonology. Phonological Yearbook, 2, 85–138.
View Article
Google Scholar

[95] View Article

[96] Google Scholar

[ref28] 28. Mohanan K. P. (1991). On the bases of radical underspecification. Natural Language & Linguistic Theory, 9, 285–325.
View Article
Google Scholar

[98] View Article

[99] Google Scholar

[ref29] 29. Steriade D. (1995). “Underspecification and markedness,” in The Handbook of Phonological Theory, ed Goldsmith J. A. (Oxford and Cambridge, MA: Blackwell Publishing), 114–174.

[ref30] 30. Xodabande I., Ebrahimi H. & Karimpour S. (2022). How much vocabulary is needed for comprehension of video lectures in MOOCs: A corpus-based study. Frontiers in Psychology, 13:992638. pmid:36248503
View Article
PubMed/NCBI
Google Scholar

[102] View Article

[103] PubMed/NCBI

[104] Google Scholar

[ref31] 31. Gomes Ferreira C.H., Murai F., Silva A.P.C., Trevisan M., Vassio L., Drago I., et al. (2022). On network backbone extraction for modeling online collective behavior. PLoS ONE 17(9): e0274218. pmid:36107952
View Article
PubMed/NCBI
Google Scholar

[106] View Article

[107] PubMed/NCBI

[108] Google Scholar

[ref32] 32. Hamann M., Lindner G., Meyerhenke H., Staudt C. L., & Wagner D. (2016). Structure-preserving sparsification methods for social networks. Social Network Analysis and Mining, 6(1), 22. https:://10.1007/s13278-016-0332-2
View Article
Google Scholar

[110] View Article

[111] Google Scholar

[ref33] 33. Siew C. S. (2013). Community structure in the phonological network. Frontiers in psychology, 4, 553. pmid:23986735
View Article
PubMed/NCBI
Google Scholar

[113] View Article

[114] PubMed/NCBI

[115] Google Scholar

[ref34] 34. Satuluri, V., Parthasarathy, S., & Ruan, Y. (2011). Local graph sparsification for scalable clustering. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data (SIGMOD ’11). Association for Computing Machinery, New York, NY, USA, 721–732. https://doi.org/10.1145/1989323.1989399

[ref35] 35. Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: An open source software for exploring and manipulating networks. In Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media; San Jose, CA, pp. 361–362.

[ref36] 36. Team JASP (2022). JASP (Version 0.16.3) [Computer software].
View Article
Google Scholar

[119] View Article

[120] Google Scholar

[ref37] 37. Nusbaum H. C., Pisoni D. B., & Davis C. K. (1984). Sizing up the Hoosier Mental Lexicon: Measuring the familiarity of 20,000 words. Research on Speech Perception Progress Report, 10, 357–376.
View Article
Google Scholar

[122] View Article

[123] Google Scholar

[ref38] 38. Kučera H., & Francis W. N. (1967). Computational analysis of present day American English. Providence, RI: Brown University Press.

[ref39] 39. Vitevitch M.S. & Luce P. (2016). Phonological neighborhood effects in spoken word perception and production. Annual Review of Linguistics, 2, 75–94.
View Article
Google Scholar

[126] View Article

[127] Google Scholar

[ref40] 40. Newman M. E. J. (2004). Detecting community structure in networks. Eur. Phys. J. B 38, 321–330.
View Article
Google Scholar

[129] View Article

[130] Google Scholar

[ref41] 41. Blondel V.D., Guillaume J.L., Lambiotte R. & Lefebvre E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 10, P10008.
View Article
Google Scholar

[132] View Article

[133] Google Scholar

[ref42] 42. Girvan M., and Newman M. E. J. (2002). Community structure in social and biological networks. Proc. Natl. Acad. Sci. U.S.A. 99, 7821–7826. pmid:12060727
View Article
PubMed/NCBI
Google Scholar

[135] View Article

[136] PubMed/NCBI

[137] Google Scholar

[ref43] 43. Newman M. E. J., and Girvan M. (2004). Finding and evaluating community structure in networks. Phys. Rev. E 69:026113. pmid:14995526
View Article
PubMed/NCBI
Google Scholar

[139] View Article

[140] PubMed/NCBI

[141] Google Scholar

[ref44] 44. Fortunato S. (2010). Community detection in graphs. Physics Reports, 486(3), 75–174.
View Article
Google Scholar

[143] View Article

[144] Google Scholar

[ref45] 45. Fisher C. G. (1968). Confusions among visually perceived consonants. Journal of Speech and Hearing Research, 11(4), 796–804. pmid:5719234
View Article
PubMed/NCBI
Google Scholar

[146] View Article

[147] PubMed/NCBI

[148] Google Scholar

[ref46] 46. Cox B., Tuft S. E., Morich J., & McLennan C. T. (2023). EXPRESS: Examining listeners’ perception of spoken words with different face masks. Quarterly Journal of Experimental Psychology, 0(ja). https://doi.org/10.1177/17470218231175631
View Article
Google Scholar

[150] View Article

[151] Google Scholar

[ref47] 47. Siew C.S.Q. & Vitevitch M.S. (2016). Spoken word recognition and serial recall of words from components in the phonological network. Journal of Experimental Psychology: Learning, Memory, and Cognition, 42, 394–410. pmid:26301962
View Article
PubMed/NCBI
Google Scholar

[153] View Article

[154] PubMed/NCBI

[155] Google Scholar

[ref48] 48. Vitevitch M.S., Castro N., Mullin G.J.D. & Kulphongpatana Z. (2023). Exploring the resilience of the phonological network: Implications for developmental and acquired disorders. Brain Sciences, 13(2), 188.
View Article
Google Scholar

[157] View Article

[158] Google Scholar

[ref49] 49. Siew C.S.Q. & Vitevitch M.S. (2019). The phonographic language network: Using network science to investigate the phonological and orthographic similarity structure of language. Journal of Experimental Psychology: General, 148, 475–500. pmid:30802126
View Article
PubMed/NCBI
Google Scholar

[160] View Article

[161] PubMed/NCBI

[162] Google Scholar

Figures

Abstract

Introduction

Methods

Results

Conclusion

References