Identifying the knowledge structure of electromagnetic fields and health research: Text network analysis and topic modeling

Background With technological and scientific advancement, people are being increasingly exposed to electromagnetic fields, particularly from portable devices such as mobile phones. However, there is currently no consensus regarding the health effects of electromagnetic field exposure, despite the large amount of research conducted on this topic. This study aimed to understand the knowledge structure and trend of electromagnetic field and health research through text network analysis and topic modeling. Methods PubMed, Embase, and Cochrane were searched, and 3,880 articles published before June 2021 were identified. We explored the main keywords and research topics regarding electromagnetic fields and human health by constructing a network of keywords. A social network analysis program was used to analyze the data, visualize the network, and perform topic modeling. Results Four keywords, “exposure,” “effect,” “cell,” and “cancer,” were highly correlated to other keywords and formed each colony in the knowledge structure of research on electromagnetic fields and health. Five topics were derived from topic modeling: cell research, research on the adaption of MRI, health effects of mobile phones, pain therapy, and exposure measurement. Cell research has been continuously performed, and many studies have been conducted on the health effects of mobile phones since 2000. Conclusions These findings will assist in gaining insights into and understanding changes in research on the health effects of electromagnetic fields, and suggest important areas and directions for future research.


Introduction
Electromagnetic radiations are produced by natural and anthropogenic sources [1], and owing to scientific and technological advancements, people are constantly exposed to electromagnetic fields (EMFs). Anthropogenic non-ionizing EMFs can be classified as extremely low-frequency (ELF) or radiofrequency (RF) range [2,3]. The electricity emitted from power sockets is associated with ELF-EMFs, whereas RF-EMFs (which have higher frequencies) are used for information transmission via TV antennas, radio stations, mobile phones, Wi-Fi, and fifthgeneration (5G) technology [2].
The increasing use of wireless devices has increased public exposure to EMFs [4]. Therefore, international scientific organizations and regulatory agencies have investigated the potential health risks associated with EMF exposure and published guidelines for limiting public exposure to EMFs [4,5]. The International Commission on Non-Ionizing Radiation Protection (ICNIRP) and the International Committee for Electromagnetic Safety (ICES) have published guidelines for limiting public exposure to EMFs [5]. In 2002, the International Agency for Research on Cancer (IARC) classified ELF-EMFs as possibly carcinogenic to humans (Group 2B) due to limited clinical evidence, inadequate experimental support, and lack of plausible mechanisms at the exposure levels observed in epidemiological studies [1]. In 2011, the IARC also classified RF-EMFs into Group 2B for similar reasons as well as the need for additional long-term studies with adequate levels of RF-EMFs exposure [6].
Studies have been conducted on the potential health effects of EMFs, including epidemiologic studies, animal and cell research, and EMF exposure assessments. However, the effects of EMFs on the human body remain unclear, and the World Health Organization (WHO) has continuously investigated this by promoting the International EMF Project since 1996 [7]. To assess and manage possible EMF risks, the WHO has identified and recommended exploring relevant research topics. Furthermore, in 2019, the WHO prioritized research on possible health implications of increased mobile phone usage [8,9], and commissioned systematic reviews to examine and synthesize the available data. As a consensus on this subject has yet to be reached, previous studies should be analyzed to identify evidence [2,10].
Although systematic reviews and meta-analyses are commonly used for this, they are unsuitable for macro-analyses as they aim to identify answers to specific research questions [11]. Social network analysis (SNA) is an analytical method that can be applied to a large amount of data and is commonly used to identify the contextual meaning of words and their relationships. Texts can be analyzed by coding them into conceptual or semantic networks [12]. Text network analysis (TNA) can be used to analyze extensive text materials in big data using SNA [13]. Therefore, TNA can be used to identify knowledge structures and research topic trends based on the frequency, centrality, and co-occurrence of keywords [14]. Knowledge structure analysis using text networks quantitatively derives the key concepts of a particular field and helps visualize relationships between them [15]. By identifying the knowledge structure of EMFs and health research, we can identify research trends and suggest future research directions. Therefore, in this study, we conducted TNA to identify research themes and trends over time and investigate the properties of the resulting knowledge structure. quantification value, network analysis can provide a more novel perspective on research topics than conventional methods [12,14].

Data search and collection
PubMed, Embase, and the Cochrane Library databases were used to search for EMF-related literature in June 2021 using the related keyword of "electromagnetic field" in the title or abstract. (S1 Table). We limited the search to human research and manuscripts in English, excluding duplicates and articles without abstracts, to identify 3,880 studies (Fig 1). We identified vital information from these included studies using citation information from the databases and organized the information using a predefined Excel form (S2 Table).
To account for words expressed in different forms or repetitions, word refinement was performed before generating the keyword matrix. A dictionary including a thesaurus, defined words, and exception words was created. First, words and abbreviations with identical or similar meaning were grouped together and designated as one representative word. To prevent any meaning overlaps in the analysis, all words were changed to lowercase letters in singular form. Second, two or more morphemes were grouped and extracted as a single word. Finally, the exceptions list was created by determining morphemes to be excluded from the analysis (such as analysis terms and abstract types). The word "Electromagnetic field," for which the literature search was conducted, was included in the "exceptions dictionary" as it was mentioned in all studies. We repeated the analysis of words in the abstract while creating the dictionary and decided on the words to be registered in the dictionary based on the advice of a text network analysis expert and a librarian.

Generation of the keyword matrix and network
By applying the thesaurus, defined words, and exception-words dictionary, we identified 22,797 keywords and their frequency of appearance. In TNA, the main phenomena can be clearly identified by focusing on repetitive subject words and generally, only keywords appearing at a certain minimum frequency were included in the analysis [16]. In this study, words with a frequency of occurrence of 10 or more were identified as keywords, so that the top 10% of words were included in the analysis to represent the main content of the text.
In TNA, a node represents a keyword of a paper, and co-occurrence refers to the repeated occurrence of a keyword. The same keywords are present in different papers for facilitating the formation of links and networks. Therefore, we generated a matrix to evaluate the frequency of co-occurrence between previously selected keywords and constructed a network of keywords representing co-occurrence relationships with connecting links. Frequent occurrence of two words indicated that they presented similar associations and significant contextual relationships [14]. We generated a total of 64,409 one-mode matrices from two-mode matrices and analyzed studies at ten-year intervals to identify changes in EMF-research subjects over time.

Keyword analysis and visualization
We analyzed the frequency and degree of closeness and betweenness centralities. These centrality indicators are commonly used in TNA [16] and are keyword quantification values. Centrality indicates the number of nodes centered in a network based on their relative ranking. Keywords with high centrality are considered main keywords. Degree centrality indicates a high incidence of connection with other research keywords, i.e., the level of influence between keywords. Closeness centrality measures the connection distance between nodes to indicate their proximity. Betweenness centrality measures the mediation level between keyword groups [16,17].
The knowledge structure was used to visualize the network structure, node, and connection strength to be included in the sociogram using keywords with high frequency and degree centrality [12]. The network data and analysis results were graphically visualized using NetMiner 4.0 (Cyram Inc., Seongnam, Korea).

Topic modeling
Latent Dirichlet allocation (LDA) is the most frequently used algorithm in topic modeling that learns a set of topics from words that tend to occur together in documents [15]. It identifies hidden topics within documents and document sets and uncovers the ratio of topics for each document and the probability of each word being included in each topic [17].
We performed topic analysis using LDA and selected small Dirichlet hyper parameters, i.e., α = 0.1 (prior to per-document topic distribution) and β = 0.01 (prior to per-topic word distribution) to obtain a sparse topic and word distribution and, consequently, more interpretable topics. As it is difficult to select an optimal number of topics [18], we analyzed various topics and compared the similarity and difference of their contents using different models. We also performed this analysis based on different time periods to identify temporal changes. Table 1 shows the top 30 keywords by frequency and the three centrality indices. "Exposure," "cell," "patient," "effect," and "treatment" showed high frequency and centrality, suggesting

PLOS ONE
Electromagnetic fields and health research: Text network analysis and topic modeling that they appeared regularly ( Table 1). The knowledge structure of EMFs and health was also identified. Four keywords (i.e., "exposure," "effect," "cell," and "cancer") were highly correlated to other keywords, and each colony was formed around these four keywords (Fig 2).

PLOS ONE
Electromagnetic fields and health research: Text network analysis and topic modeling

Topic modeling of EMFs and health research
LDA topic analysis identifies topics commonly included in the literature based on unsupervised learning. The topics are transformed into keyword combinations based on statistical results, and experts judge the meaning of the combinations and derive meaningful topics. Several rounds of LDA were performed on varying numbers of topics. After we grouped the subtopics through discussion, K = 5 topics with no overlapping meanings between groups were identified (Fig 2). Each topic was ranked with reference to word weight, and the top 10 collocates in the corresponding topic were extracted. We combined meaningful keywords to form topic groups and derived five groups (Fig 2). Topic 1 (cell research on EMF exposure) included "cell," "gene," "exposure," "expression," and "level." Topic 2 (adaptation of magnetic resonance imaging (MRI) for radiation therapy (RT) applied to cancer patients) included "patient," "image," "cancer," "radiation," and "MRI." Topic 3 (effects of EMFs from mobile phones) included "exposure," "risk," "mobile phone," "health," and "cancer." Topic 4 (pain

PLOS ONE
Electromagnetic fields and health research: Text network analysis and topic modeling therapy using EMFs) included "patient," "treatment," "pain," "therapy," and "pulsed electromagnetic field therapy." Topic 5 (measurement of EMF exposure) included "field," "exposure," "specific energy absorption rate," "measurement," "system," and "body." The network between keywords in each topic group is shown in Fig 3.

Trends in the topics of EMFs and health research over time
To identify trends in the topics of EMFs and health research, we performed topic modeling based on time periods. Prior to 1990, topics largely included "cell," "field," "exposure," "frequency," and "body." These cell research topics were reduced to 18% in the 1990s, when the most common topics included "exposure," "cancer," "risk," "child," and "leukemia." In the 2000s, research topics included "exposure," "specific energy absorption rate," "mobile phone," "system," and "field." From 2010 to June 2021, most EMF-related health research on mobile phones included "exposure," "mobile phone," "level," "risk," and "health" (S3 Table).

Discussion
This study aimed to provide insights into EMF and health research by quantitative and qualitative analyses of the network of the main keywords in the published literatures. We present a scientific perspective on the subject obtained by observing the trends in research topics and identifying the knowledge structure of EMF and health research. Our results show the macro network centering on keywords of "exposure," "effect," "cell," and "cancer" in the knowledge structure in EMF and health research. A keyword with high centrality can influence the research trend considerably and form a network upon connection with other keywords [13]. The study found that "exposure" was the center of a large-scale network analysis group, indicating that a lot of studies were performed around this network of keywords. The keywords of "exposure standards," "mobile phone," "Wi-Fi," and "MRI" were also networked with "exposure." The network group of the keyword "effect" is linked to "muscle," "nerve," "stimulation," and "pulsed electromagnetic field therapy." In addition, "cancer" was centered on studies related to both "child" and "brain," and a strong network was formed with "cell." Therefore, we confirmed that cancer research was widely performed using cell research. Considering research trends over time, "bone," "fracture," "radiation," "energy," and "diathermy" were commonly observed before 1990. Several studies investigated the effects of pulsed EMFs on bone healing after the Food and Drug Administration approval in 1979 [19][20][21]. Research before 1990 also focused on local RF hyperthermia for tumor treatment, treatment modality design, and diathermy technique [22,23]. With the invention of MRI in the late 1970s, several studies addressed its safety, accuracy, and diagnostic capability [24]. "Child," "melatonin," "leukemia," "association," and "worker" were the main keywords in 1991-2000. Most studies in the 1990s, led by the WHO and ICNIRP, explored the health effects of ELF from residential power stations and home electronic appliances, and indicated a weak relation with childhood leukemia. Therefore, ELF was classified as a possible human carcinogen (Group 2B) by the IARC in 2002 [1,25]. Studies investigating adults in this period produced unclear results, particularly for the carcinogenic potential of occupational ELF exposure and nighttime exposure [26].
In 2001-2010, "pain," "RF," and "mobile phone" emerged, and "pulsed electromagnetic field therapy" and "specific energy absorption rate" became significant centrality keywords. In the 2000s, EMF studies started to focus on the relationship between RF exposure (mobile phones and wireless communication) and adverse health effects, particularly brain cancer. IARC classified RF as a possible human carcinogen (Group 2B) in 2011 based on a long-term epidemiological study [6,27]. In 2011-2021 (2,320 studies), "data" emerged as a keyword and "pulsed electromagnetic field therapy" continued to rank highly as a centrality keyword. After the 2000s, few studies investigated the therapeutic effects of electric current therapies. In the 2010s, several studies addressed the efficacy of pulsed EMFs for chronic pain from musculoskeletal disorders [28].
LDA was used to identify focus topics based on keywords. By categorizing the keywords, we derived the following five meaningful topic groups. Topic 1 consisted of keywords related to "cell research on EMF exposure". "cell," and "gene," and "exposure" exhibited the highest frequency and centrality. As expected for epidemiologic studies, randomized controlled trials are limited to human health effects, particularly for cancer, because of ethical issues and longexposure requirements. To overcome these issues, in vitro experiments are performed for examining the response of human cells and genes to ELF-EMFs [29].
Topic 2 was related to "adaption of MRI for RT of cancer patients." Introducing MRI prior to RT as a treatment pathway has attracted interest as it provides improved soft-tissue image contrast, thereby enabling RT to be tailored and adapted to patients [30].
Topic 3 was associated with "health effects of EMFs from mobile phones." In 2000, IARC coordinated a study in 13 countries for investigating whether RF-EMFs from mobile phones increase the risk of cancer [27,31]. Since 2011, large-scale epidemiological studies have been conducted on this subject. The MOBI-KIDS study, a multi-national epidemiological study, evaluated the potential carcinogenic effects of exposure to RF and ELF-EMFs from mobile phones on tumors on the central nervous system of children and adolescents [32]. Although a carcinogenic risk was not clearly identified, this study highlighted the need for further investigation on adverse effects to the central nervous system [33]. Other large-scale epidemiology studies include the Cohort Study of Mobile Phone Use and Health and the Advanced Research on Interaction Mechanisms of Electromagnetic Exposures with Organisms for Risk Assessment [34,35].
Epidemiological studies have primarily been performed for identifying long-term health impacts, including the causes of pediatric cancer, which have attracted public attention because of the vulnerability of children during their growth period [36].
Topic 4 comprised keywords associated with "pain therapy using EMF." EMFs have been used for pain relief for decades, including pulsed EMFs for musculoskeletal pain or bone healing. Several studies have been conducted to identify the efficacy or effect of such usage [28,37].
Topic 5 was associated with "measurement of EMF exposure." The measurement of EMF exposure using human modeling or field calculations is a major research topic, particularly for RF. Specific absorption rate (SAR) in the human head is the most common evaluation measure. For instance, the SAR of patients during MRI utilization, which generates RF, has been analyzed to investigate patient safety [38,39].
From analyzing the topic model over time, the EMF and health research topic trends could be confirmed. EMF-related cell research has been continuously conducted, but at a decreasing rate, and research on "pulsed electromagnetic field therapy" has been conducted as well. Since 2000, an increasingly large number of studies have investigated the health effects of mobile phones, in accordance with the WHO's recommendations [8,9].

Limitations
Our investigation was limited to human studies as the purpose was to establish the research trends of EMF effects on human health. As animal research was excluded, relevant studies regarding EMF effects on animals might have been excluded among them. Furthermore, the literature collection was limited to three databases. Nevertheless, they are representative international databases in the health field, so it can be assumed that most papers on EMFs and health were included. Methodological limitations included the use of only titles and abstracts to extract texts in TNA, so keywords with low frequency or centrality were excluded. Therefore, the generalization of these results requires reasoning and evidence.

Conclusion
Using TNA, we investigated the research trends on EMFs and human health through several approaches. To the best of our knowledge, this is the first study to identify this knowledge structure. The relationship between research keywords, the structure of the central keyword for each topic, and changes in the main research topics over time were identified, and the trends and semantic networks of published studies were detailed. Future research directions can be inferred based on these findings.
Since 1996, WHO has organized the International EMF Project to promote intensive and high-quality research programs on EMF exposure and health risks. Several laboratory studies have been conducted on short-term effects, and epidemiologic studies have been conducted on long-term effects. In addition, large-scale multinational studies such as MOBI-Kids Study, INTEROCC study, and COSMOS have been conducted [32][33][34][35]. However, the IARC classified ELF and RF-EMFs as being possibly carcinogenic to humans (Group 2B) due to limited evidence of plausible mechanisms to explain the exposure levels observed in epidemiological studies. They suggested the need for additional long-term studies for identifying evidence of EMF exposure and health risks [1,6]. In this study, research on the health effects of mobile phones related to exposure to RF-EMFs and the measurement of EMF exposure were found to account for more than 50%. Additionally, the WHO EMF project has commissioned a systematic review study that synthesizes the results of available EMF studies on human effects in 2019 [8,9]. Therefore, both an integrated study and a largescale epidemiologic study that can confirm high-level evidence on EMF exposure and health effects are needed.