Mining author relationship in scholarly networks based on tripartite citation analysis

Following scholars in Scientometrics as examples, we develop five author relationship networks, namely, co-authorship, author co-citation (AC), author bibliographic coupling (ABC), author direct citation (ADC), and author keyword coupling (AKC). The time frame of data sets is divided into two periods: before 2011 (i.e., T1) and after 2011 (i.e., T2). Through quadratic assignment procedure analysis, we found that some authors have ABC or AC relationships (i.e., potential communication relationship, PCR) but do not have actual collaborations or direct citations (i.e., actual communication relationship, ACR) among them. In addition, we noticed that PCR and AKC are highly correlated and that the old PCR and the new ACR are correlated and consistent. Such facts indicate that PCR tends to produce academic exchanges based on similar themes, and ABC bears more advantages in predicting potential relations. Based on tripartite citation analysis, including AC, ABC, and ADC, we also present an author-relation mining process. Such process can be used to detect deep and potential author relationships. We analyze the prediction capacity by comparing between the T1 and T2 periods, which demonstrate that relation mining can be complementary in identifying authors based on similar themes and discovering more potential collaborations and academic communities.


Introduction
Accurate partners or research followers are imperative in scientific research. Mining deeper author relationships in the academic network involving various significance is achievable, which can help scholars establish potential cooperative or reference relationships. The research visual field can also be expanded, and the research content can be deepened. The establishment of a citation relationship among scholars is mainly based on the correlation of their research contents. If this relationship is deeply mined, potential partners could be found. Given that the citation data were preserved completely and accurately in document database, the processes and results of relationship mining would be feasible and reliable. As a mature quantitative research method in bibliometrics and scientometrics, citation analysis is extensively used in scientific evaluation, scholarly communications, academic behavior analysis, and information retrieval. Author citation analysis mainly includes three types: author co-citation (AC), author PLOS  bibliographic coupling (ABC), and author direct citation (ADC), which is collectively called "tripartite citation analysis" in this study. For example, in a field, both papers of Authors A and B were cited by the same paper; thus, A and B have a co-citation relationship marked as AC (A, B). Authors C and D both cite the same paper in their respective articles; C and D thus have a bibliographic-coupling relationship marked as ABC (C, D). In addition, Author D cites a paper written by A in his bibliography, or vice versa; thus, D and A have a direct-citation or cross-citation relationship marked as ADC (A, D). On mining author relationship in scholarly networks based on tripartite citation analysis, two key questions should be addressed.
1. Which bibliographic-coupled or co-cited authors did not collaborate yet or do not cite each other regularly? If we called these relationships as potential communication relationship (PCR), and the latter two as actual communication relationship (ACR), could the discovery and usage of PCR contribute to the achievement of the ACR? Furthermore, how is the quantitative relation of PCR and ACR? This concern is the first point to be investigated in this study.
2. In view of the similarities or diversity among tripartite citation relationships at the author level, how can tripartite relationships be synthetically used in discovering deeper author relationships serving for broader scholarly communication and relevant recommendations? According to these primary relationships, deducing the integrated relationships between Authors A and C, or B and D, even B and C, the association strength in these potential relationships is the second point to be answered in this study.

Related studies
Separate study of tripartite citation analysis AC analysis is the most commonly used method for the empirical analysis of disciplinary paradigm, and has been frequently studied and improved. Some AC analyses have been conducted since Small [1] introduced document co-citation analysis and White and Griffith [2] developed AC analysis. Bibliographic coupling was proposed as early as 1963 [3]. However, author coupling relationship has not gained considerable attention until it was formally proposed and empirically studied by Zhao and Strotmann [4]; the authors named this method ABC analysis, which can be used to complement AC analysis in comprehensively viewing the intellectual structure by mapping the research activities of active authors for a realistic picture of the current state of research in a field. In comparison with co-citation and bibliographic coupling, direct citation (sometimes also called inter-citation or cross-citation) is a direct citation relationship without a third-party paper. Paper-level direct citation has been used in different scenarios, such as research front detection [5][6], domanial historiography mapping [7], and publication classification [8][9]. Boyack and Klavans [10] found that bibliographic coupling slightly outperforms co-citation analysis and that direct citation is the least accurate science mapping. Shibata et al. [11] revealed that direct citation could detect large and newly emerging clusters earlier, indicating that the research front detection exhibited the best performance, whereas co-citation showed the worst. Numerous studies have focused on journal direct citation; several key research achievements have shown that journal direct citation can reveal the academic influence of journals, as well as the theme evolution and field division of periodicals [12][13]. Direct citation can also be used at the macrolevel, such as citation between subject categories, to build the global map of science [14]. Wang et al. [15] extensively studied ADC analysis, which can be used to determine author relationship from another angle and reveal the knowledge communication and disciplinary structure in scientometrics. This process was then named as ADC analysis by Yang and Wang [16].

Comparative study of tripartite citation analysis
The three types of citation analysis methods can reveal author relationships in a field in various ways. Some studies have focused on the comparative analysis of these methods, even comparing them with other author co-occurrence network analysis methods, such as co-authorship (CA), word-based author coupling (WAC), and journal-based author coupling (JAC). Related studies are especially represented by Lu, Yan, and Qiu et al. Lu and Wolfram [17] conducted a comparative study of word-based, topic-based, and author co-citation approaches to measure author research relatedness. Findings show that two word-based approaches produced similar outcomes, except in the case in which two authors were frequent co-authors for the majority of their articles, and that topic-based approach produced the most distinctive map. Yan and Ding [18] explored the similarities among six types of scholarly networks (bibliographic-coupling, direct citation, co-citation, topical, co-authorship, and co-word networks) aggregated at institution level; they also detected high similarity between co-citation and direct citation networks. Moreover, the authors recommended the use of hybrid or heterogeneous networks to study research interaction and scholarly communications. Qiu and Dong [19] constructed five types of author co-occurrence networks in the field of information library sciences, such as CA, WAC, JAC, AC, and ABC. In their research, the capabilities of different types of author co-occurrence relationships in revealing scientific structure are compared through hierarchical clustering and correlation analysis by quadratic assignment procedure (QAP) test. ABC analysis also exhibited a significant advantage in revealing discipline structure and presented the highest correlation with other networks. The idea of combining different author co-occurrence networks in scholarly communication and intellectual structure analysis was also proposed.

Combined study of tripartite citation analysis
The combination of these tripartite citation analysis methods (including AC, ABC, and ADC) has been extensively studied. Small [20] proposed a method for effectively combining them; however, only few researchers have adopted this combined linkage technique at a large scale. Persson [21] and Gómez-Núñez et al. [22][23][24] have attempted to combine these citation measures in a normalized manner to weigh existing direct citation relationships between articles or journals. According to Persson's research, direct citations weighted with shared references (bibliographic coupling) and co-citations at the article level could be better applied to domain intellectual structure detection. In addition, citation-based measure calculation and integration (involving co-citation, bibliographic coupling, and cross citation) at the journal level was also proposed and proven in the application of refining the journal classification, improving journal ranking, and further updating the subject classification structure proposed by Gómez-Núñez et al. At the author level, Wang [25] proposed a comprehensive and comparative approach by combining CA, AC, ABC, ADC, and author keyword coupling (AKC), supplemented by social network analysis (SNA), to evaluate the academic impact of the core authors in the field of scientometrics. Existing studies are focused on intellectual structure detection and optimization according to tripartite citation analysis. The assessment of the author scholarly impact by combining various citation analysis is also paid attention in few studies.

Mining author relationship in scholarly networks
Practical research on the discovery of potential author relationships in communication networks by tripartite citation analysis is limited. Currently, approaches for identifying potential collaboration mainly involve machine-learning techniques, link-prediction techniques, and SNA. Zhang and Yu [26] proposed supervised machine-learning approaches to predict research collaborations by the semantic features in the field of biomedicine and author network topological features, including co-authorship network connectivity, research profile similarity, collective productivity, and seniority. Chen and Fang [27] developed a latent collaboration index model for evaluating the collaboration probability among patent assignees by incorporating two network-related factors (i.e., degree and network distance) and complementary factors (i.e., assignees types, geographical distances, and topic similarities). Guns and Rousseau [28] introduced a method for predicting or recommending high-potential future collaborations based on a combination of link prediction and machine-learning techniques. Daud et al. [29] used discriminative and generative machine-learning techniques for predicting the emerging scholars in a co-author network based on three classes of features (i.e., author, venues, and co-authorship). These studies have focused on the use of combined relationships of direct citation and co-authorship in scholarly networks without considering other relation networks in discovering potential collaboration.

Data and methodology Basic data
Scientometrics is an international journal, launched in 1978. The journal covers all aspects of scientometrics and published 46.31% of scientometrics research paper of the world [30]. Given the Scientometrics Journal as the representative communication channel in the field of scientometrics, the characteristic trends and patterns of the past decades in scientometric research become evident [31]. Bibliographic data from the journal of Scientometrics have formed the main data object in some of recent empirical studies focusing on mapping the intellectual structure [32] or detecting social network community [33] in the field of scientometrics. Therefore, this study also employed bibliographic data that cover all types of documents published in Scientometrics in 1978-2011 and 2011-2015 as representative experimental data object in the field of scientometrics. All data were retrieved from Web of Science (WOS). Data retrieval in the first period was completed in the middle of 2011; the data will be used in deduction and mining. Meanwhile, data retrieval in the second period was completed in the middle of 2015; the data will be used in verifying the results obtained by the first sample.
The first retrieval recalled a total of 2,989 documents, of which 2,982 include author information, 2,815 include references, and 2,812 include both the author information and references. The most prominent contributors are pioneers in most research studies. For example, when evaluating author influence levels, only pioneer authors were considered, which was done by most studies only considered (such as, in uncovering knowledge communication [34], and in revealing implicit relationship [35]) because the cited references only contain the first listed author of the cited document in the database of WOS. Moreover, a complex contribution allocation problem existed when considering all authors in relation analysis. This problem has not been fully solved, which is beyond the scope of this research topic. Thus, only the first authors of each paper were considered in the current study. In counting only the first author in the citation data, the results include 35,796 citations, 16,057 cited authors, and 1,484 citing authors (as first signature identity in the publications). Each author's name is identified by his surname and first initial only. The second dataset covers 1,318 documents in total, all of which include the author information and 1,308 include references (involving 27,083 valid citations).

Methodologies
Thus far, a uniform standard for identifying core authors in scientometrics has not been developed. Lotka and Price identified excellent scientists according to the number of their published papers during the study on scientists' productivity and activity patterns [36]. Garfield treated those authors with high-cited frequency from SCI as excellent scientists [37]. Some scholars also adopted different approaches to evaluate core authors in information science; however, they all considered both the number of published papers and the cited frequency. Therefore, the present study identified 94 authors who have published 5 or more papers and received 10 or more citations as core authors from the first dataset.
AC, ABC, and ADC analyses are used in discovering author relationships with co-citation, bibliographic coupling, and direct citation in scientometrics, respectively. CA and AKC analyses were also complementarily used to discover or verify author relationships in this study. AKC analysis was introduced by Liu et al. [38] and was formally proposed by Liu and Zhang [39]. This method was re-introduced and compared with CA and ABC analyses by Qiu and Wang [40], Liu and Wang [41], Song and Wu [42], and Yang et al. [43]. The AKC analysis is supposed to expand the keyword co-occurrence relationship at the author level; it can also be used to establish author relationships through the keyword coupling strength of authors' oeuvres. The oeuvres can be used to discover PCRs among authors bound by the same research themes and then describe the knowledge structure of a field or discipline.
The networks of CA and AC were directly constructed by their co-occurrence relationships in the same records. The network of ADC comprised two-way direct citing network between author pairs (i.e., symmetrized by summing the two directional citation values as the total correlation score). The citing and cited links should be equally treated as the direct relationship between author pairs. Thus, the summing symmetrization was selected instead of the lowest or highest value method or even an asymmetrical matrix. However, the symmetrizing processcould be improved by involving the total number of citations and references of authors' publications to eliminate the effect of the absolute value. Although the original value can reflect an actual situation, the direct citation frequencies must be normalized. However, such normalization could only be done in another study, because researchers have not reached a consensus on which measure is most appropriate for normalization purposes. For ABC and AKC, basic matrixes, including authors Ã cited reference matrix and authors Ã keywords matrix, were initially generated and then transformed into ABC and AKC networks via formulas, selecting the minimum method to calculate the coupling strength as suggested by Ma [44]. All of the original co-occurrence matrixes including AC, ADC, ABC, CA and AKC are supplied in the Supporting Information (S1-S5 Tables corresponding to the period of "Before 2011"; S6-S10 Tables corresponding to the period of "After 2011".) Co-occurrence analysis and deductive reasoning methods are used in mining deeper and more potential author relationships based on the original tripartite citation analysis. VBA program can process all types of citation analysis data. The final results of author relationship mining will be visualized by the Network Workbench Tool software with the analysis of MST-PathFinder Network Scaling. The use of PathFinder can simplify the network and highlight its important structural features and core associated nodes. This method was used in this study to highlight the visualization of the network and improve map readability.
The QAP is a unique method of measuring relationships in relational data. It compares the value of various corresponding elements in two (or more) squares and gives the Pearson correlation coefficient between two matrixes by comparing the corresponding grid values in each square [45]. A non-parametric test is performed on the coefficients based on the replacement of the matrix data. A comparison on proximity results in this study was conducted using QAP, and the statistics process (including centrality measurement) was annotated in the documentation of Ucinet software [46].

Study process Discovery of PCRs
In this study, five original relation matrixes (including CA, AC, ABC, ADC, and AKC) were first developed. These five matrixes were compared by QAP, and the result was saved and marked as QAP1. Excluding ACR (including CA and ADC matrixes) from PCR (including AC and ABC matrixes), the AC 0 and ABC 0 matrixes could be obtained (Fig 1). The AC 0 , ABC 0 , and AKC matrixes were re-compared, and the result was marked as QAP2. Then, a comparison between QAP1 and QAP2 was performed. AKC is based on the similarity of research themes (can be called "inherent connection"), whereas AC and ABC are based on citation relationships (can be called "exterior connection"). When the inherent and exterior connections are highly consistent with each other, the PCR is assumed to convert into ACR. To test this assumption, the results obtained by PCR from the ACR relationships were compared with the AKC matrix (2011-2015) and ACR matrix (2011)(2012)(2013)(2014)(2015).

Deep relationship mining between author pairs
In this study, the tripartite citation analysis could be applied in deep relationship mining at the author level. To make these relationships comparable, original relation matrixes should be normalized. The normalization method was based on Salton's cosine similarity measures, which results in similarity values ranging between 0 and 1. The following five steps (some of the steps are shown in Fig 2) aid in determining author relationship mining based on tripartite citation analysis, such as "A-C," "B-D," and "B-C," which has been discussed earlier. These steps could also be regarded as the algorithm in relation mining. The implication of each variable A, B, C, and D refers to the author of the matrix, L; Q refers to the relationship between the authors in the adjacency list O; and P refers to the relationship between the authors in the adjacency matrix.  , Q 3i } then marking the "one author in the pair of {B k , C k }" (so as the "one author in a pair of {L 2i , Q 2i }") as B χ , "another one in the pair of {B k , C k }" (so as "the one author in a pair of {L 1i , Q 1i }") as C χ , one author in the pair of {L 3i , Q 3i } (so as the "another one in the pair of {L 2i , Q 2i }") as A χ , and another one in the pair of {L 1i , Q 1i } (so as the "another one in the pair of {L 3i , Q 3i }) as D χ Finally, B χ and C χ could be connected according to A χ and D χ , and the final relationship strength of B χ and C χ would be the top value in all of the correlation scores (respectively equaling to the products of X k , Y k , and Z k ). Thus far, all relationships among author pairs in {L 4i , Q 4i } had been established.
According to the above algorithm, potential relationships among no-direct-relationship core author set could be generated by the VBA program and Access databases. Finally, the comparison of new relationships and direct correlations (including CA, AC, ABC, ADC, and AKC) in 2011-2015 would be performed to identify the effectiveness of citation mining applied in the detection or promotion of more potential communications.

Results and discussion
Analysis results of AC, ABC, and ADC According to the tripartite citation analysis of AC, ABC, and ADC, we obtained three original relation matrixes and the corresponding normalized matrixes (Fig 3).
These tripartite matrixes and the AKC matrix could be visualized by the Network Workbench Tool (Figs 4-7).
The core nodes in each network are different, as shown in Figs 4-7. In the AC network, "(Moed HF, Narin F, Vlachy J)-Garfield E-Braun T-Schubert A-Glanzel W-Egghe L-Rousseau R-Thelwall M" are core associated scholars, all of whom form the main path in the network. In the ABC network, the main associated scholars include "Schubert A-Glanzel W-Meyer M-Leydesdorff L," in which new core nodes, such as Garg KC, Bar-ilan J, Guan JC, and Zitt M, also emerge. In the ADC network, Leydesdorff L becomes the superior core node, and the associated path of "Abramo G-Glanzel W-Leydesdorff L-Bornmann L-Garfield E" becomes the main path. In the AKC network, "Schubert A-Rousseau R-Vinkler P-Glanzel W-Leydesdorff L" becomes the major associate scholars, and the key associations of Thelwall M, Guan JC, Zitt M, and Glanzel W are also reflected. These scholars are also at the heart of correlation formation among other scholars. Generally, the main connected path of "Garfield E-Schubert A-Glanzel W-Leydesdorff L" is more consistent in these four networks. However, the difference is also distinct, such that Garfield E is the main supporter for most core paths in the AC network, while the main supporter in the ABC and ADC networks are Glanzel W and Leydesdorff L, respectively. In view of the ADC revealing a more direct relationship, Leydesdorff L is more likely to be the builder of the potential connection.
As shown in Figs 4-7, author relevance was preliminarily identified by different citation methods. For example, Leydesdorff L has the strongest correlation with other authors in the direct citation networks, whereas the core correlation of the three types of indirect network (PCR) is relatively low, and some are even weak in the cooperative correlation. Meanwhile, Narin F presents the highest correlation degree in the AC network, and the correlation degree is comparatively lower in other networks. Bornmann L and Sooryamoorthy R are strongly correlated with the ADC and AKC networks, respectively. However, their correlations are low in  Table 1. In Table 1, ND is the abbreviation of NrmDegree, which represents the normalization of degree.

Results of PCR discovery
In this comparative study, AKC analysis was applied to produce the AKC matrix, in which the implementation process is similar with that of the ABC analysis (i.e., the authors are correlated with one another by indexing the same keywords). Table 2 presents the result of the QAP correlation test of CA, AKC, AC, ABC, and ADC matrixes. The results show that the correlation between the CA and ABC matrixes is the strongest, followed by the AC and ADC matrixes, and the AC and ABC matrixes. These findings indicate that the current study and the topic structure revealed by these three pairs are perhaps the most similar, or can be mutually complementary. In addition, the ABC matrix generally has the highest degree of correlations compared with all other relationship matrixes, which to some extent shows that the application of Mining author relationship in scholarly networks based on tripartite citation analysis the ABC analysis can more accurately reveal the scientific structure of the disciplines. One of the possible reasons for using this analysis method is to divide into research groups and discover the subject structure by numerous scholars. This result supports the findings of Yan and Qiu [17][18], both of which revealed that ABC nearly has the highest similarity with other networks at the author level. Furthermore, the correlation coefficient of AKC and ABC matrixes is at a middle level, which indicates that AKC and ABC analyses share similarities to a certain extent. This finding is consistent with the conclusion of Yang [37].  Excluding ACR connections (including CA and ADC) from PCR (including AC, ABC, and AKC), the matrixes of AC 0 , ABC 0 , and AKC 0 were obtained. The QAP correlation test for the new three matrixes was performed, with results shown in Table 2. The two groups of correlation strengths in successive QAP results were compared (marked in color red). The comparison shows that the relation degrees among AC 0 , ABC 0 , and AKC 0 could also maintain significant correlations, especially AKC 0 and ABC 0 , which share higher relevancy than the original matrixes. This condition indicates that these authors connected by PCR are likely to produce academic exchanges based on similar themes.
The three new matrixes with ACR connections from 2011 to 2015 were further compared in Table 3. According to the QAP analysis, the correlation coefficient between the new ACR (after 2011) and the old PCR (before 2011) is 0.225 (p<0.001). This result can also sustain the assumption about applying PCR in the detection of new academic exchanges. We converted the new relationships into author pairs and analyzed them with Pearson correlation. We found that the three previous relationships showed a more apparent correlation with the new PCR and ACR. Among those relationships, ABC has the strongest correlation and the highest predictability. In addition, the previous and new PCRs are consistent, and the correlation between the new PCR and ACR is significant. Meanwhile, ABC also reflects the highest correlation, followed by AC.
Further analysis of the author pairs before and after 2011 demonstrates that several author pairs have strong PCR correlations, such as Bordons M-Glanzel W, Katz JS- Leydesdorff L, and Braun T-Rousseau R. In addition, the new ACR correlations appeared after 2011, which suggests that the PCR relationship promotes the occurrence of the ACR relationship. The new main ACR author pairs are listed in Table 4. can be considered gaining more attention from colleagues and that more communication and linkages are established over time because of him.
To verify the existence of author relationship mining based on tripartite citation analysis proposed in this study, correlation analysis between the mining results and author direct relationship status (such as co-author and co-citation) was recently performed to reveal the predictive and practical value of the mining method and results. New AC, ABC, ADC, AKC, and CA matrixes from 2011 to 2015 have been investigated; and five matrixes, including CA, AC, ABC, ADC (symmetrized), and AKC were developed. Four author pairs could be identified according to the comparison of data mining before 2011(A-C and B-D), as well as evident relationship after 2011 (AKC@, CA@, AC@, ADC@, and ABC@), which are shown in Table 5. Although no co-authorship exists among these author pairs, the other direct relationships, such as AKC, ADC, and ABC, are still evident, especially Leydesdorff L and Prathap G.
On the normalization process, given the presence of large amounts of 0 module caused by less amount of data within a limited time, another type of standardized method was selected. The AC matrix was considered as an example; the co-citation frequency between Authors A and B is x, the total frequency of A co-cited with all authors is m, the total frequency of B cocited with all authors is n, and the correlation strength between Authors A and B is x/m+x/n. This analogy indicates that standardized matrixes of CA, AC, ABC, and ADC were obtained. Finally, a comprehensive correlation (CC) matrix was developed by adding four types of correlation values. Pearson correlation test was performed among author pairs of A-C, B-D, CC, and AKC, which correspond to two types of data sets, namely, A-C and B-D. The results are shown in Table 6, and the CC matrix was visualized using the Network Workbench Tool (Fig  11). Network Scaling MST-PathFinder was performed to show the network clearly. Therefore, the previously revealed correlations are not fully displayed. As shown in Table 6, certain degrees of positive correlation are observed among A-C, AKC, and CC, and B−D and AKC, which could indicate that the indirect relationships mentioned above would turn into direct relationships to some extent in the near future. Scholars in  the field have consciously or unconsciously paid close attention to or cite links (including cocited, coupled, and citing) with other scholars who shared indirect relationships instead of the direct relationships with the former authors, and even produced substantial cooperation among one other. Notably, the correlation between the AKC and CC matrixes is relatively significant. This result can be compared with previous results shown in Tables 2 and 3, in which AKC is also significantly correlated with others, even though the related values are comparatively lower (except for ABC). Therefore, AKC analysis may help in revealing the evolution of the existing relationships. Meanwhile, the relationship mining method proposed in this study could aid in revealing unknown relationships that complement with AKC analysis or other methods, such as topic analysis.

Conclusions
Various relationships exist in academic networks, such as CA, AC, ABC, ADC, and AKC. In a given field, the intensity and the associated attributes among scholars may exhibit significant differences in terms of the different network correlations. Some scholars showed strong ACR correlations, while some had key positions in PCR association. In this study, we compared the five types of matrixes by QAP and found that ABC has the nearly highest similarity with other networks. This finding can demonstrate the superiority of ABC analysis in revealing an academic community and its scientific structure. Furthermore, the correlation coefficient of ABC and AKC is higher than the coefficients among AKC and others, indicating that AKC and ABC can be complementarily applied in potential communication mining.
By comparing the relationship of ACR and PCR, a particular phenomenon was observed, in which only PCR existed among scholars without ACR in the field of scientometrics, such as among Bonitz M, Nagpaul PS, Mccain KW, and Eto H. By analyzing PCR with the ACR correlations excluded (i.e., including AC 0 , ABC 0 , and AKC 0 ) by QAP, we found that the relationship degrees among AC 0 , ABC 0 , and AKC 0 can also maintain significant correlations, especially AKC 0 and ABC 0 , which share higher relevancy than the original matrixes. This result indicates that these authors connected by PCR are likely to produce more academic exchanges and scientific innovations because of similar themes rather than social attribute association (e.g., teacher-student or co-worker relationships). By conducting Pearson correlation analysis, the case study confirmed that a significant correlation existed between the PCR that appeared before 2011 and the new ACR@ and PCR@, which appeared after 2011. Furthermore, continuity existed between PCR 0 and PCR@, and the associated relationship between ACR@ and PCR@ were also significant. Particularly, ABC 0 , ABC@, and other relations have been highly correlated, which indicated that ABC analysis shows a good application potentiality in relationship prediction and discovery to some extent and may reflect actual communication.
On the basis of the algorithm design and the empirical analysis, the deduction from the analysis results from AC, ABC, and ADC to the potential author relationships mining is probable and practicable. For example, the relationship between Leydesdorff L and Prathap G revealed by A-C/B-D in the case study achieved a high degree of correlation in the practice after 2011 (including AC, ADC, and ABC, which did not exist before 2011). The author correlation between Breimer LH and Vaughan L obtained by two-time mining was also consistent with the new correlation in 2011, which once again confirmed the validity and the potential value of the proposed method for revealing author relationships.
The results presented above revealed that the indirect relationships among interdisciplinary scholars or novice researchers can be mined by the method combined with tripartite citation analysis, which helps with specific scientific cooperation and broader communication. In comparison with the direct relationship presented recently by Pearson correlation, the author mining method proposed in this study helps in revealing unknown relationships and could complement with AKC analysis or other methods, such as topic analysis. These methods could be applied in discovering research fellows, exploring potential partners, as well as tracking scholars with related research and their research direction.
In conclusion, this study attempted to discover PCRs. Through the correlations between the measurements, the proposed method could be used to explain that the establishment of co- citation or coupling relation may promote the production of the actual communications. This finding suggests that these two relations could identify potential collaboration partners for both individuals and teams. The proposed author relationship mining method based on tripartite citation analysis could also be an effective method for discovering future relationships among scholars and promoting scientific communication and innovation development.
In addition, we recognized the existence of limitations in the dataset of "core authors" by selecting only the first author as citation data and defining the threshold of the publication number and citation frequency. We performed such step despite the fact that only the first cited authors tallied in the database of WOS and regardless of the difficulty of a specific time window for obtaining a sufficient linking signal (e.g., the data in the first period of 1978-2011, which was acquired in 2012, is difficult to be regained at present). However, this paper was an attempt to propose an idea and process in author relationship mining in the context of five types of scholarly networks; thus, the collection of core authors targeted by this research was supposed to be useful in the application of relationship mining method. Nevertheless, we are faced with the data limitation, thus the need to present a more credible empirical study with a sizable sample and enhance the practicality and effectiveness in PCR discovery by tripartite citation analysis. Finally, as an attempt, the proposed method should also be applied in various fields. However, the method was tested only in the field of scientometrics due to computational complexity, the amount of data obtained, and so on. In addition, some of the studies exhibit positive results, which are applicable only in the field of scientometrics [47]. In the context of scientometrics, the results are easier to explain and rigorously confirmed. In further research work, the proposed method should be applied in other fields to further confirm its effectivity and rationality.