HFKG-RFE: An algorithm for heterogeneous federated knowledge graph

Chunjuan Li; Hong Zheng; Gang Liu

doi:10.1371/journal.pone.0315782

Abstract

Federated learning ensures that data can be trained globally across clients without leaving the local environment, making it suitable for fields involving privacy data such as healthcare and finance. The knowledge graph technology provides a way to express the knowledge of the Internet into a form more similar to the human cognitive world. The training of the knowledge graph embedding model is similar to that of many models, which requires a large amount of data for learning to achieve the purpose of model development. The security of data has always been a focus of public attention, and driven by this situation, knowledge graphs have begun to be combined with federated learning. However, the combination of the two often faces the problem of federated data statistical heterogeneity, which can affect the performance of the training model. Therefore, An Algorithm for Heterogeneous Federated Knowledge Graph (HFKG) is proposed to solve this problem by limiting model drift through comparative learning. In addition, during the training process, it was found that both the server aggregation algorithm and the client knowledge graph embedding model performance can affect the overall performance of the algorithm.Therefore, a new server aggregation algorithm and knowledge graph embedding model RFE are proposed. This paper uses the DDB14, WN18RR, and NELL datasets and two methods of dataset partitioning to construct data heterogeneity scenarios for extensive experiments. The experimental results show a stable improvement, proving the effectiveness of the federated knowledge graph embedding aggregation algorithm HFKG-RFE, the knowledge graph embedding model RFE and the federated knowledge graph relationship embedding aggregation algorithm HFKG-RFE formed by the combination of the two.

Citation: Li C, Zheng H, Liu G (2025) HFKG-RFE: An algorithm for heterogeneous federated knowledge graph. PLoS ONE 20(4): e0315782. https://doi.org/10.1371/journal.pone.0315782

Editor: Md Mehedi Hassan, Khulna University, BANGLADESH

Received: August 15, 2024; Accepted: December 1, 2024; Published: April 29, 2025

Copyright: © 2025 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting Information files.

Funding: Science and Technology Project of Jilin Provincial Department of Education. (JJKH20240860KJ), the funder Gang Liu serves as the Conceptualization, Writing – review & editing and Funding acquisition in this paper.

Competing interests: The authors have declared that no competing interests exist.

1 Introduction

The knowledge graph has the ability to organize,manage and understand massive information on the Internet. It’s main idea is to embed triple entities and relationships into a continuous vector space, and express the information in a form closer to the human cognitive world to benefit downstream tasks [1]. In recent years, knowledge graph has been widely applied in fields such as intelligent question answering, intelligent recommendation,and information retrieval [2]. However, knowledge graphs composed of triplets are usually incomplete,and many studies often use embedding existing triplets into continuous vector spaces to predict missing data [3]. The learning process of knowledge graph embedding requires a large amount of data,but data integration faces various problems such as industry competition,privacy and security, and complex administrative procedures. Even achieving data integration between different departments of the same company faces numerous obstacles. In reality, it is almost impossible to integrate data scattered in various places and institutions, or the required cost is huge [4]. Therefore, there are barriers between data sources that are difficult to break down, resulting in the problem of data silos [5]. Federated learning was developed to address the issue of data silos, supporting federated machine learning models that meet user privacy protection, data security, data confidentiality, and government legal requirements [6]. The combination of federated learning and knowledge graphs is considered to ensure data confidentiality and collaborative learning of knowledge graph embedding representations when multiple knowledge graphs are distributed across different clients. The combination of federated learning and knowledge graph often faces the problem of data statistical heterogeneity, which means that the data trained by each participant has the property of nonIID distributed [7]. NonIID data will cause the client to experience model drift during model training [8,9], which will lead to a decrease in the client’s ability to obtain information from other clients through federated learning [10]. In the multiple and interactive training, the performance of the client’s local model cannot be well optimized and improved, resulting in low accuracy of the trained model. Therefore, this paper proposes the algorithm HFKG-RFE to solve the problem of federated data heterogeneity. Two dataset partitioning methods are used on three datasets, DDB14, WN18RR, and NELL, respectively, namely random shuﬄing and splitting of triplets [11] or uneven partitioning of triplets [12]. Different numbers of clients are set up, and extensive experiments are conducted using four knowledge graph embedding models TransE [13], DistMult [14],ComplEx [15]and RotatE [16] to prove that the algorithm HFKG-RFE can effectively solve the problem, and the performance of the algorithm is also improved. In summary, the main contributions of this paper are as follows:

Propose an algorithm HFKG to address the problem of data statistical heterogeneity by limiting model drift through contrastive learning, for the scenario of knowledge graph embedding under federated learning settings.
Improve the server aggregation algorithm of federated learning to enhance the aggregation effect and achieve the goal of optimizing algorithm performance.
Propose a knowledge graph embedding model RFE to improve the local model embedding accuracy of the client, thereby improving and enhancing the overall performance of the federated knowledge graph algorithm.

2 Related work

2.1 Knowledge graph

Knowledge graph is a technique that uses graph models to describe the relationships between knowledge and modeled things [17–19]. Its main idea is to embed entities and relationships in the knowledge graph into a continuous vector space. In existing works that combine federated learning with knowledge graphs, there are four classic knowledge graph embedding models commonly referenced by local clients: Euclidean embedding TransE [13], which represents relationships and entities in triplets as vectors in the same space for model training. This model shows good performance in large-scale knowledge graphs, but it cannot effectively handle complex relationship problems. Tensor decomposition embedding DistMult [14] limits the bilinear transformation matrix and reduces the number of relationship parameters in the latent variable model, achieving better performance in link prediction tasks. However, its disadvantage is that the model’s combinatorial inference is overly simplified and can only handle symmetric relationships, resulting in weaker performance. Tensor decomposition embedding ComplEx [15] introduces complex vector space, which enables it to capture symmetric and ant symmetric relationships. However, the complex interactions between entities and relationships require a large amount of matrix and vector calculations, resulting in high computational complexity. Euclid embedding RotatE [16] defines each relationship as a rotation from the source entity to the target entity in the complex vector space, thus efficiently and effectively training the model. However, it only has one rotation plane, which also limits the performance improvement during the embedding process.

2.2 Federated learning

The common challenge faced by existing federated learning combined with knowledge graphs is the problem of data statistical heterogeneity between different clients, which leads to model drift caused by inconsistent local model training and global model convergence [20–23]. Many algorithms have been proposed to solve this problem, including commonly used federated learning algorithms such as Fedprox and Scaffold. Algorithm Fedprox [24] adds proximal terms to the weights during the client learning process, and relevant experiments have shown that the algorithm Fedprox has better aggregation performance than the algorithm Fedavg. The algorithm scaffold [25] calculates the gradient of local data in the global model or reuses previously calculated gradients, but compared to the algorithm Fedavg, the algorithm Scaffold increases the communication size of each round by approximately twice. In the existing work of combining federated learning with knowledge graphs, client server interaction is often carried out through entities or relationships in dataset triples to train the global model. Research on entity interaction has shown that FedE [26] uses federated learning algorithm Fedavg for entity interaction training models, but the effectiveness is slightly poor when dealing with the problem of data statistical heterogeneity. FKGE [27] uses peer-to-peer joint embedding in federated learning frameworks, which results in high communication costs. FedLU [12] is used for heterogeneous knowledge graph entity embedding learning and cancellation learning, combining backtracking interference and passive decay to aggregate global models. However, the lifecycle of knowledge graphs affects the sustained learning and cancellation learning of algorithms. Research on relational interaction has been conducted by FEDR [11], which has demonstrated through federated model attacks that federated entity embedding is far less secure than federated relationship embedding. Federated knowledge graph entity embedding aggregation can lead to serious privacy breaches, while knowledge graph reconstruction attacks make it difficult to infer entities based on relationships, thus greatly reducing the probability of privacy breaches. In addition, the shared relationship query volume is small and the communication cost is low. However, this algorithm does not consider solving the problem of heterogeneous federated data.

The above work provides corresponding ideas. From the perspective of selecting the interaction relationship between the client and server, which is more secure and has lower communication costs, a federated learning algorithm HFKG is proposed using the embedding comparison idea to solve the problem of data statistical heterogeneity. In addition, a federated learning aggregation algorithm and a knowledge graph embedding model RFE are proposed to improve algorithm performance, combine the improved algorithm to generate the HFKG-RFE algorithm.

3 HFKG-RFE algorithm

The framework of the HFKG-RFE algorithm is shown in Fig 1, consisting of a server and multiple clients. The client uses its own dataset locally to train the knowledge graph embedding model. The client interacts with the server to obtain a unique relationship matrix. The server aggregates the relationship matrices uploaded by each client and sends them to each client for model optimization and adjustment. Through continuous interactive training, the local knowledge graph embedding model learns more information and achieves optimal model performance.

Download:

Fig 1. Training process of HFKG-RFE algorithm.

https://doi.org/10.1371/journal.pone.0315782.g001

3.1 Federated algorithm HFKG

The overall process of HFKG is as shown in Algorithm 1. Firstly, clients are randomly selected according to the formula F × C , where F is the proportion of clients selected in each round, which is used to select clients according to the proportion of clients. C is the total number of federated learning clients. The server sends the initialized relationship matrix to the selected clients, and the clients perform local knowledge graph embedding model training and update the relationship matrix. Then, the relationship matrix is uploaded to the server for aggregation and updating. Through continuous interaction, each client can learn more knowledge, and the model performance is improved.

Algorithm 1.: HFKG-RFE

3.2 Client update

The four classic knowledge graph embedding models used by federated learning local clients are as follows: TransE [13], DistMult [14], ComplEx [15], RotatE [16]. The scoring function of the knowledge graph embedding benchmark model is shown in Table 1.

Download:

Table 1. Knowledge graph model rating function.

https://doi.org/10.1371/journal.pone.0315782.t001

The federated learning client is based on local data and conducts local training through a set knowledge graph embedding model. The score function is changed by selecting different knowledge graph embedding models. When the client receives a relationship embedding update element embedding E^c issued by the server, for each client C, a selected set of clients is used to adjust the local model and update the shared relationship table. For a triplet (h, r, t) in the knowledge graph on the client, we calculate it using the rating function . The embedding process of the local client knowledge graph is shown in Fig 2, and the loss function is calculated based on the positive and negative samples of the knowledge graph triplet.

Download:

Fig 2. Knowledge graph embedding optimization iteration process.

https://doi.org/10.1371/journal.pone.0315782.g002

The loss function of the client triplets selected according to the client ratio is calculated as Eq (1).

(1)

In the loss function formula, γ is the edge of a hyper parameter set, σ is the adversarial sampling temperature, and is the knowledge graph scoring function. (h,r,) is the negative sample taken corresponding to the knowledge graph triplet (h,r,t), while p(h, r, )is the weight of the corresponding negative sample obtained by calculation, where α is the sampling temperature, and the weight is defined as follows:

(2)

By calculating the numerical gradient of the loss function, combined with an optimizer (such as Adam optimizer), the model parameters are updated based on the gradient information to gradually reduce the loss function, thereby performing local optimization update embedding in the knowledge graph.

Inspired by MOON [6], contrastive learning was selected to guide embedding in solving the problem of data statistical heterogeneity. Traditional comparison did not consider the distance relationship between positive and negative samples and anchor points [28–31]. Therefore, the triplet selection method used in FaceNet [32] to solve face recognition problems under different poses and lighting conditions was selected for improvement and applied to federated learning knowledge graph embedding scenarios. The correct sample selection is crucial for model convergence. The target relationship samples required in this paper are the anchor relationship sample matrix , the positive relationship sample matrix , and the negative relationship sample matrix . The selected sample matrices are input into triplet loss to calculate the difference between the local model relationship matrix and the global model relationship matrix to limit model drift and solve the problem of data statistical heterogeneity. Generate roughly aligned matched and unmatched relationship matrix Triplets, and the training process is shown in Fig 3. In each round of training, the federated learning algorithm is promoted to better learn the knowledge graph of each client by minimizing the distance between the relationship matrix issued by the server and the latest relationship matrix of the client, and maximizing the distance between the latest relationship matrix of the client and the relationship matrix obtained from the previous round of training.

Download:

Fig 3. Training process of triplet loss in federated learning.

https://doi.org/10.1371/journal.pone.0315782.g003

Specifically, the goal of triplet loss is to minimize the distance between anchor samples and positive samples, and maximize the distance between anchor samples and negative samples. The loss function is defined as:

(3)

By minimizing triplet loss, the same class relationship matrix is closer in federated learning algorithms, while different class relationship matrices are more dispersed in federated learning algorithms, thereby improving the ability to handle heterogeneous federated data problems.

3.3 Server update

Before the aggregation work on the server, the server obtains the IDs of all unique relationships from the local client and maintains a relationship embedding matrix form. This paper considers improving the aggregation method by considering the correlation between data between the client and server to identify more valuable clients for aggregation. The server receives the initialization relationship embedding matrix form uploaded by each client. In order to improve the aggregation efficiency of the federated server, it is proposed to calculate the cosine similarity between global relationship embedding and client relationship embedding and calculate the relationship existence vector. The specific calculation is shown in (Eq. 4).

(4)

Existence vector calculates the weight of each relationship on all clients, and the cosine similarity value is used to weight the embedding vector of the client relationship, so that the aggregation function not only considers the relationships between clients, but also the relationships between the server and each client. The two are combined to calculate the score of each client in federated learning. By using (Eq. 5), the proportion of each client is calculated by division, and the server aggregation is performed by ratio.

(5)

At the same time, the secure aggregation technology SecAgg [11] proposed by the FEDR algorithm is adopted in the aggregation stage to mask the relationship matrix uploaded by the client, making it impossible for the server to understand the real data, but it does not affect the aggregation effect.

In summary, triplet selection is used to limit model drift to solve the problem of data statistical heterogeneity, and server aggregation methods are improved to enhance the aggregation performance of federated learning on the server side, providing better performance for algorithm training and application.

3.4 Knowledge graph embedding model RFE

In order to improve the aggregation effect on the federated learning server and optimize the performance of the federated algorithm by improving the embedding performance of the federated learning client, a knowledge graph embedding model RFE is proposed. The vector invariance principle of entity and relationship rotation transformation in three-dimensional space is used to achieve vector mapping and splitting of entities and relationships. Combined with the complex calculation formula in complex space, better embedding completion effect is achieved. We project the head and tail entities h and t in three-dimensional space and satisfy the vector calculation formula. Consider the two states in which relationship r exists, Fig 4 shows the state where r and t are perpendicular and Fig 5 shows the state where r and t are not perpendicular.

Download:

Fig 4. Mapping in three-dimensional space.

https://doi.org/10.1371/journal.pone.0315782.g004

Download:

Fig 5. Rotation embedding in three-dimensional space.

https://doi.org/10.1371/journal.pone.0315782.g005

When the relationship r is perpendicular to the entity t, as shown in Fig 4, for each embedded element, the relationship is mapped, and the entity and relationship satisfy Eq. (6).

(6)

According to the formula for calculating the angle between vectors, obtain Eq. (7) and Eq. (8), where 𝜃 is the rotation angle.

(7)

(8)

For each triplet (h, r, t), combined with the complex space calculation of rotating knowledge graph embedding, the distance function of knowledge graph embedding RFE is shown in Eq. (9), where ∘ is the Hadamard product.

(9)

When the relationship r is not perpendicular to the entity t, as shown in Fig 5, the entity and relationship satisfy Eq. (10).

(10)

Split the relationship and tail entity into space to obtain (Eq. 11), and then convert it into (Eq. 12) according to the calculation in Eq. (9).

(11)

(12)

Therefore, the scoring function calculation formula of the knowledge graph embedding model RFE is combined with the Hadamard product calculation, and the rotating entity and relationship mapping in complex space can be expressed as (Eq. 13), where ∘ is the Hadamard product.

(13)

4 Dataset construction

The experiment adopts three different datasets from different fields, the medical database DDB14 [33], the WN18RR [34] dataset, which includes the conceptual semantics and lexical relationships between English words, the NELL [35] dataset, and factual knowledge extracted from hundreds of millions of web pages.

Considering that the knowledge graph dataset is in the form of triplets without labels or features, two methods are adopted for dataset partitioning to form data heterogeneity. The first method is to randomly shuﬄe and split the datasets DDB14 and WN18RR without replacement, so that the triples are not duplicated, and then distribute them to all clients. Random splitting ensures that the data between all clients is heterogeneous [11]. The second method adopts the method of uneven number of triples per client [12]. The dataset is divided into four types, with client numbers C being 5, 10, 15, and 20, and the proportion of triples per client in each type is shown in Fig 6. In addition, the triplet proportions of the training set, validation set, and test set are 0.8, 0.1, and 0.1, respectively.

Download:

Fig 6. The proportion of triples in each client.

https://doi.org/10.1371/journal.pone.0315782.g006

5 Experiment and analysis

In order to evaluate the performance of the HFKG algorithm, the FEDR algorithm with the same shared relationship matrix and the Fedprox algorithm for handling the problem of data statistical heterogeneity were selected for data comparison and analysis. In addition, in order to better conduct comparative analysis of evaluation indicators, the corresponding hyper parameters are set with the same values. The evaluation indicators are: Hits@1, Hits@3, Hits@10 and mean reciprocal rank (MRR).

5.1 Federated algorithm HFKG

The experiment evaluates the effectiveness of HFKG algorithm from three perspectives: client, server, and client combined with server.

5.2 Client training

The experiment used FEDR as the baseline, and Table 2 shows the link prediction results of dividing the three datasets into C=10 clients. Bold numbers indicate excellent or similar algorithm performance. Compared with FEDR, HFKG generally achieved better or similar evaluation metrics. Taking NELL as an example, the MRR on HFKG increased by 13.02%, 3.99%, 0.45%, and 2.94%, respectively. The method proposed in this paper to solve the problem of data heterogeneity by minimizing the distance between samples is effective. And this method is suitable for situations with a large number of clients.

Download:

Table 2. Performance of HFKG (triplet selection) on multiple clients divided by different datasets.

https://doi.org/10.1371/journal.pone.0315782.t002

5.3 Server aggregation

Propose an improved server aggregation algorithm, with the commonly used weighted aggregation algorithm as the baseline in the experiment. Table 3 shows the link prediction results of dividing the three datasets into C=10 client numbers. From the evaluation index data, it can be seen that HFKG performs better. Taking NELL as an example, the MRR on HFKG has increased by 12.92%, 5.09%, 1.83%, and 4.67%, respectively. This indicates that the method proposed in this paper, which weights the relationship embedding vectors of the client by cosine similarity values, is effective and can help the server better aggregate the relationship matrices uploaded by the client.

Download:

Table 3. Performance of HFKG (aggregation) on multiple clients divided by different datasets.

https://doi.org/10.1371/journal.pone.0315782.t003

5.4 Federated algoritnm HFKG

The HFKG algorithm proposed in this paper is generated by combining client improvements and server aggregation improvements. The data in Table 4 shows the link prediction results of dividing the three datasets into C=10 clients. Compared with FEDR, HFKG generally improves the performance of the four evaluation indicators, indicating that the HFKG algorithm performs better than the federated algorithm Fedavg used by FEDR, and the aggregation efficiency has also been improved.

Download:

Table 4. Performance of HFKG on multiple clients divided by different datasets.

https://doi.org/10.1371/journal.pone.0315782.t004

In order to evaluate the performance of the HFKG algorithm in handling the heterogeneity problem of federated data, the datasets DDB14 and WN18RR were divided into C=5,10,15,20 clients. Select the federated learning algorithm Fedprox for data comparison and analysis. The data is shown in Table 5, indicating that the algorithm HFKG has better link prediction performance.

Download:

Table 5. Results on DDB14 and WN18RR.

https://doi.org/10.1371/journal.pone.0315782.t005

In addition, the NELL dataset was divided into different numbers of client triplets, and the client knowledge graph embedding models selected TransE and RotatE for data comparison, as shown in Table 6. Taking MRR as an example, the relative increases compared to the algorithm Fedprox are 8.72%, 6.08%, 5.67%, 3.16% and 4.15%, 2.49%, 3.06%, 2.55%, respectively. Overall, this indicates the ability of HFKG to handle the problem of data statistical heterogeneity.

Download:

Table 6. Results on NELL.

https://doi.org/10.1371/journal.pone.0315782.t006

In the above experimental data comparison, the knowledge graph embedding model trained by HFKG on three datasets achieved good performance improvement, indicating that the proposed HFKG algorithm effectively prevents client model drift and solves the problem of data statistical heterogeneity. However, it was found that Hits@N .The individual data of indicators (N takes 1, 3, 10) did not improve. By calculating the average ranking of the three indicators through MRR, MRR has improved, indicating that the overall proportion of correct triplets in link prediction is high, and the HFKG algorithm is effective.

5.5 Knowledge graph RFE

The knowledge graph embedding model RFE combines the FEDR federated algorithm and Fedprox federated algorithm respectively, and compares the model embedding performance with four classic knowledge graph embedding models frequently cited in existing paper work. Table 7 shows that the data is based on the DDB14, WN18RR and NELL datasets, with the average reciprocal rank (MRR) as the metric. The bolded data represents the best knowledge graph embedding performance, indicating that the RFE model has achieved significant results in the federated learning client.

Download:

Table 7. Combination of RFE embedding model and federated algorithm.

https://doi.org/10.1371/journal.pone.0315782.t007

5.6 Algorithm HFKG-RFE

Integrate the federated learning algorithm HFKG and the knowledge graph embedding model RFE to obtain the algorithm HFKG-RFE that combines the advantages of both. The algorithm is extensively tested on three datasets: DDB14, WN18RR, and NELL to demonstrate its effectiveness. The bold data in the following Table 8 represents the indicator data with the best results among all algorithm settings.

Download:

Table 8. Comparison of algorithm performance aesults.

https://doi.org/10.1371/journal.pone.0315782.t008

Table 8 shows the MRR data indicators and the larger the value of this indicator, the higher the predicted ranking. On all three datasets, HFKG-RFE performed well. Moreover, as the number of clients increased, the data became sparser, and the scene became more complex, the algorithm performance actually surpassed. Compared with other algorithm settings, the link prediction results on heterogeneous datasets were steadily improved, indicating that the algorithm HFKG-RFE has certain effectiveness.

6 Conclusion

This paper mainly studies the cross client aggregation relationship matrix to improve the client link prediction ability and improve the knowledge graph while protecting data from leaving the client’s local environment. The HFKG algorithm is the first to introduce the idea of facial recognition and use triplet comparison to address the heterogeneity of federated data between federated learning clients. In addition, algorithms that improve the efficiency of server aggregation ensure that clients with high contribution and value have a larger proportion during aggregation, thereby guaranteeing the aggregation effect. At the same time, the combination of three-dimensional space mapping and the Hadamard product is adopted, and the complex space is used to calculate and generate the knowledge graph embedding model RFE, which achieves significant embedding effects locally on the client side. A large amount of experimental data shows that the proposed algorithm HFKG-RFE has certain effectiveness. Although the addition of Triplet samples increased the computational cost slightly, it achieved the goal of training model accuracy without increasing communication costs. This algorithm will mainly solve the problem of work cost in various calculations and collaborations in the future. In practical applications, clients may simultaneously use multiple models, which may affect overall performance through embedding aggregation. How to ensure the stability of federated learning algorithm performance in this situation and how to better defend against external attacks are the future research directions of this paper.

References

1. Cao Jiahang, Fang Jinyuan, Meng Zaiqiao, et al. Knowledge graph embedding: A survey from the perspective of representation spaces. ACM Comput Surv. 2024.
- View Article
- Google Scholar
2. Ligabue PDM, Brando AAF, Peres SM. Applying a context-based method to build a knowledge graph for the blue Amazon. Data Intelligence; 2024(1).
3. Ganchev I. Knowledge graph embedding using a multi-channel interactive convolutional neural network with triple attention. Mathematics 2024;12(12):1821.
- View Article
- Google Scholar
4. Chai H, Huang Y, Xu L, Song X, He M, Wang Q. A decentralized federated learning-based cancer survival prediction method with privacy protection. Heliyon 2024;10(11):e31873. pmid:38845954
- View Article
- PubMed/NCBI
- Google Scholar
5. Milasheuski U, Barbieri L, Tedeschini BC. On the impact of data heterogeneity in federated learning environments with application to healthcare networks. IEEE Trans Artif Intel. 2024.
- View Article
- Google Scholar
6. Li Q, He B, Song D. Model-contrastive federated learning. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021. p. 10713–22.
7. Wu X, Pei J, Han X-H, Chen Y-W, Yao J, Liu Y, et al. FedEL: Federated ensemble learning for non-iid data. Expert Syst Appl. 2024;237:121390.
- View Article
- Google Scholar
8. Kang M, Kim S, Jin KH. FedNN: Federated learning on concept drift data using weight and adaptive group normalizations. Pattern Recogn: J Pattern Recogn Soc. 2024;149.
- View Article
- Google Scholar
9. Ozfatura E, Ozfatura K, Gunduz D. FedADC: Accelerated federated learning with drift control. 2020
- View Article
- Google Scholar
10. Sabah F, Chen Y, Sarwar AR. Model optimization techniques in personalized federated learning: A survey. Expert Syst Appl. 2024;243(1):122874
11. Zhang K, Wang Y, Wang H, et al. Efficient federated learning on knowledge graphs via privacy-preserving relation embedding aggregation. arXiv preprint. 2022.
- View Article
- Google Scholar
12. Zhu X, Li G, Hu W. Heterogeneous federated knowledge graph embedding learning and unlearning. Proceedings of the ACM web conference 2023. 2023. p. 2444–54.
13. Bordes A, Usunier N, Garcia-Duran A, et al. Translating embeddings for modeling multi-relational data. Adv. Neural Inform Process. Syst. 2013;26.
- View Article
- Google Scholar
14. Yang B, Yih W, He X, et al. Embedding entities and relations for learning and inference in knowledge bases. arxiv preprint arxiv:1412.6575, 2014.
15. Trouillon T, Welbl J, Riedel S, et al. Complex embeddings for simple link prediction. Proceedings of the international conference on machine learning; 2016. p. 2071–80
16. Sun Z, Deng ZH, Nie JY. Rotate: Knowledge graph embedding by relational rotation in complex space. arXiv preprint. 2019.
- View Article
- Google Scholar
17. Shokrzadeh Z, Feizi-Derakhshi MR, Balafar MA. Knowledge graph-based recommendation system enhanced by neural collaborative filtering and knowledge graph embedding. Ain Shams Eng J. 2024;15(1):102263
18. Wang H, Ren H, Leskovec J. Relational message passing for knowledge graph completion. Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining; 2021. p. 1697–707. doi: https://doi.org/10.1145/3447548.3467247
19. Fang H, Wang Y, Tian Z. Learning knowledge graph embedding with a dual-attention embedding network. Expert Syst Appl. 2023;212:118806.
- View Article
- Google Scholar
20. Jang J, Ha H, Jung D. Fedclassavg: Local representation learning for personalized federated learning on heterogeneous neural networks. Proceedings of the 51st international conference on parallel processing. 2022. p. 1–10.
21. Bonawitz K, Ivanov V, Kreuter B, et al. Practical secure aggregation for privacy-preserving machine learning. Proceedings of the 2017 ACM SIGSAC conference on computer and communications security; 2017. p. 1175–91.
22. Yang Q, Liu Y, Chen T. Federated machine learning: Concept and applications. ACM Trans Intell Syst Technol (TIST). 2019;10(2):1–19.
- View Article
- Google Scholar
23. Li Q, Wen Z, Wu Z. A survey on federated learning systems: Vision, hype and reality for data privacy and protection. IEEE Trans Knowl Data Eng. 2021;35(4):3347–66.
- View Article
- Google Scholar
24. Li T, Sahu A K, Zaheer M, et al. Federated optimization in heterogeneous networks. Proc Mach Learn Syst. 2020;2:429–50.
- View Article
- Google Scholar
25. Karimireddy SP, Kale S, Mohri M. Scaffold: Stochastic controlled averaging for federated learning. Proceedings of the international conference on machine learning; 2020. p. 5132–43.
26. Chen M, Zhang W, Yuan Z, et al. Embedding knowledge graphs in federated setting. Proceedings of the 10th international joint conference on knowledge graphs; 2021. p. 80–8.
27. Peng H, Li H, Song Y. Differentially private federated knowledge graphs embedding. Proceedings of the 30th ACM international conference on information & knowledge management; 2021. p. 1416–25.
28. Fu L, Zhang H, Gao G. Client selection in federated learning: Principles, challenges, and opportunities. IEEE Internet Things J. 2023.
- View Article
- Google Scholar
29. Hanzely F, Richtárik P. Federated learning of a mixture of global and local models. arXiv. 2020.
- View Article
- Google Scholar
30. Seo H, Park J, Oh S, et al. Federated knowledge distillation. Mach Learn Wireless Commun. 2022:457.
31. Zidi I, Issaoui I, El Khediri S, Khan RU. An approach based on NSGA-III algorithm for solving the multi-objective federated learning optimization problem. Int J Inf Tecnol 2024;16(5):3163–75.
- View Article
- Google Scholar
32. Schroff F, Kalenichenko D, Philbin J. FaceNet: A unified embedding for face recognition and clustering. Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
- View Article
- Google Scholar
33. Wang H, Ren H, Leskovec J. Relational message passing for knowledge graph completion. Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining; 2021. p. 1697–707.
34. Dettmers T, Minervini P, Stenetorp P. Convolutional 2d knowledge graph embeddings. Proceedings of the AAAI conference on artificial intelligence; 2018, 32(1).
35. Xiong W, Hoang T, Wang WY. Deeppath: A reinforcement learning method for knowledge graph reasoning. arXiv preprint. 2017.
- View Article
- Google Scholar

[ref1] 1. Cao Jiahang, Fang Jinyuan, Meng Zaiqiao, et al. Knowledge graph embedding: A survey from the perspective of representation spaces. ACM Comput Surv. 2024.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Ligabue PDM, Brando AAF, Peres SM. Applying a context-based method to build a knowledge graph for the blue Amazon. Data Intelligence; 2024(1).

[ref3] 3. Ganchev I. Knowledge graph embedding using a multi-channel interactive convolutional neural network with triple attention. Mathematics 2024;12(12):1821.
View Article
Google Scholar

[6] View Article

[7] Google Scholar

[ref4] 4. Chai H, Huang Y, Xu L, Song X, He M, Wang Q. A decentralized federated learning-based cancer survival prediction method with privacy protection. Heliyon 2024;10(11):e31873. pmid:38845954
View Article
PubMed/NCBI
Google Scholar

[9] View Article

[10] PubMed/NCBI

[11] Google Scholar

[ref5] 5. Milasheuski U, Barbieri L, Tedeschini BC. On the impact of data heterogeneity in federated learning environments with application to healthcare networks. IEEE Trans Artif Intel. 2024.
View Article
Google Scholar

[13] View Article

[14] Google Scholar

[ref6] 6. Li Q, He B, Song D. Model-contrastive federated learning. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021. p. 10713–22.

[ref7] 7. Wu X, Pei J, Han X-H, Chen Y-W, Yao J, Liu Y, et al. FedEL: Federated ensemble learning for non-iid data. Expert Syst Appl. 2024;237:121390.
View Article
Google Scholar

[17] View Article

[18] Google Scholar

[ref8] 8. Kang M, Kim S, Jin KH. FedNN: Federated learning on concept drift data using weight and adaptive group normalizations. Pattern Recogn: J Pattern Recogn Soc. 2024;149.
View Article
Google Scholar

[20] View Article

[21] Google Scholar

[ref9] 9. Ozfatura E, Ozfatura K, Gunduz D. FedADC: Accelerated federated learning with drift control. 2020
View Article
Google Scholar

[23] View Article

[24] Google Scholar

[ref10] 10. Sabah F, Chen Y, Sarwar AR. Model optimization techniques in personalized federated learning: A survey. Expert Syst Appl. 2024;243(1):122874

[ref11] 11. Zhang K, Wang Y, Wang H, et al. Efficient federated learning on knowledge graphs via privacy-preserving relation embedding aggregation. arXiv preprint. 2022.
View Article
Google Scholar

[27] View Article

[28] Google Scholar

[ref12] 12. Zhu X, Li G, Hu W. Heterogeneous federated knowledge graph embedding learning and unlearning. Proceedings of the ACM web conference 2023. 2023. p. 2444–54.

[ref13] 13. Bordes A, Usunier N, Garcia-Duran A, et al. Translating embeddings for modeling multi-relational data. Adv. Neural Inform Process. Syst. 2013;26.
View Article
Google Scholar

[31] View Article

[32] Google Scholar

[ref14] 14. Yang B, Yih W, He X, et al. Embedding entities and relations for learning and inference in knowledge bases. arxiv preprint arxiv:1412.6575, 2014.

[ref15] 15. Trouillon T, Welbl J, Riedel S, et al. Complex embeddings for simple link prediction. Proceedings of the international conference on machine learning; 2016. p. 2071–80

[ref16] 16. Sun Z, Deng ZH, Nie JY. Rotate: Knowledge graph embedding by relational rotation in complex space. arXiv preprint. 2019.
View Article
Google Scholar

[36] View Article

[37] Google Scholar

[ref17] 17. Shokrzadeh Z, Feizi-Derakhshi MR, Balafar MA. Knowledge graph-based recommendation system enhanced by neural collaborative filtering and knowledge graph embedding. Ain Shams Eng J. 2024;15(1):102263

[ref18] 18. Wang H, Ren H, Leskovec J. Relational message passing for knowledge graph completion. Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining; 2021. p. 1697–707. doi: https://doi.org/10.1145/3447548.3467247

[ref19] 19. Fang H, Wang Y, Tian Z. Learning knowledge graph embedding with a dual-attention embedding network. Expert Syst Appl. 2023;212:118806.
View Article
Google Scholar

[41] View Article

[42] Google Scholar

[ref20] 20. Jang J, Ha H, Jung D. Fedclassavg: Local representation learning for personalized federated learning on heterogeneous neural networks. Proceedings of the 51st international conference on parallel processing. 2022. p. 1–10.

[ref21] 21. Bonawitz K, Ivanov V, Kreuter B, et al. Practical secure aggregation for privacy-preserving machine learning. Proceedings of the 2017 ACM SIGSAC conference on computer and communications security; 2017. p. 1175–91.

[ref22] 22. Yang Q, Liu Y, Chen T. Federated machine learning: Concept and applications. ACM Trans Intell Syst Technol (TIST). 2019;10(2):1–19.
View Article
Google Scholar

[46] View Article

[47] Google Scholar

[ref23] 23. Li Q, Wen Z, Wu Z. A survey on federated learning systems: Vision, hype and reality for data privacy and protection. IEEE Trans Knowl Data Eng. 2021;35(4):3347–66.
View Article
Google Scholar

[49] View Article

[50] Google Scholar

[ref24] 24. Li T, Sahu A K, Zaheer M, et al. Federated optimization in heterogeneous networks. Proc Mach Learn Syst. 2020;2:429–50.
View Article
Google Scholar

[52] View Article

[53] Google Scholar

[ref25] 25. Karimireddy SP, Kale S, Mohri M. Scaffold: Stochastic controlled averaging for federated learning. Proceedings of the international conference on machine learning; 2020. p. 5132–43.

[ref26] 26. Chen M, Zhang W, Yuan Z, et al. Embedding knowledge graphs in federated setting. Proceedings of the 10th international joint conference on knowledge graphs; 2021. p. 80–8.

[ref27] 27. Peng H, Li H, Song Y. Differentially private federated knowledge graphs embedding. Proceedings of the 30th ACM international conference on information & knowledge management; 2021. p. 1416–25.

[ref28] 28. Fu L, Zhang H, Gao G. Client selection in federated learning: Principles, challenges, and opportunities. IEEE Internet Things J. 2023.
View Article
Google Scholar

[58] View Article

[59] Google Scholar

[ref29] 29. Hanzely F, Richtárik P. Federated learning of a mixture of global and local models. arXiv. 2020.
View Article
Google Scholar

[61] View Article

[62] Google Scholar

[ref30] 30. Seo H, Park J, Oh S, et al. Federated knowledge distillation. Mach Learn Wireless Commun. 2022:457.

[ref31] 31. Zidi I, Issaoui I, El Khediri S, Khan RU. An approach based on NSGA-III algorithm for solving the multi-objective federated learning optimization problem. Int J Inf Tecnol 2024;16(5):3163–75.
View Article
Google Scholar

[65] View Article

[66] Google Scholar

[ref32] 32. Schroff F, Kalenichenko D, Philbin J. FaceNet: A unified embedding for face recognition and clustering. Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
View Article
Google Scholar

[68] View Article

[69] Google Scholar

[ref33] 33. Wang H, Ren H, Leskovec J. Relational message passing for knowledge graph completion. Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining; 2021. p. 1697–707.

[ref34] 34. Dettmers T, Minervini P, Stenetorp P. Convolutional 2d knowledge graph embeddings. Proceedings of the AAAI conference on artificial intelligence; 2018, 32(1).

[ref35] 35. Xiong W, Hoang T, Wang WY. Deeppath: A reinforcement learning method for knowledge graph reasoning. arXiv preprint. 2017.
View Article
Google Scholar

[73] View Article

[74] Google Scholar

Figures

Abstract

1 Introduction

2 Related work

2.1 Knowledge graph

2.2 Federated learning

3 HFKG-RFE algorithm

3.1 Federated algorithm HFKG

3.2 Client update

3.3 Server update

3.4 Knowledge graph embedding model RFE

4 Dataset construction

5 Experiment and analysis

5.1 Federated algorithm HFKG

5.2 Client training

5.3 Server aggregation

5.4 Federated algoritnm HFKG

5.5 Knowledge graph RFE

5.6 Algorithm HFKG-RFE

6 Conclusion

References