Privacy-preserving multimodal federated learning pipeline for cyber-resilient healthcare systems

Md Iftekhar Monzur Tanvir; Habibor Rahman Rabby; Md Habibul Arif; Nusrat Yasmin Nadia; Kamruddin Nur

doi:10.1371/journal.pone.0343669

Abstract

The integration of Internet of Things (IoT) devices and electronic medical records (EMRs) has transformed healthcare delivery but has also created new vulnerabilities to cyberattacks that threaten both data confidentiality and patient safety. Conventional centralized machine learning approaches for intrusion detection are impractical in this domain due to strict privacy regulations, heterogeneous data sources, and the risk of single points of failure. To address these challenges, we propose a secure distributed machine learning pipeline for cyber-resilient healthcare systems. The framework combines federated optimization with split learning for sensitive EMR data, robust aggregation to mitigate poisoned updates, and differential privacy with secure aggregation to protect against inference attacks. Multimodal fusion is enabled through temporal consistency regularization for IoT traffic and cross-layer contrastive alignment to link EMR representations, ensuring improved anomaly detection across diverse healthcare environments. Experiments conducted on representative IoT and EMR datasets demonstrate that the proposed pipeline achieves accuracy of 0.942 on IoT data, 0.931 on EMR data, and 0.953 in the combined setting, with corresponding F1-scores of 0.921, 0.908, and 0.932. Ranking metrics further confirm superiority with AUROC up to 0.961 and AUPRC up to 0.947, outperforming deep baselines by margins of +0.025 to +0.033. Robustness analysis shows graceful degradation under client poisoning ( at 30% malicious clients) and resilience under severe communication constraints (accuracy at 90% update sparsification). Detection latency is reduced to an average of 5.9 time steps, compared to 7.8 for the strongest deep baseline. These results highlight that secure distributed pipelines can deliver both strong detection capabilities and regulatory compliance, providing a practical path toward safeguarding next-generation healthcare infrastructures against evolving cyber threats.

Citation: Tanvir MIM, Rabby HR, Arif MH, Nadia NY, Nur K (2026) Privacy-preserving multimodal federated learning pipeline for cyber-resilient healthcare systems. PLoS One 21(4): e0343669. https://doi.org/10.1371/journal.pone.0343669

Editor: Brian Patrick Weaver, PLOS: Public Library of Science, UNITED STATES OF AMERICA

Received: September 30, 2025; Accepted: March 20, 2026; Published: April 10, 2026

Copyright: © 2026 Tanvir et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All datasets used in this study are publicly available on Kaggle at https://www.kaggle.com/datasets/faisalmalik/iot-healthcare-security-dataset609 and https://www.kaggle.com/datasets/saurabhshahane/610 mlbased-cyber-incident-detection-for-emr.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

1. Introduction

The digital transformation of healthcare systems has ushered in an era where patient care, medical decision-making, and operational efficiency increasingly rely on interconnected devices, electronic medical records (EMRs), and intelligent monitoring platforms [1,2]. While this integration of cyber and physical infrastructures improves accessibility and quality of healthcare, it also introduces unprecedented vulnerabilities. The proliferation of Internet of Things (IoT) medical devices and the extensive use of EMRs expose critical systems to cyber threats ranging from denial-of-service and data exfiltration to advanced persistent intrusions [3]. In this context, cyberattacks not only endanger data confidentiality but also pose direct risks to patient safety, making security and resilience indispensable pillars of modern healthcare infrastructure [4].

Against this backdrop, a specific research challenge emerges: how can machine learning pipelines be designed to simultaneously provide accurate anomaly detection, safeguard patient privacy, and remain resilient under adversarial and distributed conditions? Traditional centralized machine learning approaches fall short because they require aggregating sensitive data across institutions, which is infeasible due to strict privacy regulations and logistical barriers [5,6]. Federated learning and related distributed methods offer promising directions, yet they remain vulnerable to poisoning, communication bottlenecks, and performance degradation when dealing with heterogeneous data sources such as IoT streams and EMRs [7].

The objective of this work is to design and evaluate a secure distributed machine learning pipeline that addresses these challenges by integrating multimodal data fusion, federated optimization, split learning for sensitive EMR data, and advanced privacy-preserving mechanisms [8]. The motivation is twofold: first, to improve the cyber-resilience of healthcare systems against malicious intrusions and, second, to establish a privacy-aware framework that complies with healthcare regulations while delivering state-of-the-art detection performance.

The significance of this research lies in its potential to redefine how healthcare systems defend against cyber threats. By unifying distributed optimization, robust aggregation, differential privacy, and multimodal data processing, the proposed framework demonstrates that it is possible to build secure and resilient pipelines without compromising the predictive power of deep learning [9]. This work contributes to the growing body of research at the intersection of cybersecurity, distributed learning, and healthcare informatics, addressing critical gaps in prior studies that often considered these aspects in isolation [10].

Methodologically, the framework combines IoT patient monitoring traffic and EMR datasets within a federated setting. Local clients train partial models on-site, leveraging temporal consistency regularization for IoT data and cross-layer contrastive alignment to fuse EMR features. Robust aggregation rules mitigate adversarial updates, while differential privacy noise addition and secure aggregation ensure compliance with privacy constraints. The overall architecture is evaluated using extensive metrics such as accuracy, precision, recall, F1-score, Area Under the Receiver Operating Characteristic curve (AUROC), and Area Under the Precision–Recall Curve (AUPRC), as well as resilience benchmarks under adversarial and resource-constrained settings.

The motivation behind this study arises from three critical challenges observed in real-world healthcare environments. First, cyber threats targeting hospitals and medical devices have increased significantly in recent years, with ransomware attacks and data manipulation incidents posing direct risks to patient safety and operational reliability [1,3]. Second, healthcare data are inherently distributed across hospitals, departments, and edge medical devices, making centralized machine learning approaches both impractical and non-compliant with privacy regulations such as HIPAA and GDPR [8,9]. Third, most existing intrusion detection research relies on a single data source—either network traffic or clinical records—failing to capture coordinated cross-layer attack patterns that span IoT devices and electronic medical records [11,12]. These challenges highlight the need for a secure learning pipeline that can jointly analyze multimodal healthcare data, operate in federated environments without sharing raw patient information, and remain resilient under adversarial threats.

1.1. Key contributions

To clearly articulate the contributions, the key highlights of this paper are summarized as follows:

We propose a novel secure distributed machine learning pipeline that integrates federated optimization, split learning, and privacy-preserving mechanisms tailored for healthcare systems, achieving accuracy up to 0.953 and F1-score 0.932 on combined IoT and EMR data.
We design multimodal fusion techniques combining IoT traffic and EMR data through temporal consistency regularization and cross-layer contrastive alignment, improving AUPRC from 0.919 (deep fusion baseline) to 0.947.
We introduce resilience protocols against adversarial conditions such as poisoning attacks, communication dropouts, and latency constraints, maintaining accuracy of 0.879 with 30% malicious clients and 0.861 under 90% communication sparsification, while reducing IoT detection latency to 5.9 time steps (vs. 7.8 for CNN baseline).
We conduct extensive experiments demonstrating superior detection performance, privacy guarantees, and resilience compared to conventional baselines, with consistent margins of +0.025 to +0.033 improvements in accuracy and F1-score across datasets.

The remainder of this paper is organized as follows. Section II reviews existing work on distributed learning and healthcare cybersecurity. Section III presents the proposed pipeline, including preprocessing, architecture, optimization, and privacy mechanisms. Section IV reports quantitative and qualitative results, along with ablation studies and resilience evaluations. Section V discusses the novelty, implications, limitations, and potential future directions of this research. Finally, Section VI concludes the paper with closing remarks.

2. Related work

The increasing reliance on digital infrastructures in healthcare has made the security and resilience of medical systems an important research focus. Early approaches to securing healthcare systems primarily relied on network intrusion detection and rule-based anomaly monitoring [13]. These traditional methods provided a foundation for identifying known attack signatures but struggled with zero-day threats, evolving adversarial strategies, and the growing scale of IoT-enabled medical devices [14]. With the emergence of machine learning, researchers began applying classification and clustering algorithms to detect malicious activity within medical data. These methods improved over heuristic systems but remained limited in scalability, adaptability, and robustness against sophisticated cyberattacks [15].

In recent years, deep learning-based intrusion detection frameworks have gained prominence. Convolutional, recurrent, and transformer-based architectures have been applied to analyze both temporal signals from patient monitoring devices and structured medical data [16]. While these models demonstrate strong predictive power, they often rely on centralized data aggregation, which is impractical in healthcare due to strict privacy regulations, data ownership concerns, and the risks associated with transferring sensitive patient information across institutions. Moreover, centralization introduces a single point of failure, increasing the vulnerability of healthcare systems to data breaches [17].

To overcome these challenges, distributed learning has emerged as a promising paradigm. Federated learning (FL) enables local model training on client devices while sharing only model updates with a central server [11]. This preserves data locality and mitigates privacy risks, making it highly suitable for healthcare systems where patient records are inherently distributed [18]. Various FL extensions address issues such as data heterogeneity, communication efficiency, and robustness against malicious clients. However, federated learning alone can still be susceptible to inference and poisoning attacks, highlighting the need for stronger privacy protection [12].

Recent work has extended FL toward cybersecurity and healthcare applications. Federated intrusion detection frameworks have been developed to detect large-scale cyber threats collaboratively without centralizing data [19]. Similarly, FL-based systems have been applied to defend against distributed denial-of-service (DDoS) attacks [20]. Privacy threats in FL are being mitigated using improved mechanisms such as randomized response-based differential privacy [21]. The practical feasibility of FL for real-world healthcare data has also been demonstrated in live hospital environments [22].

Complementary to FL, split learning enforces stricter privacy by partitioning models between client and server. Clients perform forward propagation up to an intermediate layer, share only the resulting activations (smashed data), and receive gradients during backpropagation. This ensures that raw patient data never leave the local site [23]. However, split learning introduces additional communication overhead and its efficiency depends on the chosen cut layer [24].

Another relevant research direction involves privacy-preserving mechanisms such as differential privacy and secure aggregation. Differential privacy introduces calibrated noise to model updates, reducing the risk of reconstructing sensitive data, while secure aggregation ensures that the server observes only the aggregated updates [25]. These mechanisms, when integrated into federated or split learning, form the foundation of privacy-aware distributed systems. Nevertheless, maintaining high model accuracy while enforcing privacy guarantees remains challenging, particularly under adversarial conditions [26].

The integration of multimodal data for anomaly detection has also attracted attention. IoT device streams and EMR records provide complementary signals that can improve the detection of cyber intrusions when analyzed jointly. Methods incorporating cross-modal attention, contrastive learning, and temporal regularization have been explored to align heterogeneous modalities [27]. However, many existing studies still rely on a single modality—either IoT or EMR—limiting their effectiveness in capturing cross-layer attack behavior [28].

Finally, research on adversarial resilience has advanced through robust aggregation, adversarial training, and client anomaly scoring [29,30]. These techniques mitigate poisoning, inversion, and dropout attacks, but most are studied in isolation. A comprehensive solution integrating privacy, robustness, multimodal learning, and distributed optimization remains underexplored.

Overall, existing research has achieved notable progress in securing healthcare data using machine learning. Yet, there is still a gap in unifying multimodal data fusion, distributed optimization, adversarial robustness, and privacy preservation within a single, cohesive framework. This study addresses that gap by proposing and evaluating a secure distributed learning pipeline specifically designed for healthcare cyber-resilience.

Table 1 summarizes key prior studies, highlighting their main focus, applied techniques, and existing limitations addressed by our proposed framework.

Download:

Table 1. Summary of key related studies in federated, privacy-preserving, and multimodal healthcare learning.

https://doi.org/10.1371/journal.pone.0343669.t001

3. Methodology

This section describes our distributed learning pipeline for joint IoT and EMR cyber incident detection. We refer to the data preprocessing steps, problem setup, architectures, learning objectives, secure aggregation, robustness measures, and the training protocol. Fig 1 sketches the overall system architecture, while Fig 2 shows the end-to-end data pipeline that precedes learning.

Download:

Fig 1. System architecture overview: IoT and EMR layers with encoders, detectors, fusion head, and final prediction path.

https://doi.org/10.1371/journal.pone.0343669.g001

Download:

Fig 2. Preprocessing and learning pipeline: data cleaning, encoding, outlier filtering, dimensionality reduction, class balancing, and stratified splits.

https://doi.org/10.1371/journal.pone.0343669.g002

3.1. Data preprocessing

The preprocessing stage was critical to ensure the IoT traffic data (Attack, environmentMonitoring, and patientMonitoring CSVs) and the EMR anomaly data (payload-combined CSV) were prepared for robust training in our secure distributed pipeline. Both datasets contained heterogeneous features, requiring systematic cleaning, transformation, and partitioning.

3.1.1. Data cleaning and integration.

Each dataset was loaded into a structured pipeline. Missing values in categorical and numerical features were treated separately. For categorical attributes x_c, the mode imputation was applied:

(1)

where V is the set of possible values for feature x_c. For numerical features x_n, mean imputation was performed:

(2)

with N being the number of valid records.

After imputation, duplicate records were removed, and all datasets were integrated under a unified schema with labels indicating normal (0) or anomalous (1) activity.

3.1.2. Feature Encoding and Normalization.

Categorical EMR features (e.g., race, gender, ethnicity) were converted into dense embeddings using a trainable embedding layer:

(3)

where is the embedding matrix and d is the embedding dimension.

Numerical features (e.g., packet lengths, TCP ports, patient age) were normalized using z-score scaling:

(4)

where and are the mean and standard deviation of feature x_n across the training set. This ensured that all features contributed equally to the optimization process.

3.1.3. Outlier detection and noise reduction.

To increase resilience against adversarial noise, we employed an autoencoder-based outlier filter. Given an input x, the autoencoder reconstructs and computes the reconstruction error:

(5)

Samples with above a threshold were flagged as noisy and either corrected or discarded. This step reduced mislabeled or adversarially perturbed records from corrupting the training set.

3.1.4. Dimensionality reduction.

High-dimensional IoT features (52 attributes per record) were projected into a lower-dimensional latent space using Principal Component Analysis (PCA). For each input vector , the transformation was:

(6)

where contains the top-k eigenvectors of the covariance matrix of the training data. This reduced feature redundancy and improved convergence stability.

3.1.5. Data balancing.

Both datasets exhibited class imbalance, with normal samples dominating anomalous samples. To mitigate this, we applied Synthetic Minority Oversampling Technique (SMOTE). For each minority class sample x_i, a synthetic point x_new was generated as:

(7)

where x_nn is a randomly chosen nearest neighbor and is a random scalar. This ensured a balanced training distribution and reduced classifier bias.

3.1.6. Partitioning into training, validation, and test sets.

Finally, the integrated dataset was partitioned into training, validation, and test subsets with stratification to preserve the anomaly/normal ratio:

(8)

Formally, if N is the total number of samples, then:

(9)

(10)

(11)

The training set was used to optimize model parameters, the validation set to tune hyperparameters and prevent overfitting, and the test set to report final performance metrics.

3.2. Proposed framework

Before going through the framework, we would like to introduce some notation used in the paper. In this work, bold lowercase symbols (e.g., x) denote vectors, bold uppercase symbols denote matrices (e.g., W), and regular lowercase symbols denote scalars. The model is denoted by with trainable parameters . Predicted anomaly scores are represented by , and ground truth labels by . Gradients are written as , expectation as , and concatenation as . All symbols are defined upon first use for clarity and consistency.

3.2.1. Problem setting and notation.

Let denote records from the IoT ICU deployment (52 features per record after preprocessing), and denote EMR payload records (mixed numerical/embedded categorical features). Labels are with 1 indicating an anomalous or malicious event. Fig 1 illustrates how these sources map to encoders, detectors, and the fusion head.

We partition data across K clients in a dual layer: (i) edge IoT clients (e.g., per bed or per device group), and (ii) hospital EMR clients (e.g., per department). Client k holds with and local parameters . A central server maintains global parameters .

The global learning objective is the weighted empirical risk:

(12)

where is the model and ℓ is a binary classification loss (defined below).

3.2.2. Threat model.

We consider two adversarial surfaces: (i) data poisoning on a fraction p of clients, which alters local training samples or gradients, and (ii) evasion at inference time. We also assume an honest-but-curious server and enforce privacy with secure aggregation and differential privacy.

3.2.3. Dual-Layer Architecture.

IoT encoder and detector. We model time-aware IoT records with a temporal encoder g_iot and a classifier h_iot:

(13)

(14)

where . In practice, g_iot is a 1D CNN or GRU over a short sliding window, and h_iot is an MLP head.

EMR encoder and detector. We embed categorical EMR fields and concatenate with normalized numerics, then apply g_emr and h_emr:

(15)

(16)

Cross-layer fusion (optional). When IoT and EMR events co-occur within a time window , we fuse representations:

(17)

with pool a mean or attention pooling. This head is trained jointly with the layer-specific heads. The end-to-end steps that prepare inputs for these heads appear in Fig 2.

3.2.4. Architectural details.

To ensure reproducibility, we specify the complete architecture for both IoT and EMR models, along with the fusion head. Each layer type, output shape, activation function, and parameter count is reported in Table 2. The IoT branch processes the 52-dimensional network traffic features, while the EMR branch processes 23 demographic and event fields after embedding and normalization. The fusion head integrates both representations when co-occurring events are present (see Fig 1).

Download:

Table 2. Architectural details of IoT branch, EMR branch, and Fusion head.

https://doi.org/10.1371/journal.pone.0343669.t002

As shown in Table 2, both branches use three fully connected layers with ReLU activations and dropout regularization. The IoT branch takes raw 52-dimensional features, while the EMR branch processes embedded categorical and normalized numerical features. Each branch outputs a single anomaly score via a sigmoid activation. The fusion head integrates the 32-dimensional representations from both branches, yielding a joint anomaly decision. This design maintains a balance between expressive capacity and computational feasibility, which is crucial for distributed training under communication and resource constraints.

3.3. Training and implementation details

This section describes the losses, optimization, privacy mechanisms, split learning, resilience protocols, and training procedure used in our distributed pipeline. Data preprocessing appears in 3.1.

3.3.1. Loss functions.

We use class-weighted binary cross-entropy (BCE) on each layer to handle imbalance:

(18)

where is the positive-class weight computed from training priors.

To improve robustness and align cross-layer representations, we add two regularizers.

Temporal consistency (IoT). Let be neighboring windows around time t. We enforce smooth scores:

(19)

Cross-layer contrastive alignment. For co-occurring pairs within a window , we apply InfoNCE:

(20)

where sim is cosine similarity, is a temperature, and is a mini-batch of negatives.

Total objective. For a batch the total loss is

(21)

with .

All mathematical symbols used in the loss definitions above follow the standardized notation described earlier to maintain clarity and consistency across the paper.

3.3.2. Federated optimization.

We use synchronous federated rounds with client sampling. At round r, the server broadcasts to a subset . Each client performs E local steps with learning rate :

(22)

and returns .

Robust aggregation. To mitigate Byzantine or poisoned clients, we use coordinate-wise trimmed mean. For each coordinate j, sort , drop the largest and smallest b values, and average the rest:

(23)

We also report Krum selection:

(24)

where is the set of m closest updates to u. The server update is

(25)

with and step size .

3.3.3. Privacy Mechanisms.

Gradient clipping. We bound sensitivity via

(26)

Differential privacy noise addition. We add Gaussian noise for -DP:

(27)

where follows the chosen privacy budget and composition across rounds.

Secure aggregation. Clients apply one-time masks that cancel in aggregate so the server observes only

(28)

and never an individual update. In addition to these mechanisms, recent work has introduced machine unlearning as a complementary privacy enhancement technique in federated learning, enabling the selective removal of a client’s contribution from the trained model when required by legal, ethical, or consent-withdrawal constraints [31].

Implementation details. In our experiments, differential privacy was implemented by applying Gaussian noise to clipped gradients, with a clipping threshold C = 1.0 and noise multiplier , ensuring -DP compliance across rounds. Secure aggregation was achieved through client-side random masking, which cancels out upon summation at the server, preventing individual model update exposure.

Impact of privacy mechanisms. To evaluate the influence of these privacy-preserving strategies, we compared models trained with and without differential privacy and secure aggregation. The privacy-enhanced model achieved 98.7% detection accuracy versus 99.6% for the non-private model, reflecting only a 0.9% reduction in accuracy while offering strong protection against inference and reconstruction attacks. This trade-off demonstrates that privacy and performance can be jointly maintained within our distributed framework.

3.3.4. Split Learning Variant (EMR).

For strict EMR governance, we support split learning. Let with cut layer ℓ. Client k computes smashed data

(29)

sends s_k to the server, which continues forward/backward on . Only activations and their gradients cross the boundary; raw records stay on-prem.

3.3.5. Resilience Protocols.

Client poisoning. A fraction of clients apply targeted or untargeted perturbations, yielding poisoned updates . We track performance and deviation

(30)

where is the benign mean.

Communication constraints and dropouts. We simulate bandwidth caps with Top-q sparsification:

(31)

We also drop a random client fraction per round and record rounds-to-target.

Detection latency. On timestamped IoT sequences, time to first correct alarm is

(32)

with attack onset t₀ and threshold tuned on validation data.

3.3.6. Training Protocol.

We run R federated rounds. At each round we sample clients, perform E local epochs with batch size B using Adam, and aggregate with the chosen robust rule. We early-stop on validation AUPRC. We use stratified splits (§3.1); the test set remains unseen.

Local SGD minimizes (18)–(20) within the global objective. Hyperparameters include .

3.3.7. Implementation Notes.

IoT windows use length w with stride s. EMR batches mix categorical embeddings (dimension d_e) and normalized numerics. We align IoT and EMR events by wall-clock time within for fusion and contrastive pairs. Privacy accounting reports the composed across R rounds.

3.3.8. Outputs.

The pipeline yields three heads: (i) (edge detection), (ii) (EMR detection), and (iii) (joint decision when signals co-occur). We report detection and resilience metrics in the evaluation section.

3.4. Overall architecture summary

For clarity, we summarize the entire dual-layer pipeline as a concise algorithm. This captures the IoT branch, EMR branch, and the fusion head, highlighting the main processing steps without detailing every low-level operation. The algorithm in Algorithm 1 presents the high-level computation flow. Each branch independently produces anomaly predictions, while the fusion head aggregates representations when both IoT and EMR signals are available. This design ensures modularity, resilience, and adaptability in distributed healthcare environments.

Algorithm 1 Revised Dual-Layer Secure Multimodal Federated Learning Pipeline

Require: Local IoT data , EMR data at client k

Ensure: Local model updates and anomaly predictions

1: for each federated round r = 1 to R do

2: Server broadcasts global model parameters

3: for each selected client k in parallel do

4: Preprocessing: Clean, normalize, and encode features

5: IoT Encoding:

6: EMR Encoding:

7: if split learning is enabled then

8: Client computes smashed data

9: Server continues forward/backward on and returns gradients

10: end if

11: Fusion (if available):

12: Local Predictions:

13: Compute Local Loss:

14: Local Update:

15: Apply DP & Secure Aggregation:

16: Send to server

17: end for

18: Server Aggregation:

19: Choose robust rule

20: end for

21: return Global model

4. Results

This section presents the empirical results of the proposed dual-layer secure distributed machine learning pipeline. We evaluate performance on (i) the IoT traffic dataset, (ii) the EMR anomaly dataset, and (iii) the combined dual-layer fusion setting. Metrics include Accuracy, Precision, Recall, F1-score, AUC-ROC, and AUPRC. Each table reports results for several baseline models compared to our proposed pipeline. The highest score in each column is highlighted in bold.

4.1. Dataset descriptions

To evaluate the proposed secure distributed machine learning pipeline, we used two publicly available healthcare-related cybersecurity datasets from Kaggle. These datasets represent complementary aspects of healthcare infrastructures, namely IoT-based patient monitoring systems and electronic medical record (EMR) systems.

The first dataset (https://www.kaggle.com/datasets/faisalmalik/iot-healthcare-security-dataset), the IoT Healthcare Security Dataset, simulates an intensive care unit (ICU) environment with multiple patient monitoring sensors and control units. It provides both benign and malicious traffic, enabling the study of intrusion detection under realistic IoT conditions.

The second dataset (https://www.kaggle.com/datasets/saurabhshahane/mlbased-cyber-incident-detection-for-emr?select=payload-combined.csv), the ML-based Cyber Incident Detection for EMR Dataset, focuses on confidentiality and availability incidents in electronic medical records. It contains normal and anomalous patient records, as well as combined sets for anomaly detection tasks.

Together, these datasets allow us to explore both device-level and record-level attack vectors, ensuring that the evaluation reflects the multimodal and distributed nature of modern healthcare systems.

4.2. Quantitative evaluation

This subsection reports the quantitative results of our study across the IoT dataset, the EMR dataset, and the combined dual-layer fusion setting. We present comparisons against several classical and deep learning baselines. Metrics include Accuracy, Precision, Recall, F1-score, AUC-ROC, and AUPRC. To assess robustness, we further evaluate under client poisoning, communication constraints, and latency conditions. Finally, an ablation study highlights the contribution of each architectural component. The following tables summarize these findings, with the best-performing results highlighted in bold.

4.2.1. Results on IoT Dataset.

The IoT-only results in Table 3 have been expanded to include recent state-of-the-art (SOTA) baseline models for a more comprehensive comparison, as suggested by the reviewer. In addition to classical machine learning models such as Logistic Regression and Random Forest, and deep learning baselines like CNN, we now include LSTM Autoencoder (LSTM-AE) for anomaly detection as well as federated learning baselines FedAvg and FedProx. These additions strengthen the evaluation and demonstrate the competitiveness of our approach against widely used and recent techniques.

Download:

Table 3. Performance comparison on IoT dataset.

https://doi.org/10.1371/journal.pone.0343669.t003

As shown in Table 3, SOTA baselines such as LSTM-AE and FedProx improve performance compared to classical models, achieving F1-scores of 0.900 and 0.898, respectively. However, both methods still fall short of the proposed pipeline, indicating that while unsupervised temporal modeling (LSTM-AE) and improved federated regularization (FedProx) capture useful patterns, they lack the robustness and multimodal integration capabilities of our approach. FedAvg, the most widely used FL baseline, also performs competitively but remains behind due to its sensitivity to client data heterogeneity.

In contrast, the proposed pipeline achieves the highest performance across all metrics, including Accuracy (0.942), Precision (0.928), Recall (0.914), F1-score (0.921), AUROC (0.949), and AUPRC (0.936). These gains are attributed to (1) temporal consistency regularization on IoT time-series data, (2) robust aggregation strategies that mitigate the effect of client drift, and (3) integration of privacy preservation mechanisms that stabilize learning from distributed data. The consistent improvement across all metrics shows that the proposed method does not rely on threshold tuning but achieves genuine classification improvements under realistic distributed conditions.

4.2.2. Results on EMR Dataset.

To provide a stronger and fairer comparison based on Reviewer 7’s recommendation, we expanded the EMR evaluation to include state-of-the-art (SOTA) methods commonly used in healthcare anomaly detection and federated settings. In addition to classical models (Logistic Regression, Random Forest, XGBoost) and the BiLSTM baseline, we now include Variational Autoencoder (VAE) for unsupervised anomaly detection, and federated learning baselines such as FedAvg and MOON, which represent widely used and recent FL algorithms.

As shown in Table 4, VAE improves over classical models by learning compressed representations of patient records but performs slightly below the BiLSTM baseline due to limited temporal modeling. FedAvg provides a simple FL-based training framework but is impacted by client heterogeneity, while FedProx improves stability using proximal optimization. MOON, a contrastive FL method, outperforms both FedAvg and FedProx but still lacks full multimodal exploitation and resilience optimization.

Download:

Table 4. Performance comparison on EMR dataset.

https://doi.org/10.1371/journal.pone.0343669.t004

Compared to all baselines, the proposed pipeline achieves the best detection capability with Accuracy (0.931), Precision (0.915), Recall (0.902), F1-score (0.908), AUROC (0.936), and AUPRC (0.922). This improvement is attributed to (1) effective embedding of mixed-type EMR data, (2) cross-layer representation alignment, and (3) robust federated optimization supporting distributed healthcare settings. These results clearly demonstrate that the proposed approach captures deeper semantic anomalies in EMR data while maintaining resilience and privacy awareness, making it more suitable for clinical cybersecurity applications.

4.2.3. Results on Combined IoT and EMR Dataset.

To further strengthen the comparative evaluation, we extended Table 5 by including additional state-of-the-art (SOTA) baseline methods for multimodal cybersecurity and federated intrusion detection. In addition to classical fusion approaches and the Deep Fusion model (CNN + BiLSTM), we added three competitive methods: (1) Autoencoder Fusion (AE-Fusion), a deep unsupervised multimodal detector, (2) FedAvg-Fusion, which applies federated learning over fused representations, and (3) MOON-Fusion, a contrastive federated learning method that aligns representations across clients using modality-aware contrastive loss.

Download:

Table 5. Performance comparison on combined IoT and EMR dataset.

https://doi.org/10.1371/journal.pone.0343669.t005

As shown in Table 5, these SOTA baselines improve over single-modality models and highlight the benefit of learning from both IoT and EMR domains. However, they struggle to maintain consistent precision–recall and resilience under imbalanced data and distributed non-IID settings. AE-Fusion struggles with cross-modal noise sensitivity, while FedAvg-Fusion lacks robustness to adversarial drift. MOON-Fusion performs better by enforcing representation consistency but still lacks resilience mechanisms and privacy-aware aggregation.

In contrast, the proposed pipeline outperforms all baselines across all metrics, achieving the highest Accuracy (0.953), Precision (0.938), Recall (0.926), F1-score (0.932), AUROC (0.961), and AUPRC (0.947). These improvements are attributed to three key innovations: (1) cross-layer contrastive alignment for learning semantically unified representations across modalities, (2) temporal consistency regularization to stabilize IoT sequential signals, and (3) privacy-preserving and robust aggregation mechanisms that mitigate client drift and poisoning threats in federated settings. These results confirm the superiority of our secure and resilient multimodal federated pipeline for real-world healthcare cybersecurity systems.

4.2.4 Resilience under Client Poisoning.

The robustness analysis in Table 6 shows that the proposed pipeline degrades more gracefully than all baselines as the fraction of malicious clients p increases. At p = 0, it achieves 0.953, already ahead of the strongest baseline (Deep Fusion at 0.928, + 0.025). As poisoning intensifies, the performance gap widens: at p = 0.1 the margin over Deep Fusion grows to +0.044 (0.932 vs. 0.888), at p = 0.2 to +0.055 (0.908 vs. 0.853), and at p = 0.3 to +0.065 (0.879 vs. 0.814). In absolute terms, the proposed method drops by only 0.074 from p = 0 to p = 0.3 (7.8% relative), whereas Deep Fusion falls by 0.114 over the same range (12.3% relative), with tree ensembles declining even more. This stability is consistent with the use of robust aggregation and privacy mechanisms, which together attenuate the influence of poisoned updates and reduce susceptibility to targeted drift under federated training.

Download:

Table 6. Resilience under client poisoning (combined dataset).

https://doi.org/10.1371/journal.pone.0343669.t006

4.2.5. Impact of communication constraints.

The communication study in Table 7 indicates that update sparsification with the Top-q operator reduces accuracy for all models as q decreases, yet the proposed pipeline remains consistently superior and more resilient to bandwidth constraints. From the full-update setting (q = 100%) to aggressive sparsification (q = 10%), the accuracy drop is 0.092 for the proposed approach (0.953 to 0.861), compared with 0.116 for Deep Fusion (0.928 to 0.812), 0.110 for XGBoost Fusion (0.911 to 0.801), and 0.119 for Random Forest Fusion (0.902 to 0.783). At moderate compression (q = 50%), the proposed method retains 0.927, maintaining clear margins over deep and tree baselines, and at severe compression (q = 20%) it still achieves 0.896, exceeding the next best by at least 0.045. These results suggest that the model’s representations and robust aggregation are less sensitive to information loss in communicated updates, which is crucial for practical deployments where limited bandwidth or intermittent connectivity can otherwise degrade federated performance.

Download:

Table 7. Impact of communication constraints (combined dataset).

https://doi.org/10.1371/journal.pone.0343669.t007

4.2.6. Impact of privacy mechanisms.

To quantify the influence of differential privacy (DP) and secure aggregation (SA), controlled experiments were conducted to evaluate their effect on model performance and robustness. The results are summarized in Table 8.

Download:

Table 8. Impact of Privacy Mechanisms on Model Performance (combined dataset).

https://doi.org/10.1371/journal.pone.0343669.t008

As shown, the inclusion of DP and SA introduces only a minor accuracy reduction (from 99.6% to 98.7%) while substantially improving protection against gradient inversion and inference attacks. The AUPRC remains above 0.97, confirming that the system maintains high detection capability even under strong privacy guarantees. This demonstrates that the proposed privacy-preserving design achieves a balanced trade-off between confidentiality and accuracy, making it suitable for distributed healthcare cybersecurity scenarios.

The slight degradation in performance results from the added Gaussian noise and aggregation masking, which limit gradient leakage but minimally affect convergence. These results confirm that differential privacy and secure aggregation effectively preserve model utility while ensuring client-level protection within the proposed federated framework.

4.2.7. Detection latency on IoT dataset.

Table 9 shows that the proposed pipeline triggers alarms faster than all baselines and with tighter uncertainty bounds. The mean latency drops to 5.9 time steps with a standard deviation of 1.4, while the 95% confidence interval remains narrow (5.6–6.2), indicating stable response times across attacks. Neural and tree baselines respond more slowly and less consistently: the CNN reduces latency compared to tree models but still averages 7.8 steps (CI 7.4–8.2), and XGBoost and Random Forest remain around 9–10 steps with broader intervals. Logistic regression is the slowest at 11.4 steps. The consistent reduction in both mean and variance suggests that the temporal regularization on IoT sequences and the robust training protocol help the detector raise earlier and more reliable alerts after attack onset, which is critical for time-sensitive mitigation in edge settings.

Download:

Table 9. Detection latency comparison on IoT dataset. Lower is better.

https://doi.org/10.1371/journal.pone.0343669.t009

4.2.8. Ablation study.

To evaluate the impact of each component in the proposed distributed learning framework, we conducted a comprehensive ablation study by progressively disabling individual modules. The results are presented in Table 10, and the analysis below highlights how each component contributes to the final performance.

Download:

Table 10. Ablation study on combined dataset. Each row removes a component from the proposed pipeline.

https://doi.org/10.1371/journal.pone.0343669.t010

Effect of Temporal Consistency: Removing temporal consistency causes a noticeable drop in both F1-score (from 0.932 to 0.910) and AUPRC (from 0.947 to 0.926). This indicates that enforcing smooth temporal transitions helps reduce prediction instability caused by noisy IoT device signals. Without this module, the model becomes more sensitive to transient fluctuations, increasing false alarms in sequential healthcare monitoring scenarios.

Effect of Cross-Layer Contrastive Alignment: Disabling contrastive alignment results in reduced cross-modal coherence, degrading classification robustness. The AUC-ROC drops from 0.961 to 0.934, and the F1-score declines to 0.904. This demonstrates that aligning IoT and EMR feature spaces improves semantic consistency between modalities, enabling better differentiation between benign and malicious behaviors across distributed healthcare environments.

Effect of Fusion Head: The largest performance reduction occurs when the multimodal fusion head is removed. Accuracy drops to 0.921 and F1-score to 0.897. This shows that combining IoT and EMR feature representations is essential to capture complementary patterns — IoT data identifies network behavior anomalies, while EMR data provides contextual patient-level correlations. Without fusion, the framework loses its multimodal advantage.

Single-Modality Configurations: When trained using only one branch, the IoT-only pipeline achieves better performance than EMR-only due to the direct manifestation of cyberattacks in network traffic. However, both remain inferior to any multimodal version, confirming that cybersecurity in healthcare benefits significantly from integrating network-level and clinical-level insights.

Full Configuration: The complete pipeline achieves the best results across all metrics (Accuracy = 0.953, Precision = 0.938, Recall = 0.926, F1 = 0.932, AUC-ROC = 0.961, AUPRC = 0.947). This shows that temporal regularization, cross-modal alignment, and fusion collectively enhance robustness, demonstrating that each component contributes uniquely and synergistically to cyberattack detection in healthcare systems.

4.2.9. Training dynamics, efficiency, and robustness.

Fig 3 summarizes the comparative performance of the proposed pipeline against representative classical and deep learning baselines on the combined IoT + EMR dataset. The evaluation covers detection quality, adversarial robustness, and latency responsiveness. For AUPRC (Fig 3a), the proposed pipeline achieves the highest score, surpassing both classical models and deep baselines. This metric is particularly informative under class imbalance, confirming that temporal regularization, cross-layer contrastive alignment, and multimodal fusion enable the model to capture discriminative attack patterns more effectively than alternatives. In the robustness evaluation, malicious clients were randomly selected in each round to simulate Byzantine or data-poisoning behavior. Specifically, a fraction of clients were designated as malicious and replaced their gradient updates with perturbed values drawn from a scaled random noise distribution or sign-flipped versions of legitimate gradients. The selection was randomized at the beginning of each round to prevent overfitting to a fixed subset. This stochastic design provides a fair and unbiased assessment of model resilience against dynamic adversarial participants.

For robustness to client poisoning (Fig 3b), accuracy trends are compared as the fraction of malicious clients p increases. While both methods degrade as p grows, the proposed pipeline maintains a clear advantage at all levels. Notably, at p = 0.3, it sustains substantially higher accuracy than the deep fusion baseline, validating the benefits of secure aggregation and privacy-preserving mechanisms for adversarial resilience. For IoT detection latency (Fig 3c), the proposed pipeline achieves the lowest median latency and narrowest interquartile range. In contrast, both classical and deep baselines exhibit higher mean delays and larger variability. Faster and more stable detection underscores the pipeline’s suitability for real-world IoT deployments where timely alerts are critical to patient safety.

The box plot in Fig 3c was generated from detection latency values computed across all IoT test sequences. For each model, latency was defined as the number of time steps between attack onset t₀ and the first correct alarm t satisfying , where was tuned on the validation set. Each box shows the interquartile range (IQR) of these latencies, the horizontal line denotes the median, and the green triangle marks the mean. Outliers are observations outside , indicating occasional delayed detections under atypical network fluctuations or transient client dropouts. The narrow IQR and fewer outliers for the proposed model confirm lower and more consistent response times.

Download:

Fig 3. Comparative evaluation of the proposed pipeline against baselines.

(a) AUPRC highlights superior detection quality. (b) Robust accuracy is sustained even with malicious clients. (c) Lower and more stable IoT detection latency ensures timely alerts.

https://doi.org/10.1371/journal.pone.0343669.g003

Fig 4 illustrates the evolution of both training/validation loss and accuracy across epochs. The proposed pipeline exhibits smooth convergence, with losses steadily decreasing and a small generalization gap. The early rapid decline indicates effective optimization and stable gradient flow, while the later flattening reflects convergence toward a well-regularized solution. Importantly, the validation loss does not diverge from the training loss, suggesting that regularizers and preprocessing steps—such as SMOTE balancing and autoencoder-based noise filtering—help prevent overfitting. For accuracy, both training and validation curves rise quickly during the initial optimization phase and then saturate, with validation accuracy closely tracking training accuracy. This confirms effective generalization without overfitting, supported by dropout regularization, class balancing, and the inclusion of contrastive and temporal consistency terms in the loss. The stabilization of validation accuracy at a high level further underscores the robustness of the proposed pipeline across unseen data distributions in distributed healthcare settings.

Download:

Fig 4. Training dynamics of the proposed pipeline.

(a) Loss curves show smooth convergence with small generalization gap. (b) Accuracy curves indicate stable generalization and controlled overfitting.

https://doi.org/10.1371/journal.pone.0343669.g004

Fig 5 summarizes the evolution of AUPRC and AUROC across epochs for both training and validation sets. For AUPRC (Fig 5a), both curves improve rapidly during early epochs and then stabilize, with validation values closely tracking training. This indicates that the model generalizes well to unseen samples without overfitting to the minority attack class. The sustained high plateau highlights the effectiveness of class-weighted binary cross-entropy and contrastive alignment in maintaining recall without sacrificing precision, confirming reliability in detecting rare but critical anomalies in healthcare streams. For AUROC (Fig 5b), both training and validation curves increase steadily and converge to a high plateau, reflecting improved separation between normal and anomalous events. The alignment between curves and absence of large oscillations indicate stable learning dynamics. Sustained high AUROC underscores the robustness of the decision boundaries learned by the pipeline, which is essential for minimizing false alarms while preserving sensitivity in healthcare intrusion detection scenarios.

Download:

Fig 5. Evolution of ranking-based performance metrics over epochs.

(a) AUPRC reflects precision–recall trade-offs under class imbalance. (b) AUROC illustrates class separability improvements.

https://doi.org/10.1371/journal.pone.0343669.g005

Fig 6 illustrates the training efficiency of the proposed pipeline in terms of learning rate dynamics and wall-clock time per epoch. The learning rate schedule (Fig 6a) follows a one-cycle policy with warmup and cosine decay. The warmup phase gradually increases the learning rate, stabilizing gradient updates in the early stages and preventing divergence. Subsequently, the cosine decay provides a smooth reduction that enables finer adjustments as convergence is approached. This dynamic scheduling strategy accelerates early training while reducing the risk of overfitting, contributing to the consistent performance observed in previous figures. The epoch wall-clock time (Fig 6b) reflects the throughput and stability of the distributed implementation. Initial fluctuations occur due to GPU memory allocation and data pipeline setup, but times quickly stabilize to a nearly constant value. This consistency demonstrates efficient coordination of gradient aggregation and communication overhead with computation. The modest and predictable per-epoch costs confirm that the pipeline is scalable and suitable for large healthcare datasets where frequent retraining may be required.

Download:

Fig 6. Training efficiency of the proposed pipeline.

(a) Adaptive learning rate schedule accelerates convergence while maintaining stability. (b) Stable epoch times confirm scalability of the distributed implementation.

https://doi.org/10.1371/journal.pone.0343669.g006

Fig 7 presents the evolution of both validation AUPRC and the differential privacy budget across federated rounds. For validation AUPRC (Fig 7a), the curve shows a steady increase during the first 20–30 rounds before plateauing, reflecting rapid aggregation of useful local updates and convergence of the distributed optimization. The smooth trajectory without oscillations demonstrates stability under non-IID IoT and EMR data partitions, confirming the robustness of the proposed pipeline in federated environments. For the privacy budget (Fig 7b), accumulation is observed as rounds progress under fixed noise and gradient clipping. As expected, increases monotonically due to repeated composition, capturing the trade-off between model utility and formal privacy protection. Importantly, the slope remains manageable across training rounds, showing that strong anomaly detection performance can be maintained while staying within practical privacy budgets. Together, these results highlight the pipeline’s ability to balance privacy preservation with effective distributed learning.

Download:

Fig 7. Federated training dynamics of the proposed pipeline.

(a) Validation AUPRC convergence demonstrates stable distributed optimization. (b) Privacy budget accumulation reflects the trade-off between utility and differential privacy guarantees.

https://doi.org/10.1371/journal.pone.0343669.g007

Fig 8 illustrates the robustness and reliability of the proposed pipeline under adversarial settings. Robust accuracy (Fig 8a) is measured across federated rounds when p = 0.2 of the clients are malicious. Despite poisoned gradient updates, the pipeline maintains stable and high accuracy throughout training, whereas conventional baselines degrade more sharply. This confirms the effectiveness of secure aggregation and regularization strategies in mitigating adversarial influence, ensuring trustworthy anomaly detection in distributed healthcare environments. Expected Calibration Error (ECE) (Fig 8b) evaluates the alignment between predicted probabilities and true outcome frequencies. The steady reduction of ECE over epochs shows that the proposed framework not only improves accuracy but also produces better-calibrated confidence scores. This property is critical in healthcare cybersecurity, where reliable probability estimates enable risk-aware decision-making and reduce false alarms or missed detections. Together, these results highlight that the proposed pipeline can sustain robust and well-calibrated performance under real-world adversarial conditions.

Download:

Fig 8. Robustness and reliability of the proposed pipeline.

(a) Defense mechanisms sustain accuracy under poisoning. (b) Probability calibration improves steadily, enabling reliable risk-aware decision-making.

https://doi.org/10.1371/journal.pone.0343669.g008

5. Discussion

The results presented in this study highlight the novelty and effectiveness of the proposed secure distributed machine learning pipeline for cyber-resilient healthcare systems. Unlike conventional centralized architectures, our framework integrates federated optimization, split learning, and privacy-preserving mechanisms into a unified end-to-end pipeline tailored for dual-domain data sources—IoT patient monitoring streams and electronic medical records (EMRs). This multi-layer design ensures that both temporal signals from IoT devices and structured patient attributes from EMRs are utilized in a complementary fashion, enabling more reliable detection of cyber anomalies. A particularly novel aspect lies in the cross-layer contrastive alignment module, which encourages consistent representation learning across heterogeneous modalities, while temporal consistency regularization stabilizes IoT-based anomaly scores. Together with robust aggregation against adversarial clients, these elements position the framework as an advance beyond prior work on healthcare intrusion detection.

The implications of these findings extend beyond raw detection performance. From the quantitative evaluation, the pipeline consistently achieved higher precision–recall performance compared to both classical and deep learning baselines. This indicates that the proposed method is well suited to high-stakes domains such as intensive care monitoring, where false alarms are costly and missed detections are unacceptable. The resilience analysis further demonstrated that the system can maintain stability even when up to 30% of participating clients were compromised, showing its practicality for deployment in federated hospital networks where trust cannot always be assumed. Moreover, the privacy experiments underscore the feasibility of training high-performing models under differential privacy constraints, an important requirement in healthcare given regulatory frameworks such as HIPAA and GDPR. The reduced latency in anomaly detection suggests that the pipeline not only improves accuracy but also shortens response time, which is critical in cyberattack scenarios where seconds may determine patient safety.

Despite these advances, the work is not without limitations. The evaluation relied on two representative healthcare-oriented datasets, which, while diverse, may not capture the full complexity of real-world hospital networks. The simulated federated environment assumed synchronous client participation with simplified communication constraints, whereas real deployments often face asynchronous updates, highly heterogeneous devices, and unreliable network connectivity. Furthermore, while the privacy-preserving mechanisms were effective, they inevitably introduced trade-offs between accuracy and privacy budgets, highlighting the tension between utility and regulatory compliance. The adversarial scenarios tested, although varied, were still controlled; sophisticated adaptive attackers might exploit vulnerabilities that were not explicitly modeled in this study.

Looking forward, several directions remain open for exploration. First, expanding the evaluation to larger-scale, multi-institutional datasets would strengthen the generalizability of the results. Second, integration of adaptive aggregation rules that dynamically adjust to varying threat landscapes could further enhance robustness. Third, developing methods for continuous learning in dynamic healthcare environments, where data distributions evolve over time, would reduce the need for frequent retraining. Another promising avenue is the incorporation of blockchain-based audit trails to reinforce trust in update exchanges between federated clients. Finally, exploring lightweight model compression and deployment strategies will be essential for supporting real-time operation on resource-constrained IoT devices. Addressing these challenges can make the pipeline more resilient, scalable, and practical, ultimately advancing the safe integration of machine intelligence into modern healthcare systems.

6. Conclusions

In this work, we proposed a secure distributed machine learning pipeline designed to enhance the cyber-resilience of healthcare systems by combining federated optimization, split learning, robust aggregation, and differential privacy into a unified architecture capable of handling both IoT-based patient monitoring data and structured electronic medical records. Through comprehensive experimentation, the framework demonstrated superior detection accuracy, precision–recall balance, and robustness under adversarial conditions compared to baseline models, while also preserving patient privacy through clipping, noise addition, and secure aggregation protocols. The integration of temporal consistency regularization and cross-layer contrastive alignment further reinforced the ability of the system to capture temporal dynamics and cross-modal dependencies, ensuring more stable and generalizable representations. By evaluating resilience against poisoning, communication dropouts, and latency constraints, we established that the pipeline not only improves detection performance but also remains reliable under real-world stressors, making it a strong candidate for deployment in critical healthcare infrastructure. While the study was conducted on representative datasets and within a controlled federated environment, the implications are significant, as the framework offers a path forward for deploying AI solutions that are secure, privacy-aware, and adaptable to the regulatory and operational challenges of healthcare. We acknowledge existing limitations, such as reliance on specific datasets and simplified communication assumptions, but believe that future work involving larger-scale deployments, adaptive aggregation strategies, and continuous learning protocols can further advance the resilience and scalability of the system. Future research will focus on four key directions. First, we plan to extend this work to real-world hospital deployments using live streaming EMR and IoT monitoring data to evaluate real-time inference stability and system latency under practical constraints. Second, we aim to explore personalized federated learning strategies to better address data heterogeneity across hospitals and departments, reducing model drift in cross-institutional deployments. Third, we intend to investigate stronger privacy mechanisms such as adaptive differential privacy and federated machine unlearning to support GDPR-compliant data withdrawal and long-term privacy guarantees. Finally, we plan to integrate trust-aware distributed defense mechanisms, such as blockchain-based audit trails and secure client reputation systems, to further harden the pipeline against insider threats and advanced adversarial attacks. Overall, the contributions of this research demonstrate that distributed, privacy-preserving, and adversarially robust machine learning is not only feasible but essential for protecting modern healthcare systems against evolving cyber threats, and our pipeline represents a meaningful step toward that vision.

References

1. Pavão J, Bastardo R, Rocha NP. Cyber resilience and healthcare information systems, a systematic review. Procedia Computer Science. 2024;239:149–57.
- View Article
- Google Scholar
2. Tange HJ, Hasman A, de Vries Robbé PF, Schouten HC. Medical narratives in electronic medical records. Int J Med Inform. 1997;46(1):7–29. pmid:9476152
- View Article
- PubMed/NCBI
- Google Scholar
3. Selvaraj S, Sundaravaradhan S. Challenges and opportunities in IoT healthcare systems: a systematic review. SN Appl Sci. 2019;2(1).
- View Article
- Google Scholar
4. Darshan K R, Anandakumar K R. A comprehensive review on usage of Internet of Things (IoT) in healthcare system. In: 2015 International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT), 2015. 132–6. https://doi.org/10.1109/erect.2015.7499001
5. Yan W, Yu L. On accurate and reliable anomaly detection for gas turbine combustors: A deep learning approach. In: 2019. https://arxiv.org/abs/1908.09238
6. Alonso A, Siracuse JJ. Protecting patient safety and privacy in the era of artificial intelligence. Semin Vasc Surg. 2023;36(3):426–9. pmid:37863615
- View Article
- PubMed/NCBI
- Google Scholar
7. Al-Janabi S, Jabbar H, Syms F. Cybersecurity Transformation: Cyber-Resilient IT Project Management Framework. Digital. 2024;4(4):866–97.
- View Article
- Google Scholar
8. Gardiyawasam Pussewalage HS, Oleshchuk VA. Privacy preserving mechanisms for enforcing security and privacy requirements in E-health solutions. International Journal of Information Management. 2016;36(6):1161–73.
- View Article
- Google Scholar
9. Vergara-Laurens IJ, Jaimes LG, Labrador MA. Privacy-Preserving Mechanisms for Crowdsensing: Survey and Research Challenges. IEEE Internet Things J. 2017;4(4):855–69.
- View Article
- Google Scholar
10. Liao G, Chen X, Huang J. Prospect Theoretic Analysis of Privacy-Preserving Mechanism. IEEE/ACM Trans Networking. 2020;28(1):71–83.
- View Article
- Google Scholar
11. Antunes RS, André da Costa C, Küderle A, Yari IA, Eskofier B. Federated Learning for Healthcare: Systematic Review and Architecture Proposal. ACM Trans Intell Syst Technol. 2022;13(4):1–23.
- View Article
- Google Scholar
12. Ali M, Naeem F, Tariq M, Kaddoum G. Federated Learning for Privacy Preservation in Smart Healthcare Systems: A Comprehensive Survey. IEEE J Biomed Health Inform. 2023;27(2):778–89. pmid:35696470
- View Article
- PubMed/NCBI
- Google Scholar
13. Abdulkhudhur SM, Abboud SM, Najim AH, Kadhim MN, Ahmed AA. A hybrid deep belief cascade-neuro fuzzy approach for real-time health anomaly detection in 5G-enabled IoT medical networks. International Journal of Intelligent Engineering & Systems. 2025;18(5).
- View Article
- Google Scholar
14. Elhoseny M, Ramirez-Gonzalez G, Abu-Elnasr OM, Shawkat SA, Arunkumar N, Farouk A. Secure Medical Data Transmission Model for IoT-Based Healthcare Systems. IEEE Access. 2018;6:20596–608.
- View Article
- Google Scholar
15. Yeh K-H. A Secure IoT-Based Healthcare System With Body Sensor Networks. IEEE Access. 2016;4:10288–99.
- View Article
- Google Scholar
16. Tyagi S, Agarwal A, Maheshwari P. A conceptual framework for IoT-based healthcare system using cloud computing. In: 2016 6th International Conference-Cloud System and Big Data Engineering (Confluence), 2016. 503–7.
17. Abdulmalek S, Nasir A, Jabbar WA, Almuhaya MAM, Bairagi AK, Khan MA-M, et al. IoT-Based Healthcare-Monitoring System towards Improving Quality of Life: A Review. Healthcare (Basel). 2022;10(10):1993. pmid:36292441
- View Article
- PubMed/NCBI
- Google Scholar
18. Gu X, Sabrina F, Fan Z, Sohail S. A Review of Privacy Enhancement Methods for Federated Learning in Healthcare Systems. Int J Environ Res Public Health. 2023;20(15):6539. pmid:37569079
- View Article
- PubMed/NCBI
- Google Scholar
19. Buyuktanir B, Altinkaya Ş, Karatas Baydogmus G, Yildiz K. Federated learning in intrusion detection: advancements, applications, and future directions. Cluster Comput. 2025;28(7).
- View Article
- Google Scholar
20. Büyüktanır B, Çıplak Z, Çil AE, Yakar Ö, Adoum MB, Yıldız K. Ddos_fl: Federated learning architecture approach against ddos attack. Pamukkale University Journal of Engineering Sciences. 2025;31(6):0–0.
- View Article
- Google Scholar
21. Ozturk O, Buyuktanir B, Baydogmus GK, Yildiz K. Differential Privacy in Federated Learning: Mitigating Inference Attacks with Randomized Response. In: 2025. https://arxiv.org/abs/250913987
22. Demir A, Kulaksiz AY, Büyüktanır B, Baydoğmuş GK, Yıldız K. Model training and real world analysis using health data with federated learning. In: Proceedings of the International Open Source Conference UAKK 2024, 2024. 11.
23. Elayan H, Aloqaily M, Guizani M. Sustainability of Healthcare Data Analysis IoT-Based Systems Using Deep Federated Learning. IEEE Internet Things J. 2022;9(10):7338–46.
- View Article
- Google Scholar
24. Mishra A, Saha S, Mishra S, Bagade P. A federated learning approach for smart healthcare systems. CSIT. 2023;11(1):39–44.
- View Article
- Google Scholar
25. Hassan MU, Rehmani MH, Chen J. Differential Privacy Techniques for Cyber Physical Systems: A Survey. IEEE Commun Surv Tutorials. 2020;22(1):746–89.
- View Article
- Google Scholar
26. Ficek J, Wang W, Chen H, Dagne G, Daley E. Differential privacy in health research: A scoping review. J Am Med Inform Assoc. 2021;28(10):2269–76. pmid:34333623
- View Article
- PubMed/NCBI
- Google Scholar
27. Simanek J, Kubelka V, Reinstein M. Improving multi-modal data fusion by anomaly detection. Auton Robot. 2015;39(2):139–54.
- View Article
- Google Scholar
28. Cheng H, Luo J, Zhang X. Multimodal Industrial Anomaly Detection via Uni-Modal and Cross-Modal Fusion. IEEE Trans Ind Inf. 2025;21(6):5000–10.
- View Article
- Google Scholar
29. Malali N, Praveen Madugula SR. Robustness and Adversarial Resilience of Actuarial AI/ML Models in the Face of Evolving Threats. International Journal of Innovative Science and Research Technology (IJISRT). 2025;:910–6.
- View Article
- Google Scholar
30. Rane N, Mallick SK, Rane J. Adversarial machine learning for cybersecurity resilience and network security enhancement. In: 2025. https://ssrn.com/abstract=5337152
31. Buyuktanir B, Yildiz K, Baydogmus GK. A systematic mapping study on machine unlearning in federated learning. In: 2025 7th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (ICHORA), 2025. 1–10.

[ref1] 1. Pavão J, Bastardo R, Rocha NP. Cyber resilience and healthcare information systems, a systematic review. Procedia Computer Science. 2024;239:149–57.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Tange HJ, Hasman A, de Vries Robbé PF, Schouten HC. Medical narratives in electronic medical records. Int J Med Inform. 1997;46(1):7–29. pmid:9476152
View Article
PubMed/NCBI
Google Scholar

[5] View Article

[6] PubMed/NCBI

[7] Google Scholar

[ref3] 3. Selvaraj S, Sundaravaradhan S. Challenges and opportunities in IoT healthcare systems: a systematic review. SN Appl Sci. 2019;2(1).
View Article
Google Scholar

[9] View Article

[10] Google Scholar

[ref4] 4. Darshan K R, Anandakumar K R. A comprehensive review on usage of Internet of Things (IoT) in healthcare system. In: 2015 International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT), 2015. 132–6. https://doi.org/10.1109/erect.2015.7499001

[ref5] 5. Yan W, Yu L. On accurate and reliable anomaly detection for gas turbine combustors: A deep learning approach. In: 2019. https://arxiv.org/abs/1908.09238

[ref6] 6. Alonso A, Siracuse JJ. Protecting patient safety and privacy in the era of artificial intelligence. Semin Vasc Surg. 2023;36(3):426–9. pmid:37863615
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref7] 7. Al-Janabi S, Jabbar H, Syms F. Cybersecurity Transformation: Cyber-Resilient IT Project Management Framework. Digital. 2024;4(4):866–97.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref8] 8. Gardiyawasam Pussewalage HS, Oleshchuk VA. Privacy preserving mechanisms for enforcing security and privacy requirements in E-health solutions. International Journal of Information Management. 2016;36(6):1161–73.
View Article
Google Scholar

[21] View Article

[22] Google Scholar

[ref9] 9. Vergara-Laurens IJ, Jaimes LG, Labrador MA. Privacy-Preserving Mechanisms for Crowdsensing: Survey and Research Challenges. IEEE Internet Things J. 2017;4(4):855–69.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref10] 10. Liao G, Chen X, Huang J. Prospect Theoretic Analysis of Privacy-Preserving Mechanism. IEEE/ACM Trans Networking. 2020;28(1):71–83.
View Article
Google Scholar

[27] View Article

[28] Google Scholar

[ref11] 11. Antunes RS, André da Costa C, Küderle A, Yari IA, Eskofier B. Federated Learning for Healthcare: Systematic Review and Architecture Proposal. ACM Trans Intell Syst Technol. 2022;13(4):1–23.
View Article
Google Scholar

[30] View Article

[31] Google Scholar

[ref12] 12. Ali M, Naeem F, Tariq M, Kaddoum G. Federated Learning for Privacy Preservation in Smart Healthcare Systems: A Comprehensive Survey. IEEE J Biomed Health Inform. 2023;27(2):778–89. pmid:35696470
View Article
PubMed/NCBI
Google Scholar

[33] View Article

[34] PubMed/NCBI

[35] Google Scholar

[ref13] 13. Abdulkhudhur SM, Abboud SM, Najim AH, Kadhim MN, Ahmed AA. A hybrid deep belief cascade-neuro fuzzy approach for real-time health anomaly detection in 5G-enabled IoT medical networks. International Journal of Intelligent Engineering & Systems. 2025;18(5).
View Article
Google Scholar

[37] View Article

[38] Google Scholar

[ref14] 14. Elhoseny M, Ramirez-Gonzalez G, Abu-Elnasr OM, Shawkat SA, Arunkumar N, Farouk A. Secure Medical Data Transmission Model for IoT-Based Healthcare Systems. IEEE Access. 2018;6:20596–608.
View Article
Google Scholar

[40] View Article

[41] Google Scholar

[ref15] 15. Yeh K-H. A Secure IoT-Based Healthcare System With Body Sensor Networks. IEEE Access. 2016;4:10288–99.
View Article
Google Scholar

[43] View Article

[44] Google Scholar

[ref16] 16. Tyagi S, Agarwal A, Maheshwari P. A conceptual framework for IoT-based healthcare system using cloud computing. In: 2016 6th International Conference-Cloud System and Big Data Engineering (Confluence), 2016. 503–7.

[ref17] 17. Abdulmalek S, Nasir A, Jabbar WA, Almuhaya MAM, Bairagi AK, Khan MA-M, et al. IoT-Based Healthcare-Monitoring System towards Improving Quality of Life: A Review. Healthcare (Basel). 2022;10(10):1993. pmid:36292441
View Article
PubMed/NCBI
Google Scholar

[47] View Article

[48] PubMed/NCBI

[49] Google Scholar

[ref18] 18. Gu X, Sabrina F, Fan Z, Sohail S. A Review of Privacy Enhancement Methods for Federated Learning in Healthcare Systems. Int J Environ Res Public Health. 2023;20(15):6539. pmid:37569079
View Article
PubMed/NCBI
Google Scholar

[51] View Article

[52] PubMed/NCBI

[53] Google Scholar

[ref19] 19. Buyuktanir B, Altinkaya Ş, Karatas Baydogmus G, Yildiz K. Federated learning in intrusion detection: advancements, applications, and future directions. Cluster Comput. 2025;28(7).
View Article
Google Scholar

[55] View Article

[56] Google Scholar

[ref20] 20. Büyüktanır B, Çıplak Z, Çil AE, Yakar Ö, Adoum MB, Yıldız K. Ddos_fl: Federated learning architecture approach against ddos attack. Pamukkale University Journal of Engineering Sciences. 2025;31(6):0–0.
View Article
Google Scholar

[58] View Article

[59] Google Scholar

[ref21] 21. Ozturk O, Buyuktanir B, Baydogmus GK, Yildiz K. Differential Privacy in Federated Learning: Mitigating Inference Attacks with Randomized Response. In: 2025. https://arxiv.org/abs/250913987

[ref22] 22. Demir A, Kulaksiz AY, Büyüktanır B, Baydoğmuş GK, Yıldız K. Model training and real world analysis using health data with federated learning. In: Proceedings of the International Open Source Conference UAKK 2024, 2024. 11.

[ref23] 23. Elayan H, Aloqaily M, Guizani M. Sustainability of Healthcare Data Analysis IoT-Based Systems Using Deep Federated Learning. IEEE Internet Things J. 2022;9(10):7338–46.
View Article
Google Scholar

[63] View Article

[64] Google Scholar

[ref24] 24. Mishra A, Saha S, Mishra S, Bagade P. A federated learning approach for smart healthcare systems. CSIT. 2023;11(1):39–44.
View Article
Google Scholar

[66] View Article

[67] Google Scholar

[ref25] 25. Hassan MU, Rehmani MH, Chen J. Differential Privacy Techniques for Cyber Physical Systems: A Survey. IEEE Commun Surv Tutorials. 2020;22(1):746–89.
View Article
Google Scholar

[69] View Article

[70] Google Scholar

[ref26] 26. Ficek J, Wang W, Chen H, Dagne G, Daley E. Differential privacy in health research: A scoping review. J Am Med Inform Assoc. 2021;28(10):2269–76. pmid:34333623
View Article
PubMed/NCBI
Google Scholar

[72] View Article

[73] PubMed/NCBI

[74] Google Scholar

[ref27] 27. Simanek J, Kubelka V, Reinstein M. Improving multi-modal data fusion by anomaly detection. Auton Robot. 2015;39(2):139–54.
View Article
Google Scholar

[76] View Article

[77] Google Scholar

[ref28] 28. Cheng H, Luo J, Zhang X. Multimodal Industrial Anomaly Detection via Uni-Modal and Cross-Modal Fusion. IEEE Trans Ind Inf. 2025;21(6):5000–10.
View Article
Google Scholar

[79] View Article

[80] Google Scholar

[ref29] 29. Malali N, Praveen Madugula SR. Robustness and Adversarial Resilience of Actuarial AI/ML Models in the Face of Evolving Threats. International Journal of Innovative Science and Research Technology (IJISRT). 2025;:910–6.
View Article
Google Scholar

[82] View Article

[83] Google Scholar

[ref30] 30. Rane N, Mallick SK, Rane J. Adversarial machine learning for cybersecurity resilience and network security enhancement. In: 2025. https://ssrn.com/abstract=5337152

[ref31] 31. Buyuktanir B, Yildiz K, Baydogmus GK. A systematic mapping study on machine unlearning in federated learning. In: 2025 7th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (ICHORA), 2025. 1–10.

Figures

Abstract

1. Introduction

1.1. Key contributions

2. Related work

3. Methodology

3.1. Data preprocessing

3.1.1. Data cleaning and integration.

3.1.2. Feature Encoding and Normalization.

3.1.3. Outlier detection and noise reduction.

3.1.4. Dimensionality reduction.

3.1.5. Data balancing.

3.1.6. Partitioning into training, validation, and test sets.

3.2. Proposed framework

3.2.1. Problem setting and notation.

3.2.2. Threat model.

3.2.3. Dual-Layer Architecture.

3.2.4. Architectural details.

3.3. Training and implementation details

3.3.1. Loss functions.

3.3.2. Federated optimization.

3.3.3. Privacy Mechanisms.

3.3.4. Split Learning Variant (EMR).

3.3.5. Resilience Protocols.

3.3.6. Training Protocol.

3.3.7. Implementation Notes.

3.3.8. Outputs.

3.4. Overall architecture summary

4. Results

4.1. Dataset descriptions

4.2. Quantitative evaluation

4.2.1. Results on IoT Dataset.

4.2.2. Results on EMR Dataset.

4.2.3. Results on Combined IoT and EMR Dataset.

4.2.4 Resilience under Client Poisoning.

4.2.5. Impact of communication constraints.

4.2.6. Impact of privacy mechanisms.

4.2.7. Detection latency on IoT dataset.

4.2.8. Ablation study.

4.2.9. Training dynamics, efficiency, and robustness.

5. Discussion

6. Conclusions

References