Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A poisson flow-based data augmentation and lightweight diagnosis framework for imbalanced rolling bearing faults

  • Xin Liu,

    Roles Conceptualization, Writing – original draft, Writing – review & editing

    Affiliation CHN Energy BaoRiXiLe Energy Co., Ltd., Hulunbuir, China

  • Han Wang,

    Roles Project administration, Resources, Software, Supervision

    Affiliation CHN Energy BaoRiXiLe Energy Co., Ltd., Hulunbuir, China

  • Zhiyong Du,

    Roles Formal analysis, Funding acquisition, Investigation, Methodology

    Affiliation CHN Energy BaoRiXiLe Energy Co., Ltd., Hulunbuir, China

  • Xu Xu,

    Roles Data curation, Resources, Software, Writing – original draft

    Affiliation CHN Energy BaoRiXiLe Energy Co., Ltd., Hulunbuir, China

  • Bo Song

    Roles Conceptualization, Data curation, Validation, Visualization, Writing – original draft

    songbo1984923@163.com

    Affiliation CCTEG Shenyang Engineering Company, Shenyang, China

Abstract

Accurate diagnosis of rolling bearing faults is vital for the safe operation of rotating machinery. However, real-world fault datasets often suffer from severe class imbalance, which hinders the performance of deep learning models. To address this challenge, we propose PFRNet, a novel diagnostic framework integrating a Poisson Flow-based generative model with a lightweight residual network. Raw vibration signals are transformed into time-frequency representations via CWT to capture non-stationary fault features. The Poisson generative mechanism models sample evolution in high-dimensional latent space to synthesize realistic minority-class samples by learning statistical distributions of real data, mitigating imbalance. These augmented datasets are subsequently classified using an efficient residual network designed for robust feature extraction with minimal complexity. Experiments on the CWRU benchmark demonstrate that PFRNet outperforms state-of-the-art methods in diagnostic accuracy, robustness, and generalization across various imbalance scenarios. Quantitative evaluations further confirm that the generated samples closely resemble real data in both quality and diversity, supporting the effectiveness of the proposed method. The proposed approach offers a promising solution for reliable fault diagnosis under practical, imbalance-prone industrial conditions.

1. Introduction

In the field of high-end equipment manufacturing, condition monitoring and damage identification of core components in rotating machinery have become critical technical challenges for ensuring equipment reliability. As the central hub of power transmission in rotating machinery, the operational performance of rolling bearings directly affects the continuous operation of major equipment such as shield tunneing systems and wind turbines. Bearing components operating under long-term high-load conditions are prone to initiating micro-cracks on the raceway and rolling elements due to the coupled effects of alternating stress and material fatigue, which may evolve into macroscopic spalling failures under vibration and impact [1]. Industrial statistics indicate that approximately one-third of unplanned downtime in rotating machinery can be attributed to progressive damage in bearing components. The sudden onset of such faults often triggers cascading equipment failures, resulting in single-event economic losses potentially reaching millions. Timely and accurate diagnosis of bearing faults is crucial for extending equipment lifespan, reducing maintenance costs, and ensuring operational safety [2].

Vibration signals are the most commonly used and information-rich medium in bearing condition monitoring. By analyzing and processing vibration signals, various abnormal operating states of bearings can be effectively identified [3,4]. Early bearing fault diagnosis relied on manually extracted features combined with traditional classifiers, such as support vector machines (SVM) [5] random forests (RF) [6] These methods depend heavily on expert knowledge for feature selection and exhibit limited performance when handling complex signals, thereby constraining their applicability in large-scale and complex environments [7]. In recent years, deep learning has emerged as a mainstream approach due to its capabilities for automatic feature extraction and powerful classification performance [8,9]. Complementarily, nonlinear dynamic analysis methods have also gained attention. Techniques such as the time Poincaré plot index (TPPI) and the enhanced hierarchical Poincaré plot index (EHPPI) have demonstrated strong potential for capturing multiscale and nonlinear characteristics in vibration data, with EHPPI further improving diagnostic accuracy by enabling effective multi-sensor signal fusion [10,11]. Li et al. [12] proposed the KANs-CNN-FAN fusion network to extract nonlinear features from bearing signals. Tang et al. [13] developed a convolutional structure based on fusion units that can extract multi-scale features from signals, enabling fault diagnosis with strong generalization ability. Wang et al. [14] proposed a hybrid method combining residual networks and long short-term memory (LSTM) networks, which effectively extracts fault features from bearing signals. Gui et al. [15] introduced the QNN-BiLSTM hybrid model, which integrates the advantages of quadratic neural networks and bidirectional temporal networks to enhance both diagnostic efficiency and model interpretability. Guo et al. [16] designed a compact residual network for cross-domain bearing fault diagnosis by leveraging a teacher network enhanced by DANN-NAM and an IPKD-based quantization-coordinated optimization framework, achieving comparable accuracy with significantly reduced model size.

However, the practical deployment of existing intelligent diagnostic systems still faces significant challenges, primarily due to their strong dependence on the completeness of training data, and their limited generalization capability under domain shifts between different operating conditions. To tackle this challenge, Huang et al. [17] introduced a domain-independent compact boundary learning framework for universal diagnosis domain generation (UDDGN), which addresses arbitrary category shifts and enhances generalization without requiring target domain data. In real-world industrial environments, bearing fault samples are typically scarce, while samples from normal operating conditions are overwhelmingly abundant. This severe class imbalance poses a major challenge for training effective bearing fault diagnosis models [18,19]. Data imbalance tends to bias diagnostic models toward the majority class (i.e., the normal condition), resulting in low recognition rates for minority class faults, reduced generalization performance, and an increased risk of overfitting and misclassification. This issue is particularly pronounced in deep learning models, where neural networks tend to focus on learning the dominant class features while neglecting the subtle variations of minority classes, thereby compromising the sensitivity and accuracy of fault detection [20]. Therefore, effectively mitigating the impact of data imbalance and enhancing the discriminative ability for minority classes remain key bottlenecks in achieving high-performance bearing fault diagnosis.

To address the issue of data imbalance, traditional approaches primarily include resampling techniques (oversampling and undersampling), cost-sensitive learning, and threshold adjustment methods. For example, the Synthetic Minority Oversampling Technique (SMOTE) generates new samples by interpolating within the neighborhood of minority class instances to augment the dataset [21,22]. However, conventional oversampling methods often introduce redundant or noisy samples, leading to insufficient data diversity and limiting model improvement [23]. In contrast, undersampling may result in the loss of critical information, thereby degrading overall model performance. Additionally, cost-sensitive model adjustments often require manually specified parameters, making them difficult to adapt to complex operating conditions [24]. Overall, it is difficult for traditional data augmentation methods to generate samples with similar distribution to the original data.

Data augmentation has become an effective way to address the lack of fault samples in intelligent diagnosis tasks [25]. Among these methods, GAN-based techniques are widely explored for their sample generation capabilities [26]. For instance, Yang et al. [26]proposed a stacked contractive autoencoder combined with an auxiliary classifier GAN (VSC-ACGAN), which improves generation quality and diagnostic accuracy on imbalanced bearing datasets. Liu et al. [27] proposed ACGAN-SG, which integrates spectral normalization and gradient penalty into GANs to generate high-quality samples and reduce class imbalance. Similarly, Qin et al. [28] developed WGAN-GP to improve training stability by penalizing the discriminator’s gradients. Wang et al. [29] enhanced CGAN with spectral normalization to prevent mode collapse and stabilize the training process. To improve generation quality, several studies introduced domain-specific constraints. Ruan et al. [30] incorporated frequency-domain envelope spectrum error into the GAN loss function, guiding the model to generate more realistic fault features. Wang et al. [31] proposed M-D2GAN, which uses a dual-discriminator design and a class-specific generation strategy to handle diverse monitoring data. An additional discriminator was also introduced to address imbalance in normal samples. Other approaches focused on improving interpretability and diversity. Liu et al. [32] designed an interpretable variational autoencoder with sequential attention (AVAE-SQA) to augment imbalanced bearing datasets. Fusion-based models have also shown promise. Zhu et al. [33] presented DWCVAE-DFL, a dynamically weighted conditional variational autoencoder tailored for 3D blade tip clearance signals. Zhu et al. [34] combined Digital Twin technology with C-DCGAN to constrain generation and improve sample fidelity. Yu et al. [35] proposed a method that integrates physical modeling with a CycleGAN variant. By incorporating BiLSTM and multi-head attention, their model can generate fault samples even without real data. Wang et al. [36] used GANs with CNNs to generate balanced datasets for wind turbine SCADA data, enhancing diagnostic accuracy. More recently, Yu et al. [37] introduced ReF-DDPM, a conditional diffusion model that uses noise prediction and label-guided constraints to generate high-quality samples. This method improves both the accuracy and generalization of diagnosis under imbalanced conditions. Similarly, Wang et al. [38] proposed a method combining an improved DDPM with 1D-CNN to enhance small-sample training, effectively improving fault diagnosis performance. Despite these advances, many generative models still focus too heavily on dominant patterns. As a result, they often fail to capture subtle but critical features in real-world data. This leads to limited diversity and reduces the representational capacity of synthetic samples—posing a challenge in replicating the complexity of actual industrial fault scenarios.

To address imbalanced bearing fault data, we propose PFRNet, a framework based on the Poisson Flow generative model. This approach augments fault data by synthesizing samples that match the statistical patterns of real faults, enabling accurate diagnosis under balanced conditions. Continuous wavelet transform (CWT) is employed to convert the raw vibration signals collected from sensors into time-frequency representations. The Poisson Flow generative model is then used to augment fault samples by generating time-frequency representations that closely match the feature distribution of the original data. Finally, the generated fault samples are combined with real data to construct a balanced dataset, facilitating effective fault diagnosis under data-balanced conditions. The overall methodological framework is illustrated in Fig 1. Experimental results show that, compared with existing approaches, the proposed method achieves higher diagnostic accuracy and effectively handles varying degrees of data imbalance.

thumbnail
Fig 1. Flowchart of the fault diagnosis based on the proposed PFRNet.

https://doi.org/10.1371/journal.pone.0332994.g001

The main contributions of this paper can be summarized as follows.

  1. A Poisson Flow-based spectrogram generation strategy is proposed to produce high-quality fault samples that preserve essential time-frequency structures and alleviate data imbalance.
  2. A compact residual network is developed to efficiently extract both local and global time-frequency features, enhancing diagnostic accuracy with minimal computational cost.
  3. An end-to-end framework integrates preprocessing, sample generation, and classification, enabling joint optimization for improved data augmentation and robust fault diagnosis.

The rest of this paper is organized as follows. Section II introduces the basic theoretical background. Section III presents experimental verification of the proposed approach. In Section IV, results are analyzed and compared with state-of-the-art methods. Section V includes the conclusion and research prospect.

2. Theoretical Background

2.1. Vibration signal collection and preprocessing

In practical operation, rolling bearings are often affected by factors such as load fluctuations, structural nonlinearity, and installation misalignment, leading to non-stationary vibration signals whose frequency components vary over time. This non-stationarity reduces the discriminative capability of traditional time-domain or frequency-domain analysis methods in fault pattern recognition. To more effectively capture the underlying time-varying characteristics of the signals, this study employs Continuous Wavelet Transform (CWT) for time-frequency analysis, enabling dynamic spectral representation of the vibration data. The process of time-frequency feature extraction is illustrated in Fig 2.

thumbnail
Fig 2. Process for the conversion of time-frequency feature.

https://doi.org/10.1371/journal.pone.0332994.g002

CWT exhibits excellent time-frequency localization properties, making it particularly suitable for analyzing impact or transient signals. In this study, the Morlet wavelet, which resembles the shape of impact responses, is selected as the mother wavelet to enhance sensitivity to local features. CWT performs convolution of the signal across different time shifts and scale parameters, enabling a multi-resolution representation of the signal structure. Its mathematical formulation is given as follows:

(1)

where x(t) denotes the original signal, τ and s represent the time shift and scale parameters, respectively, and ψ^* (t) is the complex conjugate of the selected wavelet basis function.

The Morlet wavelet used in this study is defined as follows:

(2)

where is the constant pi, is the imaginary unit, is the central angular frequency of the Morlet wavelet, and represents the time variable.

Through this transformation, the one-dimensional time-series signal is mapped into a two-dimensional time-frequency representation, where the horizontal axis corresponds to time, the vertical axis to frequency, and the pixel intensity indicates energy magnitude. The resulting time-frequency image simultaneously captures local temporal characteristics and frequency distribution features, offering a more discriminative input representation for deep network models.

2.2. Data augmentation process

To address the data imbalance issue caused by minority classes in rolling bearing fault diagnosis, this study introduces a Poisson Flow-based generative model to construct high-quality time-frequency feature samples. By incorporating the Poisson generative mechanism into the generation process of bearing time-frequency representations, the model simulates the evolution of sample trajectories in a high-dimensional latent space, thereby enriching the diversity of minority-class samples and mitigating performance degradation due to class imbalance. The core idea is to model samples as points evolving toward the data manifold through learned trajectory dynamics, resulting in the generation of new data instances that match the statistical distribution of real fault features.

The Poisson generative model first maps the original time-frequency sample into a high-dimensional augmented space , where represents an additional perturbation variable. In this space, samples are constructed via a spherically symmetric perturbation kernel to form the conditional probability distribution , which takes the following form:

(3)

where, denotes a sample drawn from the original data distribution , represents the perturbation radius, and is the augmented dimensionality parameter introduced to enhance the model’s representational capacity.

To ensure consistency in the shape of the perturbation kernel across different augmented dimensions, the Poisson-based model in this study defines the perturbation radius according to the following proportional scheme:

(4)

where, denotes the standard noise intensity.

This configuration ensures scale-aligned properties of the perturbation kernel across different D values—specifically, the intermediate perturbed distribution remains statistically consistent. Consequently, the geometric structure of the generation path maintains stability and generalizability during dimensional migration. As demonstrated by Xu et al. [39], the model achieves an optimal balance between robustness and generation quality when D = 2048, in contrast to extreme diffusion model scenarios. In a 64x64x3 image space, 2048 dimensions can provide sufficient nonlinear modeling capabilities while maintaining the stability of the generation path.

The generation process in the Poisson-based model can be interpreted as the evolution of samples along the streamlines of a normalized electric field, with the direction given by the following integral form:

(5)

where, represents the electric field direction in the augmented space, and denotes the current position in this space.

However, due to the difficulty of explicitly solving the electric field function in high-dimensional space, a neural network is introduced to approximate its direction. The training objective is to make the network output closely align with the sample perturbation direction, and the corresponding loss function is defined as follows:

(6)

where, represents the direction vector predicted by the model, and denotes the normalized theoretical perturbation direction.

By minimizing the directional loss, the network learns to backtrack the perturbed sample to its center point , thereby establishing a reverse flow path from the sample to the target manifold.

After training is completed, the Poisson-based model can initiate from a perturbed point far from the data manifold and perform reverse integration along the learned vector field to generate new samples. By integrating the trajectory of an ordinary differential equation (ODE), the process flows back toward the data manifold, enabling sample synthesis. The sampling formula is as follows:

(7)

where, denotes the current sampling state, represents the step size that controls the scale of each reverse step, and indicates the normalization of the directional vector.

According to the principle of Poisson model, the main parameters for establishing the data generation module are shown in Table 1.

thumbnail
Table 1. The key parameters of the Poisson generation module.

https://doi.org/10.1371/journal.pone.0332994.t001

2.3. Fault diagnosis model

In fault feature recognition tasks, although deep networks possess strong feature extraction capabilities, they often encounter performance bottlenecks during training due to gradient vanishing or degradation issues. To address this, this study employs a ResNet architecture, centered on the residual mechanism, as the classifier. This effectively mitigates instability during deep model training and enhances the network’s capacity to model high-dimensional time-frequency features. By introducing identity mapping paths, the residual network allows input features to propagate directly across layers, enabling efficient sharing of critical information at multiple levels, thereby improving the model’s sensitivity to variations in bearing fault patterns.

The fundamental building block of the network is the residual unit, whose core idea is to explicitly learn the difference (residual) between the input and output, rather than directly fitting a complex mapping. For the -th layer, its structure can be formally expressed as:

(8)(9)

where, denotes the input to the l-th layer, and represents a convolution-normalization-activation composite module with learnable parameters. refers to the set of learnable weights associated with the l-th composite module. The intermediate output is the residual sum of the input and the transformed features, which is then passed through a ReLU activation to obtain the output of the current layer.

The introduction of residual paths allows the model to learn the residual function via skip connections, preventing the progressive degradation of information in deep networks. To accommodate the time-frequency image inputs generated in this study, a lightweight modification of the original ResNet architecture was implemented. The network structure parameters are shown in Table 2. Detailed descriptions of key training hyperparameters, such as the optimizer, learning rate, and batch size, are provided in Table 3. While maintaining the core residual stacking logic, the number of channels in the initial convolutional kernel was reduced, and excessively deep convolutional layers before the global average pooling were removed to avoid feature loss caused by overfitting or excessive dimensionality reduction. The network output is passed through a Softmax layer to obtain the predicted probabilities for each class, and training is conducted using the standard multi-class cross-entropy loss function:

thumbnail
Table 2. Structural parameters of fault diagnosis module.

https://doi.org/10.1371/journal.pone.0332994.t002

thumbnail
Table 3. Hyperparameter Configuration for PFRNet Classification Model Training.

https://doi.org/10.1371/journal.pone.0332994.t003

(10)

where, denotes the one-hot encoded ground truth label, and represents the predicted confidence that sample belongs to class . Here, is the total number of training samples in a batch, and denotes the number of fault categories. The indices and iterate over samples and classes, respectively. By optimizing this loss function in an end-to-end manner, the network can effectively extract critical discriminative features embedded in the time-frequency images, enabling accurate classification and identification of various fault conditions.

3. Experiment

The experimental runtime environment uses Python 3.8.10 and Pytorch library. The testbed is equipped with Intel Core i9-14900KF CPU, GeForce RTX4090D GPU 64G RAM.

3.1. Data presentation

The Case Western Reserve University (CWRU) bearing dataset is a widely used benchmark in the field of mechanical fault diagnosis, primarily employed to evaluate the performance of bearing fault detection algorithms. The dataset was collected using the experimental platform shown in Fig 3, and detailed technical specifications of the bearings are provided in Appendix Table 4. The experimental data include vibration signals of the drive-end bearing under normal operating conditions, as well as characteristic vibrations under three typical fault modes: inner race fault, outer race fault, and rolling element fault. For each fault type, three different defect sizes were introduced to simulate varying severities of pitting damage, ranging from mild to severe.

3.2. Data partitioning

To investigate the applicability of bearing fault diagnosis under imbalanced sample distribution scenarios, this study employs data augmentation techniques to mitigate class imbalance in the original dataset. The dataset is divided into training, validation, and test sets in a ratio of 10:1:1.

In the experimental design, we define a dataset imbalance ratio (see Eq (11)) to quantify varying degrees of sample scarcity, assuming that the imbalance level is consistent across all fault categories. Synthetic samples are used exclusively during the training and validation phases, while the test set consists entirely of original data with an equal number of samples for each fault category. Model training is conducted using the mixed dataset for parameter optimization, and final performance evaluation is carried out on an independent test set. Table 5 provides the specific sample distribution across subsets when the imbalance ratio I is set to 0.9.

thumbnail
Table 5. The number of samples when the imbalance degree is 0.9.

https://doi.org/10.1371/journal.pone.0332994.t005

(11)

where, represents the number of generated samples, and represents the number of original samples.

3.3. Experimental results

The proposed model was evaluated on a ten-class dataset alongside CNN, VGGNet, AlexNet, and ViT for classification accuracy. For reproducibility, we briefly describe the main architectural settings of all models, including CNN, AlexNet, VGGNet, ViT, and PFRNet. All models were configured to accept 64 × 64 × 3 time–frequency inputs and produce ten-category outputs. Table 6 summarizes the core components of each model, including major layers and structural parameters. All results are averaged over five independent runs with different random seeds (42, 123, 456, 789, 1000) and reported as mean ± standard deviation. Table 7 reports classification accuracy under varying data imbalance levels.

thumbnail
Table 7. Fault diagnosis accuracy of different models (mean±std over 5 runs with different random seeds.

https://doi.org/10.1371/journal.pone.0332994.t007

Experimental results show that the classification accuracy of all models declines as the class imbalance ratio increases from 0.8 to 0.98, underscoring the persistent challenge of imbalanced data in fault diagnosis. CNN experiences the most significant performance drop, with accuracy decreasing from 90.10% ± 1.20% to 84.20% ± 2.10%, indicating its high sensitivity to class distribution skew. In contrast, AlexNet and VGGNet exhibit greater resilience. At an imbalance ratio of 0.98, AlexNet achieves 94.00% ± 1.30%, while VGGNet reaches 95.20% ± 1.05%, reflecting the advantages of deeper convolutional architectures in handling data imbalance and random variation. ViT, a Transformer-based classifier, consistently outperforms VGGNet across all scenarios, with accuracy decreasing modestly from 98.80% ± 0.50% at 0.8 to 95.60% ± 0.95% at 0.98. Its lower standard deviations further highlight strong generalization capability.

The proposed PFRNet attains the highest accuracy across all imbalance levels, achieving 99.40% ± 0.32% at a ratio of 0.8 and 96.40% ± 0.70% at 0.98. Notably, it yields the smallest standard deviations across all settings, indicating strong robustness to both initialization and data variation. These results suggest that combining Poisson Flow-based augmentation with a lightweight residual network allows PFRNet to address class imbalance more effectively than conventional CNNs and Transformer-based models.

To further validate the training stability of PFRNet, Fig 4 displays training and validation accuracy curves under imbalance ratios of 0.8 and 0.98. In both cases, the model converges rapidly within 30 epochs, with training and validation curves increasing synchronously and maintaining narrow gaps. This demonstrates that the selected hyperparameters (Table 3) enable stable learning while avoiding overfitting under different imbalance conditions.

thumbnail
Fig 4. Training and validation accuracy curves of PFRNet under different imbalance ratios.

https://doi.org/10.1371/journal.pone.0332994.g004

4. Analysis

4.1. Confusion matrix and class-wise metric analysis

Fig 5 illustrate the confusion matrices of the proposed PFRNet model on the test set under different data balance ratios (0.8, 0.9, 0.96, and 0.98). The results indicate that PFRNet demonstrates strong fault identification capabilities across all balance ratios. Particularly under relatively balanced conditions (e.g., ratios of 0.8 and 0.9), the model achieves high accuracy for all fault categories, with diagonal elements in the confusion matrix approaching ideal values, suggesting excellent feature extraction and discrimination performance. Even under highly imbalanced conditions (balance ratio of 0.98), PFRNet maintains stable recognition for most classes, with only slight misclassifications in minority categories, and still outperforms conventional deep networks overall. These findings highlight the robustness and generalization capability of PFRNet; however, extreme imbalance can still hinder the model’s ability to recognize minority classes, underscoring the importance of data balancing in improving diagnostic performance.

thumbnail
Fig 5. Confusion matrix of PFRNet model test set under different data balance ratios.

https://doi.org/10.1371/journal.pone.0332994.g005

To further evaluate PFRNet’s class-wise performance under imbalanced conditions, Table 8 reports precision, recall, and F1-scores across four imbalance ratios (0.8–0.98). Most fault categories (e.g., B021, IR021, OR014) achieve near-perfect metrics (>98%) across all scenarios, indicating robust feature discrimination. For imbalance-sensitive minority classes (e.g., B007, IR014), slight performance declines are observed under extreme imbalance (ratio = 0.98); however, their F1-scores remain above 93%, confirming effective mitigation of under-recognition. The majority class (Normal) also maintains high stability (F1-score ≥ 97.07%), indicating that PFRNet avoids bias toward dominant categories. These results demonstrate PFRNet’s superior class-specific discriminative capability under imbalanced conditions.

thumbnail
Table 8. Class-wise precision, recall, and F1-score of PFRNet under different imbalance ratios (%).

https://doi.org/10.1371/journal.pone.0332994.t008

Fig 6 display the confusion matrices for CNN, AlexNet, VggNet, and ViT at an imbalance ratio of 0.98. As data imbalance increases, all these models exhibit degraded performance. The traditional CNN shows high misclassification rates across categories, with severe confusion between IR007 and IR014, indicating poor recognition of minority classes. Although AlexNet and VggNet achieve higher accuracy than CNN for some faults, they still misclassify minority categories such as OR014 and B014. ViT despite some diagonal concentration in its confusion matrix, misclassifies minority cases and lacks consistent stability in predicting them. Notably, PFRNet outperforms all other models across most fault categories, exhibiting a more concentrated diagonal in the confusion matrix, which indicates more accurate and stable predictions. Even under extreme data imbalance, PFRNet accurately identifies typical faults such as IR007, IR021, and OR021, demonstrating strong feature extraction capabilities and adaptability to minority classes. These findings further confirm the robustness and practical value of PFRNet in real-world imbalanced data scenarios.

thumbnail
Fig 6. Confusion matrix for the comparison model test set at imbalance 0.98.

https://doi.org/10.1371/journal.pone.0332994.g006

4.2. Comparison of the effects of different data generation models

Table 9 presents the ablation results that quantify the individual contributions of data augmentation and classifier modules to diagnostic performance under imbalanced conditions. These results clearly demonstrate a positive correlation between diagnostic accuracy and the advancement of data augmentation strategies, underscoring the essential role of high-quality sample generation in mitigating class imbalance.

As the ablation baseline, the “No Augmentation” group (without any data augmentation) reveals that all classifiers suffer performance degradation when trained on raw imbalanced data. For instance, CNN achieves only 72.60% accuracy at an imbalance ratio of 0.98, while ResNet, despite its stronger feature extraction ability, achieves only 83.70% under the same condition. This significant performance gap between the baseline and augmented groups strongly confirms that data augmentation is indispensable for addressing class imbalance.

Among traditional generative approaches, GAN and its variants (DCGAN, BAGAN) enhance diagnostic accuracy over the baseline, but are outperformed by diffusion-based models. DDPM, a foundational diffusion model, achieves 93.90% accuracy with ResNet at an imbalance ratio of 0.98. Its optimized variant, DDIM, further improves this to 95.80% due to more efficient sampling. TransGAN, which integrates Transformer architecture with GAN, outperforms conventional GANs—for example, achieving 94.80% accuracy with ResNet at I = 0.98—by leveraging long-range feature modeling, but still falls short compared to diffusion-based methods.

The proposed PFRNet, as the core ablation target, outperforms all alternative augmentation modules across all imbalance levels, achieving 99.40% accuracy at I = 0.8 and 96.40% at I = 0.98. This superior performance stems from two key components validated by ablation analysis: (1) the Poisson Flow-based generative mechanism, which synthesizes minority-class samples that are statistically consistent with real data; and (2) the lightweight residual network, which efficiently extracts discriminative features from time-frequency representations. These results confirm that PFRNet’s integrated framework provides synergistic benefits beyond those of individual components, demonstrating its effectiveness in addressing imbalanced fault diagnosis.

4.3. Visual analysis of T-SNE features

To provide a more intuitive assessment of the PFRNet model’s performance in data generation tasks—particularly its capability for class discrimination in high-dimensional feature space and its impact on feature learning for fault diagnosis—this study employs the t-Distributed Stochastic Neighbor Embedding (t-SNE) algorithm to visualize the distribution of both generated and real samples in the feature space. As a nonlinear dimensionality reduction technique, t-SNE effectively preserves local structural relationships among high-dimensional features and is well-suited for cluster visualization in complex feature spaces.As illustrated in Fig 7, the generated samples exhibit strong cohesion with real samples of the same class in the low-dimensional space, and distinct, well-separated clusters are formed among different fault categories. This observation indicates that the samples generated by PFRNet are highly consistent with the real data in terms of feature representation, accurately simulating the original distribution patterns of various fault types and thereby enhancing the generalization capability of the diagnostic model.Even under extremely imbalanced data conditions (imbalance ratio = 0.98), where slight overlaps among feature points of certain minority classes are observed, the overall clustering structure remains well separated. These findings further confirm the robustness and discriminative capacity of the proposed model in handling severe class imbalance. The generated samples not only effectively augment minority class data but also maintain strong inter-class separability in the feature space.In summary, the t-SNE visualization results clearly demonstrate that the samples generated by PFRNet offer significant advantages in preserving feature distribution consistency and enhancing fault class discriminability.

thumbnail
Fig 7. T-SNE feature visualization of the PFRNet model generated data and the original data with different balance degrees.

https://doi.org/10.1371/journal.pone.0332994.g007

4.4. Generate data quality analysis

As shown in Fig 8, we compare the time–frequency representations of original samples with those generated via Poisson Flow-based data augmentation across various fault types. For instance, in cases such as B007 and B014, the generated samples exhibit time–frequency feature distributions that closely align with the originals. In terms of energy concentration, temporal distribution of impact signals, and frequency band patterns, the augmented data effectively replicates the key structural characteristics of the original signals. This provides intuitive visual evidence supporting the effectiveness of the Poisson Flow-based augmentation method. By preserving fault-relevant time–frequency features, the generated samples demonstrate strong potential as reliable and informative inputs for imbalanced fault diagnosis tasks.

thumbnail
Fig 8. Comparison of Time – Frequency Representations between Original Samples and Generated Samples for Different Fault Types.

https://doi.org/10.1371/journal.pone.0332994.g008

To evaluate the generative model’s ability to simulate real fault data, three widely used metrics are adopted: Fréchet Inception Distance (FID), Inception Score (IS), and Maximum Mean Discrepancy (MMD), providing a comprehensive assessment of similarity between real and generated samples.

FID measures the distance between Gaussian distributions fitted to Inception-v3 features of real and generated data:

(12)

where and denote feature means and covariances from the pool3 layer. Fig 9 shows that PFRNet achieves lower FID, indicating better alignment with real data.

thumbnail
Fig 9. Comparison of data FID values generated by different data generation models.

https://doi.org/10.1371/journal.pone.0332994.g009

MMD quantifies distributional differences using kernel embeddings:

(13)

where and represent real and generated samples, and is the Gaussian kernel. The bandwidth parameter was selected using the median heuristic. As shown in Fig 10, PFRNet attains the lowest MMD in most categories, suggesting strong distributional consistency.

thumbnail
Fig 10. Comparison of MMD values of data generated by different data generation models.

https://doi.org/10.1371/journal.pone.0332994.g010

Inception Score (IS) evaluates both the clarity and diversity of generated samples based on the conditional label distribution predicted by Inception-v3. It is computed as:

(14)

where is the predicted label distribution for a generated sample , and is the marginal class distribution. IS was calculated using 10 splits, following the standard protocol. The results in Fig 11 show that the proposed model consistently achieves the highest IS across all fault categories, outperforming the competing approaches.

thumbnail
Fig 11. Comparison of data IS values generated by different data generation models.

https://doi.org/10.1371/journal.pone.0332994.g011

In summary, the evaluation results based on the three metrics consistently demonstrate that the proposed PFRNet model outperforms baseline models in terms of sample realism, distributional similarity, and diversity, thereby confirming its effectiveness and robustness in fault sample generation tasks.

5. Conclusions

This study proposes a fault diagnosis method that integrates continuous wavelet transform with a Poisson-based sample generation mechanism to address the issue of imbalanced data distribution in rolling bearing fault diagnosis. The method extracts time-frequency features from raw signals using continuous wavelet transform and generates samples for underrepresented fault categories based on a Poisson distribution model, thereby improving both data balance and sample diversity. Validation on the CWRU bearing dataset demonstrates that the proposed approach can generate high-quality fault samples, significantly enhancing classification accuracy, stability, and robustness under various imbalance scenarios. These results highlight the potential of the method for practical application and industrial deployment.

It should be noted that the present validation is conducted on the CWRU dataset under controlled laboratory conditions. In real industrial environments, domain shifts and variations in operating conditions, such as fluctuating loads, variable speeds, and environmental noise, may influence the generalizability of the proposed framework. Future research will therefore focus on evaluating its performance on diverse industrial datasets and exploring domain adaptation strategies to enhance robustness in practical deployments.

References

  1. 1. Xu M, Shi Y, Deng M, Liu Y, Ding X, Deng A. An improved multi-scale branching convolutional neural network for rolling bearing fault diagnosis. PLoS One. 2023;18(9):e0291353. pmid:37703236
  2. 2. Yu S, Li Z, Gu J, Wang R, Liu X, Li L, et al. CWMS-GAN: A small-sample bearing fault diagnosis method based on continuous wavelet transform and multi-size kernel attention mechanism. PLoS One. 2025;20(4):e0319202. pmid:40215467
  3. 3. Tao L, Liu H, Ning G, Cao W, Huang B, Lu C. LLM-based framework for bearing fault diagnosis. Mech Syst Signal Process. 2025;224:112127.
  4. 4. Pandiyan M, Babu TN. Systematic Review on Fault Diagnosis on Rolling-Element Bearing. J Vib Eng Technol. 2024;12(7):8249–83.
  5. 5. Zhou J, Xiao M, Niu Y, Ji G. Rolling Bearing Fault Diagnosis Based on WGWOA-VMD-SVM. Sensors (Basel). 2022;22(16):6281. pmid:36016042
  6. 6. Wan L, Gong K, Zhang G, Yuan X, Li C, Deng X. An Efficient Rolling Bearing Fault Diagnosis Method Based on Spark and Improved Random Forest Algorithm. IEEE Access. 2021;9:37866–82.
  7. 7. Wang H, Zhou Z, Zhang L, Yan R. Multiscale Deep Attention Q Network: A New Deep Reinforcement Learning Method for Imbalanced Fault Diagnosis in Gearboxes. IEEE Trans Instrum Meas. 2024;73:1–12.
  8. 8. An Y, Zhang K, Liu Q, Chai Y, Huang X. Rolling Bearing Fault Diagnosis Method Base on Periodic Sparse Attention and LSTM. IEEE Sensors J. 2022;22(12):12044–53.
  9. 9. Hou J, Wu Y, Ahmad AS, Gong H, Liu L. A Novel Rolling Bearing Fault Diagnosis Method Based on Adaptive Feature Selection and Clustering. IEEE Access. 2021;9:99756–67.
  10. 10. Chen F, Zhao Z, Hu X, Liu D, Yin X, Yang J. A nonlinear dynamics method using multi-sensor signal fusion for fault diagnosis of rotating machinery. Adv Eng Inform. 2025;65:103190.
  11. 11. Chen F, Ding C, Hu X, He X, Yin X, Yang J, et al. Tensor Poincaré plot index: A novel nonlinear dynamic method for extracting abnormal state information of pumped storage units. Reliab Eng Syst Safe. 2025;254:110607.
  12. 12. Li Y, Gu X, Wei Y. A Deep Learning-Based Method for Bearing Fault Diagnosis with Few-Shot Learning. Sensors (Basel). 2024;24(23):7516. pmid:39686052
  13. 13. Tang T, Hu T, Chen M, Lin R, Chen G. A deep convolutional neural network approach with information fusion for bearing fault diagnosis under different working conditions. Proc Instit Mech Eng Part C J Mech Eng Sci. 2020;235(8):1389–400.
  14. 14. Wang Y, Cheng L. A combination of residual and long–short-term memory networks for bearing fault diagnosis based on time-series model analysis. Meas Sci Technol. 2020;32(1):015904.
  15. 15. Cui L, Tian X, Wei Q, Liu Y. A self-attention based contrastive learning method for bearing fault diagnosis. Exp Syst Appl. 2024;238:121645.
  16. 16. Guo W, Li X, Shen Z. A lightweight residual network based on improved knowledge transfer and quantized distillation for cross-domain fault diagnosis of rolling bearings. Exp Syst Appl. 2024;245:123083.
  17. 17. Huang Y, Huang W, Hu X, Liu Z, Huo J. UDDGN: Domain-Independent Compact Boundary Learning Method for Universal Diagnosis Domain Generation. IEEE Trans Instrum Meas. 2025;74:1–20.
  18. 18. Huang N, Chen Q, Cai G, Xu D, Zhang L, Zhao W. Fault Diagnosis of Bearing in Wind Turbine Gearbox Under Actual Operating Conditions Driven by Limited Data With Noise Labels. IEEE Trans Instrum Meas. 2021;70:1–10.
  19. 19. Wang H, Zhang X. Fault Diagnosis Using Imbalanced Data of Rolling Bearings Based on a Deep Migration Model. IEEE Access. 2024;12:5517–33.
  20. 20. Jalayer M, Kaboli A, Orsenigo C, Vercellis C. Fault Detection and Diagnosis with Imbalanced and Noisy Data: A Hybrid Framework for Rotating Machinery. Machines. 2022;10(4):237.
  21. 21. Irfan M, Mushtaq Z, Khan NA, Mursal SNF, Rahman S, Magzoub MA, et al. A Scalo Gram-Based CNN Ensemble Method With Density-Aware SMOTE Oversampling for Improving Bearing Fault Diagnosis. IEEE Access. 2023;11:127783–99.
  22. 22. Wang Z, Liu T, Wu X, Liu C. A diagnosis method for imbalanced bearing data based on improved SMOTE model combined with CNN-AM. J Comput Des Eng. 2023;10(5):1930–40.
  23. 23. Mao W, Liu Y, Ding L, Li Y. Imbalanced Fault Diagnosis of Rolling Bearing Based on Generative Adversarial Network: A Comparative Study. IEEE Access. 2019;7:9515–30.
  24. 24. Lyu P, Zheng P, Yu W, Liu C, Xia M. A Novel Multiview Sampling-Based Meta Self-Paced Learning Approach for Class-Imbalanced Intelligent Fault Diagnosis. IEEE Trans Instrum Meas. 2022;71:1–12.
  25. 25. Fan J, Yuan X, Miao Z, Sun Z, Mei X, Zhou F. Full Attention Wasserstein GAN With Gradient Normalization for Fault Diagnosis Under Imbalanced Data. IEEE Trans Instrum Meas. 2022;71:1–16.
  26. 26. Yang Z, Mao R, Ye L, Liu Y, Hu X, Li Y. VSC-ACGAN: bearing fault diagnosis model applied to imbalanced samples. Meas Sci Technol. 2025;36(3):036212.
  27. 27. Liu S, Dou L, Jin Q. Improved generative adversarial network for bearing fault diagnosis with imbalanced data. 2023 6th International Conference on Information Communication and Signal Processing (ICICSP). 2023, p. 343–47.
  28. 28. Qin Z. Improved generative adversarial network for bearing fault diagnosis with a small number of data and unbalanced data. [cited 15 May 2025]. Available from: https://www.mdpi.com/2073-8994/16/3/358
  29. 29. Wang R, Zhang S, Liu S, Liu W, Ding A. A bearing fault diagnosis method for high-noise and unbalanced dataset. SRT. 2022;5(1):28–45.
  30. 30. Ruan D, Song X, Gühmann C, Yan J. Collaborative Optimization of CNN and GAN for Bearing Fault Diagnosis under Unbalanced Datasets. Lubricants. 2021;9(10):105.
  31. 31. Wang C, Liu J, Zio E. A Modified Generative Adversarial Network for Fault Diagnosis in High-Speed Train Components with Imbalanced and Heterogeneous Monitoring Data. JDMD. 2022;:84–92.
  32. 32. Liu Y, Jiang H, Yao R, Zhu H. Interpretable data-augmented adversarial variational autoencoder with sequential attention for imbalanced fault diagnosis. J Manufact Syst. 2023;71:342–59.
  33. 33. Huang X, Zhang X, Xiong Y, Fan B, Dai F. Highly imbalanced fault diagnosis of turbine blade cracks via deep focal dynamically weighted conditional variational auto-encoder network. Adv Eng Inform. 2024;62:102612.
  34. 34. Zhu Y, Cheng J, Liu Z, Zou X, Cheng Q, Xu H, et al. Data generation approach based on data model fusion: An application for rolling bearings fault diagnosis with small samples. IEEE Trans Instrum Measur. 2024 [cited 20 May 2025]. Available from: https://ieeexplore.ieee.org/abstract/document/10764738/
  35. 35. Yu Y, Tang L, Liu Z, Xiang J. A Novel Bearing Fault Data Generation Strategy Combining Physical Modeling and CycleGAN Variant for Fault Diagnosis Without Real Samples. IEEE Trans Instrum Meas. 2025;74:1–17.
  36. 36. Wang H, Li T, Xie M, Tian W, Han W. Wind Turbine Fault Diagnosis with Imbalanced SCADA Data Using Generative Adversarial Networks. Energies. 2025;18(5):1158.
  37. 37. Yu T, Li C, Huang J, Xiao X, Zhang X, Li Y, et al. ReF-DDPM: A novel DDPM-based data augmentation method for imbalanced rolling bearing fault diagnosis. Reliab Eng Syst Safe. 2024;251:110343.
  38. 38. Wang Q, Sun Z, Zhu Y, Li D, Ma Y. A fault diagnosis method based on an improved diffusion model under limited sample conditions. PLoS One. 2024;19(9):e0309714. pmid:39226268
  39. 39. Xu Y, Liu Z, Tian Y, Tong S, Tegmark M, Jaakkola T. PFGM++: Unlocking the potential of physics-inspired generative models. arXiv. 2023.