Intelligence prediction of integrated circuit reliability based on SSA-LSTM fusion architecture

Ying Qian; Bing Liu; Beibei Su; Chunyan Zhang

doi:10.1371/journal.pone.0339394

Abstract

The relentless scaling of integrated circuits (ICs) into the nanoscale regime has intensified critical reliability challenges, such as Negative Bias Temperature Instability (NBTI), which manifests primarily as a progressive shift in the transistor threshold voltage (). This study focuses on the prognostics of digital integrated circuits (ICs) where Metal-Oxide-Semiconductor Field-Effect Transistors (MOSFETs) serve as the fundamental building blocks. Accurate prognostics of device degradation are paramount for predicting circuit lifetime, yet traditional models struggle with the nonlinearity of the degradation process. To address this challenge, a novel fusion architecture is proposed that synergistically combines the Sparrow Search Algorithm (SSA) with a Long Short-Term Memory (LSTM) network. The resulting SSA-LSTM model automates the optimization of crucial LSTM hyperparameters—including the number of hidden units, learning rate, and iteration count—thereby enhancing the capability to learn complex temporal degradation patterns. Empirical Mode Decomposition (EMD) is further integrated as a pre-processing step to denoise the temporal data. This model significantly reduces the average absolute error of prediction, outperforming benchmark models such as PSO and EMD-PSO. It can accurately capture the nonlinear trajectory of degradation curves and has high sensitivity for early anomaly detection, providing a high-precision data-driven solution for the reliability evaluation and prediction of digital ICs.

Citation: Qian Y, Liu B, Su B, Zhang C (2025) Intelligence prediction of integrated circuit reliability based on SSA-LSTM fusion architecture. PLoS One 20(12): e0339394. https://doi.org/10.1371/journal.pone.0339394

Editor: Zeheng Wang, Commonwealth Scientific and Industrial Research Organisation, AUSTRALIA

Received: July 30, 2025; Accepted: December 7, 2025; Published: December 31, 2025

Copyright: © 2025 Qian et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting Information files.

Funding: This research was funded by the Jiangsu Province General Project of Philosophy and Social Sciences for Higher Education Institutions (No. 2023SJYB0998).

Competing interests: The authors have declared that no competing interests exist.

1 Introduction

Integrated Circuits (ICs) serve as the fundamental building blocks of the modern digital economy, underpinning advancements in computing, communications, and consumer electronics through their unparalleled capabilities in performance, miniaturization, and efficiency [1–3]. However, as IC feature sizes shrink to the nanoscale, their operational reliability is increasingly threatened by physical degradation mechanisms. Phenomena such as Negative Bias Temperature Instability (NBTI), Hot Carrier Injection (HCI), and Time-Dependent Dielectric Breakdown (TDDB) induce gradual and often irreversible performance shifts, with the threshold voltage drift () of transistors being a primary and critical manifestation [4–6]. The repercussions are severe; for instance, drift can lead to timing violations, increased power consumption, and ultimately, functional failure, accounting for a significant portion of field returns in demanding sectors like automotive electronics.

The reliability assessment of ICs at the nanoscale is severely hampered by three fundamental limitations. First, established physical models become inadequate, failing to capture the complex interplay of multiple stress factors. Furthermore, conventional test methods, such as production line burn-in (PLBI), grow prohibitively expensive and inefficient for complex AI Systems-on-Chip (SoCs). Compounding these issues, shallow machine learning techniques lack the capacity to decode the intricate spatiotemporal patterns inherent in heterogeneous reliability data. Among various monitored parameters, the threshold voltage shift () has emerged as a quintessential metric. Its prominence stems from its direct correlation to dominant failure mechanisms like NBTI and HCI [7], its profound impact on circuit performance (governing switching speed and power consumption), and its practical measurability throughout the device lifecycle using standard automated test equipment (ATE) [8,9].

In this context, data-driven methods, particularly deep learning, offer a viable alternative for automatically learning degradation dynamics from massive datasets without requiring explicit physical equations [10–12]. While various data-driven approaches exist for prediction—including physics-informed statistical models like Bayesian updating and Gaussian Process Regression [13,14] and shallow machine learning methods [15,16]—these often fail to fully capture the complex degradation trajectory. Long Short-Term Memory (LSTM) networks are particularly suitable for modeling such temporal sequences due to their inherent architecture. However, their predictive performance remains highly sensitive to hyperparameter configuration [17–19], and manual tuning typically yields suboptimal results while being labor-intensive and dependent on expert intuition.

Recent research highlights the efficacy of hybrid models that integrate optimization algorithms with deep learning. For instance, Hang et al. [20] successfully combined Variational Mode Decomposition (VMD), SSA, and LSTM for safety prediction in aerial refueling, demonstrating the power of signal processing and intelligent optimization. Similarly, Ma et al. [21] developed a Mechanics-Guided Data-Driven model that integrates physical mechanisms with deep learning for high-accuracy prediction of complex geophysical processes. Inspired by these advancements, this study utilizes the Sparrow Search Algorithm (SSA) [22–24] to automatically and adaptively optimize the hyperparameters of the LSTM network. This fusion mitigates the guesswork in model configuration and enhances prediction robustness.

According to the above literature investigation, the method proposed in this paper is effective in predicting transistor threshold voltage drift, and the principal innovations of this work are threefold:

Novel SSA-LSTM Fusion Architecture with Automated Hyperparameter Optimization: Introduces an intelligent framework that synergistically combines the global search capability of the Sparrow Search Algorithm with the superior temporal modeling of Long Short-Term Memory networks, enabling automated optimization of critical hyperparameters for robust IC reliability prediction.
Precise -Centered Degradation Tracking with Physical Foundation: Establishes the threshold voltage shift as a direct and physically-grounded core metric, providing a critical linkage between transistor-level aging phenomena and circuit-level performance degradation through integrated mathematical modeling of power-law kinetics and acceleration effects.
Demonstrated Empirical Superiority with High Prognostic Sensitivity: Validates significant performance advantages through rigorous comparative experiments, showing substantial prediction error reduction on accelerated aging data while maintaining exceptional sensitivity for early anomaly detection in practical intelligent health management systems.

The paper is organized as follows: Sect 2 develops the mathematical foundation for threshold voltage drift modeling, establishing both power-law and acceleration models. Sect 3 details the proposed SSA-LSTM fusion architecture, encompassing data preprocessing techniques, algorithmic fundamentals, and model integration strategies. Sect 4 presents the experimental setup and provides a comprehensive performance analysis through comparative validation. Finally, Sect 5 concludes the work by highlighting key findings and their implications for IC reliability prognostics.

2 Mathematical modeling establishment

2.1 Physics informed degradation modeling

The core of our prognostic approach is not purely data-driven but is informed by well-established physical models of IC degradation. This ensures that the LSTM network learns representations that are physically plausible.

The temporal evolution of the threshold voltage shift, , is predominantly described by a power-law model, which is empirically well-established for capturing degradation mechanisms like NBTI [25]:

(1)

where t is the stress time, A is a pre-factor or proportionality constant, and n is a dimensionless empirical exponent (fitting parameter).

Remark 1: All experiments are conducted under controlled stress conditions, where temperature and gate voltage are kept constant. This avoids the need for multi-variable modeling, which would significantly increase the complexity of the analysis.

Remark 2: Samples collected under different stress conditions are assumed to be independent and comparable. It is further assumed that environmental factors such as humidity, electromagnetic interference, and equipment deviations have been minimized to avoid introducing systematic errors in the measurement of .

2.2 Accelerated model of life prediction

The parameter A in the power-law model (1) is not a true constant but a stress-dependent prefactor. Its temperature dependence is classically described by the Arrhenius relation. In this context, A(T) represents the value of the prefactor at a specific temperature T, and is modeled as:

(2)

where is a voltage-dependent coefficient, E_a is the activation energy (e.g., 0.1-0.2 eV for interface trap generation and 0.8-1.0 eV for hole trapping in NBTI [26]), k_B is Boltzmann’s constant, and T is the absolute temperature.

Similarly, the voltage dependence of the prefactor is frequently described by an exponential model, where the stress voltage V is the gate-to-source voltage applied during the accelerated test [27]:

(3)

Here, is the voltage acceleration factor, which typically ranges from 1 to 10 depending on the technology and oxide thickness [28].

Integration with Data-Driven Approach: Our SSA-LSTM model does not attempt to directly fit these equations. Instead, it learns the complex, nonlinear functional that maps a sequence of past measurements and the corresponding stress conditions (T, ) to future values:

(4)

where are the model parameters optimized by SSA. By training on data from multiple stress conditions, the model inherently learns to generalize the effects encapsulated in the physical acceleration models, enabling predictions under varying operational profiles.

Assumption 1: The failure threshold for chip lifetime prediction is predefined and well-established. It is assumed that the end of the chip’s useful life occurs when the threshold voltage shift exceeds a specified limit (e.g., 50 , which should be determined based on reliable reliability standards or manufacturing specifications.

Assumption 2: The trained LSTM model is assumed to effectively capture the nonlinear temporal evolution of and provide accurate short-term predictions. Furthermore, the LSTM outputs are assumed to be compatible with power-law model fitting, enabling integration with acceleration models for lifetime extrapolation. The LSTM hyperparameters optimized by the SSA are presumed to offer near-optimal global solutions with good generalization performance on unseen data.

3 Threshold voltage drift prediction model

3.1 Forecasting target

The prediction objective of this study is to perform high-precision modeling and forecasting of the threshold in transistors within integrated circuits, which serves as a key indicator of device aging and performance degradation. By constructing a LSTM prediction model optimized via the SSA, referred to as SSA-LSTM, the goal is to dynamically predict the nonlinear degradation process of . This approach aims to provide accurate support for intelligent reliability prognostics and health management of integrated circuits. The prediction logic of this study is illustrated in Fig 1.

Download:

Fig 1. SSA-LSTM IC reliability prognostics architecture.

https://doi.org/10.1371/journal.pone.0339394.g001

3.2 Data preprocessing

To enhance signal quality and extract meaningful degradation patterns from noisy voltage shift data, this study investigates two signal decomposition techniques: VMD [29] and EMD [30]. Both are widely used for nonstationary and nonlinear signal analysis, offering effective time-frequency decomposition capabilities.

3.2.1 Variational mode decomposition.

VMD is a non-recursive, variational technique that decomposes a signal into a finite set of band-limited intrinsic mode functions (IMFs), each centered around an adaptive center frequency. Unlike EMD, VMD is defined as a constrained optimization problem.

The decomposition is formulated as a constrained variational problem:

(5)

where the symbol denotes the L₂ norm of a signal, and represents its square, which corresponds to the signal’s energy. u_k(t) represents the k-th mode function to be extracted from the original signal f(t), while denotes its corresponding center frequency. The decomposition utilizes the Dirac delta function and the convolution operator * to shift the signal into the frequency domain. The imaginary unit j and the partial time derivative are used in constructing the analytic signal for each mode, enabling the algorithm to isolate modes with minimal spectral overlap.

Remark 3: VMD ensures that each mode is compact in the frequency domain, which enhances separation accuracy. However, its effectiveness depends on carefully selecting the number of decomposition modes and the penalty parameter, which may require empirical tuning.

3.2.2 Empirical mode decomposition.

EMD is an adaptive, data-driven approach that decomposes a signal into a set of intrinsic mode functions (IMFs) without requiring a predefined basis or model. It is particularly suited for nonlinear and nonstationary time series.

The decomposition process follows a “sifting" algorithm:

1. Identify all local extrema (maxima and minima).

2. Construct upper and lower envelopes using cubic spline interpolation.

3. Compute the mean envelope:

(6)

4. Extract the detail:

(7)

5. Check whether h(t) satisfies IMF conditions. If not, repeat steps 1-4 on h(t).

6. Subtract the IMF from the signal and repeat for the residual.

The final decomposition yields:

(8)

where refers to the i-th Intrinsic Mode Function, which represents a simple oscillatory component extracted from the original signal. These IMFs capture localized time-frequency information at different scales. The term r_n(t) denotes the final residual signal after extracting n IMFs, typically representing the long-term trend or non-oscillatory baseline of the original data. Together, the IMFs and the residual form a complete and adaptive decomposition of the input signal.

Remark 4: EMD is particularly advantageous for analyzing nonlinear and nonstationary signals, such as degradation curves in electronic components, due to its fully data-driven and adaptive nature. Its high temporal resolution allows precise tracking of time-varying signal characteristics, while its robustness to variations in mode shapes ensures reliable extraction of degradation trends. These properties make EMD highly suitable for preprocessing tasks in reliability prediction models, where capturing subtle changes in early-stage degradation is critical.

3.3 Basic principle of LSTM

In the field of deep learning research on Q tasks, Recurrent Neural Networks (RNNs) are widely utilized due to their powerful capabilities in handling sequential data, such as in speech recognition and natural language processing [31–33]. However, traditional RNNs often suffer from the vanishing gradient problem when dealing with long sequences, making it difficult to learn long-range dependencies. To address this challenge, LSTM networks were introduced as an enhanced RNN architecture. With their gate-based mechanism, LSTMs effectively mitigate this issue and have become a key technology in sequence modeling.

3.3.1 LSTM gating mechanism.

The LSTM network addresses the short-term memory and gradient vanishing problems of traditional RNNs through its gating mechanism, comprising the forget, input, and output gates. These gates collectively regulate the retention, update, and output of information, enabling the network to effectively capture long-term dependencies. The logical structure of LSTM is depicted in Fig 2.

Download:

Fig 2. The LSTM logical structure diagram.

https://doi.org/10.1371/journal.pone.0339394.g002

(1) Forgetting gate

The forget gate controls which information from the long-term cell state should be retained or discarded. It takes the current input and the previous hidden state as inputs, passing them through a Sigmoid function to generate values between 0 and 1. Values close to 0 imply that the corresponding information in the cell state will likely be discarded, while values close to 1 indicate that the information will be retained. The calculation formula of forgetting gate is as follows:

(9)

where W_f and b_f are the weight matrix and bias vector of forgetting gate, respectively.

(2) Input gate

The input gate determines which parts of the current input information should be incorporated into the cell state. It processes the current input along with the previous hidden state through a Sigmoid function to produce the update proportions, and simultaneously uses a Tanh function to preprocess the input. Multiplying these two results yields the information that will actually be updated into the cell state.

(10)

(11)

where and b_C are the correlation weight matrix and bias vector of the input gate respectively.

According to the results of forgetting gate and input gate, the cell state is updated. The specific formula is as follows:

(12)

where stands for element-by-element multiplication.

(3) Output gate

The output gate controls which parts of the cell state will be outputted as the current hidden state. It processes the current input and previous hidden state through a Sigmoid function to determine the proportion of information to be outputted. This proportion is then multiplied by the cell state activated through a Tanh function to obtain the final hidden state at the current timestep.

(13)

(14)

where W_o and b_o are the weight matrix and bias vector of the output gate, respectively.

3.4 Sparrow search algorithm

The SSA is a swarm intelligence optimization technique that mimics sparrow foraging and anti-predation behavior. It employs a population divided into explorers (searching for food), followers (exploiting found resources), and vigilants (issuing random warnings). This structure ensures effective exploration-exploitation balance, allowing SSA to efficiently solve complex optimization problems [20,23,34], as shown in Fig 3.

Download:

Fig 3. The schematic diagram of sparrow search algorithm.

https://doi.org/10.1371/journal.pone.0339394.g003

3.4.1 Description of mathematical formula of SSA.

During the foraging process, discoverers exhibit a broader search capability, actively guiding followers and continuously updating their own positions based on memory, thereby improving the efficiency of locating food sources. The position update follows the iterative formula below.

(15)

where Q is a random number drawn from a standard normal distribution, which introduces a scalar perturbation to the position. denotes the d-th dimensional position of the i-th sparrow in the t-th generation of the population, where and D is the total number of dimensions. is a uniformly distributed random number in the interval [0,1]. is the maximum number of iterations, R₂ is an alertness factor drawn from a uniform distribution over [0,1], T represents the alert threshold, typically ranging from 0.5 to 1, and U is a matrix with all elements equal to 1.

During the foraging process, followers continuously monitor the behavior of discoverers. Once a discoverer locates a new food source, the followers immediately move toward the updated position to compete for the resource. The position update rule for followers is given by Eq (16).

(16)

where X_p represents the best position currently occupied by a discoverer, while denotes the worst position in the population. A is a matrix with each element randomly selected from the interval (–1,1), and denotes the generalized inverse of matrix A. The parameter n refers to the total number of sparrows in the population.

Considering the potential threat of predators during foraging, approximately 10% to 20% of the sparrows are designated as scouts. These scouts are initially distributed randomly, and if danger is detected, they will immediately abandon the current food source and relocate to a safer position. The position update rule for scouts is defined by Eq (17).

(17)

where denotes the current global best position found within the population. and represent the fitness values of the globally best and worst individuals, respectively, while f_i is the fitness of the i-th sparrow. is a random variable that follows a standard normal distribution, and K is a randomly selected value from the interval [–1,1] that determines the movement direction of the sparrow. is a small constant introduced to prevent division by zero.

3.5 SSA-LSTM prediction model

After applying the data preprocessing method described in Sect 3.2, the dataset is divided into training and testing sets, which are then loaded into the corresponding registers. The SSA is subsequently employed to optimize the key hyperparameters of the LSTN model. Once the optimal hyperparameters are obtained, they are fed into the LSTM model to construct predictive models for different output targets. The overall workflow of the SSA-LSTM prediction model is illustrated in Fig 4.

Download:

Fig 4. The process of SSA-LSTM prediction model.

https://doi.org/10.1371/journal.pone.0339394.g004

3.6 SSA-LSTM hyperparameter optimization strategy

The optimization of LSTM hyperparameters using SSA was formulated as a constrained search problem to minimize the prediction error on the validation set.

3.6.1 Search space: The SSA was configured to optimize three critical hyperparameters.

- Number of hidden units: Discrete search space [32, 64, 128, 256].

- Initial Learning Rate: Continuous log-space [1e-4, 1e-1].

- L2 Regularization Factor: Continuous log-space [1e-5, 1e-2].

3.7 SSA configuration and execution

The sparrow population size was set to 30, and the algorithm was run for 50 iterations. The proportion of discoverers was set to 20% and the scouts to 10% of the population. The algorithm was terminated when the fitness improvement fell below a tolerance of for 10 consecutive iterations. The best-performing parameter set from this automated search was then used to train the final LSTM model on the combined training and validation sets before evaluation on the test set.

3.8 Predictive performance evaluation index

In the fields of machine learning and data analysis, several statistical metrics are commonly used to evaluate the predictive performance of a model. The following are three widely used and essential evaluation metrics:

1. Root Mean Square Error (RMSE)

RMSE measures the magnitude of the difference between predicted and actual values. A smaller RMSE indicates that the model’s predictions are closer to the true values, implying better performance. It is calculated as:

(18)

where is the predicted value, is the actual value, and n is the total number of data points.

2. Mean Absolute Percentage Error (MAPE)

MAPE expresses the average prediction error as a percentage of the actual values, providing an intuitive understanding of how large the errors are relative to the true values. A lower MAPE indicates higher overall prediction accuracy. The formula is:

(19)

3. Coefficient of Determination (R²)

R², also known as the coefficient of determination, evaluates how well the model explains the variance in the observed data. An R² value closer to 1 indicates a better fit, whereas a value of 0 means the model fails to capture any variability. The formula is:

(20)

where represents the mean of all actual values.

4 Simulation verification

To verify the effectiveness and superiority of the proposed EMD-SSA-LSTM forecasting model, a comparative analysis was conducted against two benchmark models: a standalone LSTM model and an EMD-LSTM. These models were applied to the same time series dataset under identical experimental conditions. To ensure fairness and consistency in comparison, the following simulation parameters were carefully configured and kept uniform across all predictive schemes.

4.1 Dataset provenance and experimental setup

Device Under Test and Stress Conditions: The accelerated life test data was obtained from a population of commercial p-type MOSFETs with a gate oxide thickness of 2.5 nm. The devices were subjected to constant high-temperature gate bias (HTGB) stress, a standard accelerated test for invoking NBTI.

Data Acquisition: The was measured in-situ at predefined intervals using a precision semiconductor parameter analyzer (Keysight B1500A). The dataset comprises 512 time-series samples for each device, recorded at a fixed sampling interval of 1 hour until a predefined failure criterion of mV was reached. Data from multiple devices and stress conditions (T = 125^∘C, 150^∘C; Vgs = -2.0V, -2.2V) were aggregated to create a robust and generalizable training set.

Data Partitioning and Preprocessing: The aggregated dataset was partitioned chronologically on a per-device basis, with 70% of the sequence used for training, 15% for validation (for early stopping), and the final 15% held out for testing. This ensures the model is evaluated on unseen future degradation data, simulating a real-world prognostic scenario. All data was normalized to the [0, 1] range based on the training set’s minimum and maximum values to ensure stable and efficient network training.

4.2 Simulation parameter setting

The time series dataset employed in this study comprises voltage signal measurements with a total of 512 samples, collected at a fixed sampling interval of one hour. During the data preprocessing phase, the raw signal was restructured into input–output pairs suitable for supervised learning by applying a sliding window technique. The lookback window size was set to 5, meaning that each input sample consists of five consecutive historical voltage values. The forecast horizon was set to 1, indicating that the model predicts the voltage value at the next time step. This configuration effectively captures the temporal dependencies inherent in the time series, allowing the model to learn the dynamic influence of past states on future outcomes.

To better characterize the non-stationary nature of the voltage signal, EMD was applied to decompose the original signal into a set of IMFs. EMD adaptively separates a complex signal into a finite number of oscillatory components, each representing a distinct frequency band. In this study, the EMD procedure was configured with a bandwidth penalty parameter to control the smoothness of the IMFs, and the DC component was suppressed by setting DC = 0. The frequency initialization method was set to init = 1, which uniformly distributes the initial center frequencies across the spectrum. A convergence tolerance of was enforced to ensure decomposition accuracy. This setup enables effective extraction of signal features across multiple frequency scales, providing a refined basis for subsequent LSTM modeling.

An LSTM-based architecture served as the core forecasting model in this work. The network consists of an input layer, a single LSTM hidden layer, a ReLU activation layer, a fully connected regression layer, and a MSE loss function. The input dimension is determined by the lookback window size and the number of input features. For instance, in the case of a univariate voltage signal with a lookback window size of 5, the input dimension is 5. To ensure robust training and enhance generalization performance, the following hyperparameters were adopted: the maximum number of training epochs was set to 70, the initial learning rate to 0.01, and the gradient clipping threshold to 1. A learning rate schedule was applied, reducing the rate by a factor of 0.2 every 60 epochs. L2 regularization with a coefficient of 0.01 was incorporated to prevent overfitting, and the Adam optimizer was used for stochastic gradient descent. All experiments were executed on a computing node equipped with an Intel Xeon Gold 6248R processor (3.0 GHz base frequency) and 256 GB of RAM, utilizing the MATLAB R2021a programming environment. The average time to complete a full training cycle for the final EMD-SSA-LSTM model was approximately 529 seconds. Training progress was monitored in real-time via plotted loss curves. Prior to training, all input features were normalized to the range [0, 1] using min-max scaling to mitigate numerical instability and accelerate convergence.

4.3 Comparative analysis of forecasting performance

The simulation results (Figs 5–11) systematically illustrate performance improvements from signal decomposition and hyperparameter optimization. Figs 5 and 6 compare decomposition granularity: with K = 3, the signal separates into trend, oscillation, and noise; with K = 5, finer quasi-periodic structures are isolated, improving feature separation at increased complexity. Fig 7 shows the SSA-LSTM optimization converging within five iterations. The RMSE remains near 0.0178 for the first three iterations, drops sharply to 0.0146 by the fourth, and stabilizes, confirming effective convergence.

Download:

Fig 5. The decomposition diagram of EMD with modal number K = 3.

https://doi.org/10.1371/journal.pone.0339394.g005

Download:

Fig 6. The decomposition diagram of EMD with modal number K = 5.

https://doi.org/10.1371/journal.pone.0339394.g006

Download:

Fig 7. Evolution convergence diagram of SSA-LSTM prediction model.

https://doi.org/10.1371/journal.pone.0339394.g007

Download:

Fig 8. Comparative diagram of training set results of three different prediction models.

https://doi.org/10.1371/journal.pone.0339394.g008

Download:

Fig 9. Comparative diagram of training set error results of three different prediction models.

https://doi.org/10.1371/journal.pone.0339394.g009

Download:

Fig 10. Comparative diagram of test set results of three different prediction models.

https://doi.org/10.1371/journal.pone.0339394.g010

Download:

Fig 11. Comparative diagram of test set error results of three different prediction models.

https://doi.org/10.1371/journal.pone.0339394.g011

Figs 8 and 10 show the performance of the training and testing sets of the prediction model. Visually, the trajectory predicted by our EMD-SSA-LSTM model (red line) is closer to the ground truth target (black line) on both the training and testing sets. In contrast, PSO-LSTM (blue) and EMD-PSO-LSTM (green) predictions show significant bias, particularly in capturing the exact amplitude and timing of voltage peaks. Figs 9 and 11 show the prediction errors of the prediction model. In terms of quantity, the prediction error curve of EMD-SSA-LSTM (red) has always been tighter and closer to zero, indicating lower bias and variance. The error band of the model based on particle swarm optimization is significantly wider and the peak amplitude is larger, especially on the test set, indicating poor generalization ability and stability. These new results strongly demonstrate that the SSA optimizer in our fusion architecture is more effective than mature PSO algorithms in finding high-performance LSTM configurations, resulting in higher prediction accuracy and generalization ability. This strengthens our claim for the novelty and practicality of SSA-LSTM fusion.

5 Discussion and future work

This study successfully establishes and validates an EMD-SSA-LSTM framework for predicting transistor threshold voltage drift (). The proposed methodology, integrating EMD for signal denoising and feature extraction with SSA for automated LSTM hyperparameter optimization, demonstrates superior capability in capturing the complex, nonlinear dynamics of IC aging. Extensive validation under accelerated stress conditions confirms that our model achieves significantly higher accuracy and robustness compared to multiple benchmarks, including standalone LSTM, EMD-LSTM, and newly added PSO-optimized variants, providing an effective data-driven tool for high-precision IC reliability prognostics.

Future research will focus on three key directions: multi-objective SSA optimization to balance accuracy with computational efficiency; multi-sensor fusion architectures integrating additional degradation indicators like leakage current for enhanced robustness; and hybrid signal processing techniques combined with advanced neural architectures to improve adaptability across diverse IC technologies and operational profiles.

Supporting information

S1 paper program.

https://doi.org/10.1371/journal.pone.0339394.s001

(PDF)

References

1. Zhu K, Wen C, Aljarb AA, Xue F, Xu X, Tung V, et al. The development of integrated circuits based on two-dimensional materials. Nat Electron. 2021;4(11):775–85.
- View Article
- Google Scholar
2. Rabaey JM, Chandrakasan A, Nikolic B. Digital integrated circuits. vol. 2. Prentice Hall Englewood Cliffs; 2002.
3. Gray PR, Hurst PJ, Lewis SH, Meyer RG. Analysis and design of analog integrated circuits. John Wiley & Sons; 2024.
4. Haselman M, Hauck S. The future of integrated circuits: A survey of nanoelectronics. Proc IEEE. 2010;98(1):11–38.
- View Article
- Google Scholar
5. Nagarajan R, Joyner CH, Schneider RP, Bostak JS, Butrie T, Dentai AG, et al. Large-scale photonic integrated circuits. IEEE J Select Topics Quantum Electron. 2005;11(1):50–65.
- View Article
- Google Scholar
6. Yin L, Cheng R, Ding J, Jiang J, Hou Y, Feng X, et al. Two-dimensional semiconductors and transistors for future integrated circuits. ACS Nano. 2024;18(11):7739–68. pmid:38456396
- View Article
- PubMed/NCBI
- Google Scholar
7. Turzyński M, Musznicki P. A review of reduction methods of impact of common-mode voltage on electric drives. Energies. 2021;14(13):4003.
- View Article
- Google Scholar
8. Umralkar RB, Kangrong L. Testing multiple high-speed channels using automated test equipment (ATE), SOC 93K, in parallel. In: 2021 IEEE 23rd electronics packaging technology conference (EPTC); 2021. p. 405–9. https://doi.org/10.1109/eptc53413.2021.9663994
9. Panigrahi JK, Acharya DP. GAN augmented phase noise modeling of PLL for use in automated testing. Int J Electron Lett. 2025;13(2):219–26.
- View Article
- Google Scholar
10. Todorov A, Dotsch R, Wigboldus DHJ, Said CP. Data-driven methods for modeling social perception. Social Personal Psych. 2011;5(10):775–91.
- View Article
- Google Scholar
11. Finegan DP, Zhu J, Feng X, Keyser M, Ulmefors M, Li W, et al. The application of data-driven methods and physics-based learning for improving battery safety. Joule. 2021;5(2):316–29.
- View Article
- Google Scholar
12. Maddalena ET, Lian Y, Jones CN. Data-driven methods for building control — A review and promising future directions. Control Eng Pract. 2020;95:104211.
- View Article
- Google Scholar
13. Nazir U, Mukdasai K, Sohail M, Singh A, Alosaimi MT, Alanazi M, et al. Investigation of composed charged particles with suspension of ternary hybrid nanoparticles in 3D-power law model computed by Galerkin algorithm. Sci Rep. 2023;13(1):15040. pmid:37699944
- View Article
- PubMed/NCBI
- Google Scholar
14. Errehymy A. Static and spherically symmetric wormholes in power-law f (R) gravity model. Phys Dark Universe. 2024;44:101438.
- View Article
- Google Scholar
15. Xu Y, Zhou Y, Sekula P, Ding L. Machine learning in construction: From shallow to deep learning. Dev Built Environ. 2021;6:100045.
- View Article
- Google Scholar
16. Jafari F, Moradi K, Shafiee Q. Shallow learning vs. Deep learning in engineering applications. In: Shallow learning vs. deep learning: A practical guide for machine learning solutions. Springer; 2024. p. 29–76.
17. Zha W, Liu Y, Wan Y, Luo R, Li D, Yang S, et al. Forecasting monthly gas field production based on the CNN-LSTM model. Energy. 2022;260:124889.
- View Article
- Google Scholar
18. Li Z, Yu H, Xu J, Liu J, Mo Y. Stock market analysis and prediction using LSTM: A case study on technology stocks. IAET. 2023:1–6.
- View Article
- Google Scholar
19. Shi J, Zhong J, Zhang Y, Xiao B, Xiao L, Zheng Y. A dual attention LSTM lightweight model based on exponential smoothing for remaining useful life prediction. Reliab Eng Syst Safety. 2024;243:109821.
- View Article
- Google Scholar
20. Hang B, Liang S, Guo P, Xu B. Safety analysis and prediction of UAVs aerial refueling docking based on deep learning data-driven method. IEEE Syst J. 2025.
21. Ma T, Chen H, Zhang K, Shen L, Sun H. The rheological intelligent constitutive model of debris flow: A new paradigm for integrating mechanics mechanisms with data-driven approaches by combining data mapping and deep learning. Expert Syst Applic. 2025;269:126405.
- View Article
- Google Scholar
22. Gharehchopogh FS, Namazi M, Ebrahimi L, Abdollahzadeh B. Advances in sparrow search algorithm: A comprehensive survey. Arch Comput Methods Eng. 2023;30(1):427–55. pmid:36034191
- View Article
- PubMed/NCBI
- Google Scholar
23. Yue Y, Cao L, Lu D, Hu Z, Xu M, Wang S, et al. Review and empirical analysis of sparrow search algorithm. Artif Intell Rev. 2023;56(10):10867–919.
- View Article
- Google Scholar
24. Fathy A, Alanazi TM, Rezk H, Yousri D. Optimal energy management of micro-grid using sparrow search algorithm. Energy Reports. 2022;8:758–73.
- View Article
- Google Scholar
25. Pernstich KP, Haas S, Oberhoff D, Goldmann C, Gundlach DJ, Batlogg B, et al. Threshold voltage shift in organic field effect transistors by dipole monolayers on the gate insulator. J Appl Phys. 2004;96(11):6431–8.
- View Article
- Google Scholar
26. Ribes G, Bruyère S, Monsieur F, Roy D, Huard V. New insights into the change of voltage acceleration and temperature activation of oxide breakdown. Microelectr Reliab. 2003;43(8):1211–4.
- View Article
- Google Scholar
27. Suehle JS, Chaparala P, Messick C, Miller WM, Boyko KC. Experimental investigation of the validity of TDDB voltage acceleration models. In: International integrated reliability workshop final report. IEEE; 1993 . p. 59–67. https://doi.org/10.1109/irws.1993.666293
28. Malozyomov BV, Martyushev NV, Bryukhanova NN, Kondratiev VV, Kononenko RV, Pavlov PP, et al. Reliability study of metal-oxide semiconductors in integrated circuits. Micromachines (Basel). 2024;15(5):561. pmid:38793134
- View Article
- PubMed/NCBI
- Google Scholar
29. Li H, Liu T, Wu X, Chen Q. An optimized VMD method and its applications in bearing fault diagnosis. Measurement. 2020;166:108185.
- View Article
- Google Scholar
30. Flandrin P, Rilling G, Goncalves P. Empirical mode decomposition as a filter bank. IEEE Signal Process Lett. 2004;11(2):112–4.
- View Article
- Google Scholar
31. Kasongo SM. A deep learning technique for intrusion detection system using a recurrent neural networks based framework. Comput Commun. 2023;199:113–25.
- View Article
- Google Scholar
32. Lu M, Xu X. TRNN: An efficient time-series recurrent neural network for stock price prediction. Inform Sci. 2024;657:119951.
- View Article
- Google Scholar
33. Weerakody PB, Wong KW, Wang G, Ela W. A review of irregular time series data handling with gated recurrent neural networks. Neurocomputing. 2021;441:161–78.
- View Article
- Google Scholar
34. Xue J, Shen B. A survey on sparrow search algorithms and their applications. Int J Syst Sci. 2023;55(4):814–32.
- View Article
- Google Scholar

[ref1] 1. Zhu K, Wen C, Aljarb AA, Xue F, Xu X, Tung V, et al. The development of integrated circuits based on two-dimensional materials. Nat Electron. 2021;4(11):775–85.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Rabaey JM, Chandrakasan A, Nikolic B. Digital integrated circuits. vol. 2. Prentice Hall Englewood Cliffs; 2002.

[ref3] 3. Gray PR, Hurst PJ, Lewis SH, Meyer RG. Analysis and design of analog integrated circuits. John Wiley & Sons; 2024.

[ref4] 4. Haselman M, Hauck S. The future of integrated circuits: A survey of nanoelectronics. Proc IEEE. 2010;98(1):11–38.
View Article
Google Scholar

[7] View Article

[8] Google Scholar

[ref5] 5. Nagarajan R, Joyner CH, Schneider RP, Bostak JS, Butrie T, Dentai AG, et al. Large-scale photonic integrated circuits. IEEE J Select Topics Quantum Electron. 2005;11(1):50–65.
View Article
Google Scholar

[10] View Article

[11] Google Scholar

[ref6] 6. Yin L, Cheng R, Ding J, Jiang J, Hou Y, Feng X, et al. Two-dimensional semiconductors and transistors for future integrated circuits. ACS Nano. 2024;18(11):7739–68. pmid:38456396
View Article
PubMed/NCBI
Google Scholar

[13] View Article

[14] PubMed/NCBI

[15] Google Scholar

[ref7] 7. Turzyński M, Musznicki P. A review of reduction methods of impact of common-mode voltage on electric drives. Energies. 2021;14(13):4003.
View Article
Google Scholar

[17] View Article

[18] Google Scholar

[ref8] 8. Umralkar RB, Kangrong L. Testing multiple high-speed channels using automated test equipment (ATE), SOC 93K, in parallel. In: 2021 IEEE 23rd electronics packaging technology conference (EPTC); 2021. p. 405–9. https://doi.org/10.1109/eptc53413.2021.9663994

[ref9] 9. Panigrahi JK, Acharya DP. GAN augmented phase noise modeling of PLL for use in automated testing. Int J Electron Lett. 2025;13(2):219–26.
View Article
Google Scholar

[21] View Article

[22] Google Scholar

[ref10] 10. Todorov A, Dotsch R, Wigboldus DHJ, Said CP. Data-driven methods for modeling social perception. Social Personal Psych. 2011;5(10):775–91.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref11] 11. Finegan DP, Zhu J, Feng X, Keyser M, Ulmefors M, Li W, et al. The application of data-driven methods and physics-based learning for improving battery safety. Joule. 2021;5(2):316–29.
View Article
Google Scholar

[27] View Article

[28] Google Scholar

[ref12] 12. Maddalena ET, Lian Y, Jones CN. Data-driven methods for building control — A review and promising future directions. Control Eng Pract. 2020;95:104211.
View Article
Google Scholar

[30] View Article

[31] Google Scholar

[ref13] 13. Nazir U, Mukdasai K, Sohail M, Singh A, Alosaimi MT, Alanazi M, et al. Investigation of composed charged particles with suspension of ternary hybrid nanoparticles in 3D-power law model computed by Galerkin algorithm. Sci Rep. 2023;13(1):15040. pmid:37699944
View Article
PubMed/NCBI
Google Scholar

[33] View Article

[34] PubMed/NCBI

[35] Google Scholar

[ref14] 14. Errehymy A. Static and spherically symmetric wormholes in power-law f (R) gravity model. Phys Dark Universe. 2024;44:101438.
View Article
Google Scholar

[37] View Article

[38] Google Scholar

[ref15] 15. Xu Y, Zhou Y, Sekula P, Ding L. Machine learning in construction: From shallow to deep learning. Dev Built Environ. 2021;6:100045.
View Article
Google Scholar

[40] View Article

[41] Google Scholar

[ref16] 16. Jafari F, Moradi K, Shafiee Q. Shallow learning vs. Deep learning in engineering applications. In: Shallow learning vs. deep learning: A practical guide for machine learning solutions. Springer; 2024. p. 29–76.

[ref17] 17. Zha W, Liu Y, Wan Y, Luo R, Li D, Yang S, et al. Forecasting monthly gas field production based on the CNN-LSTM model. Energy. 2022;260:124889.
View Article
Google Scholar

[44] View Article

[45] Google Scholar

[ref18] 18. Li Z, Yu H, Xu J, Liu J, Mo Y. Stock market analysis and prediction using LSTM: A case study on technology stocks. IAET. 2023:1–6.
View Article
Google Scholar

[47] View Article

[48] Google Scholar

[ref19] 19. Shi J, Zhong J, Zhang Y, Xiao B, Xiao L, Zheng Y. A dual attention LSTM lightweight model based on exponential smoothing for remaining useful life prediction. Reliab Eng Syst Safety. 2024;243:109821.
View Article
Google Scholar

[50] View Article

[51] Google Scholar

[ref20] 20. Hang B, Liang S, Guo P, Xu B. Safety analysis and prediction of UAVs aerial refueling docking based on deep learning data-driven method. IEEE Syst J. 2025.

[ref21] 21. Ma T, Chen H, Zhang K, Shen L, Sun H. The rheological intelligent constitutive model of debris flow: A new paradigm for integrating mechanics mechanisms with data-driven approaches by combining data mapping and deep learning. Expert Syst Applic. 2025;269:126405.
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref22] 22. Gharehchopogh FS, Namazi M, Ebrahimi L, Abdollahzadeh B. Advances in sparrow search algorithm: A comprehensive survey. Arch Comput Methods Eng. 2023;30(1):427–55. pmid:36034191
View Article
PubMed/NCBI
Google Scholar

[57] View Article

[58] PubMed/NCBI

[59] Google Scholar

[ref23] 23. Yue Y, Cao L, Lu D, Hu Z, Xu M, Wang S, et al. Review and empirical analysis of sparrow search algorithm. Artif Intell Rev. 2023;56(10):10867–919.
View Article
Google Scholar

[61] View Article

[62] Google Scholar

[ref24] 24. Fathy A, Alanazi TM, Rezk H, Yousri D. Optimal energy management of micro-grid using sparrow search algorithm. Energy Reports. 2022;8:758–73.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref25] 25. Pernstich KP, Haas S, Oberhoff D, Goldmann C, Gundlach DJ, Batlogg B, et al. Threshold voltage shift in organic field effect transistors by dipole monolayers on the gate insulator. J Appl Phys. 2004;96(11):6431–8.
View Article
Google Scholar

[67] View Article

[68] Google Scholar

[ref26] 26. Ribes G, Bruyère S, Monsieur F, Roy D, Huard V. New insights into the change of voltage acceleration and temperature activation of oxide breakdown. Microelectr Reliab. 2003;43(8):1211–4.
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref27] 27. Suehle JS, Chaparala P, Messick C, Miller WM, Boyko KC. Experimental investigation of the validity of TDDB voltage acceleration models. In: International integrated reliability workshop final report. IEEE; 1993 . p. 59–67. https://doi.org/10.1109/irws.1993.666293

[ref28] 28. Malozyomov BV, Martyushev NV, Bryukhanova NN, Kondratiev VV, Kononenko RV, Pavlov PP, et al. Reliability study of metal-oxide semiconductors in integrated circuits. Micromachines (Basel). 2024;15(5):561. pmid:38793134
View Article
PubMed/NCBI
Google Scholar

[74] View Article

[75] PubMed/NCBI

[76] Google Scholar

[ref29] 29. Li H, Liu T, Wu X, Chen Q. An optimized VMD method and its applications in bearing fault diagnosis. Measurement. 2020;166:108185.
View Article
Google Scholar

[78] View Article

[79] Google Scholar

[ref30] 30. Flandrin P, Rilling G, Goncalves P. Empirical mode decomposition as a filter bank. IEEE Signal Process Lett. 2004;11(2):112–4.
View Article
Google Scholar

[81] View Article

[82] Google Scholar

[ref31] 31. Kasongo SM. A deep learning technique for intrusion detection system using a recurrent neural networks based framework. Comput Commun. 2023;199:113–25.
View Article
Google Scholar

[84] View Article

[85] Google Scholar

[ref32] 32. Lu M, Xu X. TRNN: An efficient time-series recurrent neural network for stock price prediction. Inform Sci. 2024;657:119951.
View Article
Google Scholar

[87] View Article

[88] Google Scholar

[ref33] 33. Weerakody PB, Wong KW, Wang G, Ela W. A review of irregular time series data handling with gated recurrent neural networks. Neurocomputing. 2021;441:161–78.
View Article
Google Scholar

[90] View Article

[91] Google Scholar

[ref34] 34. Xue J, Shen B. A survey on sparrow search algorithms and their applications. Int J Syst Sci. 2023;55(4):814–32.
View Article
Google Scholar

[93] View Article

[94] Google Scholar

Figures

Abstract

1 Introduction

2 Mathematical modeling establishment

2.1 Physics informed degradation modeling

2.2 Accelerated model of life prediction

3 Threshold voltage drift prediction model

3.1 Forecasting target

3.2 Data preprocessing

3.2.1 Variational mode decomposition.

3.2.2 Empirical mode decomposition.

3.3 Basic principle of LSTM

3.3.1 LSTM gating mechanism.

3.4 Sparrow search algorithm

3.4.1 Description of mathematical formula of SSA.

3.5 SSA-LSTM prediction model

3.6 SSA-LSTM hyperparameter optimization strategy

3.6.1 Search space: The SSA was configured to optimize three critical hyperparameters.

3.7 SSA configuration and execution

3.8 Predictive performance evaluation index

4 Simulation verification

4.1 Dataset provenance and experimental setup

4.2 Simulation parameter setting

4.3 Comparative analysis of forecasting performance

5 Discussion and future work

Supporting information

S1 paper program.

References