Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Data compression of Bridge Resilience Control: Algorithm and case analysis

  • Ming Chen

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    chenmchen1975@126.com

    Affiliation School of Architecture Engineering, Shanghai ZhongQiao Vocational and Technical University, Shanghai, China

Abstract

Bridge inspection and structural health monitoring represent the primary approaches to managing bridge resilience. Data acquired through inspection and monitoring activities provides an effective technical basis for the systematic implementation of bridge resilience control strategies. Yet, uninterrupted monitoring and diverse inspection campaigns have yielded an enormous volume of data, which directly imposes comprehensive and stringent challenges on data storage, transmission and processing. Consequently, data compression has become a research priority in the field of bridge resilience control. However, existing data compression algorithms are all general-purpose data processing techniques, which decouple the intrinsic physical relevance between monitoring data and bridge structural behaviors. To tackle this limitation, this study integrates domain knowledge, the time-series characteristics of bridge monitoring data, and bridge deterioration models into the design of a novel data compression algorithm. This approach addresses the issue of indiscriminate data compression inherent to conventional algorithms, thereby enabling efficient data compression while preserving critical bridge structural state information. By incorporating domain knowledge, the proposed method transforms raw monitoring data into data information with engineering attributes. based on these attributes, a set of interrelated monitoring data is further converted into a small subset of key data that is directly applicable to bridge resilience control practice. Leveraging the steady-state variation law of bridge operational performance, the dynamic structural characteristics of bridges are extracted from time-series monitoring data, which correspondingly reduces the storage demand of time-series datasets. For data sampling intervals interrupted by various types of system faults, a sparse data supplementation method is proposed. After data supplementation, the complete dataset is further refined by utilizing the inherent time-series characteristics of the monitoring data, which not only ensures data integrity but also further reduces the overall data volume. Simulation analyses demonstrate that the domain knowledge-based compression method achieves a data compression ratio of 75%. Moreover, the comprehensive compression ratio exceeds 92% after the synergistic processing of time-series feature extraction and sparse data supplementation, with a data fidelity rate of 95%. These performance metrics indicate that the proposed method can reduce the data storage costs and transmission bandwidth consumption associated with bridge resilience control by 75% to 92%. Meanwhile, the 95% feature retention accuracy satisfies the engineering precision requirements for bridge resilience control assessments, which effectively reconciles the inherent contradiction between data compression efficiency and structural evaluation accuracy.

1. Introduction

1.1 Background

Bridges constitute the core components of urban infrastructure systems, and their structural resilience plays a decisive role in the overall resilience of urban engineering systems. Conventional structural detection, structural health monitoring (SHM), and other technical methodologies have exerted a fundamental role in the resilience assessment and operational maintenance management of bridge structures. Nevertheless, with the rapid advancement of science and engineering technologies, bridge inspection and real-time structural monitoring are generating an enormous volume of multi-source heterogeneous data streams. Although such datasets contain comprehensive information reflecting the actual service status of bridges, they also impose severe challenges and heavy burdens on the efficiency of data storage and real-time transmission in engineering applications. Taking the bridge health monitoring system as a typical instance, various types of sensors are deployed for continuous data acquisition via high-frequency sampling protocols. Uncompressed raw monitoring data rapidly depletes storage resources, which inevitably results in a substantial escalation in capital expenditure pertaining to hardware capacity expansion and long-term maintenance.

By using data compression technology to reduce the volume of data, the deployment and maintenance costs of servers and cloud storage will be significantly reduced. At the same time, it supports the full lifecycle data archiving of bridges, providing a data foundation for long-term performance evolution analysis of bridges. In the field of bridge resilience control, commonly used algorithms can be divided into three categories: lossless compression, lossy compression, and time-series data-specific compression. Lossless compression algorithms include Huffman coding, the LZW algorithm, and the DEFLATE algorithm. Huffman encoding is based on data frequency allocation of encoding length. The LZW algorithm achieves adaptive encoding by dynamically constructing a dictionary. The DEFLATE algorithm combines LZ77's sliding window matching with Huffman encoding. The typical characteristic of non-computational compression is that the reconstructed data is completely identical to the original data.

The bridge lossy compression algorithm improves compression rate by discarding non-critical information. Typical lossy compression algorithms include Discrete Cosine Transform (DCT), Wavelet Transform, and Principal Component Analysis (PCA). DCT converts time-series data into frequency-domain data while retaining low-frequency trend components; the Wavelet transform takes into account both time-series and frequency-domain local features, and can capture data mutation signals; PCA achieves high-dimensional data dimensionality reduction through orthogonal transformation and is often used as a preprocessing step in intelligent algorithms. The specialized compression algorithm for time-series data is designed for monitoring data with strong correlation and periodicity, including differential encoding, rotary gate algorithm (SDT), and online piecewise linear approximation (OLA). Differential encoding encodes the residuals of adjacent data, often combined with entropy encoding. The revolving door algorithm is based on a threshold, fits data with a line, and saves the inflection point. OLA algorithm divides data segments in real time, which is suitable for online compression tasks of edge computing nodes.

With the rapid development of artificial intelligence technology, based on the core requirements of structural disaster resistance, recovery and adaptation, intelligent algorithms integrate traditional compression principles with artificial intelligence, edge computing and other technologies to achieve data dimensionality reduction and key feature retention on the premise of ensuring the accuracy of resilience assessment, providing core support for real-time early warning, disaster response and long-term operation and maintenance. The first research direction of intelligent algorithms is the adaptive optimization results of classical algorithms, such as the adaptive revolving gate algorithm based on LSTM network optimization threshold, and the wavelet transform optimization method that integrates a genetic algorithm to screen wavelet basis functions. The second research direction is based on AI-driven deep compression technology. The research content includes an auto-encoder that integrates high compression rate and damage feature preservation with key resilience evaluation indicators as constraints, compressed sensing that reduces the sampling and transmission pressure of sensing nodes based on data sparsity, a lightweight neural network compression scheme adapted to edge deployment based on quantization, pruning, and other technologies, combined with domain adaptive principal component analysis, and graph neural network fusion compression based on bridge structure topology. These intelligent algorithms can adapt to engineering scenarios such as extreme disaster monitoring, long-term resilience assessment, real-time edge warning, and multi-source data fusion. The goal of the algorithm is to transmit disaster mutation data with low power consumption and store historical data with a high compression ratio, in order to support performance degradation analysis, reduce node resource consumption, and accurately evaluate bridge resilience.

1.2 Motivation and innovation

The existing research results on data compression provide technical support for data compression in the field of bridge resilience control, especially with the rapid development of intelligent algorithms, greatly improving the level of data compression. However, there are still some key technologies that need to be broken through in the field of data compression for bridge resilience control.

Firstly, the general data compression algorithm does not involve knowledge in the field of bridges, and does not take into account the mechanical properties of bridges and the physical meaning of data, which may result in the loss of key structural state information in compressed data.

Second, bridges themselves and the monitoring data collected during their operational time exhibit considerable inherent redundancy, yet existing general-purpose compression algorithms fail to effectively leverage this redundancy for data compression purposes.

Thirdly, due to the limitations of the collection equipment and extreme weather, the data collected for bridge resilience control also has the characteristic of sparsity, which requires data filling to be effectively applied.

To address the aforementioned issues, research has been conducted on the following aspects:

Firstly, a data compression method based on knowledge in the field of bridge resilience control has been proposed, which reduces the geometric data volume of bridges.

Secondly, a time-series data compression method has been proposed. This method focuses on data within the coverage range of the time step window, achieving dynamic compression of time-series data.

Thirdly, a method based on sparse data filling of pre- and post-dataset is proposed, which supplements sparse data in the time-series and refines the supplementary data, solving the problem of data loss in the field of bridge rebound control.

2. Related work

Bridge health monitoring is the main technical means of bridge resilience control. The research and application of bridge health monitoring began in the 1970s. After the 1990s, with the rapid development of large-scale bridge construction, bridge health monitoring systems have been widely applied. However, the application of a large number of data collection methods has led to an unusually large amount of data obtained for bridge health monitoring, prompting data compression processing to become the main research direction for bridge health monitoring.

Traditional data compression methods have been widely used in the data processing of bridge health monitoring systems, including wavelet transform, Fourier transform, PCA, Empirical Mode Decomposition, Huffman Encoding, SAX, and their fusion applications. Wavelet transform is a mathematical framework for multi-scale decomposition using adjustable scale wavelet functions. Its core principle is to decompose signals into components of different scales and compress them by removing redundant information. For example, Lorenzo Bernardini et al. [1] proposed a bridge damage detection method based on driving vibration. By using the wavelet transform to extract the time-frequency characteristics of vibration signals, this method can maintain a damage recognition accuracy of over 95% even when the data compression rate reaches 85%. The Fourier transform is a mathematical tool that converts time-domain signals into frequency-domain signals. Its core is to decompose any periodic or non-periodic signal that satisfies the Dirichlet condition into a superposition of sine/cosine waves of different frequencies, amplitudes, and phases. The Fourier transform is suitable for data compression of periodic, stationary signals, and the compressed data can be directly applied to feature extraction. For example, Premjeet Singh et al. [2] proposed a method that integrates natural excitation techniques and empirical Fourier decomposition to analyze environmental bridge vibration data and determine the modal parameters of the bridge. This method can provide an accurate and robust estimation of bridge modal parameters. PCA is an unsupervised data dimensionality reduction and feature extraction method that maps high-dimensional data to a low-dimensional space through linear transformation, reducing redundancy while preserving the main information of the data. Principal component analysis can eliminate the data correlation between sensors in bridge health monitoring systems and is suitable for large-scale multi-source data compression, with a data compression rate of over 90% [3]. EMD is an adaptive signal decomposition method that decomposes complex non-stationary and nonlinear signals into several stationary and physically meaningful intrinsic mode functions and a residual component. The empirical mode decomposition method does not require prior knowledge and has strong adaptive ability. This method is suitable for bridge health monitoring in complex environments, with a data compression rate of up to 50% while reducing signal noise [4]. Huffman encoding is a lossless data compression algorithm that reconstructs data that is completely identical to the original data after compression. This method can solve the problem of low spatial efficiency caused by unpredictable outliers in the compression process of time series data [4]. SAX is a data dimensionality reduction and symbolic representation method for time series. Its core is to convert continuous time series into discrete symbol streams, which can significantly compress data volume while preserving key features. After receiving real-time perception data in the monitoring system, SAX is applied to compress the data, and then efficient classification tasks are performed based on the compressed data to complete the evaluation of bridge structure status [4]. In order to integrate the advantages of traditional compression algorithms, scholars have fused multiple algorithms and applied them to data processing in the field of bridge health monitoring. For example, Zhou L et al. [5] proposed a scheme that combines complementary set empirical mode decomposition, wavelet threshold denoising, and PCA fusion to achieve data reduction while denoising. Zhang Feng Yuan et al. [6] proposed a compression method combining wavelet transform and LZW encoding, and applied multiple sets of actual sampling data for simulation testing. The results showed that the compression ratio was better than 10.8.

The advantages of traditional data compression algorithms are mature principles, strong hardware adaptability, and high accuracy in data restoration after compression. However, traditional data compression algorithms are limited to simple data compression, and their compression process and feature extraction process are separated, requiring a redesign of the feature extraction algorithm for bridge structures. With the rapid development of artificial intelligence technology, the application of machine learning in the field of bridge resilience control has received widespread attention from scholars and the engineering community. Common machine learning methods include Bayesian, CNN, LSTM, Transformer, GNN, Reinforcement learning, etc. The Bayesian method is a statistical inference framework based on probability theory, whose core is to update the knowledge of unknown parameters through prior probabilities and observation data, and output a posterior probability distribution. Kullaa Jyrki [7] combined Bayesian theory with virtual sensing technology to solve the problem of dense sensor networks and repeated collection of vibration data, generating a large amount of data that needs to be stored. CNN is suitable for processing data with a grid structure. In bridge resilience control, the combination of CNN's convolutional layer, pooling layer, and fully connected layer is used to extract data features and compress the original data [8]. LSTM excels at capturing long-term dependencies of time-series data. LSTM is often combined with convolutional networks, generative adversarial networks, etc., for bridge monitoring data compression, which can reduce data volume and ensure the accuracy of subsequent structural state evaluation. LSTM is a special RNN architecture that is designed with gating mechanisms and cell states at its core. LSTM can map high-dimensional raw data to a low-dimensional space while preserving its implicit structural features, thereby reducing data transmission volume [9]. Transformer is a deep learning model proposed by the Google team, which uses a self-attention mechanism to directly calculate the correlation weight between each position in the sequence and all other positions, achieving the function of capturing global dependencies and complex correlations of data [10]. Transformer uses Embedding technology to transform high-dimensional data into fixed-dimensional feature vectors, which can significantly compress data and save storage space and data transmission. GNN is suitable for processing graph-structured data, adept at capturing topological relationships between data, achieving feature extraction, and data compression. GNN is commonly used to identify defects such as cracks, concrete spalling, and steel corrosion in bridge structures, and can compress high-dimensional data into low-dimensional data [11]. The advantages of reinforcement learning are dynamic adaptive decision-making and multi-objective optimization. Through the interaction learning between the agent and the environment, redundant information is eliminated, achieving dimensional compression of high-dimensional data [12]. The research on machine learning algorithms focuses on the accuracy of data compression, but neglects the physical requirements of bridge resilience control, and has shortcomings such as data dependence and poor interpretability. Improvement is needed to establish the correlation between machine learning results and the mechanical principles of bridge structures.

During the operational time of bridges, both bridge inspection and bridge health monitoring obtain data with time-series characteristics. Therefore, the compression of time-series data is an important aspect of bridge resilience control. In existing literature, there are relatively few studies that directly focus on compressing time-series data of bridges, and the vast majority of the literature focuses on feature extraction of bridges. The process of extracting bridge features is also the process of compressing time-series data. Time-series data compression algorithms are divided into two categories: lossless compression and lossy compression. Lossless compression is a reversible compression technique that achieves 100% restoration of compressed data by identifying and eliminating statistical redundancy in the data [13]. The core principle of lossy compression is to actively discard redundant or secondary information in the data that has no significant impact on the target application scenario in exchange for a higher compression ratio, allowing irreversible information loss in the compression and decompression process. Lossy compression is the main method for processing bridge time-series data. At present, research on lossy compression algorithms for bridge temporal data is focused on the field of machine learning. On the basis of maintaining the status information of the bridge, scholars have integrated multiple machine learning algorithms to compress the temporal data of the bridge. For example, combining CNN with BiGRU can extract data features while identifying temporal features between data, achieving data compression [14]. In order to predict the degradation of high-strength steel wire in long-term service, Long Xiao et al. [15] proposed a hybrid prediction model that integrates the sparrow search algorithm and LSTM. Simulation analysis shows that the algorithm has significant advantages in both convergence speed and optimization accuracy. Wang Ziyi et al. [16] integrated the grey wolf optimization algorithm and LSTM, using the grey wolf optimization algorithm to synergistically optimize the hyper-parameters of LSTM, while integrating temporal feature extraction and signal decomposition techniques. The case analysis results indicate that, compared to CNN and LSTM, the algorithm combining the grey wolf optimization algorithm and LSTM is more suitable for bridge displacement prediction tasks. In addition, time-series data compression algorithms in other fields also have certain reference value for the field of bridge engineering. For example, in order to solve the problem of the relatively lagging development of computing power and storage capacity in high-performance computing platforms in computational fluid dynamics, Adalberto Perez et al. [17] introduced Gaussian process regression under the Bayesian framework, which can achieve posterior recovery of initially discarded information. Research has confirmed that this method is not only suitable for compressing three-dimensional turbulent spatial field data, but also for compressing discrete time series datasets. In the field of the Internet of Things, efficient data compression technology is crucial for reducing storage costs and improving query performance. Due to its high precision and wide dynamic range, floating-point sequential data poses great challenges to data compression. To address this issue, Wenjing Wang et al. [18] proposed a numerical pattern-aware compression algorithm for floating-point time-series data. This algorithm introduces a classification model at the time window level to identify hidden numerical patterns in the data, and constructs a two-layer decision architecture to achieve a balance between compression ratio and time overhead. Johannes Pöppelbaum et al. [19] proposed a novel quaternion temporal data compression method based on neural network models. This method first divides long-term data into several data segments, extracts the minimum, maximum, mean, and standard deviation of each data segment as representative features, and encapsulates these features into quaternions to generate quaternion numerical time series data. In the field of cloud-based digital twin systems, monitoring key performance indicators is the core link to ensure system security and reliability. However, the monitoring data generated by the system is massive, and data compression technology has become a necessary means to save data transmission bandwidth and storage space. Zicong Miao et al. [20] proposed a collaborative compression method for multivariate temporal data based on a two-step compression scheme. This method first performs a morphology-based clustering algorithm to group multivariate temporal data; Subsequently, it optimizes the compressive sensing technology to achieve collaborative compression of grouped data. The experimental results show that this method can achieve efficient data compression while effectively preserving the complex temporal correlations between indicators: at a compression ratio of 30%, the root mean square error of the correlation between reconstructed data and original data is only 0.0489.

3. Related definitions

3.1 Data

Data is the fine-grained information of bridge structures, and the data for bridge toughness control is defined as equation (1)

(1)

In equation (1), is the data. is the label of the data, which has uniqueness in bridge resilience control. is the type of data. is the value of the data. is the associated information of the data. is the knowledge information that constrains the data. is the data weight used to represent the importance of data in data compression, . For data with different knowledge constraints, there are significant differences in the value of . Taking the section in bridge structures as an example, for rectangular sections, the width data and height data have the same importance when calculating the inertia moment of the section. However, for I-shaped sections, the importance of flange thickness and web height is significantly higher than other data. The dataset consisting of all data for bridge resilience control is defined as equation (2).

(2)

The dataset represented by equation (2) is a collection of various types of data for bridge resilience control. The amount of data contained in this set increases with increased operational time throughout the entire life cycle of the bridge. The dataset is the direct object of Data compression research.

3.2 Knowledge

Knowledge is the core element that distinguishes different application fields. The knowledge in the field of bridges is defined as equation (3)

(3)

In equation (3), is the identifier for knowledge . is the relationship between the value of data and the knowledge value , . is the value of knowledge , which comes from various codes or experiences. Therefore, the determination of data compliance does not need to be obtained through training of large-scale models, which can significantly reduce the training workload of large-scale models. In this study, knowledge was applied to large-scale models through knowledge templates. Different data correspond to different domain knowledge, and the corresponding knowledge templates are also different. The cross-sectional knowledge template is defined as equation (4)

(4)

In equation (4), represents the knowledge association between data. For example, in the cross-sectional knowledge template, is the angle between the data and the .

3.3 Correlation

Correlation refers to the correlation between data. The correlations studied in this article include geometric correlations and temporal correlations. Geometric correlation refers to the geometric shapes or structures composed of data within the subset of a dataset, . When , constitutes the entire bridge structure. Time-series correlation refers to the correlation generated in bridge structure over time. For example, the acceleration data and displacement data collected by the bridge health monitoring system all satisfy time-series correlation.

3.3.1 Geometric correlation.

Based on geometric correlation, data can be used to establish sections, components, substructures, etc. Geometric correlation is defined as equation (5)

(5)

In equation (5), represents the data involved in the correlation with . is the correlation type, for cross-sectional data . is the correlation parameter. In the geometric correlation of cross-sections, . is the key of the correlation, used to represent the attributes of the correlation. is the value of the correlation, used to characterize the specific function of the correlation. is the data associated with the influence of . is related to the engineering characteristics and correlation attributes, and can be one (section) or multiple (substructure or structure). By applying geometric correlations, data can be integrated and evolved into sections, components, substructures, and structures.

3.3.2 Time-series correlation.

Time-series correlation represents the variation of data over time. Time-series correlation can be continuous or discrete, with equal or unequal intervals. Time-series correlation is defined as equation (6).

(6)

In equation (6), represents the correlation type, which can be acceleration, displacement, strain, etc. is the time of data acquisition. Time-series correlation can characterize the full lifecycle changes of certain data during the operational time of bridge structures, and is the most direct information for recording the continuous changes in bridge operation status.

4. Data compression methods

The goal of Data compression is to reduce the amount of data required for low bridge resilience control supported by large models. It mainly includes three aspects: first, knowledge-based compression which mainly deals with the compression of data with geometric correlations in bridges. The second is based on time-series compression, which mainly deals with the compression of data with time-series correlations. The third is sparse data compression, which mainly deals with the compression of incomplete data in bridge resilience control.

4.1 Data compression based on domain knowledge

The significant difference between bridge resilience control data and conventional data lies in its domain engineering properties. The application of these engineering properties helps to significantly reduce the amount of data required for large-scale model training and improve model training efficiency.

4.1.1 Weight setting.

The main problem solved by intelligent algorithms in existing large-scale models is to find the optimal solution, and the selection of weights is one of the core components of intelligent algorithms. Knowledge based Data compression applies domain knowledge to weight adjustment, avoiding the traditional method of using a large amount of data to train and adjust weights, which can significantly reduce the amount of training data. For example, in the process of refining cross-sectional data, each data has a corresponding importance coefficient . Based on , the weights of the large-scale model algorithm can be adjusted to accelerate convergence speed. Taking the commonly used neural network in large-scale models as an example, its output function and squared error are given by equations (7) and (8), respectively.

(7)

In equation (7), is the output of the neural network. is the activation function, is the connection weight between the th input neuron and the -th output neuron, and is the number of all input neurons.

(8)

In equation (8), is the expected output value, and is the number of output neurons. Using gradient descent optimization algorithm, the weight update rule is equation (9)

(9)

In equation (9), is the learning factor, which is generally a constant (positive value). Therefore, the selection of becomes the key to the efficiency and quality of large-scale model training. Smaller values have lower training efficiency, while larger values may not achieve the optimal solution.

In the process of data compression based on domain knowledge, is designed as a variable whose value is limited by the importance coefficient corresponding to . Taking the I-shaped cross-section as an example, the learning factor is shown in equation (10).

(10)(11)

In equation (11), is used to extract the importance coefficient of the data . The physical meaning of equation (10) is that a smaller learning factor is adopted for the data with higher importance, while a larger learning factor is adopted for the data with lower importance, so as to improve the training efficiency of the large model.

4.1.2 Algorithm.

The difference between knowledge-based Data compression algorithms and traditional algorithms lies in the application domain knowledge setting learning factor . At the same time, domain knowledge is applied to convert data into bridge data with engineering properties.

Taking the I-shaped section as an example, stores data such as flange width, thickness, and web height, which can be combined into an I-shaped section through domain knowledge. A neural network model is invoked in the proposed algorithm. This neural network model is a fully connected deep neural network. The input layer consists of 4 feature dimensions; the hidden layer is composed of three layers with the ReLU activation function adopted, and the output layer contains a single neuron with a linear activation function applied. In the process of model training, adaptive training is realized by means of early stopping and learning rate decay, which not only ensures the convergence performance of the model, but also mitigates overfitting. Besides, the differentially weighted features are applied throughout the entire training process. The mean squared error (MSE) is selected as the loss function for this neural network. This loss function has excellent compatibility with the Adam optimizer, which enables stable gradient calculation and fast convergence speed. Meanwhile, the square term in the mean squared error imposes a heavier penalty on large errors, thus driving the model to prioritize the correction of samples with large deviations.

Algorithm: Bridge Section Data Closed-Loop Processing and Dataset Update Algorithm

// Input: D – Original bridge structure dataset; T – Cross-sectional data template

// Output: D – Updated bridge structure dataset

Algorithm BridgeSectionDataProcessing(, )

   :

   for each in do

     if is cross-sectional data then

      current_d =

      while current_d. ≠ ∅ do

       Add current_d to

       current_d = current_d.

      end while

      Add current_d to

     end if

   end for

    = NeuralNetworkModel()

   for each in do

     Create new data point

     .v =

     Add to

   end for

   RecursiveRemove(, , )

   return

end Algorithm

4.2 Compression of time-series data

The data for bridge resilience control includes real-time data collected by the bridge health monitoring system. The typical characteristic of this type of data is continuous cyclic sampling, with an exceptionally large volume of data that requires a significant amount of computing and communication resources. The method of time-series data compression is to analyze and transform the data within the window period, and then replace the entire window period with the center point data.

4.2.1 Definition and transformation of time-series data.

  1. (1). Displacement mode dataset

The definition of displacement mode dataset is shown in equation (12).

(12)

In equation (12), is the displacement mode value analyzed based on the -th sensor. is an ordered set that is time-dependent. is the number of sensors in the collection system. The acquisition time of is synchronized.

(13)

In equation (13), is the displacement mode value at time , and is the displacement mode value at time . The format is equation (1).

  1. (2). Window Size

Time-series data compression refers to the compression processing of data within the window period. The size of the window is defined as the window size, denoted by . When is small, the data loss is relatively small while the compression ratio is low; when is large, the compression ratio is high, whereas the data loss is relatively large. The window size is correlated with the operational time of the bridge and abrupt incident during its operation. According to the nonlinear model for natural deterioration of concrete proposed in Ref. [21], it is expressed as equation (14).

(14)

In equation (14), denotes the bridge technical condition in the -th year, and the value of 95 represents the initial condition score of the bridge. The initial value of the window size can be selected according to the operational time of the bridge based on equation (14). The initial value of is calculated by equation (15) as follows:

(15)
  1. (3). Data transformation

Data transformation is the transformation of time-series data for bridge resilience control based on engineering properties. After data transformation, the engineering significance of the data becomes clearer and the process of Data compression can be simplified. For example, using equation (16) to transform the displacement mode of multiple acquisition points at the same time into a curvature mode

(16)

In equation (16), is the curvature mode at measurement point . is the displacement mode at measurement point . is the distance between two measuring points. By data transformation, the displacement mode obtained by the acquisition system is converted into the curvature mode of each measuring point.

4.2.2 Time-series Data compression algorithm.

Applying Data compression algorithms to process curvature modes within the window and replacing time-series data with center point data can reduce the amount of system data. The steps of Data compression are as follows.

Algorithm: Curvature Mode Data Compression Based on GIN Classification

// Input:

// - : Curvature mode dataset (raw input)

// - : Bridge operational time (parameter)

// Output: – Compressed curvature mode

Algorithm TimeSeriesDataCompression(, )

   

   

   for each in do

      = GenerateCurvatureModeDiagram()

     Add to

      = CallGINClassification()

     while > 1 do

       = / 2 // Halve the window step size

       = CallGINClassification()

     end while

     sum_dj = 0

     for j from 1 to W do

      sum_dj = sum_dj +

     end for

= sum_dj /

     Add to

     Remove from

   end for

   if ≠ ∅ then

     goto Step 3 // Re-enter loop if unprocessed data remains in G

   end if

   return

end Algorithm

This algorithm transforms displacement modes into curvature modes and applies a large-scale image classification algorithm to classify curvature mode maps within a time step. When the number of classifications , it is considered that the curvature mode data at each time point can be refined, and the mean of all curvature modes is taken as the representative value of the refined data.

In the above algorithms, the GIN model is invoked to classify the graphs generated by the curvature mode. When the number of categories , the data in this category is compressed by taking the mean value of the data within the category. When the number of categories , half of the window size is used to perform reclassification. Meanwhile, the data fidelity rate is set to 95% in the program design.

4.3 Sparse data compression

Affected by factors such as unexpected equipment failures and extreme weather, the data acquired by the information acquisition system for bridge resilience control may suffer from missing sampling, leading to incomplete data information [2224]. Directly discarding sparse data will reduce the continuity of bridge resilience control, and fail to effectively characterize the continuous variation of the bridge's operational performance throughout its whole life cycle. To effectively utilize these sparse data, the sparse data is first supplemented in this study, and then compressed with the time series data compression algorithm introduced in Section 4.2.2. The methods for data supplementation are as follows:

The compensation window size for missing data is set as in accordance with Equation (15). Let the missing data set be , where denotes the missing value of the -th sensor, represents the number of sensors in the acquisition system, and is the sampling time. The -th data in the preceding dataset window with missing values is taken as , where is the data mean value of the -th sensor in the preceding data set. The -th data in the subsequent dataset window with missing values is taken as , where denotes the data mean value of the -th sensor in the subsequent data set. Equation (17) is then applied to fill the missing data set:

(17)

Equation (17) adopts a linear combination method, which comprehensively considers the correlation between the preceding and subsequent data and the missing data to perform linear filling on the missing data. In this equation, denotes the reference value for linear combination with a value range of . The value of is set to 1 when the missing bridge data are collected at the initial operation stage of the bridge (on the assumption that there is no structural deterioration in the initial stage), and the values at other sampling time are calculated in accordance with Equation (18):

(18)

Equation (18) indicates that decreases with increased operational time. For long-term missing data, its filled value is closer to the subsequent data, thus reflecting the most unfavorable state of bridge resilience control.

5. Simulation analysis

5.1 Compression based on knowledge

In the simulation analysis, 50 cross-sectional dimensions were randomly selected as the validation set from the dataset of 3965 doubly symmetric I-shaped cross-sections (with flange thickness ranging from 700 mm to 1000 mm, web height ranging from 1200 mm to 1600 mm, and web thickness of 80 mm). Among the remaining cross-sections, 80% were used as the training set and the other 20% as the test set. A neural network model was then adopted for calculation, and the predicted compressive values and actual values of the 50 cross-sections outputted by the model are shown in Fig 1 () and Fig 2 () respectively.

thumbnail
Fig 1. Comparison between predicted compressive values and actual values of for doubly symmetric I-shaped cross-section.

https://doi.org/10.1371/journal.pone.0346272.g001

thumbnail
Fig 2. Comparison between predicted compressive values and actual values of for doubly symmetric I-shaped cross-section.

https://doi.org/10.1371/journal.pone.0346272.g002

The moment of inertia of the cross-section about the x-axis is presented in Fig 1, for which the maximum relative error of the predicted compressive values relative to the actual values is 7.03%, with a mean relative error of 2.42%. As shown in Fig 2 for the moment of inertia about the y-axis, the maximum and average relative errors are 4.54% and 1.28%, respectively.

5.2 Time-series data compression

Taking a simply supported beam with a reinforced concrete T-shaped cross-section as an example, C50 concrete was adopted for the beam, with a calculated span of 15 m, a flange width of 800 mm, a flange thickness of 100 mm, a cross-section height of 1200 mm and a web thickness of

100 mm. 8% non-Gaussian noise (including traffic noise, wind noise and temperature noise) was added to the simulated acceleration data. The time-series data compression algorithm proposed in Section 4.2.2 was applied, and the corresponding compression results are presented in Fig 3.

thumbnail
Fig 3. Results of time-series data compression.

https://doi.org/10.1371/journal.pone.0346272.g003

As can be seen from Fig 3, the number of rows of the original data is 3600, and the number of rows of the data after compression is 262, with a compression ratio of 92.72%. The average data fidelity after compression is 97.77%, and the minimum data fidelity is 95.00%. To verify the advantages of the algorithm proposed in this paper compared with the existing algorithms, a comparative study was conducted between the proposed algorithm and the conventional algorithms (PCA, 1D Convolutional Auto-Encoder, Wavelet Transform, SAX and PAA) in the simulation analysis. Under the condition of 95% accuracy, the compression ratios of various algorithms are presented in Fig 4.

thumbnail
Fig 4. Compression ratios of various algorithms (95% accuracy).

https://doi.org/10.1371/journal.pone.0346272.g004

As can be seen from Fig 4, the compression ratio of the proposed algorithm in this paper is the highest under the premise of the same accuracy. Meanwhile, the compression ratios of the PAA and SAX algorithms are close to those of the time-series data compression algorithm proposed in this paper. However, PAA and SAX have inherent defects when applied to the field of bridge resilience control. Specifically, the PAA algorithm divides time-series data into fixed segments of equal length and uses the mean value of each segment to represent the characteristics of the entire segment. If the peak value of bridge acceleration falls exactly on the segment boundary, this peak value will be averaged with other data in the segment, resulting in the loss of key signals of structural anomalies. The SAX algorithm adds a process of symbol mapping on the basis of the PAA algorithm. The SAX algorithm not only inherits the defects of the PAA algorithm, but also has some new drawbacks. For example, the SAX algorithm divides the symbol boundaries through the quantiles of the normal distribution, and the boundaries are fixed. In addition, the SAX algorithm only symbolizes the PAA values of a single segment without considering the temporal correlation between segments. Similarly, PCA, Auto-Encoder and Wavelets do not incorporate the characteristics of bridge structures, which may lead to the loss of key structural information and render the dimensionality reduction results devoid of engineering significance.

5.3 Sparse data filling and compression

The same case as that of time-series data compression was adopted for sparse data compression. In this study, it is assumed that the modal displacement data of Node 5 are missing in the range of 10% to 70%. The data filling method proposed in Section 4.3 was applied, and the errors after filling are presented in Table 1. In the simulation analysis, the operational time of the bridge is set as 10 years, and the window size is taken as 90 according to equation (15).

thumbnail
Table 1. Error statistics of sparse data after filling.

https://doi.org/10.1371/journal.pone.0346272.t001

As can be seen from Table 1, the maximum MSE and mean MSE after data filling are both relatively small for different values of data missing ratio, which reflects that the results of data filling are consistent with the true values of the data. It should be noted that the error will be significantly large when abrupt changes occur in the data with the simultaneous absence of this segment of missing data. When the missing ratio of sparse data is 30%, and the window sizes are set as 90 (10 years of operational time), 80 (16 years of operational time), 70 (20 years of operational time), 60 (24 years of operational time) and 50 (26 years of operational time), the corresponding data compression ratios are presented in Table 2.

As can be seen from Table 2, within the range of window sizes from 50 to 90, the data compression ratio is close to 90%, and there is little variation in the compression ratio across different window sizes. This phenomenon indicates that the result of data padding is actually a weighted average of the preceding and following data. The maximum difference between the compression ratio after padding and that of the original data is 4.4%..

6. Conclusion and outlook

Data is the core of bridge resilience control. However, with the continuous development of bridge resilience control technologies, the generation of massive data has posed new challenges to data storage, transmission and processing, resulting in low efficiency of bridge resilience control. To address this problem, domain knowledge, time-series data characteristics and bridge deterioration models are integrated into the data compression algorithm in this study, which achieves a substantial simplification of the data required for bridge resilience control. The main research conclusions are as follows:

First, domain knowledge is introduced into the general model, and a data compression algorithm based on domain knowledge is proposed. The results of simulation analysis show that the data compression ratio of this algorithm reaches 75%. Meanwhile, the compressed data has a clear physical meaning and can be directly applied to the resilience control of bridges.

Second, a compression algorithm combining time-series data and the bridge deterioration model is proposed. This algorithm realizes the refinement processing of dynamic time-series data, with the maximum refined compression ratio of the data reaching more than 92%. The results of comparative analysis with existing algorithms show that when the data fidelity rate is the same, the compression ratio of the algorithm proposed in this paper is higher, and the physical information of bridge structures can be retained effectively.

Third, a sparse data completion method based on the data sets before and after the measuring points is proposed. Meanwhile, the above-mentioned refined compression algorithm for time-series data is applied to compress the completed data, with the maximum data compression ratio after processing exceeding 90%.

The data compression algorithms proposed in this paper integrate the domain knowledge of bridge resilience control with the inherent time-series characteristics of data, which significantly reduces the volume of data required for bridge resilience control. Nevertheless, several issues in this study remain to be further explored and discussed:

First, the effectiveness of the algorithms is verified by simulated data in this study, and their actual performance will be validated in subsequent tests on real bridges. In particular, there are still differences between the simulated noise in the study and the actual noise of real bridges. Meanwhile, noise reduction algorithms will also be a key focus of future research.

Second, in the data compression based on domain knowledge, a biaxially symmetrical I-shaped section is adopted for analysis in this study. In follow-up research, analysis and comparison should be conducted for other types of cross-sections.

Third, the simply supported beam bridge is taken as an example to study the data compression algorithms in this research. Subsequent studies should carry out relevant analysis for other bridge types, such as continuous beam bridges, arch bridges, cable-stayed bridges and suspension bridges.

Fourth, the sparse data completion in this study only considers the data sets before and after a single measuring point. In the future, the research on sparse data completion for consecutive multiple measuring points will be carried out as a key focus.

References

  1. 1. Bernardini L, Bono FM, Collina A. Drive-by damage detection based on the use of CWT and sparse autoencoder applied to steel truss railway bridge. Adv Mech Eng. 2025;17(5):1–24.
  2. 2. Singh P, Bana D, Sadhu A. Improved bridge modal identification from vibration measurements using a hybrid empirical Fourier decomposition. J Sound Vib. 2024;590:118598.
  3. 3. Burrello A, Marchioni A, Brunelli D. Embedding principal component analysis for data reduction in structural health monitoring on Low-cost IoT gateways. ACM. 2019.
  4. 4. He D, Wang B, Gao X. An adaptive filtering method for bridge vibration signals based on improved CEEMDAN and multi-scale permutation entropy. EESRJ. 2021;8(4).
  5. 5. Zhou L, Lai P, Zhao W, Yang Y, Shi A, Li X, et al. A noise reduction method for GB-RAR bridge monitoring data based on CEEMD-WTD and PCA. Symmetry. 2025;17(4):588.
  6. 6. Zhang F-y, Yang D, Gong X-y, Zou J, Lan L. A wavelet-based data compression algorithm for bridge vibration. 2012 IEEE 14th International Conference on Communication Technology; 2012. p. 334–41. https://doi.org/10.1109/icct.2012.6511239
  7. 7. Kullaa J. Damage detection and localization under variable environmental conditions using compressed and reconstructed bayesian virtual sensor data. Sensors (Basel). 2021;22(1):306. pmid:35009842
  8. 8. Chencho , Li J, Hao H, Li L. Structural damage classification of large-scale bridges using convolutional neural networks and time domain responses. J Perform Constr Facil. 2024;38(4):20242216181569.
  9. 9. Guo A, Jiang A, Lin J, Li X. Data mining algorithms for bridge health monitoring: Kohonen clustering and LSTM prediction approaches. J Supercomput. 2019;76(2):932–47.
  10. 10. Li Z, Li D, Sun T. A transformer-based bridge structural response prediction framework. Sensors (Basel). 2022;22(8):3100. pmid:35459083
  11. 11. Datta Rachuri R, Liao D, Sarikonda S, Kondur DV. A multimodal fusion framework for bridge defect detection with cross-verification. 2024 IEEE International Conference on Big Data (BigData); Washington, DC, USA; 2024. p. 3292–300. https://doi.org/10.1109/BigData62323.2024.10825867
  12. 12. Gadiraju DS, Azam SE, Khazanchi D. SHM-traffic: DRL and transfer learning based UAV control for structural health monitoring of bridges with traffic. arXiv2402.14757. 2024. https://arxiv.org/abs/2402.14757
  13. 13. Hwang S-H, Kim K-M, Kim S, Kwak JW. Lossless data compression for time-series sensor data based on dynamic bit packing. Sensors (Basel). 2023;23(20):8575. pmid:37896669
  14. 14. Pham-Hong Q, Tran Hung V, Nguyen CT, Bui PL, Mai-Duc A. Damage detection in steel truss bridges using 1D-CNN-BiGRU network with time-series data. Eng Comput. 2025;12:1–21.
  15. 15. Xiao L, Lu X, Huang T, Lv Y, Peng M, Miao C, et al. Hybrid SSA-LSTM based mechanical property degradation prediction for corroded steel wires of long-span cable supported bridges. Case Stud Constr Mater. 2025;23:e05488.
  16. 16. Wang Z, Liu H, Han Y, Jiang L. A hybrid GWO-VMD-LSTM surrogate model for vehicle-track-bridge response prediction under near-fault earthquakes. Int J Str Stab Dyn. 2025:10:2750090.
  17. 17. Perez A, Rezaeiravesh S, Ju Y, Laure E, Markidis S, Schlatter P. Compression of turbulence time series data using Gaussian process regression. Comput Phys Commun. 2026;319:109914.
  18. 18. Wang W, Liu L, Zhang K, Yang K, Kuang L, Zheng Z, et al. NPAC: numeric pattern aware compression algorithm for floating-point time-series data. World Wide Web. 2025;28(6).
  19. 19. Pöppelbaum J, Schwung A. Time series compression using quaternion valued neural networks and quaternion backpropagation. Neural Netw. 2025;188:107465. pmid:40286679
  20. 20. Miao Z, Li W e i z e, Pan X. Multivariate time series collaborative compression for monitoring systems in securing cloud-based digital twin. J Cloud Comput: Adv Syst Appl. 2024;13:16.
  21. 21. Tianzhi H, Zhigang M, Longyu W. Research and application of nonlinear deterioration model based on bridge technical condition. West China Commun Sci Technol. 2018;6:92–5.
  22. 22. Longji Z, Zhi Y, Jiaqing L, Wenhua L, Jingchun M. Spatiotemporal dependency data imputation for long-term health monitoring of concrete arch bridges. Sci Rep. 2025;15(1):36218. pmid:41102339
  23. 23. Entezami A, Sarmadi H, Behkamal B. Long-term health monitoring of concrete and steel bridges under large and missing data by unsupervised meta learning. Eng Struct. 2023;279:115616.
  24. 24. Xin J, Mo X, Jiang Y, Tang Q, Zhang H, Zhou J. Recovery method of continuous missing data in the bridge monitoring system using SVMD‐assisted TCN–MHA–BiGRU. Struct Control Health Monit. 2025;2025(1):8833186.