Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Digital media pattern design compression and optimization method based on K-means clustering and LLE dimensionality reduction

  • Binmei Liu,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliation The Academy of VR and Art, Jiangxi University of Software Professional Technology, Nanchang, China

  • Shiwan Zhou ,

    Roles Investigation, Project administration, Writing – review & editing

    zhou649792703@163.com

    Affiliation The Academy of VR and Art, Jiangxi University of Software Professional Technology, Nanchang, China

  • Jing Sun,

    Roles Formal analysis, Funding acquisition, Writing – review & editing

    Affiliation College of Creative Arts, Universiti Teknologi MARA, Selangor, Malaysia

  • Xiaojuan Liu,

    Roles Software, Supervision, Writing – original draft

    Affiliation College of Humanities, Gandong University, Fuzhou, China

  • Wenting Lu

    Roles Methodology, Software, Validation

    Affiliations The Academy of VR and Art, Jiangxi University of Software Professional Technology, Nanchang, China, Postdoctoral Research Station Pioneer Software Inc, Jiangxi University of Software Professional Technology, Nanchang, China

Abstract

The rapid development of digital media demands higher quality in pattern design. Simultaneously, significant redundant information hinders the transmission and sharing of these patterns, creating an urgent need for image compression. However, traditional image compression methods struggle to balance efficiency and quality. To address this, a new image compression model for digital media patterns is proposed, based on K-means clustering and Locally Linear Embedding methods. This model creates an efficient compression solution by integrating dynamic clustering parameter selection based on color histograms, multi-dimensional image segmentation of color and texture, and a local linear embedding dimensionality reduction algorithm using dynamic neighborhood selection. The model achieves a compression ratio of 84% and a Peak Signal-to-Noise Ratio of 41dB, with no significant quality difference before and after compression, validating the effectiveness of the improvements made to the basic methods. In practical application experiments, the model’s multi-scale structural similarity reaches 0.71. When processing a large number of patterns, the fastest response time is 189ms, with a minimum memory usage of 16.1M. The shortest processing time on a single-core processor is 0.34s. These experimental results demonstrate that the model balances compression efficiency and quality, offering superior compression performance, good robustness, and the ability to handle various complex tasks and adapt to different application scenarios, meeting the high standards of the digital media industry for image compression.

1. Introduction

With the development of the internet and the iteration of various graphic design software, digital media patterns have not only improved in quality but also become more content-rich, while facing storage and transmission issues. Digital media patterns are visual elements in digital media art, typically defined as graphics or decorative motifs created via digital technology with specific design intentions, including but not limited to ordered combinations of visual elements such as color composition, texture features, shape layout, etc. Digital media patterns are transformed, reorganized, and integrated while respecting the symbolic meaning of the original image to better align with modern aesthetics and market demand. Their high resolution and complex designs provide users with a strong visual impact [12]. However, these characteristics impose higher demands on image storage and transmission, requiring high-quality compression of the original patterns to ensure smooth transmission and sharing of digital media patterns [3]. The large volume of data and the presence of redundant data make the storage and transmission of digital media patterns difficult and inefficient [4]. To improve storage and transmission efficiency, save bandwidth and energy, and enhance user experience, compression and optimization of digital media patterns are necessary. However, traditional image compression methods achieve smaller image sizes at the cost of image quality [56]. Huang et al. proposed an image data semantic communication model that utilizes reinforcement learning and adaptive semantic coding to perform pixel-level image encoding and reconstruction. The results indicate that the model has noise resistance and can compress and reconstruct images under low bit rate conditions [7]. To address the hardware resource burden caused by hyperspectral pattern processing and storage, Zhang et al. proposed an unsupervised dimensionality reduction method based on dynamic mode decomposition. The method showed good performance in terms of average peak signal-to-noise ratio and average principal angle mapping in comparative experiments, proving it to be a useful tool [8]. To balance the high compression ratio and image quality in medical image processing, and alleviate pressure on storage capacity and hardware devices, Dimililer et al. trained machine learning algorithms to ensure both compression ratio and image quality. Experimental results showed that the radial basis function neural network can be effectively used for classifying the optimal compression ratio, achieving a highest recognition rate of 90.625% [9]. Duan et al. proposed a new lossy image compression scheme based on variational autoencoders to address the lossy image compression issue. This scheme enables efficient decoding and variable-rate compression, with numerous experimental results indicating significant advantages in compression rate and rate-distortion performance over existing benchmark methods [10].

The K-means clustering algorithm is a widely used clustering analysis method due to its simplicity, speed, and broad applicability. It has become a favored research topic among experts and scholars in various fields. For example, Nie et al. re-expressed the objective function of the traditional K-means clustering algorithm as a maximization problem, reducing the need for additional intermediate variables and lowering computational complexity, thereby alleviating pressure on computer resources [11]. To achieve image compression in environments with limited transmission bandwidth and storage capacity, Liang et al. proposed a compression algorithm that combines the K-means clustering algorithm with neural networks. The results showed that compared with traditional neural network algorithms, the proposed algorithm improved the running time by 9.5% [12]. Zhao et al. proposed a lossless encryption method based on dual images, which introduces the characteristics of memristors into four-dimensional cellular neural networks and obtains partial keys through the K-means to make the keys more random. The results show that the proposed method exhibits good encryption performance in various indicators such as key sensitivity and information entropy [13].

Locally Linear Embedding (LLE), known for its ability to handle high-dimensional data, has been studied and optimized by scholars both domestically and internationally. For example, Lee et al. proposed a method for detecting abnormalities in the manufacturing industry based on the LLE algorithm, addressing the limitations of data utilization caused by imbalanced data. This method clusters the normal data distribution using the LLE algorithm to label different types of anomalies. Comparative experiments confirmed the superiority of this method in dimensionality reduction, with an F-1 score of 0.86 [14]. Chu et al. combined kernel-based LLE and partial least squares to fully exploit both local and global data information in a dataset, proposing a new machine operation performance evaluation method. Experiments showed that this method had significant advantages in detecting machine issues efficiently, helping producers adjust machines in time to maximize overall economic benefits [15].

In summary, experts and scholars in various fields have explored and researched image compression processing and optimization, with applications already realized in some areas. However, existing methods still have shortcomings in balancing compression quality and computational efficiency, especially in traditional K-means clustering, which heavily relies on manually setting parameters and has randomness in initial center selection. Meanwhile, traditional LLE algorithms are sensitive to neighborhood parameters and exhibit instability on high-dimensional feature data. In response to the above issues, this study proposes a K-means LLE compression model that combines dynamic parameter selection with multi-dimensional feature fusion, filling the research gap in parameter adaptive optimization, multi-dimensional feature complementary utilization, and local structure preservation for dimensionality reduction in image compression. The research aims to construct a compression model that can adaptively handle digital media modes, achieve a better balance between compression ratio and visual quality, and promote efficient storage, fast transmission, and intelligent processing of digital media resources.

The innovation of this study lies in: (1) proposing a clustering center initialization method based on geometric principles, and dynamically determining the number of clusters using color histograms to improve the stability and clustering quality of K-means, reducing the dependence on manually set parameters. (2) improving the nearest neighbor selection mechanism of the LLE algorithm, introducing class sample information to enhance the dimensionality reduction effect, and adaptively adjusting the neighborhood size based on local curvature measurement. (3) building an end-to-end compression model that integrates improved K-means and LLE algorithms to achieve high-quality compression and efficient dimensionality reduction.

2. Methods and materials

2.1. Design of pattern compression method based on k-means clustering

K-means represents the entire image using fewer colors, thereby achieving rapid image compression. The process of compressing an image using K-means involves first converting the image into a specific one-dimensional array, with each pixel represented as an element. Then, the clustering algorithm is applied to these one-dimensional arrays, clustering them into a fixed number of clusters. Based on the algorithm’s results, the centroid of each cluster is located, and the color of the centroid replaces the original pixel colors in the image. Finally, the compressed pixels are reconstructed to obtain the compressed image. K-means segmentation is simple and independent of dataset size. However, since the initial cluster centers are selected randomly, the quality of the clustering results depends heavily on the selection of these centers, leading to potential instability. Therefore, this study selects initial cluster centers using Euclidean distance and vector angle parameters, prioritizing points that maximize both inter-cluster distance and centroid vector angle differences (Fig 1).

thumbnail
Fig 1. Initialization of cluster centers based on geometric principles.

https://doi.org/10.1371/journal.pone.0350623.g001

As seen in Fig 1, the algorithm for initializing cluster centers based on geometric principles first reduces the dimensionality of the dataset, transforming high-dimensional data into a two-dimensional dataset. The mean center of the dataset, denoted by , is then computed, and the distances from each cluster to the mean center are calculated. The cluster with the largest distance is chosen as the first initialization center. The vector between the first initialization center and the mean center is considered the reference vector , and the Euclidean distance between the mean center and the other clusters, excluding the first initialization center, is calculated. The angles formed by the three points—the mean center , the first initialization center , and all other clusters—are calculated to form the angle set . Both the distance set and the angle set are sorted in descending order, and the cluster with the smallest combined rank of distance and angle is selected as the next initialization center . This process is repeated until initialization centers are found. The vector between two clusters is calculated as shown in Equation (1).

(1)

In Equation (1), and represent two different clusters. The Euclidean distance between the two vectors is obtained using Equation (2).

(2)

In Equation (2), and represent two -dimensional clusters. The angle between the two vectors is calculated using Equation (3).

(3)

In Equation (3), and represent two vectors. Using geometric principles to select cluster centers can reduce the impact of random cluster center selection on the clustering results and yield better clustering performance. However, the selection of the value still depends on manual input, requiring the user to have a high level of expertise to select the appropriate value. Color histograms quantify the colors in an image, enabling the statistical analysis of the image’s color features [1617]. This technique adapts well to various image variations and provides robust global feature descriptions. It has been successfully applied in fields such as photographic post-processing and computer vision [18]. Therefore, the study introduces color histograms to dynamically determine the number of clusters that best suit the algorithm (Fig 2). Color histograms are used to dynamically determine the number of clusters by combining the image’s color features.

thumbnail
Fig 2. Dynamically determining the number of clusters based on the color histogram.

https://doi.org/10.1371/journal.pone.0350623.g002

As shown in Fig 2, the first step in dynamically determining the number of clusters based on color histograms is to convert the image’s RGB colors to the Hue-Saturation-Intensity (HSI) color space. After the color transformation, the colors are quantized, grouping similar colors into a primary color, yielding the color feature values. A color histogram is built based on these values. After determining the threshold for the number of clusters, a vertical and horizontal scan is performed to find the actual peak values of the color features. Among them, vertical scanning refers to traversing each color interval of the histogram and selecting intervals with count values higher than the preset minimum peak height threshold as candidate peaks. Horizontal scanning refers to checking the interval distance between adjacent peaks in the candidate peaks. If it is less than the set minimum distance, the lower peaks are merged or removed, and the final number of retained peaks is the adaptively determined cluster number k. The peak detection process sets a minimum peak height threshold equal to 15% of the histogram’s maximum value. It requires a minimum distance of 5 histogram intervals between peaks to avoid false peaks caused by noise. The number of reliable peaks identified through this process corresponds to the final number of clusters k, achieving adaptive partitioning of image color complexity. The equation for calculating the number of clusters is shown in Equation (4).

(4)

In Equation (4), represents the number of color feature values with a cluster number greater than or equal to 1, and represents the number of clusters in a single feature value. Multi-dimensional segmentation methods can integrate different feature types to complement and validate each other, thus improving segmentation accuracy and reasonability [1920]. Therefore, the study uses color and texture as supplementary constraints to jointly determine the segmentation result through this multi-dimensional segmentation method (Fig 3).

thumbnail
Fig 3. Multi-dimensional image segmentation method.

https://doi.org/10.1371/journal.pone.0350623.g003

As shown in Fig 3, the multi-dimensional image segmentation method first extracts the color, texture, and position distance features of the target image. Then, the similarity between the multi-dimensional features of different clusters is calculated. After normalizing the multi-feature similarity, the final similarity value is obtained and used as the basis for image segmentation. The color similarity between different clusters is calculated using Equation (5).

(5)

In Equation (5), represents the compared cluster group, and , , , and represent the hue and saturation components of the compared clusters. represents the difference in the color feature dimension of the compared clusters. For distance similarity, the study uses Manhattan distance, with the formula shown in Equation (6).

(6)

In Equation (6), represents the vector dimension, and represents the position distance between vectors and . The texture histogram is directly constructed based on the grayscale values of the image, and the histogram features are formed by statistical analysis of the distribution of pixel grayscale values in each region. For texture feature similarity, this study calculates the intersection distance by adding the minimum values of texture histograms between different clusters. The smaller the value, the lower the texture similarity. The calculation of intersection distance is shown in Equation (7).

(7)

In Equation (7), represents the number of bins in the color concentration area. After obtaining the color, position, distance, and texture similarities of the clusters, the results are normalized to obtain a unified multi-dimensional similarity value. Finally, the multi-dimensional similarity values are summed to obtain the final similarity value. The normalization process is implemented using Equation (8).

(8)

The Min-Max normalization method is a linear data standardization method that scales the original data proportionally to unify measurement standards [2122]. In Equation (8), , , and represent the original similarity values, maximum similarity value, and minimum similarity value, respectively. Finally, the color, position distance, and texture similarities are added to form the final comprehensive similarity, which is used as the basis for K-means image segmentation. By integrating geometric principles, histograms, and multi-dimensional segmentation methods into K-means, the study creates a dynamic parameter acquisition and multi-dimensional segmentation image compression method called DM-K-means (Fig 4).

thumbnail
Fig 4. DM-K-means image compression method flow.

https://doi.org/10.1371/journal.pone.0350623.g004

As shown in Fig 4, the proposed DM-K-means image compression method first determines the compression parameter, the number of clusters , using the histogram-based dynamic parameter acquisition method. Then, the geometric principle method is used to compute the initial center. The color, texture, and position distance features of the image are extracted, and multi-dimensional similarity is calculated. Based on the multi-dimensional similarity, the image is clustered. Finally, the image is reconstructed based on the clustering results, completing the image compression task.

2.2. Construction of compression and optimization model for digital media pattern design

The DM-K-means image compression method enables the compression of digital media patterns, allowing for better storage and transmission of patterns. However, image processing often encounters the “curse of dimensionality.” Excessive image data dimensions increase the computational complexity of the method, which impacts the overall model performance. Experts in various fields are exploring effective dimensionality reduction methods to facilitate efficient data sharing and utilization [2324]. The LLE algorithm reduces dimensionality by reconstructing local linear relationships from high-dimensional space within a low-dimensional manifold. This method performs well in reducing computational complexity and has a wide range of applications. Therefore, the study applies the LLE algorithm for dimensionality reduction of image data. The LLE algorithm’s implementation steps are as follows: first, select m nearest neighbors for a given sample, then determine the weight coefficients for the relationship between the sample and its neighbors. Finally, based on these weight coefficients, the data points are reconstructed while preserving the local linear relationships between low-dimensional and high-dimensional spaces, achieving dimensionality reduction. However, the LLE algorithm is highly sensitive to the size of the neighborhood and the selection of nearest neighbors. Improper selection can affect the dimensionality reduction results. Therefore, the study improves the LLE algorithm’s neighborhood size and nearest neighbor selection to enhance the dimensionality reduction quality (Fig 5).

As shown in Fig 5, the improved nearest neighbor selection method adjusts effective distances to balance sparse and dense data points, ensuring a more uniform distribution. The method then introduces class samples to exclude data points from different classes, forming the first round of nearest neighbor data sets. Next, the original data points are compared with the first round of nearest neighbor data, and the changes in their position distances are sorted in ascending order. The data points with the smallest changes are selected as the second round of nearest neighbor data. Equation (9) describes the process of incorporating class information.

(9)

In Equation (9), is the initially changed distance, is the maximum distance between data points from different classes, represents the data point’s class, is an empirical parameter controlling data point spacing (set to 0.4 in this study), and is the distance of data points integrated with class information. The distance is shown in Equation (10).

(10)

In Equation (10), and represent two measured data points, and and are the average distances between the two measured data points and other data points. The changes in distances between data points in high-dimensional and low-dimensional spaces are calculated according to Equation (11).

(11)

In Equation (11), and represent the distances between data points and neighboring points in high and low-dimensional spaces. represents the distance error between the two spaces. A smaller value of indicates that the topological structure differences between the dimensions are minimal. For neighborhood size selection, the study determines it using the popular curvature metric. In regions with high local curvature of data points, a larger neighborhood is chosen, and vice versa. The relationship between the curvature and the ratio of the position distance to the Euclidean distance is proportional. The process for measuring the local curvature of data points is shown in Equation (12).

(12)

In Equation (12), and are the position distance and Euclidean distance between any two points in the initial neighborhood of the data point. The indicator function and its average value in the high-dimensional space are calculated for all data points, with the expressions for the indicator function and its average value given in Equation (13).

(13)

In Equation (13), . The final neighborhood size of the data points is determined dynamically, as shown in Equation (14).

(14)

In Equation (14), is the initially preset neighborhood size, and is the final computed neighborhood size. The study integrates the improved nearest neighbor and neighborhood size selection methods into the original LLE algorithm, designing a dynamic neighborhood selection LLE algorithm, named DAF-LLE (Fig 6).

As shown in Fig 6, the DAF-LLE algorithm first maps high-dimensional image data to a low-dimensional space and dynamically optimizes the neighborhood structure for dimensionality reduction. In the nearest neighbor selection stage, the algorithm adjusts the spacing between data points based on category information, prioritizes selecting samples of the same class as candidate neighbors, and then determines the final set of neighbors based on the principle of minimizing distance changes. In the adaptive stage of neighborhood size, the algorithm dynamically adjusts the neighborhood range of each data point based on local curvature evaluation, assigning larger neighborhoods to areas with high curvature. Finally, in the weight calculation and reconstruction stage, the weight matrix is calculated based on the selected neighbors, and the optimal weights are obtained by minimizing the reconstruction error, and data dimensionality reduction is completed in a low-dimensional space. The loss function is shown in Equation (15).

(15)

In Equation (15), represents the number of data points, and denotes the contribution of the -th data point to reconstructing the -th data point. If is not a neighboring point of , then . Finally, the study combines the DM-K-means image compression method and the DAF-LLE dimensionality reduction algorithm to build a digital media pattern compression model based on K-means and LLE, named K-means-LLE (Fig 7).

Fig 7 displays the processing framework of the K-means-LLE model for patterns. In the compression process, the DM-K-means stage achieves initial compression by reducing the number of colors, but its output dimension is still relatively high, especially when processing large-sized images, which can bring significant storage and computational overhead. Therefore, after DM-K-means compression, the DAF-LLE dimensionality reduction step is introduced to map the high-dimensional feature space output by DM-K-means to a carefully constructed low dimensional manifold space. The high-dimensional features here mainly refer to the feature set composed of color, texture, and spatial position information. DAF-LLE preserves the local neighborhood structure of sample points from the high-dimensional feature space while removing redundant information and noise in the low-dimensional embeddings, further reducing the data required to represent compressed images while maximizing visual fidelity. This dimensionality reduction operation improves the overall compression pipeline, not only reducing the bit rate required for subsequent storage and transmission of these compressed features, but also accelerating any possible subsequent processing tasks, thereby achieving a better balance between compression ratio and computational efficiency. The pseudocode of the proposed K-means LLE algorithm is shown in Table 1.

thumbnail
Table 1. Pseudo code of K-means LLE algorithm.

https://doi.org/10.1371/journal.pone.0350623.t001

2.3. Experimental design

In the DM-K-means method, the number of clusters k is dynamically determined through color histogram analysis, with a threshold set at 15% of the total number of feature values. In the DAF-LLE algorithm, the empirical parameter α is determined to be 0.4 after grid search optimization. The neighborhood size is dynamically adjusted based on local curvature, with an initial preset neighborhood size of 12, and adaptively calculated using formula (14). The number of nearest neighbors during the construction of the weight matrix is set to 8. All parameters were determined through cross-validation in preliminary experiments to ensure that the model maintains stable performance on different types of images. To validate the performance of the K-means-LLE model proposed in this study, performance verification was carried out through both ablation and comparative experiments. The ablation experiment primarily verifies the effectiveness of the model’s internal improvements, comparing K-means, LLE, and their combined versions by controlling variables to confirm the contribution of each component to the final performance. The comparative experiment evaluates the comprehensive performance of the model in a wider competitive environment, comparing the complete K-means-LLE model against industry-standard methods such as Joint Photographic Experts Group (JPEG), High Efficiency Video Coding (HEVC), and Efficient Learning Image Compression (ELIC) to demonstrate its overall advantages. These two experiments jointly validated the value of the model from different dimensions – the ablation experiment ensured its design rationality, and the comparative experiment proved its actual competitiveness. For the ablation experiments, the ADE20K image dataset was used, which is an open scene understanding dataset divided into a training set and a validation set. The training set contains 25,574 patterns, and the validation set contains over 2,000 patterns. The dataset covers patterns from over 300 different scene categories, including sports, daily life, and work. For the comparative experiments, the ImageNet dataset was used, which is currently the world’s largest database. It is a collection of patterns based on the WordNet hierarchy and contains 14 million high-resolution patterns across more than 20,000 categories, formatted in PASCAL VOC. The ablation experiments compared the image compression performance of the K-means, LLE, and K-means-LLE models. The comparative experiments evaluated the performance of the K-means-LLE model by comparing it with three other models: Joint Photographic Experts Group (JPEG), High Efficiency Video Coding (HEVC), and Efficient Learned Image Compression (ELIC). Both the ablation and comparative experiments were conducted in the same experimental environment (Table 2).

thumbnail
Table 2. Experimental environment parameters.

https://doi.org/10.1371/journal.pone.0350623.t002

After the ablation experiments validated the effectiveness of the proposed optimization, the study proceeded with comparative experiments on the practical application of the K-means-LLE model. The experiments were conducted in the same environment as described in Table 2, using the ImageNet dataset. The first step involved a comprehensive evaluation of the four models. “Comprehensive” here refers to whether the models could achieve both high compression ratios and low image differences after compression.

3. Results

3.1. Experimental setup and ablation study

To evaluate the sensitivity of the model to key parameters, this study conducted testing on three parameters: cluster number k, empirical parameter α, and neighborhood size. The experiment was conducted on 100 images in the ADE20K dataset, with only one parameter adjusted at a time and the remaining parameters fixed as default values. The changes in compression ratio and Peak Signal-to-Noise Ratio (PSNR) were recorded (Table 3).

thumbnail
Table 3. Sensitivity analysis results of three parameters.

https://doi.org/10.1371/journal.pone.0350623.t003

Table 3 indicates that the k value significantly impacts performance. When the k value increases from 20 to 150, the PSNR increases from 38.6 dB to 45.1 dB, while the compression ratio decreases from 88.5% to 65.5%. The parameter α yields stable performance within the range of 0.3–0.5, but it decreases slightly beyond this range. The influence of neighborhood size on model performance is relatively small. Based on the experimental environment in Table 2 and the ADE20K image dataset, the study first conducted a comparison of the compression ratios of the K-means, LLE, and K-means-LLE models. The experiment involved compressing 100 patterns of two different resolutions using three image compression models (Fig 8).

thumbnail
Fig 8. Comparison results of compression ratios of three models at different image resolutions.

https://doi.org/10.1371/journal.pone.0350623.g008

From Fig 8, it can be seen that when the resolution of the compressed image is 1920 × 1080, the compression ratio for LLE ranged from 48% to 53%, with a maximum of 79% and a minimum of 38%, with a few outliers. For K-means, the compression ratio ranged from 57% to 62%, with a maximum of 83% and a minimum of 48%. The proposed K-means-LLE model had a compression ratio ranging from 77% to 84%, with no outliers. For high-resolution patterns, K-means-LLE performed significantly better than the basic models. When the compressed patterns were of low resolution (768 × 512), K-means-LLE still demonstrated superior compression performance, with a median compression ratio of 79%, outperforming LLE (61%) and K-means (62%). These results indicate that the compression ratio of K-means-LLE is less affected by image resolution, and regardless of the image quality, K-means-LLE can still perform high-quality compression. Moreover, these results demonstrate that the proposed optimized model provides a significant improvement in compression performance over the basic models. After the compression ratio experiments, the study continued by comparing the PSNR of LLE, K-means, and K-means-LLE. PSNR quantifies the compression effect of the models by comparing the pixel value differences between the original and compressed patterns. A higher PSNR indicates that the difference between the compressed and original patterns is smaller, meaning better compression quality. The experiments selected two groups of patterns with different content complexities: one group containing patterns with detailed textures like leaves and hair, and another group containing simple patterns with a single color, like the sky and blackboards. All experiments were independently repeated 10 times, and the results were reported in the form of mean ± standard deviation. Paired t-test was used for statistical significance analysis, with a significance level set at p < 0.05 (Fig 9).

thumbnail
Fig 9. Comparison results of PSNR between three models on simple content images and complex content images.

Note: “*” indicates a significant difference compared to K-means-LLE, p < 0.05.

https://doi.org/10.1371/journal.pone.0350623.g009

From Fig 9(a), it can be seen that when the compressed image is of simple content, the average PSNR of K-means-LLE was 48dB, meaning that the difference between the original and compressed patterns was almost imperceptible to the human eye. The average PSNR for LLE was 28dB, and for K-means, it was 21dB, indicating that the difference between the compressed and original patterns was noticeable. In contrast, K-means-LLE had the least perceptible difference when processing simple content patterns. From Fig 9(b), it can be seen that when compressing complex content patterns, the average PSNR of all three models decreased to some extent. K-means-LLE maintained an average PSNR of 41 dB, with minimal distortion. LLE had an average PSNR of 20dB, indicating considerable distortion during compression. K-means had an average PSNR of 13dB, suggesting more significant distortion, with visible quality degradation. These results show that, regardless of the complexity of the image content, K-means-LLE can perform high-quality compression. In the final part of the ablation experiments, the study visually compared the compression effects of the three models (Fig 10).

thumbnail
Fig 10. Visual quality comparison between the original image and the output results of three compression methods (Source:

http://mbd.baidu.com/newspage/data/dtlandingsuper?nid=dt_5615633959172790718).

https://doi.org/10.1371/journal.pone.0350623.g010

As seen in Fig 10(b), after compressing the original image using K-means, it was difficult for the human eye to detect any differences. However, upon zooming in, some blurring was visible, and the image quality showed a slight decline. In Fig 10(c), when the image was compressed using LLE, noticeable differences in clarity were observed, especially in the detailed areas where blurring was more severe. This indicates that LLE caused some loss of information during the compression process. In Fig 10(d), when K-means-LLE was used, no significant difference in image quality was observed, and the details were preserved with adequate clarity. In summary, K-means-LLE outperforms the basic models, K-means and LLE, significantly in terms of image compression performance, demonstrating that the proposed optimization is effective.

3.2. Verification of model’s practical application performance

The comprehensive experiments used both the compression ratio and PSNR to evaluate the models’ performance (Fig 11).

thumbnail
Fig 11. Comparison results of the compression ratio and PSNR index of four compression models.

https://doi.org/10.1371/journal.pone.0350623.g011

From Fig 11 (a) and 11(b), it can be seen that as the bit rate increased, JPEG’s maximum compression ratio was 75.2%, with a PSNR of 33.4dB. HEVC had a maximum compression ratio of 69.7% and a PSNR of 43.1dB. While HEVC had a slightly lower compression ratio than JPEG, its PSNR value was higher, suggesting that HEVC outperforms JPEG in terms of image comprehensiveness, maintaining better image quality with only a slight sacrifice in compression rate. ELIC had higher values for both compression ratio and PSNR than both JPEG and HEVC, with a maximum compression ratio of 86.1% and a PSNR of 45.3dB, indicating superior comprehensiveness. In comparison, the K-means-LLE model achieved the best performance in both compression ratio and PSNR, with a maximum compression ratio of 91.1% and a PSNR of 47.2dB. This demonstrates that K-means-LLE has both a high compression ratio and a high PSNR, providing an excellent balance between compression effect and image quality. Next, the study used the Multi-Scale Structural Similarity Index (MS-SSIM) to compare the four models. MS-SSIM evaluates the structural similarity of the patterns after compression by considering the luminance, contrast, and structure. A higher MS-SSIM indicates that the compressed image is more similar to the original image, and thus the distortion is smaller (Fig 12).

thumbnail
Fig 12. Comparison results of four models on MS-SSIM metrics.

https://doi.org/10.1371/journal.pone.0350623.g012

From Fig 12(a), JPEG’s MS-SSIM ranged from a maximum of 0.69 to a minimum of 0.36. As the number of patterns increased to 116, the MS-SSIM dropped below 0.5. In Fig 12(b), ELIC had a maximum MS-SSIM of 0.72 and a minimum of 0.39, dropping below 0.5 when the number of compressed patterns reached 148. Fig 12(c) shows that HEVC’s MS-SSIM was mostly below 0.5, with a maximum of 0.67 and a minimum of 0.37. In contrast, Fig 12(d) shows that K-means-LLE had the highest MS-SSIM values, with a maximum of 0.79 and a minimum of 0.41. This data demonstrates that K-means-LLE preserves more of the structural integrity of the original patterns during compression, providing superior compression performance. After verifying the compression performance of K-means-LLE, the study continued by testing its practical applicability. The first step involved comparing the response time and memory usage of the four models. The experiment used 10 groups of patterns, each containing 100 patterns from different categories. The study verified the response time and memory usage when all four models processed 100 patterns simultaneously (Fig 13).

thumbnail
Fig 13. Comparison of response time and memory usage when four models process 100 images simultaneously.

https://doi.org/10.1371/journal.pone.0350623.g013

As shown in Fig 13(a), for the 10 groups of experimental subjects, the response time of ELIC was the shortest at 612ms and the longest at 1012ms, indicating relatively long response times. For the HEVC compression model, the shortest and longest response times were 322ms and 571ms, respectively, showing relatively short response times. The shortest and longest response times for JPEG were 384ms and 732ms, respectively, placing its response time between that of HEVC and ELIC. The response times for K-means-LLE were the shortest at 189ms and the longest at 294ms, clearly faster than the other three models. The small difference between the shortest and longest response times suggests that K-means-LLE not only processed multiple patterns quickly but also had stable response times. Fig 13(b) presents the memory usage results, showing that K-means-LLE occupied the least memory among the four models, significantly less than HEVC and ELIC, and slightly less than JPEG. When compressing 100 patterns simultaneously, the minimum and maximum memory usage for ELIC, HEVC, and JPEG were 25.1MB, 20.8MB, and 18.2MB, and 34.8MB, 27.9MB, and 25.0MB, respectively. For K-means-LLE, the minimum and maximum memory usage were 16.1MB and 23.2MB. These results indicate that K-means-LLE outperformed the comparison models in both response time and memory usage when compressing a large number of patterns simultaneously, making it efficient and practical for handling multiple processing tasks. Next, the study verified the dependence of the four models on the operating environment through comparative experiments. The experiments were conducted in two steps. The first step tested the duration required to process 100 patterns with increasing bandwidth using a single-core processor. The second step validated the same process using a dual-core processor (Fig 14).

thumbnail
Fig 14. Comparison of the processing time of four models on single-core processors and dual-core processors with different bandwidths.

https://doi.org/10.1371/journal.pone.0350623.g014

Fig 14 reflects the varying processing times for the four models on single-core and dual-core processors and also highlights their dependence on computing resources. As shown in Fig 14(a), with a single-core processor, when the bandwidth was 100Mbps and 500Mbps, the processing time for ELIC decreased from 4.95s to 2.84s. For HEVC, the processing times were 3.75s and 3.21s, and for JPEG, the longest and shortest processing times were 2.98s and 1.59s, respectively. For K-means-LLE, the processing time was 1.22s at 100Mbps and 0.34s at 500Mbps. Fig 14(b) shows the results for the dual-core processor. In this case, the processing times for all four models were reduced. With a bandwidth of 100Mbps, the processing times for ELIC, HEVC, JPEG, and K-means-LLE were 3.25s, 2.42s, 1.59s, and 0.97s, respectively. The processing times were reduced by 0.7s, 0.79s, 1.39s, and 0.25s compared to the single-core processor. Notably, K-means-LLE had the shortest processing time on both the single-core and dual-core processors, with the smallest difference in processing time between the two. These experimental results demonstrate that K-means-LLE has a lower dependence on computing resources and can efficiently run on a variety of machines, indicating its broad applicability.

4. Discussion

To meet the high demands of modern digital media patterns for image compression technology, this study proposed an image compression model, K-means-LLE, based on K-means clustering and dimensionality reduction techniques. Ablation experiments and comparative tests verified the K-means-LLE model’s performance. The results of the ablation experiment showed that K-means-LLE achieved a compression ratio of 84% and a PSNR of 48 dB, indicating that it can maintain high visual quality while significantly reducing data volume. This performance improvement is mainly due to the dynamic clustering parameter selection and multi-dimensional feature fusion strategy, effectively overcoming the limitations of traditional methods in initial center selection and neighborhood sensitivity issues [25]. Unlike Rui et al.’s improved approach of using hierarchical clustering to enhance computational efficiency [26], the dynamic parameter and multi-dimensional feature fusion mechanism proposed in this study focuses more on ensuring and enhancing compression quality through parameter adaptation and feature complementarity while improving efficiency. The dimensionality reduction optimization method used by Oh et al. [27] focuses on reducing computational resource overhead. In contrast, the DAF-LLE algorithm proposed in this study focuses more on maintaining the local manifold structure of the data to ensure visual fidelity after dimensionality reduction, thus achieving a better balance between resource consumption and reconstruction quality. In practical applications, K-means-LLE achieved a compression ratio of 91.1% and a PSNR of 47.2 dB, demonstrating its ability to maintain high-quality compression performance even when processing complex images. The slight difference in PSNR values between ablation experiments and practical application tests (48 dB and 47.2 dB) is mainly due to the different datasets used in the two experiments. 48 dB is the test result on the ADE20K image dataset, while 47.2 dB is the test result on the ImageNet dataset. This result is of great significance in practical applications, as a high compression ratio means that about 90% of storage space and transmission bandwidth can be saved, especially for network transmission or resource-limited mobile devices. A high PSNR value ensures that the compressed image has almost no visual distortion, making it particularly suitable for scenes that require high detail preservation, such as art and design, high-precision printing, and other fields [28]. Although Kim’s rate distortion-based method also aims to balance compression ratio and quality [29], the K-means LLE model proposed in this study achieves better overall performance in key indicators such as PSNR and compression ratio through a collaborative architecture of clustering and dimensionality reduction. In the multi-scale structural similarity experiments, K-means-LLE achieved the highest structural similarity of 0.79, attributed to the use of multi-dimensional feature similarity calculations. Similar to Lin’s approach of integrating multi-scale features to enhance the robustness of drone tracking [30], this study integrates color, texture, and positional features in multi-dimensional image segmentation. When handling patterns of varying complexities, K-means-LLE exhibited the shortest response time of 189 ms, with memory usage as low as 16.1 MB, demonstrating its efficiency and stability in handling a large number of image tasks. This feature makes the K-means LLE model particularly suitable for real-time or near-real-time application scenarios, such as online image editing platforms, multimedia messaging services, etc. In addition, the model exhibits shorter execution time and smaller performance fluctuations on both single-core and dual-core processors, indicating its low dependence on hardware resources and good cross-platform applicability [31]. Furthermore, K-means-LLE displayed fast processing speeds across different processor specifications, which can be attributed to the reasonable design of the K-means-LLE model. Regarding the rate issue in image compression, Zhang saved decoding time by decoupling the structure, thereby shortening the image compression duration [32]. Kim, on the other hand, reduced memory usage by dynamically scheduling the workload of the model [33]. The experimental results demonstrated that the K-means-LLE model provides superior image compression performance and practicality, offering more professional technical support for digital media workers and contributing to the intelligent and diversified development of digital media. The K-means LLE model in this study demonstrates comparable or even better efficiency with its lightweight architecture that combines clustering and dimensionality reduction.

In summary, the K-means LLE model not only achieves a good balance between compression efficiency and quality but also demonstrates significant storage and bandwidth saving potential in practical applications, adapting to various complex scenarios. The successful practice of this model provides strong support for the intelligent processing and efficient transmission of digital media content, which helps to promote the development of related industries towards a more efficient and environmentally friendly direction. However, this study did not conduct in-depth transferability validation for specific domain images, such as medical imaging, satellite images, etc., resulting in unclear applicability and performance of the model in fields with unique imaging characteristics and professional requirements. Therefore, in future work, systematic transferability validation should be conducted on more specialized datasets, such as the NIH ChestX-ray14 in the field of medical imaging and the EuroSAT dataset in the field of remote sensing images. At the same time, targeted domain adaptation techniques are introduced to optimize the feature extraction module, in order to enhance the model’s generalization ability in professional domains.

5. Conclusion

Digital media patterns, due to their rich content and vibrant color representation, have gained attention in various fields such as virtual reality, film, and gaming. However, substantial redundant information limits the storage and transmission of digital media patterns, creating an urgent need for compression. Traditional image compression methods struggle to balance compression efficiency and quality. To address this, this study combines K-means clustering and LLE dimensionality reduction techniques to propose a new digital media pattern compression model. This model dynamically determines the number of clusters and initial centers, utilizes multi-dimensional features for accurate segmentation, and applies improved dimensionality reduction algorithms to process high-dimensional features. Ultimately, while maintaining high-quality visual performance, it achieves higher compression ratios, faster processing speeds, and lower memory usage. In the ablation experiments, the compression ratio and image quality before and after compression were superior to those of the baseline models, demonstrating the effectiveness of the optimizations and improvements made to the model. In practical application experiments, the results showed that the model was able to maintain the quality of the compressed patterns while ensuring a high compression ratio, with minimal structural differences between the patterns before and after compression. When handling a large number of patterns, the model exhibited faster response times and lower memory usage than the comparison models, while also demonstrating a favorable dependence on the processor. Overall, the proposed digital media pattern compression model exhibited excellent compression performance and practicality, meeting the high standards required by the digital media industry for image compression.

Supporting information

References

  1. 1. Fatima M, Ahmed QM, Paracha O. Examining sustainable consumption patterns through green purchase behavior and digital media engagement: A case of Pakistan’s postmillennials. FS. 2024;26(5):867–85.
  2. 2. Peters C, Schrøder KC, Lehaff J, Vulpius J. News as they know it: Young adults’ information repertoires in the digital media landscape. Digital Journalism. 2021;10(1):62–86.
  3. 3. Reiter F, Matthes J. Correctives of the mainstream media? A panel study on mainstream media use, alternative digital media use, and the erosion of political interest as well as political knowledge. Digital Journalism. 2021;11(5):813–32.
  4. 4. Rahebi J. Vector quantization using whale optimization algorithm for digital image compression. Multimed Tools Appl. 2022;81(14):20077–103.
  5. 5. Tang Z, Wang H, Yi X, Zhang Y, Kwong S, Kuo C-CJ. Joint graph attention and asymmetric convolutional neural network for deep image compression. IEEE Trans Circuits Syst Video Technol. 2023;33(1):421–33.
  6. 6. Tang T, Li M, Zhang X, Zheng S, Li W. SR-LBSCC: Super resolution based screen content image compression at low bitrate. Pattern Recognition Letters. 2025;191:96–102.
  7. 7. Huang D, Gao F, Tao X, Du Q, Lu J. Toward Semantic Communications: Deep Learning-Based Image Semantic Coding. IEEE J Select Areas Commun. 2023;41(1):55–71.
  8. 8. Zhang Y, Yuan P, Jiang L, Ewe HT. Novel data-driven spatial-spectral correlated scheme for dimensionality reduction of hyperspectral images. IEEE J Sel Top Appl Earth Observations Remote Sensing. 2022;15:3877–90.
  9. 9. Dimililer K. DCT-based medical image compression using machine learning. Signal Image Video Process. 2022;16(1):3877–90.
  10. 10. Duan Z, Lu M, Ma J, Huang Y, Ma Z, Zhu F. QARV: Quantization-Aware ResNet VAE for Lossy Image Compression. IEEE Trans Pattern Anal Mach Intell. 2024;46(1):436–50. pmid:37812557
  11. 11. Nie F, Li Z, Wang R, Li X. An Effective and Efficient Algorithm for K-Means Clustering With New Formulation. IEEE Trans Knowl Data Eng. 2023;35(4):3433–43.
  12. 12. Liang Y, Zhao M, Liu X, Jiang J, Lu G, Jia T. An adaptive image compression algorithm based on joint clustering algorithm and deep learning. IET Image Process. 2024;18(3):829–37.
  13. 13. Zhao Y, Zheng M, Zhang Y, Yuan M, Zhao H. Novel dual-image encryption scheme based on memristive cellular neural network and K-means alogrithm. Nonlinear Dyn. 2024;112(21):19515–39.
  14. 14. Lee T, Kim Y, Hyun Y, Mo J, Yoo Y. Unsupervised Anomaly Detection Process Using LLE and HDBSCAN by Style-GAN as a Feature Extractor. Int J Precis Eng Manuf. 2023;25(1):51–63.
  15. 15. Chu F, Mo S, Hao L, Lu N, Wang F. Operating performance assessment of complex nonlinear industrial process based on kernel locally linear embedding PLS. IEEE Trans Automat Sci Eng. 2024;21(1):593–605.
  16. 16. Chang Q, Li X, Zhao Y. Reversible data hiding for color images based on adaptive three-dimensional histogram modification. IEEE Trans Circuits Syst Video Technol. 2022;32(9):5725–35.
  17. 17. Avi-Aharon M, Arbelle A, Raviv TR. Differentiable histogram loss functions for intensity-based image-to-image translation. IEEE Trans Pattern Anal Mach Intell. 2023;45(10):11642–53. pmid:37224367
  18. 18. Li W, Liu X, Yang J, Zhao C, Deng H. A color correction method based on incremental multilevel iterative histogram matching. IEEE Sensors J. 2024;24(17):27892–901.
  19. 19. Zeng B, Chen L, Zheng Y, Chen X. Adaptive multi-dimensional weighted network with category-aware contrastive learning for fine-grained hand bone segmentation. IEEE J Biomed Health Inform. 2024;28(7):3985–96. pmid:38640043
  20. 20. Zhang Z, Yu C, Zhang H, Gao Z. Embedding tasks into the latent space: Cross-space consistency for multi-dimensional analysis in echocardiography. IEEE Trans Med Imaging. 2024;43(6):2215–28. pmid:38329865
  21. 21. Renukadevi P, Amaran S, Vikram A, Prabhakara Rao T, Ishak MK. Enhancing cybersecurity through fusion of optimization with deep wavelet neural networks on denial of wallet attack detection in serverless computing. IEEE Access. 2025;13:47111–22.
  22. 22. Mohammadrezaei M, Maleki Z, Tabesh A, Khajehoddin SA. A framework for normalizing physical features of Li-Ion batteries to form a generic health estimation model. IEEE Trans Transp Electrific. 2024;10(3):6880–92.
  23. 23. Weber E, Papadopoulos DP, Lapedriza A, Ofli F, Imran M, Torralba A. Incidents1M: A large-scale dataset of images with natural disasters, damage, and incidents. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023;45(4):4768–81.
  24. 24. Thanyawet N, Ratsamee P, Uranishi Y, Kobayashi M, Takemura H. Identifying disaster regions in images through attention shifting with a retarget network. IEEE Access. 2024;12:143754–66.
  25. 25. Shi Z, Chen L, Ding W, Zhang C, Wang Y. Parameter-free robust ensemble framework of fuzzy clustering. IEEE Transactions on Fuzzy Systems. 2023;31(12):4205–19.
  26. 26. Rui L, Yang S, Chen S, Yang Y, Gao Z. Smart network maintenance in an edge cloud computing environment: An adaptive model compression algorithm based on model pruning and model clustering. IEEE Trans Netw Serv Manage. 2022;19(4):4165–75.
  27. 27. Oh Y, Lee N, Jeon Y-S, Poor HV. Communication-Efficient Federated Learning via Quantized Compressed Sensing. IEEE Trans Wireless Commun. 2023;22(2):1087–100.
  28. 28. Patra A, Saha A, Bhattacharya K. Second level storage space optimization for lossless image compression using diffraction grating. J OPTICS-UK. 2025;54(4):1811–5.
  29. 29. Kim S, Do J, Kang J, Kim HY. Rate-rendering distortion optimized preprocessing for texture map compression of 3D reconstructed scenes. IEEE Trans Circuits Syst Video Technol. 2024;34(5):3138–55.
  30. 30. Lin Y, Yin M, Zhang Y, Guo X. Robust UAV tracking via information synergy fusion and multi-dimensional spatial perception. IEEE Access. 2025;13:39886–900.
  31. 31. Reddy BNK, Zia Ur Rahman M, Lay-Ekuakille A. Enhancing reliability and energy efficiency in many-core processors through fault-tolerant network-on-chip. TRANS NETW SERV MANAGE. 2024;21(5):5049–62.
  32. 32. Zhang Z, Esenlik S, Wu Y, Wang M, Zhang K, Zhang L. End-to-end learning-based image compression with a decoupled framework. IEEE Trans Circuits Syst Video Technol. 2024;34(5):3067–81.
  33. 33. Kim J, Han S, Ko G, Kim J-H, Lee C, Kim T, et al. EPU: An energy-efficient explainable AI accelerator with sparsity-free computation and heat map compression/pruning. IEEE J Solid-State Circuits. 2024;59(3):830–41.