Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Construction of consumer satisfaction evaluation index system for green products based on online comments

  • Changlu Zhang,

    Roles Data curation, Methodology, Project administration

    Affiliations School of Economic & Management, Beijing Information Science & Technology University, Beijing, China, School of Computer Science, Beijing Information Science & Technology University, Beijing, China

  • Zihao WEI ,

    Roles Validation, Writing – review & editing

    18237332027@163.com

    Affiliations School of Computer Science, Beijing Information Science & Technology University, Beijing, China, Beijing Key Lab of Green Development Decision Based on Big Data, Beijing, China

  • Jian Zhang,

    Roles Formal analysis, Supervision

    Affiliations School of Economic & Management, Beijing Information Science & Technology University, Beijing, China, Beijing Key Lab of Green Development Decision Based on Big Data, Beijing, China

  • Liqian Tang

    Roles Conceptualization

    Affiliations School of Economic & Management, Beijing Information Science & Technology University, Beijing, China, Beijing Key Lab of Green Development Decision Based on Big Data, Beijing, China

Abstract

The promotion of green product consumption and its transformation towards low-carbon alternatives is essential for implementing new developmental paradigms and achieving carbon neutrality objectives. This study employs text mining techniques to analyze user online comments from e-commerce platforms, focusing on consumer satisfaction regarding green products. Utilizing the KeyBert model, relevant keywords were extracted from user feedback, followed by the training of keyword vectors using Word2Vec. K-means clustering was then employed to develop a comprehensive system of consumer satisfaction evaluation index system for energy-saving air conditioning products on the JD platform. The findings reveal that consumers prioritize functionality, service quality, aesthetic appeal, pricing, logistics, and installation in their evaluations. It is recommended that manufacturers enhance installation procedures, refine aesthetic designs, and emphasize functional advantages to elevate consumer satisfaction. However, this study is limited by its focus on a singular product category and necessitates further research incorporating a broader dataset to validate these findings. Future investigations should consider a wider range of green products and leverage diverse data sources to enrich the analysis.

1. Introduction

As environmental issues become increasingly critical, green consumption has garnered significant public interest. Driven by the concept of sustainable development, substantial progress has been achieved in promoting green consumption. During China’s “13th Five-Year Plan” period, the overall green consumption level among urban and rural residents exceeded 25% [12]. The rapid economic growth and rising living standards have heightened public awareness and interest in green products. To stimulate the adoption of these products, various countries and regions have instituted unified green product certifications through authoritative bodies, aiming to enhance consumer willingness to engage in green consumption behaviors [34].

Despite these advancements, research on the green consumption experiences and satisfaction levels of residents is limited, yet it is crucial for supporting supply-side structural reforms and facilitating high-quality development. The rapid rise of e-commerce has introduced multi-dimensional dynamics in online shopping, culminating in a new consumption paradigm [5]. The increasing variety of green products available on e-commerce platforms has further fueled consumer enthusiasm for expressing their experiences and emotions online [6].

Online reviews represent active consumer feedback post-purchase, offering rich and valuable insights into consumer sentiments. These reviews are characterized by substantial data volume, diverse information, and easy accessibility. They not only inform prospective buyers but also provide vital data for the enhancement of green products [78]. Merchants can utilize these insights to better understand consumer needs and satisfaction levels, ultimately improving product offerings and refining marketing strategies. Consequently, in the digital intelligence era, online reviews have emerged as a pivotal foundation for businesses to innovate and for consumers to make informed purchasing decisions [9].

However, existing research on product satisfaction predominantly relies on questionnaire surveys and traditional satisfaction models. The exploration of consumer satisfaction through the analysis of online reviews for green products remains scarce. Traditional models typically impose a predefined satisfaction index system through surveys, which can introduce uncertainty due to factors such as the validity of the indicators, survey scale, costs, and feedback timeframes [1012]. In contrast, the subjective feelings reflected in actively shared online review texts provide a more accurate representation of consumer sentiments. Therefore, evaluating consumer satisfaction regarding green products through text mining methods holds significant potential for fostering green consumption and achieving high-quality development.

This article aims to analyze online review texts of green products on leading e-commerce platforms using text mining techniques to construct a consumer satisfaction evaluation index system, thereby enhancing the scientific rigor and objectivity of the evaluation process.The structure of this article is as follows: The literature review reviews and synthesizes existing research on consumer satisfaction and text mining. The model construction develops a green product consumer satisfaction evaluation model based on text mining. The empirical research presents empirical research findings.

2. Literature review

Based on online comment texts, using text mining methods to conduct research on consumer satisfaction evaluation of green products can objectively obtain the dimensions that consumers are concerned about and their real consumption experience. This is the key to promoting green consumption and promoting green development. Therefore, we have reviewed existing literature from two perspectives: consumer satisfaction and online comment text mining.

Consumer satisfaction refers to the degree to which consumers are satisfied with a certain product or service [13].Consumer satisfaction is a person’s feelings after consuming a product or service and is compared to their expectations. Consumer satisfaction can be influenced by Service Quality, Product Quality and Purchasing Decisions. Domestic and foreign scholars have conducted extensive research on consumer satisfaction. According to the different methods and models of data acquisition, it can be roughly divided into three aspects:

The first is to construct a consumer satisfaction index model using traditional methods such as interviews and questionnaires. Representative achievements such as Wang Hongxin and Liu Yuhui obtaining consumer online fresh agricultural product data through survey questionnaires. They proposed seven factors and corresponding hypotheses that affect consumer satisfaction based on the Chinese Consumer Satisfaction Index model, and then conducted statistical analysis [14]. (2015) Li Ning et al. (2019) obtained basic data by distributing survey questionnaires to consumers and constructed a satisfaction factor model for consumers purchasing fresh agricultural products online [15]. Li Wen et al. (2020) distributed questionnaires to fresh agricultural product consumers under the O2O model and studied the relationship between factors affecting consumer shopping satisfaction through correlation analysis [16]. Yang Hongyan et al. (2020) conducted mining and analysis on three-year consumer food safety satisfaction survey data based on association rules, and determined the influencing factors of consumer satisfaction with food safety [17]. Lee et al. (2019) conducted a study through distributing survey questionnaires and found that the webpage design of e-commerce platforms has a positive impact on consumer satisfaction, which can enhance consumer purchase intention by improving the design level [1819]. Birjoveanu (2019) first qualitatively analyzed the willingness and behavior of e-commerce consumers in Liaoning Province, and constructed a theoretical model. Then, through quantitative analysis of the questionnaire survey data, he used model fitting and hypothesis testing methods to verify that various factors such as the design elements of the e-commerce platform interface and the security of network information have a significant impact on consumer willingness and satisfaction [20].

The second is to use the information in online comment text data to construct a traditional consumer satisfaction index model. Representative achievements such as Wei Helin et al. (2020) focused on five aspects of product characteristics and consumer reviews, and used stepwise regression method to explore the influencing factors of online word-of-mouth on team tourism product online booking. At the same time, they explored the relationship between the dependent variable and indicator characteristics [21]. Zhang Yanfeng et al. (2019) constructed a technology acceptance model and a linear regression model to study the factors affecting comment time. The research results found that the model can effectively discover the relationship between consumer online comment behavior and comment time, which helps to discover and predict the characteristic patterns of consumer comment time [22]. Geebren et al. (2021) demonstrated a significant positive impact of trust on customer satisfaction using a partial least squares structural equation model based on 659 satisfaction data from electronic banking. And the relationship between trust and intermediary structure assurance, service quality, and customer satisfaction [23].

The third is to use online comment text information to construct a consumer satisfaction evaluation model by calculating emotional tendencies. Representative achievements such as Sun Baosheng et al. (2022), based on online tourism review data and online text mining technology, constructed a tourist satisfaction evaluation index system and evaluation model, and quantitatively evaluated tourists’ ecotourism satisfaction based on a tourism sentiment lexicon [24]. Geng Xiaoli et al. (2019) constructed an LDA topic model through user online comments and conducted sentiment analysis, based on which they explored the most important factors affecting online user purchase satisfaction [25]. Zhang Zhengang et al. (2022) extracted product attributes from online comments and conducted sentiment analysis. Based on this, they constructed a classification model to identify different attribute requirements [26]. Zheng Songyin et al. (2022) constructed a digital service vocabulary by collecting comments from museum users and extracted aspect level statements. Then they conducted sentiment classification on aspect level statements, and finally analyzed the factors influencing user experience based on the classification results [27]. Zhao Yuqing et al. (2020) improved the traditional KANO model and combined it with online comments to achieve objective measurement of user demand acquisition and demand fulfillment [28]. Tu Min (2020) used the method of online comment clustering to calculate product satisfaction and constructed a satisfaction quartet, providing a new research approach for measuring user satisfaction [29]. Hao et al. (2021) analyzed the logistics factors that affect consumer satisfaction with JD agricultural products through text mining technology and optimized the e-commerce delivery path [30].

Through reviewing relevant literature both domestically and internationally, it has been found that researchers use different methods to study consumer satisfaction in different fields. At present, domestic and foreign scholars have gradually shifted their research methods on consumer satisfaction from traditional methods such as questionnaires and interviews to online comment text mining methods. At the same time, scholars have paid extensive attention to consumer purchasing behavior and preferences of products with certification marks. However, the research mainly focuses on the pre-purchase behavioral intentions of consumers, and there is still a lack of relevant research on post purchase satisfaction of labeled products.

Online comment content on e-commerce platforms is currently a research hotspot in the field of management decision-making and an important direction of information mining. In recent years, the scope of research on online comments has been continuously expanding, and research results have emerged. According to the different research content, it can be roughly divided into four aspects: number of comments, quality of comments, length of comments, and sentiment of comments [31].

In terms of the quantity characteristics of online comments: Cui et al. found that the number and validity of online comments have a significant impact on the sales of newly launched products when exploring the impact of online comment data on the sales of newly launched products [32].Shi Wenhua, Zhong Biyuan, and Zhang Qi (2017) investigated the relationship between movie box office revenue and the number of online film reviews. Their comparative analysis of online film reviews and online short reviews revealed that the quantity of online short reviews significantly influences box office performance [33].

Yang Xian et al. demonstrated the relationship between the number of online comments and consumer purchase intention by constructing a model of the relationship between the number of comments and purchase intention [34]. Niu Gengfeng et al. studied the impact mechanism of comment quality and quantity on consumer purchase intention through experimental design. Their research findings show that both the quantity and quality of comments affect consumer purchasing decisions, and have varying degrees of impact on individuals with different cognitive needs [35].

In terms of the quality characteristics of online comments: Sun Jin et al. found that the higher the quality of comments, the more it can affect consumers’ online purchasing decisions [36]. Cao Yu and others conducted research based on the theory of regulatory orientation, which showed that consumer cognition, consumer purchase intention, and comment quality are closely related. They also found that high-quality online reviews have a greater impact on consumer purchasing decisions [37]. Hong Fei et al. studied college student consumers and proposed a theoretical model of the impact of online comments on college student online shopping consumption. Research has shown that the higher the quality of online comments, the more useful value consumers gain, and the more it can promote consumer purchase intention [38]. Huo Hong et al. studied the importance of perceived risk in the quality of online comments and demonstrated through empirical research that high-quality comments can help consumers avoid perceived risk to a certain extent [39].

In terms of the length characteristics of online comments, Chevalier et al. found that the length of online comment text positively affects the usefulness of comments [40]. Li Ang et al. found through research that the more words online comments have, the higher their usefulness [41]. Ye et al. introduced online review research into the hotel industry and found that the quantity and quality of online review texts have a certain impact on hotel sales by analyzing relevant online review texts [42].

In terms of emotional characteristics of online comments, Zhao Tianrui et al. utilized deep learning technology to construct a Korean film review sentiment dictionary, and then formed the sentiment analysis model, which effectively completed sentiment analysis of Korean short texts [43]. Wang Yang conducted online comment data mining, emotion analysis and opinion extraction for spray products in small and medium-sized agricultural equipment. On the basis of fully obtaining user attribute opinions, he proposed corresponding improvement strategies [44]. Munuswamy et al. extracted valuable information from social user comments using sentiment rating prediction methods. They calculated the sentiment value of a single user's product based on a sentiment dictionary, thereby predicting project ratings and calculating the reputation of the product [45]. Wang Weina combines subjective online text and sentiment analysis techniques with consumer satisfaction knowledge. She focused on the important attributes of the relevant products, which provide reference value for the subsequent optimization of the products [46]. Liu Yulin et al. established an emotional index based on emotional tendencies and dynamically monitored emotional changes in online comment texts to grasp the emotional trends of e-commerce platforms [47].

In light of these insights, this study not only addresses the gap in the existing literature regarding post-purchase satisfaction of green products but also offers a novel perspective by emphasizing the significance of extracting authentic consumer experiences through online comment text mining. Unlike prior research predominantly focused on pre-purchase intentions, this investigation delves into the specific determinants of consumer satisfaction within the realm of green product consumption. The insights gained from this research can significantly inform businesses on strategies to optimize their offerings in the green market, thus fostering sustainable consumption patterns. Furthermore, the findings of this study pave the way for future research endeavors aimed at exploring the multifaceted dimensions of consumer satisfaction in environmentally friendly product categories.

Thus, online comment text mining provides a robust method for understanding consumer satisfaction and can offer actionable insights for green product optimization. Sentiment analysis, in particular, serves as a powerful tool for evaluating satisfaction levels and identifying improvement areas, ultimately contributing to consumption upgrading and high-quality development. The emphasis on post-purchase satisfaction in the context of green products is underscored as a distinct contribution to the literature. In contrast to prior research, which predominantly explores general drivers of consumer satisfaction or is focused on pre-purchase intentions, this work investigates specific determinants of satisfaction within green product consumption. By providing an authentic consumer perspective, this approach aligns with broader goals of fostering sustainable consumption patterns and advancing high-quality development within green markets.

3. Model construction

We crawled the user review data of green products on e-commerce platforms and used natural language processing technology to denoise and segment the review text. Using the KeyBert algorithm to extract keywords from comment texts. Then the Word2Vec tool was used to train keyword word vectors. Finally, the K-means algorithm was used for keyword clustering and ultimately to construct a green product consumer satisfaction evaluation index system.

3.1. Keyword extraction for online comments based on KeyBert

KeyBERT is a keyword extraction method developed through research and development led by Mararten Grootendors. The model enables users to extract keywords or key phrases from the given text and embed sentences or documents into highdimensional vector representations using BERT [48].The basic steps of the Key Bert method are as follows:

Firstly, encode the comment text. We record the comment text that requires keyword extraction as t. And use a pre-trained Bert model to encode the text, which can obtain vector representations of each word in the text. These vectors form a matrix, as shown in formula (1).

  • Input: The comment text t.
  • Output: A matrix of word vectors H, where each vector hᵢ represents the i-th word in the comment text.
(1)

The hᵢ in the formula represents the vector of the i-th word in a comment text.

Secondly, the vectorization of the comment text. In order to obtain the vector representation V(t) of the comment text, it is necessary to summarize the vectors of each word. A common method is to perform an average operation on word vectors, which involves adding up all word vectors and dividing them by the number of words n. The specific calculation is shown in formula (2).

  • Input: The matrix of word vectors H and the number of words n.
  • Output: The vector representation V(t)of the comment text.
(2)

Finally, calculate the similarity weight and extract keywords. To measure the similarity between each word and the comment text, this article uses cosine similarity for calculation. Based on the calculated similarity value, we select the top ranked words as the key feature words for this comment statement. Assuming the keyword is w and its vector is represented as V(w), the cosine similarity calculation between the keyword w and the comment text t is shown in formula (3).

  • Input: The vector representation V(w) of the keyword and the vector representation V(t) of the comment text.
  • Output: The similarity weight sim(w,t) between the keyword w and text t.
(3)

where, sim (w, t) represents the similarity weight between keyword w and text t.

In summary, KeyBert uses the Bert model to encode text into vector representations and uses cosine similarity to calculate the similarity between keywords and text. By calculating formulas and sorting operations, a list of keywords in the text can be obtained.

3.2. Construction of keyword vector based on Word2Vec

The principle of Word2Vec is to use the idea of deep learning to transform the processing of text content into vector operations in high-dimensional vector spaces, and to transform semantic similarity of text into spatial vector similarity. The core idea is to train words into high-dimensional real number vectors. The vector contains rich word information, so synonyms can be found by calculating similarity through the vector distance between words[49]. The basic steps for constructing keyword vectors based on Word2Vec are as follows:

Firstly, the comment text is segmented using natural language processing techniques, and the segmentation results can be used as a corpus.

  • Input: The raw comment text.
  • Output: A segmented corpus of words.

Secondly, Word2Vec is used to train keyword vectors and convert keywords in comments into vector form. The two important models in Word2Vec are the CBOW model and the Skip gram model. Among them, the CBOW model predicts the current word under the premise of knowing the context, which is suitable for situations with small datasets. This article is based on the CBOW model to calculate the word vector of keywords, and the calculation process is shown in formula (4).

  • Input: The keywords extracted using KeyBERT.
  • Output: The vector representation V(w) for each keyword.
(4)

Where is the word vector that needs to be predicted, and “context” refers to the surrounding words of .

3.3. Construction of evaluation index system based on K-means

(1) Extraction of satisfaction evaluation indicators for green products.

In order to extract satisfaction evaluation indicators for green products, we used the K-means method to cluster keywords. In our study, k represents the number of keyword clusters. The specific steps for extracting consumer satisfaction indicators based on the K-means method are as follows:

  1. 1) Using the elbow method to determine the k value, it is desired to cluster the feature words to obtain k sets.
  2. 2) Select the initialized k word vectors as the clustering center.
  3. 3) For each feature word vector in the datasets, calculate its Euclidean distance from each center to obtain clusters corresponding to k centers. On this basis, the first round of clustering is finished.
  4. 4) Update the mean vector of the cluster class based on the cluster to which each word vector belongs.
  5. 5) Repeat steps (3) and (4). Until a certain termination condition is reached, the clustering of feature words is completed.
  6. 6) Select the top eight feature words with similarity scores from various clusters to define the evaluation dimension.

(2) Calculation of the weight of satisfaction evaluation indicators for green products.

The k evaluation indicators can be obtained through K-means clustering, and the weights of each indicator are determined based on the KeyBert similarity score of the feature words under each indicator after clustering. The specific calculation steps are as follows:

Firstly, according to formula (3), the sum of similarity weights for keyword w in all texts can be calculated, and the average similarity weight for keyword w can be obtained. The calculation is shown in formula (5):

(5)

Among them, represents the similarity weight of keyword w in the n-th text t, and represents the average similarity weight of keyword w in all texts.

Secondly, for each evaluation indicator u, calculate the sum of the similarity weights of all its feature keywords, as shown in formula (6). For the evaluation index system z, calculate the sum of all evaluation index weights , as shown in formula (7):

(6)(7)

Where, represents the similarity weight of the wi-th keyword in the evaluation index u, and represents the sum of the similarity weights of all keywords in the j-th evaluation index in the evaluation index system.

Finally, by performing normalization, we can obtain the proportion of each evaluation indicator u, which is the weight value w(u) of the evaluation indicator u, as shown in formula (8):

(8)

Therefore, the construction process of the evaluation index system for consumer satisfaction of green products based on text mining is shown in Fig 1.Please refer to S1 Table.

thumbnail
Fig 1. The construction process of consumer satisfaction evaluation index system for green products.

https://doi.org/10.1371/journal.pone.0322470.g001

4. Empirical research

In July 2022, the State Council Executive Meeting called for multiple measures to expand consumption and identified multiple measures to support the consumption of green and intelligent home appliances. Our study selected green and energy-saving household appliances as the empirical research object, and selected online comments on energy-saving air conditioning on the JD platform as empirical data.

4.1. Construction of keyword database

We used a web-scraping tool to collect approximately 11,000 user feedback entries on level 1 energy-saving air conditioning.Firstly, perform data preprocessing. We screened text with comment length greater than 15 but less than 200 and preprocessed the data, including data cleaning, word segmentation, and removing stop words. Then, this article used the KeyBert method to filter out the top five nouns, gerunds, verbs, and adjective phrases with higher weights in each comment text as keywords for the evaluation index system. We conducted word frequency analysis on the selected keywords and selected keywords with a frequency greater than 10 to construct a core feature word database. Some high-frequency keywords are shown in Table 1.

The data collection and analysis methods used in this study adhered to the terms and conditions of the data source to ensure compliance and ethical standards.

We gathered a dataset comprising 16,000 user feedback entries pertaining to Level 1 energy-saving air conditioning units.Initially, we performed data preprocessing, screening texts with comment lengths greater than 15 and less than 200 characters. This preprocessing included data cleaning, word segmentation, and the removal of stop words.In the next step, we employed the KeyBert method to identify the top five keywords from each comment text, specifically focusing on nouns, gerunds, verbs, and adjective phrases that carried higher weights. The consideration of Part-of-Speech (POS) tagging during this process is crucial, as the POS provides insights into the role and function of keywords within sentences. The POS of a keyword directly influences its function; for example, nouns typically serve as subjects or core concepts, while verbs often indicate actions or behaviors. By analyzing the POS of keywords, we can gain a better understanding of their true meaning and role in the sentence, thereby accurately capturing the themes and core content of the text.Following this, we conducted a word frequency analysis on the identified keywords and retained those with a frequency greater than 10 to construct a core feature word database. Some high-frequency keywords are presented in Table 1. This methodology not only refined our keyword selection but also ensured that the keywords accurately reflected consumer sentiments regarding energy-saving air conditioning products.

4.2. Word vectorization of feature keywords

Based on the neural network model, our study utilized the CBOW model and mapped words into a distributed vector based on contextual information and the internal structure of sentences. Based on the word2vec model and comment data corpus, we transformed the feature words in the core feature word lexicon into high-dimensional word vectors. Among them, the partial data represented by keyword vectorization is shown in Table 2.

thumbnail
Table 2. Partial data representation of word2vec word vectorization.

https://doi.org/10.1371/journal.pone.0322470.t002

4.3. Clustering of feature keywords and definition of evaluation indicators

We determined the optimal number of clusters using the SSE elbow method and used the sum of squared errors generated by model iteration to determine the appropriate clustering points. The results showed that the optimal number of clusters was around 7–8. After determining the optimal number of clusters through SSE, we calculated the Euclidean distance from the word vector of the keywords to the nearest cluster center point, and formed clusters for the keywords. Each clustering theme in the clustering results can form a green product consumer satisfaction evaluation index. The final clustering results are shown in Table 3.

We determined the optimal number of clusters using the Sum of Squared Errors (SSE) elbow method, which quantitatively represents the degree of sample aggregation. A smaller SSE value indicates a tighter grouping of samples within each cluster[50]. The SSE is calculated using the following formula:

(9)

Where Ci represents the i-th cluster, ni denotes the number of sample points in Ci,and mi is the mean of all samples within Ci.As the number of clusters 𝑘 increases, the degree of aggregation within each cluster tends to improve, leading to a finer partitioning of the samples and a corresponding reduction in the SSE.

Initially, when 𝑘 is less than the true number of clusters, an increase in 𝑘 rapidly enhances the aggregation within each cluster, resulting in a significant decrease in SSE. However, once 𝑘 reaches the true number of clusters, further increases yield diminishing returns in terms of aggregation, causing the SSE reduction to level off. Consequently, the relationship between SSE and 𝑘 typically forms an elbow shape, with the elbow corresponding to the true number of clusters in the data.

The final clustering results indicated an optimal number of clusters around 7–8. After establishing the optimal cluster count through the SSE method, we calculated the Euclidean distance from the word vectors of the keywords to the nearest cluster centroids, thereby forming clusters for the keywords. Each clustering theme identified in the results contributes to the development of a consumer satisfaction evaluation index for green products. The final clustering results are presented in Table 3.

4.4. Weight calculation and analysis of evaluation indicators

Firstly, the KeyBert model was used to calculate the similarity of feature keywords in each category.

Secondly, we calculated the sum of the similarity of keywords in each category and normalized them, as shown in Table 4.

thumbnail
Table 4. Similarity weights of feature keywords for each category.

https://doi.org/10.1371/journal.pone.0322470.t004

Finally, we calculated the weight of each evaluation indicator based on its category. The final results are as follows: the weight of installation evaluation indicators is 27%, logistics evaluation indicators are 10.2%, functional evaluation indicators are 18.9%, appearance evaluation indicators are 19.6%, service evaluation indicators are 11.6%, and price evaluation indicators are 12.7%.

Among the six evaluation indicators, consumers attach great importance to the installation, appearance, and green functions of energy-saving air conditioners. When consumers consider purchasing energy-efficient air conditioners, their emphasis on these three indicators far exceeds other factors. The functionality of energy-saving air conditioners is mainly reflected in their energy efficiency, refrigeration and heating effects, frequency conversion, and other aspects. It is the core manifestation of their green attributes and directly affects the effectiveness of products in daily use. In terms of the installation of energy-saving air conditioners, consumers hope to be able to install them in a timely manner within the scheduled time after purchasing. At the same time, installation efficiency and the proficiency of installation technicians will have a significant impact on the consumer experience. The appearance of energy-saving air conditioners is also a highly valued aspect by consumers. Consumers not only hope that the product has the functional characteristics, but also hope that energy-

saving air conditioners have a beautiful appearance design that can be coordinated and matched with home decoration. In addition, price, logistics, and service are also factors that consumers need to consider when making purchases. Our empirical research was conducted on the JD e-commerce platform, which can achieve standardization and uniformity in services, logistics, and pricing. Therefore, in these three categories of indicators, consumer attention is relatively low.

5. Conclusions

In this study, text mining methods were employed to extract keywords, compute word vectors, and cluster comment data from user reviews on e-commerce platforms, culminating in the construction of an evaluation index system for consumer satisfaction with green products. It was demonstrated that user-generated online comment data contains rich information that more comprehensively and objectively reflects consumption and usage experiences compared to traditional survey methodologies. Through the application of KeyBert, word2vec, and K-means methods, keywords were extracted from the comment texts, revealing that consumer satisfaction with energy-saving air conditioning products is primarily assessed based on factors such as installation, appearance, functionality, logistics, price, and service. Notably, it was observed that consumers placed the greatest emphasis on installation, appearance, and functional characteristics, suggesting that manufacturers should prioritize these aspects in product development.However, certain limitations were acknowledged, including the prevalence of false reviews in online comments, which may compromise the objectivity of the data. Furthermore, the study was restricted to online comment data sourced from JD’s e-commerce platform, indicating that future research should aim to explore a broader range of high-quality comments across various platforms.The proposed method distinguishes itself from prior works by integrating advanced text mining techniques to provide a more nuanced understanding of consumer sentiment, thereby highlighting the critical role of online reviews in shaping product evaluations. Previous studies in the field, have utilized keyword extraction and clustering methods but often rely on basic frequency-based approaches, limiting their ability to capture the deeper relationships between words and their contextual meanings. In contrast, our method leverages word embeddings generated by word2vec to capture semantic relationships between keywords, providing a more comprehensive understanding of consumer sentiment.

The contributions of the author include the development of a robust framework for analyzing consumer feedback, which can be adapted for future research across diverse product categories.

Supporting information

S1 Table. Relevant data underlying the findings described in manuscript.

https://doi.org/10.1371/journal.pone.0322470.s001

(XLSX)

Acknowledgments

The authors would like to thank the editor and anonymous reviewers for their valuable comments.

References

  1. 1. Hu A, Zhou S, Xie Y. Green modernization with chinese characteristics: review and prospect[J]. Study on the National Conditions of Modernization with Chinese Characteristics, 2024: 173-205.
  2. 2. Xiong P, Zhao C, Chen H. Research on the influencing factors of public green consumption behavior under the carbon peak target. Business Economics Research, 2022 15:52-56
  3. 3. Jamoussi B, Abu-Rizaiza A, Al-Haij A. Sustainable building standards, codes and certification systems: the status quo and future directions in Saudi Arabia. Sustainability. 2022; 14(16).
  4. 4. Trukhina N, Barinov V, Andiyunina Y, et al. Innovation and certification as the basis for the development of energy-efficient construction[C]. International Scientific Conference on Business Technologies for Sustainable Urban Development (SPbWOSCE), 2018.
  5. 5. Puengwattanapong P, Leelasantitham A. A holistic perspective model of plenary online consumer behaviors for sustainable guidelines of the electronic business platforms[J]. Sustainability, 2022, 14(10).
  6. 6. Fülöp MT, Topor DI, Căpușneanu S, Ionescu CA, Akram U, et al. Utilitarian and hedonic motivation in E-commerce online purchasing intentions. Eastern European Economics. 2023;61(5):591–613.
  7. 7. Nagamma P, Pruthvi H R, Nisha K K, et al. an improved sentiment analysis of online movie reviews based on clustering for box-office prediction. International Conference on Computing, Communication & Automation (ICCCA), 2015: 933-937.
  8. 8. Lam T, Heales J, Hartley N. The role of positive online reviews in risk-based consumer behaviours: an information processing perspective[J]. Aslib Journal of Information Management, 2025, 77(2): 282-305.
  9. 9. Luo Y, Yang Z, Liang Y, Zhang X, Xiao H. Exploring energy-saving refrigerators through online e-commerce reviews: an augmented mining model based on machine learning methods. Kybernetes. 2022;51(9):2768–2794.
  10. 10. Li S-P, Lin Y-H, Huang C-C. Application of the innovative model NIPA to evaluate service satisfaction. Sustainability. 2022;14(16):10036.
  11. 11. Stojic D, Ciric Z, Sedlak O, et al. Students’ views on public transport: satisfaction and emission[J]. Sustainability, 2020, 12(20).
  12. 12. Ma C, Gong X, Qiao Q. Exploration of consumer satisfaction and influencing factors in online shopping of sports goods. Business Economics Research, 2022 (09): 76-79.
  13. 13. Yang T, Wu J, Zhang J. Knowing how satisfied/dissatisfied is far from enough: a comprehensive customer satisfaction analysis framework based on hybrid text mining techniques. International Journal of Contemporary Hospitality Management ahead-of-print. 2023.
  14. 14. Wang H, Liu Y. Empirical study on the factors influencing consumer satisfaction of online purchasing of fresh agricultural products. Consumer Economy, 2015, 31(06): 81-86.
  15. 15. Li N, Sun J, Li D. Empirical study on the influencing factors of consumer satisfaction with online fresh agricultural products [J]. Business Economics Research, 2019 (11): 144-147.
  16. 16. Li W, Song H, Pan Y, et al. Empirical analysis of consumer satisfaction evaluation and improvement of fresh agricultural products under O2O Model. China Agricultural Resources and Regionalization, 2020, 41 (01): 129-137.
  17. 17. Yang H, Zhou F, Tian Y. A study on consumer food safety satisfaction based on association rules management review. 2020, 32 (04): 286-297.
  18. 18. Lee C, Lee E. E-commerce, competition & ASEAN economic integration. Iseas-Yusof Ishak Institute; 2019: 12-31.
  19. 19. Oh S-S, Song J, Kim J. Increasing prevalence of multidrug-resistant mcr-1-positive Escherichia coli isolates from fresh vegetables and healthy food animals in South Korea. Int J Infect Dis. 2020;92:53–55. pmid:31877351
  20. 20. Birjoveanu CV, Birjoveanu M. Automated verification of e-commerce protocols for complex transactions[M]. Springer International Publishing; 2019.
  21. 21. Wei H, Wu D. Research on the impact of online comments on online booking of group travel products. Market Weekly. 2020: 03: 66-68.
  22. 22. Yanfeng Z, He L, Lihui P, et al. Empirical study on the influencing factors of the timeliness characteristics of online user comment behavior [J]. Modern Intelligence, 2019, 39 (01): 60-69+77.
  23. 23. Geebren A, Jabbar A, Luo M. Examining the role of consumer satisfaction within mobile eco-systems: evidence from mobile banking services. Computers in Human Behavior. 2021;114:106584.
  24. 24. Sun B, Ao C, Wang J, et al. Research on satisfaction evaluation of ecotourism based on network text mining. Operations Research and Management, 2022, 31 (12): 165-172.
  25. 25. Geng X, Chen L. Research on the influencing factors of online user satisfaction based on sentiment analysis and LDA model. Microcomputer Applications, 2019, 35 (06): 38-41
  26. 26. Zhang Z, Luo T. Product requirements analysis based on online comment data mining and kano model. Management Review, 2022, 34 (11): 109-117.
  27. 27. Zheng S, Wang P, Ding H, et al. Research on user experience of museum digital services based on aspect level emotional analysis [J]. Information Science, 2022, 40 (04): 171-178.
  28. 28. Zhao Y, Ruan P, Liu X, et al. Research on user satisfaction evaluation based on online comments. Management Review. 2020; 32(03): 179-189.
  29. 29. Tu M. Research on user satisfaction based on online comment clustering [D]. Jilin University; 2020.
  30. 30. Hao H, Guo J, Xin Z, et al. Research on e-Commerce distribution optimization of rice agricultural products based on consumer satisfaction. IEEE Access. 2021, 9: 135304-135315.
  31. 31. Sung E, Chung WY, Lee D. Factors that affect consumer trust in product quality: a focus on online reviews and shopping platforms. Humanities and Social Sciences Communications. 2023;10(1):1-10.
  32. 32. Cui G, Lui H-K, Guo X. The effect of online consumer reviews on new product sales[J]. International Journal of Electronic Commerce. 2012, 17(1): 39-58.
  33. 33. Shi W, Zhong B, Zhang Q. A comparative study of the impact of online film reviews and online short reviews on box office revenue [J]. China Management Science, 2017, 25(10): 162-170.
  34. 34. Yang X, Party Y, Wu J. Analysis of the relationship between product review quantity and purchase quantity based on symbolic regression. Journal of Systems Engineering, 2020, 35 (03): 289-300.
  35. 35. Niu G, Li G, Geng X, et al. The impact of the quantity and quality of online comments on online shopping willingness: the moderating effect of cognitive needs. Psychological Science, 2016, 39 (06): 1454-1459.
  36. 36. Sun J, Zheng Y, Chen J. Research on the impact of perceived online comment credibility on consumer trust: the moderating effect of uncertainty avoidance [J]. Management Review, 2020, 32 (04): 146-159.
  37. 37. Cao Y,Li Q,Wan G.Research on the impact of online reviews on consumer decision-making for purchasing leisure food. Management Review, 2020,32(03):157-166.
  38. 38. Hong F, Zheng H, Zhou Y, et al. A study on the impact of online comments on the purchase intention of college students. Business Economics Research, 2019 (08): 52-56.
  39. 39. Huo H, Zhang C. The impact of quality richness in online reviews on purchasing heterogeneous products. Enterprise Economics, 2018, 37 (06): 77-83.
  40. 40. Chevalier JA, Mayzlin D. The effect of word of mouth on sales: online book reviews. Journal of Marketing Research. 2006;43(3):345–54.
  41. 41. Li A, Zhao Z. Research on factors influencing the usefulness of online comments based on signal transmission theory. Modern Intelligence, 2019, 39 (10): 38-45.
  42. 42. Ye Q, Law R, Gu B. The impact of online user reviews on hotel room sales. International Journal of Hospitality Management. 2009;28(1):180–182.
  43. 43. Zhao T, Liu C. Construction of an emotional dictionary for korean film critics based on deep learning [J]. Information Technology and Informatization, 2021 (01): 250-253.
  44. 44. Wang Y. Emotional analysis and mining based on mobile product review text. Enterprise Technology and Development. 2019; 05:130-132
  45. 45. Munuswamy S, Saranya M S, Ganapathy S, et al . Sentiment analysis techniques for social media-based recommendation systems. National Academy Science Letters-India, 2021, 44(3): 281-287.
  46. 46. Wang W. Research and application of mining algorithms for chinese online comment opinion. Xi’an University of Science and Technology; 2017
  47. 47. Liu Y, Kan L. E-commerce online comment data mining based on text sentiment analysis. Statistics and Information Forum, 2018, 33 (12): 119-1.
  48. 48. Issa B, Jasser M B, Chua H N, et al. A comparative study on embedding models for keyword extraction using KeyBERT method [C] // 2023 IEEE 13th International Conference on System Engineering and Technology (ICSET). IEEE, 2023: 40-45.
  49. 49. Church KW. Word2Vec. Natural Language Engineering, 2017, 23(1): 155-162.
  50. 50. Long W, Zhang X, Zhang L. Business process clustering method based on k-means and elbow rule. Journal of Jianghan University (Natural Science Edition) 2020; 48(1): 81-90.