tRF-BERT: A transformative approach to aspect-based sentiment analysis in the bengali language

Shihab Ahmed; Moythry Manir Samia; Maksuda Haider Sayma; Md. Mohsin Kabir; M. F. Mridha

doi:10.1371/journal.pone.0308050

Peer Review History

Original SubmissionJanuary 11, 2024
18 Mar 2024 Decision Letter - Qin Xiang Ng, Editor PONE-D-24-01459tRF-BERT: A transformative approach to aspect-based sentiment analysis in the bengali languagePLOS ONE Dear Dr. Mridha, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by May 02 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Qin Xiang Ng, MBBS, MPH Academic Editor PLOS ONE Journal Requirements: 1. When submitting your revision, we need you to address these additional requirements. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, all author-generated code must be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse. 3. We note that your Data Availability Statement is currently as follows: All relevant data are within the manuscript and its Supporting Information files. Please confirm at this time whether or not your submission contains all raw data required to replicate the results of your study. Authors must share the “minimal data set” for their submission. PLOS defines the minimal data set to consist of the data required to replicate all study findings reported in the article, as well as related metadata and methods (https://journals.plos.org/plosone/s/data-availability#loc-minimal-data-set-definition). For example, authors should submit the following data: - The values behind the means, standard deviations and other measures reported; - The values used to build graphs; - The points extracted from images for analysis. Authors do not need to submit their entire data set if only a portion of the data was used in the reported study. If your submission does not contain these data, please either upload them as Supporting Information files or deposit them to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of recommended repositories, please see https://journals.plos.org/plosone/s/recommended-repositories. If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent. If data are owned by a third party, please indicate how others may request data access. Additional Editor Comments: Apologies for the delay in securing reviewers for this manuscript. After reviewing the manuscript as well as the reviewers' comments and feedback, it is quite apparent that major revisions are necessary before the resubmitted manuscript can be considered. There is no guarantee of acceptance. 1. Given the journal's biomedical and public health focus, some applications of sentiment analysis in public health research should be highlighted in the Introduction section (see: https://pubmed.ncbi.nlm.nih.gov/37376407; https://pubmed.ncbi.nlm.nih.gov/37358808). 2. Although the authors provided a description of the proposed hybrid transformative Random Forest and Bidirectional Encoder Representations from Transformers (tRF-BERT) model, there is limited explanation on the technical underpinnings, such as the specific architecture details, the interaction between Random Forest and BERT components, and how exactly the hybrid model outperforms its constituent parts. 3. While the manuscript mentions using two open-source benchmark datasets, Cricket and Restaurant, for Aspect-Based Sentiment Analysis, I am unable to find any citations for these datasets, and the authors do not provide further statistics about these datasets (e.g., number of samples, distribution of classes). A more thorough dataset description is necessary. 4. The comparison against existing works primarily focuses on the final performance metrics. A comparison which discusses the nature of the datasets used in those works, model complexities, and computational resources required, would offer a clearer picture of the proposed model's advantages and limitations. 5. The author strongly focuses on F1 score and accuracy for evaluating model performance. Incorporating additional metrics such as Precision-Recall AUC, Matthews Correlation Coefficient, or analysis on the model's performance across different aspects/categories could provide a more comprehensive evaluation. 6. While some hyperparameters are listed, the process of selecting these values or any optimization strategy employed is not discussed. Detailing the hyperparameter tuning process, including the range of values explored, would strengthen the methodological rigor. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: No ******** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: No ****** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No ****** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ****** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: - The paper provides a technically sound piece of scientific research with data that supports the conclusions. - The authors present appropriately the statistical analysis. - The authors provide data underlying the findings in their manuscript. - This paper is well-written in English with some needed revisions. Specific comments: - In the abstract section, there is a sentence “it was clear that all the models used for our work achieved better results than any of the previous work”, what does the “it” word refer to? If it is not unclear, please revise! - In the sentence “A crucial part of this research involved finding or creating a dataset specifically designed for Bengali aspect-level sentiment analysis” on page 2, the authors mention that this paper creating a dataset. However, the page 7 mentioned that this study used publicly available datasets. Please clarify this issue! - There is the sentence “it’s crucial to consider the semantics of the given aspect as a new and distinct piece of information, separate from the context itself” on page 4. Do not use abbreviations in academic paper such as “it’s” and do not use the “it” word for syntactic expletive in academic papers. Use this concept consistently in the whole paper! - What is used pre-trained BERT and RoBERTa models for fine-tuning process? Did authors perform a pre-training process for the BERT model or use already pre-trained models by others? The author should clearly mention this issue. - In Tokenization and embeddings section on page 8, the paper mentioned the use of ‘bert-base-uncased’ tokenizers for tokenization. However, ‘bert-base-uncased’ and ‘roberta-base’ tokenizers are pre-trained in English. How these tokenizers can be implemented in Bengali language? - What are M, N, and T terms in the BERT model? Please define them! - In the BERT model section on page 10, please briefly define these unique tokens i.e., [CLS], [SEP], and [EOS]! - If the authors want to elaborate Q, K, and V parameters, how do these parameters come and the correlation with the BERT inputs, i.e., X and Y? - Please paraphrase this sentence “It is marked as “IsNext” if it does, and “NotNext” if it doesn’t.” What does “it” refer to? - As we can see in Table 2 and 3, the number of each data class or category is imbalanced, how the proposed model can address this issue? - Are Bangla and Bengali different terms? If they are the same term, please choose one of them and use consistently the selected term. - In the performance evaluation, the proposed model used cross-validation, please provide more explanation in the implementation such as what is the value of the k parameter. - In the experimental result of aspect detection and sentiment classification in this research, please also include the results if the model only uses TF-IDF and RF for the classification task to be presented in Tables 7-10. How is the performance of TF-IDF and RF? - The authors can cite this paper https://doi.org/10.1186/s40537-023-00782-9 that also proposed hybrid strategy for sentiment analysis. Reviewer #2: This paper proposes a mix of random forest and pretrained transformer model for two Aspect Based Sentiment Analysis (ABSA) tasks: aspect category detection and aspect sentiment classification. This study has been primarily targeted towards Bengali language and it used two existing Bengali text data in their experiments. The results show that the proposed tRF-BERT model outperforms ABSA tasks compared to tasks done with independent models. Pros: 1. ABSA for low resource languages like Bengali is interesting 2. The proposed model looks like an ensemble model for ABSA tasks, which is interesting 3. Focus on two different tasks is a plus Cons: 1. There is no novelty in this paper. The only novel part in this paper is combining results of two different classifiers and feeding them to a neural network model for final classification. However, this looks like an extended random forest algorithm and it just gets more data for a classification task. 2. There are several questions with experiments: (a) Why the proposed model has not been evaluated against other ABSA models? This area is fast growing research and many methods can be applied approximately to other languages too. (b) Why results of ABSA tasks just with random forest are not given? (c) I couldn't find any citations for the dataset (d) Why the runtime is not compared? Is this performance boost worth the execution time this model takes? (e) Why no experiments done on English datasets? 3. The random forest model uses TF-IDF features. This totally ignores the concept of language models and makes it inefficient and unreliable for ABSA tasks. For TF-IDF features, the model should know all the data before hand, including some of the words from the test data. How this method can scale for larger problems? 4. Is it not possible to add other models like SVM, CNN, or other traditional models. How the performance change in that scenario? 5. It is evident from previous methods that random forest perform very poorly in ABSA tasks. How does it make sense to give equal importance to both model predictions? Can it be weighted? 6. I do not understand how the predictions are fed again into a neural network Overall this paper lacks novelty and it requires significant addition of contributions to go for another round of submission. ****** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ******** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. https://doi.org/10.1371/journal.pone.0308050.r001
Revision 1
24 Apr 2024 Author Response Journal: PLOS ONE Manuscript ID: PONE-D-24-01459 Title: tRF-BERT: A transformative approach to aspect-based sentiment analysis in the bengali language Authors: Shihab Ahmed, Moythry Manir Samia, Maksuda Haider Sayma, Md Mohsin Kabir, M. F. Mridha Dear Editor and Reviewers, The authors thank all reviewers and the editor for their acceptance of revising our paper and for their specific and essential comments. We have revised the paper and restructured several sections. The updated version presents all the changes. Response of Editor Editor Comment-1: Given the journal’s biomedical and public health focus, some applications of sentiment analysis in public health research should be highlighted in the Introduction section (see: https://pubmed.ncbi.nlm.nih.gov/37376407; https://pubmed.ncbi.nlm.nih.gov/37358808). Author’s Response: We thank the Editor for suggesting these studies for improving our paper. Author’s Action: To address the editor’s concern, two new paragraphs were added at the beginning of the “Introduction” section, along with the recommended studies as examples. Editor Comment-2: Although the authors provided a description of the proposed hybrid transformative Random Forest and Bidirectional Encoder Representations from Transformers (tRF-BERT) model, there is limited explanation on the technical underpinnings, such as the specific architecture details, the interaction between Random Forest and BERT components, and how exactly the hybrid model outperforms its constituent parts. Author’s Response: We appreciate the Editor for emphasizing this crucial aspect to enhance our paper. Author’s Action: Architectural details of both BERT and Random Forest are provided in the “BERT model” and “RF” subsections of the “Methods and Materials” section of the manuscript. Also check Figs 2, 3, 4 in the manuscript. Additionally, the specifics of the proposed tRF-BERT model are presented in the “Proposed tRF-BERT Model” subsection. Only the newly added segments are highlighted in blue in the manuscript. Editor Comment-3: While the manuscript mentions using two open-source benchmark datasets, Cricket and Restaurant, for Aspect-Based Sentiment Analysis, I am unable to find any citations for these datasets, and the authors do not provide further statistics about these datasets (e.g., number of samples, distri- 1 bution of classes). A more thorough dataset description is necessary. Author’s Response: We express our gratitude to the Editor for emphasizing the importance of this aspect in enhancing our paper. Author’s Action: The citations for the datasets and the link to the data source are included in the “Data Availability” section of the manuscript. Additional statistical information regarding the datasets, such as the number of samples and class distribution, has been included in the “Data source” subsection within the “Data collection and preprocessing” section in Tables 2 and 3 in the manuscript. Editor Comment-4: The comparison against existing works primarily focuses on the final performance metrics. A comparison which discusses the nature of the datasets used in those works, model complexities, and computational resources required, would offer a clearer picture of the proposed model’s advantages and limitations. Author’s Response: We appreciate the Editor’s helpful suggestion. The comparison between related works and ours is presented in Table 14 in the manuscript. All the studies discussed in Table 14 use the same two datasets as we did. The statistics related to these two datasets are provided in Tables 2 and 3 in the manuscript. Editor Comment-5: The author strongly focuses on F1 score and accuracy for evaluating model performance. Incorporating additional metrics such as Precision-Recall AUC, Matthews Correlation Coefficient, or analysis on the model’s performance across different aspects/categories could provide a more comprehensive evaluation. Author’s Response: We thank the Editor for this observation. We are especially grateful to the editor for suggesting additional metrics such as Precision-Recall AUC and Matthews Correlation Coefficient, as well as recommending analysis of the model’s performance across different aspects/categories. We intend to include these suggestions in future editions. Author’s Action: We incorporated accuracy, precision, recall, and F1-score as evaluation metrics, which helped us evaluate the models and compare them with others in the domain. The response to this comment can be found in the “Performance Evaluation” subsection in “Methods and materials” section and “Result Analysis” section of the manuscript. Editor Comment-6: While some hyperparameters are listed, the process of selecting these values or any optimization strategy employed is not discussed. Detailing the hyperparameter tuning process, including the range of values explored, would strengthen the methodological rigor. Author’s Response: We thank the editor for pointing out this important point. Author’s Action: We have tried our best to address this issue in our manuscript in the subsection “Hyperparameters” of the “Methods and Materials” section. The first paragarph has been added to the manuscript and colored blue as a response to this comment. 2 Response of Reviewer-1 Reviewer-1 Comment-1: In the abstract section, there is a sentence “it was clear that all the models used for our work achieved better results than any of the previous work”, what does the “it” word refer to? If it is not unclear, please revise! Author’s Response: We thank the reviewer for the observation. Author’s Action: We revised the mentioned sentence in the “Abstract” section for better understanding. Reviewer-1 Comment-2: In the sentence “A crucial part of this research involved finding or creating a dataset specifically designed for Bengali aspectlevel sentiment analysis” on page 2, the authors mention that this paper creating a dataset. However, the page 7 mentioned that this study used publicly available datasets. Please clarify this issue! Author’s Response: We thank the reviewer for mentioning the issue. In our research work, we didn’t create a new dataset; instead, we used publicly available datasets. Author’s Action: We changed the sentence mentioned in the “Introduction” section making the matter clear. Reviewer-1 Comment-3: There is the sentence “it’s crucial to consider the semantics of the given aspect as a new and distinct piece of information, separate from the context itself” on page 4. Do not use abbreviations in academic paper such as “it’s” and do not use the “it” word for syntactic expletive in academic papers. Use this concept consistently in the whole paper! Author’s Response: We thank the reviewer for making this suggestion regarding our paper. Author’s Action: We updated the mentioned sentence according to the reviewer’s advice in the first paragraph of the “Literature Review” section. Reviewer-1 Comment-4: What is used pre-trained BERT and RoBERTa models for fine-tuning process? Did authors perform a pre-training process for the BERT model or use already pre-trained models by others? The author should clearly mention this issue. Author’s Response: We thank the reviewer for mentioning the issue. For this study, we used already pre-trained BERT and RoBERTa models by others. Author’s Action: Please refer to the first paragraph with the heading “BERT-based model” in the “Proposed tRF-BERT model” subsection in the “Methods and materials” section of the manuscript. Reviewer-1 Comment-5: In Tokenization and embeddings section on page 8, the paper mentioned the use of ‘bert-base-uncased’ tokenizers for tokenization. However, ‘bert-base-uncased’ and ‘roberta-base’ tokenizers are pre-trained in English. How these tokenizers can be implemented in Bengali language? 3 Author’s Response: We thank the reviewer for focusing on this issue. Author’s Action: A short explanation of using ‘bert-base-uncased’ and ‘roberta-base’ tokenizers for our Bengali datasets is added to the manuscript. We added this explaination in “Tokenization and embeddings” subsection of “Methods and materials” section of our manuscript. Reviewer-1 Comment-6: What are M, N, and T terms in the BERT model? Please define them! Author’s Response: We thank the reviewer for the observation. Author’s Action: We briefly defined M, N, and T in “BERT model” sub-subsection of “Aspect and sentiment prediction models” subsection of “Methods and materials” section. Reviewer-1 Comment-7: In the BERT model section on page 10, please briefly define these unique tokens i.e., [CLS], [SEP], and [EOS]! Author’s Response: We thank the reviewer for the observation. Author’s Action: We briefly defined the unique tokens i.e., [CLS], [SEP], and [EOS] in “BERT model” sub-subsection of “Aspect and sentiment prediction models” subsection of “Methods and materials” section. Reviewer-1 Comment-8: If the authors want to elaborate Q, K, and V parameters, how do these parameters come and the correlation with the BERT inputs, i.e., X and Y? Author’s Response: We thank the reviewer for the observation. Author’s Action: We elaborated on Q, K and V parameters and their correlation with BERT inputs in “BERT model” sub-subsection of “Aspect and sentiment prediction models” subsection of “Methods and materials” section. Reviewer-1 Comment-9: Please paraphrase this sentence “It is marked as “IsNext” if it does, and “NotNext” if it doesn’t.” What does “it” refer to? Author’s Response: We appreciate the reviewer’s helpful suggestion. The word ”it” refers to the relationship between the second sentence and the first sentence in the original text. Author’s Action: However, we paraphrased the sentence for better understanding in the “BERT model” sub-subsection of the “Aspect and sentiment prediction models” subsection of “Methods and materials” section. Reviewer-1 Comment-10: As we can see in Table 2 and 3, the number of each data class or category is imbalanced, how the proposed model can address this issue? Author’s Response: We thank the reviewer for mentioning the issue. To evaluate the performance of our model, we employed a combination of metrics: Accuracy, Precision, Recall, and F1-Score. While Accuracy is sufficient for balanced datasets, imbalanced datasets require a more nuanced approach. In such cases, Precision, Recall, and F1-Score provide valuable insights. By considering these metrics, we gain a comprehensive understanding of how well our model identifies every case, even when the classes are not evenly distributed. 4 Author’s Action: As shown in Tables 7, 8, 9 and 10 in the manuscript, our proposed model demonstrates strong performance across all evaluation metrics, indicating its effectiveness in handling both balanced and imbalanced scenarios. A paragraph addressing this issue is added in the “Results” subsection right above Table 7 of the “Result analysis” section in the manuscript. In future work, we aim to address the issue of imbalanced data classes by exploring various data balancing techniques such as oversampling, undersampling, and class weighting. Reviewer-1 Comment-11: Are Bangla and Bengali different terms? If they are the same term, please choose one of them and use consistently the selected term. Author’s Response: We thank the reviewer for pointing out this important point. Bengali is the same term as Bangla. Bangla is the language’s name in Bangla, while Bengali is the term used in English. Author’s Action: To avoid confusion, we used “Bengali” consistently throughout the paper. Reviewer-1 Comment-12: In the performance evaluation, the proposed model used cross-validation, please provide more explanation in the implementation such as what is the value of the k parameter. Author’s Response: We appreciate the esteemed reviewer for highlighting this concern. Author’s Action: In response, we have included the value of K in the “Cross-Validation” subsection of the “Methods and materials” section of the manuscript. Only the newly added lines are highlighted in blue. Reviewer-1 Comment-13: In the experimental result of aspect detection and sentiment classification in this research, please also include the results if the model only uses TF-IDF and RF for the classification task to be presented in Tables 7-10. How is the performance of TF-IDF and RF? Author response: We are grateful to the respected reviewer for the valuable comments. Author’s Action: We have included the results obtained from the Random Forest model with TF-IDF in Tables 7, 8, 9, and 10 in the manuscript. Reviewer-1 Comment-14: The authors can cite this paper https://doi.org/10.1186/s40537- 023-00782-9 that also proposed hybrid strategy for sentiment analysis. Author’s Response: We thank the reviewer for suggesting this supporting material for our study. Author’s Action: We have cited this study in the seventh paragraph in “Introduction” section of the manuscript. 5 Response of Reviewer-2 Reviewer 2 Comment-1:This paper proposes a mix of random forest and pretrained transformer model for two Aspect Based Sentiment Analysis (ABSA) tasks: aspect category detection and aspect sentiment classification. This study has been primarily targeted towards Bengali language and it used two existing Bengali text data in their experiments. The results show that the proposed tRFBERT model outperforms ABSA tasks compared to tasks done with independent models. Pros: 1. ABSA for low resource languages like Bengali is interesting 2. The proposed model looks like an ensemble model for ABSA tasks, which is interesting 3. Focus on two different tasks is a plus Author response: We thank the respectful reviewer for these excellent remarks. Reviewer 2 comment-2: Cons: 1. There is no novelty in this paper. The only novel part in this paper is combining results of two different classifiers and feeding them to a neural network model for final classification. However, this looks like an extended random forest algorithm and it just gets more data for a classification task. Author’s Response: We thank the reviewer for providing such insightful observations. We have tried our best to respond. Author’s Action: Our proposed model outperformed all existing works on the publicly available ‘Cricket’ and ‘Restaurant’ datasets in the field of Bengali ABSA. The contributions made by our study are the markers that ensure the novelty of the research in the field of Bengali ABSA. The contributions made by our study are mentioned in the “Introduction” section of the manuscript. Reviewer 2 Comment-3:There are several questions with experiments: (a) Why the proposed model has not been evaluated against other ABSA models? This area is fast growing research and many methods can be applied approximately to other languages too. (b) Why results of ABSA tasks just with random forest are not given? (c) I couldn’t find any citations for the dataset (d) Why the runtime is not compared? Is this performance boost worth the execution time this model takes? (e) Why no experiments done on English datasets? Author’s Response: We are grateful to our respected reviewer for the valuable comments. (a) Author’s Action: Our research primarily concentrated on Bengali Aspect-Based Sentiment Analysis (ABSA) tasks. As such, our investigation was limited to comparing our model exclusively with existing Bengali ABSA models. The comparison with existing Bengali ABSA models is given in Table 14 in our manuscript in the “Comparison with previous works” subsection. However, we aim to extend our evaluation in future studies to include comparisons with models designed for various other languages. (b) Author’s Action: We have included the results obtained from the Random Forest 6 model with TF-IDF in Tables 7, 8, 9, and 10 in the manuscript. (c) Author’s Action: The datasets used in this study are available at [https://github.com/atik- 05/Bangla ABSA Datasets] Additionally, citation 16 in the reference section provides further details about the dataset and includes the DOI link. (d) Author’s Action: We have incorporated a comparison of runtime in Table 11 in the manuscript in the “Results” subsection in the “Result and analysis” sec Attachments Attachment Submitted filename: tRF_BERT_Response to Reviewers.pdf https://doi.org/10.1371/journal.pone.0308050.r002
13 May 2024 Decision Letter - Qin Xiang Ng, Editor PONE-D-24-01459R1tRF-BERT: A transformative approach to aspect-based sentiment analysis in the bengali languagePLOS ONE Dear Dr. Mridha, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Jun 27 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Qin Xiang Ng, MBBS, MPH Academic Editor PLOS ONE Journal Requirements: Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: (No Response) ******** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes ****** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes ****** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes ****** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes ****** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The authors generally already addressed my previous comments. However, there is a point that should be clarified by the author as following: - The authors said that they used "pre-trained BERT and RoBERTa models by others". Please give the reference where those pre-trained models can be found. ****** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No ******** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. https://doi.org/10.1371/journal.pone.0308050.r003
Revision 2
19 May 2024 Author Response Response of Reviewer-1 Reviewer-1 Comment-1: The authors generally already addressed my previous comments. However, there is a point that should be clarified by the author as following: - The authors said that they used ”pre-trained BERT and RoBERTa models by others”. Please give the reference where those pre-trained models can be found. Author’s Response: We thank the reviewer for pointing out the issues. Author’s Action: We’ve already referenced two papers in Citation 14 and 32, providing in-depth insights into pre-trained models initially, though not uniformly. Now, these references are consistently cited throughout the text wherever pre-trained models are mentioned. Attachments Attachment Submitted filename: tRF_BERT.pdf https://doi.org/10.1371/journal.pone.0308050.r004
7 Jun 2024 Decision Letter - Qin Xiang Ng, Editor PONE-D-24-01459R2tRF-BERT: A transformative approach to aspect-based sentiment analysis in the bengali languagePLOS ONE Dear Dr. Mridha, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Jul 22 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Qin Xiang Ng, MBBS, GDMH, MPH Academic Editor PLOS ONE Journal Requirements: Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed ******** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly ****** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: N/A ****** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ****** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ****** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: (No Response) Reviewer #2: Authors have addressed most of the previous reviews. However, I have the following comments for this revised manuscript: Use of TF-IDF features in the proposed model: TF-IDF performs decent, not only in the ABSA problem, but also for several other NLP classifiers. The only problem with TF-IDF is its lack of generalization. Authors did not show any evidence how the model performs if tokens are not present during the training phase but becomes available during the test. How about other classifier models like SVM and NB instead of Random Forest? They are traditional ML models too. ****** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ******** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. https://doi.org/10.1371/journal.pone.0308050.r005
Revision 3
10 Jul 2024 Author Response Response of Reviewer-2 Reviewer-2 Comment-1: Authors have addressed most of the previous reviews. However, I have the following comments for this revised manuscript: Use of TF-IDF features in the proposed model: TF-IDF performs decent, not only in the ABSA problem, but also for several other NLP classifiers. The only problem with TF-IDF is its lack of generalization. Authors did not show any evidence how the model performs if tokens are not present during the training phase but becomes available during the test. How about other classifier models like SVM and NB instead of Random Forest? They are traditional ML models too. Author’s Response: We thank the reviewer for pointing out the issues. We have tried our best to make the use of the tokenizer during the training phase more explicit. Author’s Action: It is known that TF-IDF works well with trained words or vocabularies. In our research, we used vocabularies that are most commonly used in sentiment analysis, and as a result, TF-IDF performed satisfactorily. For future work, we intend to explore other technologies such as word2vec and BERT tokenizers to address the issue of generalization and improve the model’s performance with previously unseen tokens. The tokens are present during the training phase, as shown in Figure 4 on page 17 (Section “Methods and Materials”). We have also added a statement indicating the use of the tok- enizer with TF-IDF in the last paragraph of the subsection “Tokenization and Embeddings” in the “Methods and Materials” section. (New additions are coloured blue.) Additionally, we have experimented with SVM and NB instead of Random Forest with the BERT model, and the results are mentioned in Table 13 on page 22. To avoid confusion, we have updated the names of the models. Attachments Attachment Submitted filename: Response to Reviewers (1).docx https://doi.org/10.1371/journal.pone.0308050.r006
17 Jul 2024 Decision Letter - Qin Xiang Ng, Editor tRF-BERT: A transformative approach to aspect-based sentiment analysis in the bengali language PONE-D-24-01459R3 Dear Dr. Mridha, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Qin Xiang Ng, MBBS, GDMH, MPH Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: https://doi.org/10.1371/journal.pone.0308050.r007
Formally Accepted
2 Aug 2024 Acceptance Letter - Qin Xiang Ng, Editor PONE-D-24-01459R3 PLOS ONE Dear Dr. Mridha, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Qin Xiang Ng Academic Editor PLOS ONE https://doi.org/10.1371/journal.pone.0308050.r008

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .