A facial expression recognition network using hybrid feature extraction

Dandan Song; Chao Liu

doi:10.1371/journal.pone.0312359

Peer Review History

Original SubmissionMay 14, 2024
1 Jul 2024 Decision Letter - Qionghao Huang, Editor PONE-D-24-19404A Facial Expression Recognition Network Using Hybrid Feature ExtractionPLOS ONE Dear Dr. song, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. ============================== ACADEMIC EDITOR: Thank you for submitting your manuscript to PLOSE ONE. We have received feedback from reviewers on your paper. While the reviewers acknowledge the potential of your work, they have raised several concerns that need to be addressed. In light of these comments, we invite you to submit a revised version of your manuscript that addresses the reviewers' concerns for further consideration. Please include a detailed response to each point raised by the reviewers, and clearly indicate the changes made in the manuscript. ============================== Please submit your revised manuscript by Aug 15 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Qionghao Huang Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Partly Reviewer #2: Yes ******** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: N/A ****** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ****** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ****** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The paper introduces a facial expression recognition network called HFE-Net, designed to capture both subtle changes in expression features and overall facial expression information. The method has been extensively tested on three public facial expression datasets, validating that the hybrid feature extraction block can enhance the network’s ability to recognize facial expressions. However, there are several areas that need improvement: -In the Introduction section, the research problem posed by the authors is not very clear. Please clearly state the current Research Gap, how your work differs from existing studies, and briefly outline your technical contributions. -In the Related Work section, it would be beneficial to include some recent related works (e.g., Face2nodes: learning facial expression representations with relation-aware dynamic graph convolution networks, INS, 2023; FER-CHC: Facial expression recognition with cross-hierarchy contrast, ASOC, 2023; Emotion recognition from large-scale video clips with cross-attention and hybrid feature weighting neural networks, ERPH, 2023; Facial expression recognition with grid-wise attention and visual transformer, INS,2021). This would provide readers with a more current understanding of this field. -Regarding model design, the authors use two Feature Extraction Modules. The motivation behind this design is not very clear, and it’s uncertain whether other feature extraction networks could also be used. Please conduct some data analysis in the experimental section to confirm the rationality of the architecture design. -In terms of experiments, it is recommended to add more datasets, such as Affectnet, to better demonstrate the model’s generalizability. -It is suggested to add comparisons with the latest methods from recent years, as the baselines currently compared are not state-of-the-art (SOTA) methods. -The paper still has some issues with language grammar or fluency. Please carefully check and revise. Reviewer #2: This paper presents a facial expression recognition network by proposing a Hybrid Feature Extraction Block, which consists of parallel Big Model and Multi-head Self-attention. Overall, the paper is well presented. Nevertheless, I have a few concerns about the paper on the following points: What are the authors' explanations for using Big Model rather than already existing methods in capturing long-range dependencies in feature maps? The explanation of equation 2 states the Softmax is normalized, but the equation doesn’t have a normalization factor. Please correct this mismatch. In the Cross-entropy loss function, equation 6, both variables are the same (p(i)), which is inaccurate because the loss is calculated between the predicted and true values. The authors' discussion of why the proposed method outperformed existing approaches would provide more insights into their proposed network. Labeling the confusion matrices with associated expressions rather than the indices can make them better for comparison. ****** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ******** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. https://doi.org/10.1371/journal.pone.0312359.r001
Revision 1
23 Aug 2024 Author Response Reviewer #1: The paper introduces a facial expression recognition network called HFE-Net, designed to capture both subtle changes in expression features and overall facial expression information. The method has been extensively tested on three public facial expression datasets, validating that the hybrid feature extraction block can enhance the network’s ability to recognize facial expressions. However, there are several areas that need improvement: 1. In the Introduction section, the research problem posed by the authors is not very clear. Please clearly state the current Research Gap, how your work differs from existing studies, and briefly outline your technical contributions. Reply：Dear reviewer, as per your request, I have rephrased the issues with current facial expression classification models in the introduction section and highlighted the technical contributions of this paper. Specifically, the current facial expression recognition mainly adopts a network architecture based on convolutional neural networks and Transformers. Among them, convolutional kernels have strong local modeling capabilities but insufficient global modeling capabilities. Although Transformer can calculate the correlation of global pixels in an image, it also introduces too much background in the feature extraction process, which can lead to a decrease in the model's focusing ability. In response to the above issues, the HFE-Net proposed in this article uses convolutional kernels and multi head self attention mechanisms to extract features from different angles in images. In addition, this article also uses Big Model (Renamed as: Feature Fusion Device) with different modeling methods to calculate the correlation of distant elements to improve the network's ability to focus on facial expression features. 2.In the Related Work section, it would be beneficial to include some recent related works (e.g., Face2nodes: learning facial expression representations with relation-aware dynamic graph convolution networks, INS, 2023; FER-CHC: Facial expression recognition with cross-hierarchy contrast, ASOC, 2023; Emotion recognition from large-scale video clips with cross-attention and hybrid feature weighting neural networks, ERPH, 2023; Facial expression recognition with grid-wise attention and visual transformer, INS,2021). This would provide readers with a more current understanding of this field. Reply：Dear reviewer, according to your recommendation, I have added the relevant introduction of predecessors' work in the relevant work section. 3.Regarding model design, the authors use two Feature Extraction Modules. The motivation behind this design is not very clear, and it’s uncertain whether other feature extraction networks could also be used. Please conduct some data analysis in the experimental section to confirm the rationality of the architecture design. Reply：Dear reviewers, the motivation of using Multi-head self-attention mechanism and Big Model (Renamed as: Feature Fusion Device) in this paper comes from the insufficient local modeling ability of convolution kernel and the insufficient global modeling ability of Transformer. First of all, the model has global modeling ability, which can better extract the complete facial expression features. Secondly, the local modeling ability can better assist the network to focus the learning focus on the foreground information area in the image. For the validity of the argument put forward in this paper, I added the related experiment of Multilayer Perceptron(MLP) to the ablation experiment. 4. In terms of experiments, it is recommended to add more datasets, such as Affectnet, to better demonstrate the model’s generalizability. Reply：Dear reviewers, according to your request, I have added related contrast experiments and ablation experiments in the Affectnet data set. 5. It is suggested to add comparisons with the latest methods from recent years, as the baselines currently compared are not state-of-the-art (SOTA) methods. Reply：Dear reviewers, according to your request, I have increased the experimental results of the latest methods in recent years. 6. The paper still has some issues with language grammar or fluency. Please carefully check and revise. Reply：Dear reviewer, according to your request, I have re-read the article and revised the relevant grammar. Reviewer #2: This paper presents a facial expression recognition network by proposing a Hybrid Feature Extraction Block, which consists of parallel Big Model and Multi-head Self-attention. Overall, the paper is well presented. Nevertheless, I have a few concerns about the paper on the following points: 1. What are the authors' explanations for using Big Model rather than already existing methods in capturing long-range dependencies in feature maps? Reply：Dear reviewer, I would like to make the following explanation for your question. The existing feature extraction modules are mainly composed of convolution kernel, Transformer, MLP and so on. Among them, the feature pyramid captures the long-term dependence in the feature graph by using convolution kernels of different sizes, but this method increases the computational complexity of the network. Similarly, although other feature extraction modules have realized the capture of remote context information, they often increase the computational burden of the network. Then, Big Model (Renamed as: Feature Fusion Device) realizes the capture of remote context information by shifting the spatial position of image pixels, which not only reduces the computational complexity of the network, but also improves the classification performance of the network. 2. The explanation of equation 2 states the Softmax is normalized, but the equation doesn’t have a normalization factor. Please correct this mismatch. Reply：Dear reviewer, I have corrected this problem. Thank you for your guidance. 3. In the Cross-entropy loss function, equation 6, both variables are the same (p(i)), which is inaccurate because the loss is calculated between the predicted and true values. Reply：Dear reviewer, I have corrected this problem. Thank you for your guidance. 4.The authors' discussion of why the proposed method outperformed existing approaches would provide more insights into their proposed network. Reply：Dear reviewers, according to your request, we have explained why the proposed method is superior to the existing methods in Comparison with state-of-the-art methods, Discussion and Conclusion and prospects. 5.Labeling the confusion matrices with associated expressions rather than the indices can make them better for comparison. Reply：Dear reviewers, this article provides our experimental results by imitating the relevant experiments provided in previous articles. Attachments Attachment Submitted filename: Response to Reviewers.doc https://doi.org/10.1371/journal.pone.0312359.r002
7 Oct 2024 Decision Letter - Qionghao Huang, Editor A Facial Expression Recognition Network Using Hybrid Feature Extraction PONE-D-24-19404R1 Dear Dr. song, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Qionghao Huang Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed ******** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ****** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ****** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ****** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ****** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The author's response has addressed my concerns on this paper, and I recommend an acceptance of this paper. Reviewer #2: The authors have addressed the reviewers' comments and revised the manuscript accordingly. Good luck. ****** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ******** https://doi.org/10.1371/journal.pone.0312359.r003
Formally Accepted
5 Nov 2024 Acceptance Letter - Qionghao Huang, Editor PONE-D-24-19404R1 PLOS ONE Dear Dr. Song, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Qionghao Huang Academic Editor PLOS ONE https://doi.org/10.1371/journal.pone.0312359.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .