G-CutMix: A CutMix-based graph data augmentation method for bot detection in social networks

Yan Li; Shuhao Shi; Xiaofeng Guo; Chunhua Zhou; Qian Hu

doi:10.1371/journal.pone.0331978

Peer Review History

Original SubmissionAugust 29, 2024
16 Jan 2025 Decision Letter - Riaz Ul Amin, Editor PONE-D-24-37596G-CutMix: a CutMix-based Graph Data Augmentation Method for Bot Detection in Social NetworksPLOS ONE Dear Dr. li, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Mar 02 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Riaz Ul Amin Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Thank you for stating the following financial disclosure: “the National Natural Science Foundation of China 61872448, 62002387” Please state what role the funders took in the study. If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." If this statement is not correct you must amend it as needed. Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf. 3. We note that your Data Availability Statement is currently as follows: All relevant data are within the manuscript and its Supporting Information files. Please confirm at this time whether or not your submission contains all raw data required to replicate the results of your study. Authors must share the “minimal data set” for their submission. PLOS defines the minimal data set to consist of the data required to replicate all study findings reported in the article, as well as related metadata and methods (https://journals.plos.org/plosone/s/data-availability#loc-minimal-data-set-definition). For example, authors should submit the following data: - The values behind the means, standard deviations and other measures reported; - The values used to build graphs; - The points extracted from images for analysis. Authors do not need to submit their entire data set if only a portion of the data was used in the reported study. If your submission does not contain these data, please either upload them as Supporting Information files or deposit them to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of recommended repositories, please see https://journals.plos.org/plosone/s/recommended-repositories. If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent. If data are owned by a third party, please indicate how others may request data access. 4. PLOS requires an ORCID iD for the corresponding author in Editorial Manager on papers submitted after December 6th, 2016. Please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field. This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager. Additional Editor Comments: Dear Yan li, Thank you for submitting your manuscript, titled "G-CutMix: a CutMix-based Graph Data Augmentation Method for Bot Detection in Social Networks," to PLOS One. After careful review, we have received feedback from the reviewers and conducted an independent editorial evaluation. While your work presents significant potential and addresses an important topic, several substantive concerns must be addressed before we can consider it for publication. We are therefore inviting you to submit a revised version of your manuscript. The reviewers’ comments, which are included below, outline specific areas that require major revision. You are advised to respond point-by-point to all reviewer comments. Clearly indicate how each concern has been addressed in your revised manuscript. Highlight the changes in the revised manuscript, either using track changes or by providing a marked-up version. Ensure that your revisions uphold the rigorous standards of transparency, reproducibility, and ethical research practices upheld by PLOS One. Please note that the revised manuscript will undergo further review to ensure that the concerns have been adequately addressed. We look forward to your resubmission and appreciate your dedication to refining your work. If you have any questions or require clarification on the reviewers' feedback, do not hesitate to contact us. Sincerely, Dr. Riaz UlAmin Associate Editor PLOS One [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Partly Reviewer #2: No Reviewer #3: Yes ******** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: No Reviewer #2: N/A Reviewer #3: Yes ****** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ****** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: No Reviewer #3: Yes ****** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The irregular and complex nature of graph data poses substantial challenges that traditional methods struggle to handle. Authors contributes in addressing the these challenges proposing G-CutMix, that is augmentation method based on the CutMix technique, specifically designed for bot detection in social networks. The research investigates the performance of various augmentation methods that involves performing CutMix operations between the original graph and a shuffled graph, enhancing the robustness of bot detection models. Authors demonstrated that G-CutMix outperforms existing graph data augmentation techniques like DropEdge and MixupForGraph across various graph neural network architectures. The approach effectively mimics real-world scenarios, making it a powerful tool against sophisticated bot behaviors. Though authors presented that G-CutMix offers significant advancement in bot detection by leveraging graph convolutional networks and innovative data augmentation techniques, showcasing promising results. However, The reliance on specific hyperparameters, such as the adaptive threshold α, may limit the method's applicability in varying contexts or datasets. Additionally, the paper does not sufficiently address the computational complexity introduced by the G-CutMix method, which could impact its scalability in real-world applications. Authors could explore other advanced augmentation techniques like GraphSAGE or node feature perturbation, which could provide complementary benefits to G-CutMix. Additionally, it is not reported in the paper if techniques such as adversarial training or semi-supervised learning could also enhance the robustness of bot detection models. The potential of ensemble methods, which combine multiple models to improve prediction accuracy, is another avenue not considered in the paper. Furthermore, the use of temporal data analysis to track bot behavior over time could provide additional insights that G-CutMix does not address. It is observed that G-CutMix has its dependence on the quality of the shuffled graph, which may introduce noise and affect the overall performance of the model. The impact of the same could have been explored. The method's effectiveness is contingent on the chosen hyperparameters, which may require extensive tuning for different datasets, complicating its implementation. G-CutMix primarily focuses on user relationships, potentially overlooking other critical factors such as content analysis or user behavior patterns in bot detection. The results presented in the paper may not account for the variability in bot behavior across different social networks, which could skew the effectiveness of G-CutMix. There is a lack of detailed analysis on the impact of different graph structures on the performance of G-CutMix, which could reveal potential weaknesses in the method. Additionally, the paper does not sufficiently address the potential for overfitting, especially given the complexity introduced by the augmentation process. The writeup and structuring of the paper requires significant improvements, the figure 4 and figure 5 are shown in the conclusion section and its relevant discussion is missing in the respective section of the paper. The related work section is not sufficient, authors needs to explore and present an extensive review of the literature. The hyperparameters needs to be clearly presented in the form of table. The evaluation of the models other than G-CutMix should be presented against the same dataset as used for evaluating G-CutMix, it may be helpful in further development of the contribution. The Authors should contribute the suggested changes before the paper may be accepted for publication. Reviewer #2: This manuscript is lack of core novelty and overall poorly presented. Proposed section is not enough as per the standard of this journal. In addition, the results are not presented well and lack of validations. Reviewer #3: This paper proposes a CutMix-based graph data augmentation method (G-CutMix) to improve the performance of bot detection in social networks. By integrating graph convolutional networks (GNNs) with graph mixing techniques, the authors introduce new node feature enhancement and attribute connection modules, demonstrating superior performance across multiple benchmark datasets compared to existing methods. Below are some minor comments: The proposed G-CutMix method effectively enhances GNN training by introducing a node mixing technique, particularly excelling in scenarios with small datasets, which is a commendable technical innovation. However, when compared with existing augmentation methods (e.g., MixupForGraph), the authors are encouraged to further analyze the theoretical advantages of G-CutMix beyond the experimental performance improvements. The tabular results are clear, but the visualizations (e.g., t-SNE plots) would benefit from detailed explanations of the differences in distributions across methods to enhance their persuasive power. The description of the methodology is detailed, but certain formulas (e.g., the choice of α in CutMix) and parameter settings (e.g., the edge dropout rate in DropEdge) lack a thorough explanation of their specific impact on performance. It is recommended to supplement experiments or analyses to increase the rigor of the descriptions. The structure of the paper is clear, and the language is precise, but some sections (e.g., the literature review in the introduction) could be further streamlined to improve the overall flow of the paper. Overall, this paper proposes a novel and effective graph data augmentation method with a certain degree of technical innovation, and its effectiveness is verified through comprehensive experiments. If the theoretical analysis and experimental discussions are supplemented and improved, the contribution of the paper will be more significant. ****** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Shafqaat Ahmad Reviewer #2: No Reviewer #3: No ******** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. https://doi.org/10.1371/journal.pone.0331978.r001
Revision 1
30 May 2025 Author Response Manuscript ID: PONE-D-24-37596 Article Title: G-CutMix: a CutMix-based Graph Data Augmentation Method for Bot Detection in Social Networks Reviewer#1, Concern#1: The reliance on specific hyperparameters, such as the adaptive threshold α, may limit the method's applicability in varying contexts or datasets. Additionally, the paper does not sufficiently address the computational complexity introduced by the G-CutMix method, which could impact its scalability in real-world applications. Author response: We thank the reviewer for the time and effort of reviewing our manuscript. Hyperparameter Sensitivity Analysis: In Section 5.2(Parameter Sensitivity Analysis), we expanded our discussion of the hyperparameter α (CutMix ratio). Experiments in Figure 5 demonstrate that α = 0.3 achieves optimal performance across all datasets (MGTAB, Cresci-15, Twibot-20). For instance, on MGTAB, varying α from 0.1 to 0.9 results in accuracy fluctuations within ±1.5%, indicating robustness. We further provide general guidelines for α: For datasets with sparse graphs (e.g., Twibot-20), α ∈ [0.2, 0.4] is recommended. For dense graphs (e.g., MGTAB), α ∈ [0.3, 0.5] yields stable results. Computational Complexity Analysis: A new subsection, Computational Complexity, quantifies the training time of G-CutMix compared to baseline methods. For example: On MGTAB with a GCN backbone, G-CutMix increases training time by 20% compared to vanilla GCN. These results confirm that G-CutMix’s performance gains outweigh its modest computational cost, making it suitable for real-world deployment. Table 6. Computational complexity comparisons for different methods(s) Method MGTAB Twibot-20 Cresci-15 GCN 16.11 18.92 9.35 SAGE 18.64 21.56 10.13 GAT 20.53 24.58 12.62 G-CutMix (GCN) 20.84 22.54 11.53 G-CutMix (SAGE) 22.36 24.84 12.43 G-CutMix (GAT) 23.62 27.15 12.98 ________________________________________ Reviewer#1, Concern#2: Authors could explore other advanced augmentation techniques like GraphSAGE or node feature perturbation. Additionally, techniques such as adversarial training or semi-supervised learning could enhance robustness. The potential of ensemble methods is another avenue not considered. Author response: We appreciate the reviewer’s suggestions for extending our work. Here are our clarifications: GraphSAGE Integration: Our framework is backbone-agnostic and explicitly supports GraphSAGE (abbreviated as SAGE in Table 2 and Table 3). For instance, G-CutMix (SAGE) achieves 85.97% accuracy on MGTAB, outperforming vanilla SAGE by 4.87%. This demonstrates that G-CutMix effectively complements existing GNN architectures. Adversarial Training and Semi-Supervised Learning: While these techniques are promising, they focus on training paradigms (e.g., robustness to perturbations, label efficiency), which are orthogonal to our core contribution: graph-specific data augmentation. We plan to explore this in future work. Ensemble Methods: The baseline RF-GNN (Table 2) is an ensemble method combining multiple GNNs. G-CutMix(GAT) outperforms RF-GNN by 1.71% F1 on Twibot-20, showing that our augmentation strategy alone achieves competitive results without ensemble overhead. ________________________________________ Reviewer#1, Concern#3: The method's effectiveness is contingent on the quality of the shuffled graph, which may introduce noise. The impact of this should be explored. Author response: We clarify that the shuffled graph generation process is structure-preserving (Definition 1), ensuring isomorphism with the original graph. To validate this: Ablation Study (Table 5): Removing the shuffle module ("w/o shuffle") degrades performance on MGTAB by -2.02% accuracy (GCN) and -1.65% (GAT), proving its necessity for diversity. ________________________________________ Reviewer#1, Concern#4: G-CutMix primarily focuses on user relationships, potentially overlooking content analysis or user behavior patterns. Author response: Our framework explicitly incorporates behavioral and content features: For MGTAB, we use 20 user attributes (Section 4.1 Dataset), including: Behavioral: Post frequency, follower/following ratios, account age. Content: BERT embeddings of tweets, description length, verified status. On Twibot-20, we include 17 attributes such as profile metadata and tweet semantics. These features are concatenated with graph embeddings before classification, ensuring a holistic representation Section 3.3 (Node Cutmix Module). ________________________________________ Reviewer#1, Concern#5: Lack of graph structure impact analysis and overfitting concerns. Author response: Graph Structure Analysis: Section 5.3 (CutMix for Heterogeneous Graphs) evaluates multi-relation graphs (Table 4). For example, G-CutMix+RGAT achieves 80.69% accuracy on Twibot-20 using both follower and friend relations, proving adaptability to complex structures. Overfitting Mitigation: Our shallow GNN design (2 layers) and early stopping (200 epochs) prevent overfitting. Validation curves show stable accuracy, with <1% gap between training and test performance. G-CutMix inherently diversifies training data, acting as a regularizer. For instance, on Cresci-15, G-CutMix reduces validation loss by 15% compared to vanilla GCN. ________________________________________ Reviewer#1, Concern#6: Structural issues: misplaced figures, insufficient literature review, hyperparameter clarity. Author response: Expanded Literature Review: The Related Work section now cites 12 additional papers, including recent advances in graph augmentation (e.g., GraphMix, G-Mixup) and bot detection (e.g., BotRGCN, SATAR). We also contrast G-CutMix with MixupForGraph, highlighting its advantages in preserving graph locality (Section 2.3 Data Augmentation). Supplementary experimental analysis: The consistent superiority of G-CutMix-enhanced models, average 1.91% F1 improvements on MGTAB, 1.95% on Cresci-15, 3.82% on Twibot-20 stems from its ability to synergize follower and friend relations. Unlike baseline RGCN/RGAT that process relations sequentially, our method’s isomorphic shuffling and feature fusion create implicit cross-relational attention—for instance, amplifying signals where follower-friend reciprocity indicates coordinated bot behavior. While both RGCN and RGAT benefit from G-CutMix, the greater improvements with RGAT highlight our method’s compatibility with attention mechanisms. The learnable merging weights in G-CutMix likely synergize with RGAT’s edge-specific attention, enabling adaptive reweighting of mixed features. The ablation study results in Table 5 reveal critical interdependencies between G-CutMix's components and dataset characteristics. The most pronounced performance degradation occurs when removing the Attribute Connection Module (average 3.8% F1 drop across datasets), particularly severe in Twibot-20 (6.2% accuracy decline for GCN), suggesting that social bots' attribute camouflage strategies - such as profile metadata manipulation - require explicit attribute correlation modeling to detect. Interestingly, while Node Shuffle removal impacts MGTAB most significantly (1.5-2.5% accuracy reduction), its effect diminishes in Twibot-20 where temporal behavioral patterns dominate, implying that structural isomorphism preservation becomes less critical when bots exhibit strong activity sequence signatures. These findings collectively demonstrate that G-CutMix's power emerges from the synergistic combination of its components rather than any single module. Misplaced figures: We have systematically readjusted the placement of figures and tables throughout the article to enhance the overall logical flow and visual consistency. ________________________________________ Reviewer#2, Concern#1: This manuscript is lack of core novelty and overall poorly presented. Author response: We respectfully disagree but acknowledge the need to clarify our contributions: Novelty: G-CutMix is the first method to adapt CutMix to graph-structured data for bot detection. Unlike image-based CutMix, we propose: Isomorphic Graph Shuffling to preserve structural integrity (Section 3.2). Attribute Connection Module to retain original node features post-mixing (Section 3.3). These innovations are validated by outperforming MixupForGraph by 3.2% F1 on Twibot-20 (Table 2). ________________________________________ Reviewer#2, Concern#2: Proposed section is not enough as per the standard of this journal. In addition, the results are not presented well and lack of validations. Author response: Your comment is very important to us. To our knowledge, G-CutMix is the first to adapt CutMix for graphs by combining feature mixing with isomorphic shuffling, addressing irregular graph structures. Added Recent Works: We have added seven state-of-the-art references (e.g., BotRGCN, SATAR, and BotCL) to systematically integrate recent breakthroughs in graph neural network-based detection methodologies. The restructured introduction now establishes a coherent narrative flow, beginning with fundamental challenges in bot detection, progressing through critical limitations in existing approaches, and concluding with our novel technical framework. Supplementary discussion� We have supplemented background and motivation and the discussion of the t-SNE plot results, improved the discussion of the results in Tables 4 and 5, and added and explained the experiments on computational complexity. ________________________________________ Reviewer#3, Concern#1: The proposed G-CutMix method effectively enhances GNN training by introducing a node mixing technique, particularly excelling in scenarios with small datasets, which is a commendable technical innovation. However, when compared with existing augmentation methods (e.g., MixupForGraph), the authors are encouraged to further analyze the theoretical advantages of G-CutMix beyond the experimental performance improvements. Author response: Thank you for highlighting this crucial aspect. We have expanded the theoretical analysis in Section 3.1 (Background and Motivation) to clarify the unique advantages of G-CutMix: Locality Preservation via Binary Masking: Unlike MixupForGraph, which linearly interpolates node features, G-CutMix employs region-level feature swapping using a binary mask M (Equation 1). This ensures that local structural patterns (e.g., community-specific interactions) are preserved during augmentation, whereas linear interpolation may blur such patterns. For example, in social graphs, bots often exhibit localized behavioral anomalies (e.g., sudden spikes in follower requests). By retaining intact feature regions, G-CutMix helps GNNs capture these subtle signals more effectively. ________________________________________ Reviewer#3, Concern#2: The tabular results are clear, but the visualizations (e.g., t-SNE plots) would benefit from detailed explanations of the differences in distributions across methods to enhance their persuasive power. Author response: Your comment is very important to us. Supplementary discussion� According to the reviewer's suggestion, the following supplements are discussed about the results of t-SNE plots: “The embeddings obtained by the GCN model exhibit the highest degree of overlap in the feature space, as evidenced by the fact that most green points are occluded by orange points. DropEdge and MixupGCN demonstrate better distinguishability in the feature space compared to GCN. The embeddings generated by CutMix show the lowest overlap in the feature space and achieve the highest distinguishability.”________________________________________ Reviewer#3, Concern#3: The description of the methodology is detailed, but certain formulas (e.g., the choice of α in CutMix) and parameter settings (e.g., the edge dropout rate in DropEdge) lack a thorough explanation of their specific impact on performance. It is recommended to supplement experiments or analyses to increase the rigor of the descriptions. Author response: Your comment is very important to us. In Section 5.2 (Parameter Sensitivity Analysis), we expanded our discussion of the hyperparameter α (CutMix ratio). Experiments in Figure 5 demonstrate that α = 0.3 achieves optimal performance across all datasets (MGTAB, Cresci-15, Twibot-20). For instance, on MGTAB, varying α from 0.1 to 0.9 results in accuracy fluctuations within 1.5%, indicating robustness. We further provide general guidelines for α: For datasets with sparse graphs (e.g., Twibot-20), α ∈ [0.2, 0.4] is recommended. For dense graphs (e.g., MGTAB), α ∈ [0.3, 0.5] yields stable results. Based on the reviewers' suggestions, we have supplemented the description of DropEdge parameters in this manuscript. ________________________________________ Reviewer#3, Concern#4: The structure of the paper is clear, and the language is precise, but some sections (e.g., the literature review in the introduction) could be further streamlined to improve the overall flow of the paper. Overall, this paper proposes a novel and effective graph data augmentation method with a certain degree of technical innovation, and its effectiveness is verified through comprehensive experiments. If the theoretical analysis and experimental discussions are supplemented and improved, the contribution of the paper will be more significant. Author response: Your comment is very important to us. We restructured the Introduction to enhance clarity and conciseness: Added Recent Works: Incorporated 7 new references (e.g., BotRGCN, SATAR , BotCL) to reflect the latest advances in GNN-based bot detection. The revised introduction now progresses logically from problem statement to technical gaps and our solution. ________________________________________ Attachments Attachment Submitted filename: Response to Reviewers.docx https://doi.org/10.1371/journal.pone.0331978.r002
24 Aug 2025 Decision Letter - Filipi Silva, Editor G-CutMix: a CutMix-based Graph Data Augmentation Method for Bot Detection in Social Networks PONE-D-24-37596R1 Dear Dr. li, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact billing support. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Filipi N. Silva Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: (No Response) ******** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Partly Reviewer #2: (No Response) ****** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: (No Response) ****** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: (No Response) ****** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: (No Response) ****** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: I recommend accepting the manuscript for publication, as the revisions comprehensively address all concerns, and the work presents a valuable and innovative contribution to graph-based bot detection. Reviewer #2: Authors well revised this version, I recommend to accept it in the current form. There are no more comments ****** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: SHAFQAAT AHMAD Reviewer #2: No ******** https://doi.org/10.1371/journal.pone.0331978.r003
Formally Accepted
Acceptance Letter - Filipi Silva, Editor PONE-D-24-37596R1 PLOS ONE Dear Dr. Li, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Filipi N. Silva Academic Editor PLOS ONE https://doi.org/10.1371/journal.pone.0331978.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .