The potential of the transformer-based survival analysis model, SurvTrace, for predicting recurrent cardiovascular events and stratifying high-risk patients with ischemic heart disease

Hiroki Shinohara; Satoshi Kodera; Yugo Nagae; Takashi Hiruma; Atsushi Kobayashi; Masataka Sato; Shinnosuke Sawano; Tatsuya Kamon; Koichi Narita; Kazutoshi Hirose; Hiroyuki Kiriyama; Akihito Saito; Mizuki Miura; Shun Minatsuki; Hironobu Kikuchi; Norifumi Takeda; Hiroshi Akazawa; Hiroyuki Morita; Issei Komuro

doi:10.1371/journal.pone.0304423

Peer Review History

Original SubmissionOctober 31, 2023
14 Dec 2023 Decision Letter - Marcelo Arruda Nakazone, Editor PONE-D-23-35647The Potential of the Transformer-based Survival Analysis Model, SurvTrace, for Predicting Recurrent Cardiovascular Events and Stratifying High-risk Patients with Ischemic Heart DiseasePLOS ONE Dear Dr. Kodera, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. ============================== ACADEMIC EDITOR: The manuscript is interesting but will require further reworking and a major revision.<o:p></o:p> While they recognize the potential interest of the subject studied, the reviewers raised a number of important issues that need to be properly addressed. ============================== Please submit your revised manuscript by Jan 28 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Marcelo Arruda Nakazone, M.D., Ph.D. Academic Editor PLOS ONE Journal requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. In your revised cover letter, please address the following prompts: a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent. b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. We will update your Data Availability statement on your behalf to reflect the information you provide. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly ******** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: I Don't Know Reviewer #2: Yes ****** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No ****** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ****** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The manuscript presents a retrospective cohort study aimed at assessing the precision of a survival analysis model for predicting cardiovascular events. It would be beneficial for the authors to align their reporting with the Tripod guidelines to enhance the clarity and reproducibility of their research. In terms of methodology, the manuscript would benefit from explicitly stating any exclusion criteria that were applied to the study population. Furthermore, a rationale should be provided for the allocation of 10% of the dataset for validation purposes, with the remaining 90% utilized for the development and testing of the model. Regarding the statistical analysis, it is crucial for the authors to specify the exact multiple imputation method employed. It is also advisable to make the Python code used for the analysis available, as this would greatly aid in the transparency and replicability of the study. Lastly, performing a sensitivity analysis to gauge the influence of missing data on the study's outcomes—comparing results from complete case analysis with those from multiple imputation—would significantly strengthen the findings. Reviewer #2: I am grateful for the opportunity to review this manuscript evaluating the accuracy of a survival analysis algorithm that's based on the transformer deep learning model in predicting the development of major adverse cardiovascular events among patients who underwent percutaneous coronary intervention. The manuscript is well-written and easy to digest. However, I note some important comments below: 1. This analysis is based on data from a single healthcare system. Patients were not actively followed prospectively to evaluate study outcomes. Therefore, if a patient develops the outcome but they were not admitted at the hospital, they will likely be adjudged as not having the outcome because it was not observed in the hospital. This may introduce a huge amount of bias in outcome ascertainment. The authors need to develop some measures that will reduce this bias. For example, the authors can limit their analysis to established patients of the hospital (i.e. patients who are in a way loyal to the hospital). Established patients can be defined based on certain plausible criteria or based on previous studies. 2. Furthermore, the authors need to address or discuss how demographic changes such as emigration or even a patient moving away from the hospital service area can affect their ability to track outcome development among the patients. 3. Was there any attempt to ensure that the analysis cohort is made up of patients who are getting PCI for the first time? Otherwise, can the authors provide data on the proportion of included patients who have already received a previous PCI before the index event (per this study's criteria)? 4. In line 87, the authors mentioned that Any variable exhibiting a Pearson’s correlation coefficient exceeding 0.90 was omitted from set of explanatory variables used for model training. How did the authors decide which of the two variables with correlation >0.9 between them is dropped from the analysis? 5. How did the authors arrive at the decision to use 90% of the dataset for training? Was there any form of sensitivity analysis done to arrive at the optimal data splitting ratio? 6. The authors have not presented the measures of the fitness of the deep learning models (For example, the test and training accuracy by epoch) 7. Model hyperparameter tuning were not discussed. How does a reader know that the final model is the best model? 8. The authors should ideally provide the codes used for their analysis. ****** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Ang Yee Gary Reviewer #2: No ******** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. https://doi.org/10.1371/journal.pone.0304423.r001
Revision 1
2 Feb 2024 Author Response Reviewer #1: The manuscript presents a retrospective cohort study aimed at assessing the precision of a survival analysis model for predicting cardiovascular events. It would be beneficial for the authors to align their reporting with the Tripod guidelines to enhance the clarity and reproducibility of their research. We would like to express our sincere gratitude for your thorough and insightful review of our manuscript. Your constructive comments and suggestions are greatly appreciated and have provided valuable guidance in improving the quality and clarity of our research. We recognize the importance of adhering to established guidelines and methodological rigor, and we are grateful for the opportunity to improve our work based on your expert feedback. After reviewing the Tripod guidelines and considering your comments, we have revised our manuscript to better meet these standards. This revision focused on providing a more detailed and transparent presentation of our methodology, data analysis, and results. By aligning our report with the Tripod guidelines, we have improved both the clarity and replicability of our research. ・In terms of methodology, the manuscript would benefit from explicitly stating any exclusion criteria that were applied to the study population. Thank you for your valuable comments regarding clarification of the exclusion criteria applied to our study population. As this study enrolled all consecutive patients who underwent percutaneous coronary intervention (PCI) at our institution, we did not apply any specific exclusion criteria. To clarify this point, we have revised the manuscript on page 4, line 58 as follows: " This study involved consecutive enrollment of patients who underwent percutaneous coronary intervention (PCI) at the Department of Cardiovascular Medicine, University of Tokyo Hospital, between 2005 and 2019. Within this timeframe, the initial PCI performed at our hospital was designated as the index procedure for each individual patient and used for analysis." We believe this revision provides a clearer understanding of our study design and patient selection process. ・Furthermore, a rationale should be provided for the allocation of 10% of the dataset for validation purposes, with the remaining 90% utilized for the development and testing of the model. Thank you for highlighting the need to provide a detailed rationale for the allocation of our dataset into training, validation, and test sets. In our study, we allocated 10% of the dataset as a test set and divided the remaining 90% into training and validation sets at a 3:1 ratio. This decision was guided by the requirements of the Transformer model used in our analysis, which requires a substantial amount of data for effective training while ensuring an adequate amount for unbiased testing. To improve the clarity of our methodology, we have updated Figure 1. This revised figure now includes a note on the validation data set and illustrates the data partitioning process. The legend for Figure 1, found on page 11, line 198, has been revised to read: “This figure illustrates the flowchart of the study. Initially, all data were split into training and test datasets at a 9:1 ratio. To address missing values, multiple imputation was applied to both datasets, generating five pseudo-complete datasets for each. A separate 25% segment of the training dataset was reserved for validation. Subsequently, survival analysis was performed on each pseudo-complete dataset, and the c-index was calculated. Finally, Rubin’s rules were used to integrate the c-index values from each dataset to compute the final result. In the figure, yellow-green represents the data used for training the model, orange represents the validation data, and pink represents the data used for testing post-training.” In addition, to thoroughly address concerns about the impact of the percentage of test data on our results, we performed extensive sensitivity analyses with different allocations of the test set, including a scenario where 20% of the dataset was used for testing. The methodology and results of these analyses are carefully detailed in our new manuscript. Specifically, on page 7, line 120, the text reads: "To assess the robustness of our findings, we performed three distinct sensitivity analyses: first, by omitting missing values; second, by adjusting the percentage of test sets; and third, by excluding patients with a history of PCI. " This section outlines the steps taken in our sensitivity analyses. Further, the results of these analyses are elaborated on page 11, line 188, with the statement: “The second sensitivity analysis involved adjusting the proportion of the test dataset to 20%. Following this modification, the analysis was performed using one of the five pseudo-complete datasets generated by the multiple imputation method, including both training and test datasets. This adjustment yielded a c-index of 0.68 for SurvTrace and 0.66 for the conventional scoring system.” ・Regarding the statistical analysis, it is crucial for the authors to specify the exact multiple imputation method employed. It is also advisable to make the Python code used for the analysis available, as this would greatly aid in the transparency and replicability of the study. Lastly, performing a sensitivity analysis to gauge the influence of missing data on the study's outcomes—comparing results from complete case analysis with those from multiple imputation—would significantly strengthen the findings. We very much appreciate your comments on our statistical analysis, in particular the need to specify the multiple imputation method used and the importance of transparency and replicability in our study. First, we have addressed your comment about specifying the multiple imputation method. We have updated our manuscript on page 5, line 85, to clearly state: “In this study, we used Python to generate five pseudo-complete datasets, applying multiple imputations using the Bayesian Ridge method (S1 File).” In addition, to increase transparency and facilitate replication of our study, we have included the Python script used for the analysis as S1 file. Furthermore, per your suggestion for a sensitivity analysis to assess the influence of missing data on the results of our study, we have performed such analyses. On page 7, line 120, the following has been added: “To assess the robustness of our findings, we performed three distinct sensitivity analyses: first, by omitting missing values; second, by adjusting the percentage of test sets; and third, by excluding patients with a history of PCI.” In addition, the results of the first sensitivity analysis are elaborated on page 11, line 185: “In the first sensitivity analysis, cases with missing values were excluded from both training and test datasets. Post-exclusion, the training dataset comprised 2137 cases, and the test dataset contained 254 cases. The c-index for SurvTrace was 0.71, compared with 0.66 for the conventional scoring system.” This analysis confirms that the overall trends in our results remain consistent even when accounting for missing data. Reviewer #2: I am grateful for the opportunity to review this manuscript evaluating the accuracy of a survival analysis algorithm that's based on the transformer deep learning model in predicting the development of major adverse cardiovascular events among patients who underwent percutaneous coronary intervention. The manuscript is well-written and easy to digest. However, I note some important comments below: We would like to express our deepest gratitude for your time and effort in reviewing our manuscript. We are particularly grateful for your positive comments on the clarity and readability of our work. It is encouraging to hear that our manuscript, which evaluates the accuracy of a survival analysis algorithm based on the Transformer deep learning model in predicting major adverse cardiovascular events in patients undergoing percutaneous coronary intervention, has been well received. We appreciate your constructive feedback and are committed to addressing your concerns and suggestions. 1. This analysis is based on data from a single healthcare system. Patients were not actively followed prospectively to evaluate study outcomes. Therefore, if a patient develops the outcome but they were not admitted at the hospital, they will likely be adjudged as not having the outcome because it was not observed in the hospital. This may introduce a huge amount of bias in outcome ascertainment. The authors need to develop some measures that will reduce this bias. For example, the authors can limit their analysis to established patients of the hospital (i.e. patients who are in a way loyal to the hospital). Established patients can be defined based on certain plausible criteria or based on previous studies. We sincerely appreciate your critical observation regarding the potential bias in outcome ascertainment due to the retrospective nature of our study and the reliance on data from a single healthcare system. As you rightly point out, there is indeed a possibility that not all events were captured, particularly if they occurred outside the hospital or if patients were not admitted to our hospital for subsequent care. In response to your insightful suggestion, we have made significant additions to our manuscript to address this issue. Specifically, we have added a statement on page 10, line 169, that reads: “During the observation period, 683 subjects (17.3%) were lost to follow-up, including 610 cases in the training dataset and 73 cases in the test dataset.” This addition is intended to provide transparency regarding the extent of follow-up within our study population. In addition, we have further acknowledged this limitation and the potential for selective bias in our results on page 19, line 299, in the Limitations section: “Fourth, this study was retrospective in nature, with events meticulously tracked in the EHRs. Despite this thorough tracking, some events might have been overlooked as a result of patients relocating or transferring to other hospitals, potentially leading to selective bias. To mitigate this issue, future prospective studies employing survival analysis with the Transformer model are necessary.” We believe these changes will help provide a more comprehensive understanding of the scope and limitations of our study. 2. Furthermore, the authors need to address or discuss how demographic changes such as emigration or even a patient moving away from the hospital service area can affect their ability to track outcome development among the patients. Thank you for your important observation regarding the impact of demographic changes, such as migration or patients moving out of the hospital's service area, on our ability to track patient outcomes. As mentioned in response to point 1, we acknowledge that as a single-center, retrospective study, there is indeed a risk of selective bias. Given your comment, we extended the limitations section of our manuscript as mentioned above. On page 19, line 299, we added: “Fourth, this study was retrospective in nature, with events meticulously tracked in the EHRs. Despite this thorough tracking, some events might have been overlooked as a result of patients relocating or transferring to other hospitals, potentially leading to selective bias. To mitigate this issue, future prospective studies employing survival analysis with the Transformer model are necessary.” 3. Was there any attempt to ensure that the analysis cohort is made up of patients who are getting PCI for the first time? Otherwise, can the authors provide data on the proportion of included patients who have already received a previous PCI before the index event (per this study's criteria)? Thank you for raising the important question regarding the inclusion criteria of our patient cohort, especially concerning those who have previously undergone percutaneous coronary intervention (PCI). In this study, we consecutively enrolled patients who underwent PCI at our hospital. For purposes of analysis, the first PCI performed at our institution was considered the index PCI for each patient. Therefore, our analysis cohort included patients with a history of PCI prior to the index PCI at our institution. Page 4, line 58 reads “This study involved consecutive enrollment of patients who underwent percutaneous coronary intervention (PCI) at the Department of Cardiovascular Medicine, University of Tokyo Hospital, between 2005 and 2019. Within this timeframe, the initial PCI performed at our hospital was designated as the index procedure for each individual patient and used for analysis.” In response to your insightful suggestion, we have conducted a sensitivity analysis to evaluate the impact of including patients with a prior history of PCI. On page 7, line 120, the following has been added: “To assess the robustness of our findings, we performed three distinct sensitivity analyses: first, by omitting missing values; second, by adjusting the percentage of test sets; and third, by excluding patients with a history of PCI.” In addition, the results of the sensitivity analysis are elaborated on page 11, line 192: “In the final sensitivity analysis, after excluding patients with a history of PCI from one of the five pseudo-complete training and test datasets, the c-index for SurvTrace was 0.69, compared with 0.63 for the conventional scoring system.” 4. In line 87, the authors mentioned that Any variable exhibiting a Pearson’s correlation coefficient exceeding 0.90 was omitted from set of explanatory variables used for model training. How did the authors decide which of the two variables with correlation >0.9 between them is dropped from the analysis? We apologize for any lack of clarity in our manuscript regarding the methodology for handling highly correlated variables. We appreciate your question, which has highlighted an area in need of further explanation. To address the issue of multicollinearity in our dataset, we referred to the methodology outlined in reference 14. Based on this, when two features exhibited a Pearson’s correlation coefficient exceeding 0.90, we removed the one with the highest overall correlation to all other features. To provide greater clarity on this aspect, we have revised our manuscript accordingly. On page 6, line 91, we have added the following clarification: "In cases where two features were highly correlated, the one with the greater overall correlation to all features was eliminated " 5. How did the authors arrive at the decision to use 90% of the dataset for training? Was there any form of sensitivity analysis done to arrive at the optimal data splitting ratio? Thank you for highlighting the need to provide a detailed rationale for the allocation of our dataset into training, validation, and test sets. In our study, we allocated 10% of the dataset as a test set and divided the remaining 90% into training and validation sets at a 3:1 ratio. This decision was guided by the requirements of the Transformer model used in our analysis, which requires a substantial amount of data for effective training while ensuring an adequate amount for unbiased testing. To improve the clarity of our methodology, we have updated Figure 1. This revised figure now includes a note on the validation data set and illustrates the data partitioning process. The legend for Figure 1, found on page 11, line 198, has been revised to read: “This figure illustrates the flowchart of the study. Initially, all data were split into training and test datasets at a 9:1 ratio. To address missing values, multiple imputation was applied to both datasets, generating five pseudo-complete datasets for each. A separate 25% segment of the training dataset was reserved for validation. Subsequently, survival analysis was performed on each pseudo-complete dataset, and the c-index was calculated. Finally, Rubin’s rules were used to integrate the c-index values from each dataset to compute the final result. In the figure, yellow-green represents the data used for training the model, orange represents the validation data, and pink represents the data used for testing post-training Attachments Attachment Submitted filename: Respons_To_Reviewers_20240124.docx https://doi.org/10.1371/journal.pone.0304423.r002
27 Mar 2024 Decision Letter - Marcelo Arruda Nakazone, Editor PONE-D-23-35647R1The Potential of the Transformer-based Survival Analysis Model, SurvTrace, for Predicting Recurrent Cardiovascular Events and Stratifying High-risk Patients with Ischemic Heart DiseasePLOS ONE Dear Dr. Kodera, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by May 11 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Marcelo Arruda Nakazone, M.D., Ph.D. Academic Editor PLOS ONE Journal Requirements: Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. Additional Editor Comments: Previous comments have been considered; nevertheless, the manuscript requires minor revisions. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: (No Response) ******** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ****** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: I Don't Know Reviewer #2: Yes ****** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No ****** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ****** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Thank you for addressing my concerns. The outcome occurs in 8% of the population. How would you manage this imbalance in outcomes? Reviewer #2: The authors have addressed all my concerns. I just have two suggestions. 1. The authors should consider removing the word "retrospectively" and replacing "post-index" with "post-index event" in the following sentence (in the abstract): "Each patient’s initial PCI at our hospital was designated as the index procedure, and a composite of major adverse cardiovascular events (MACE) was retrospectively monitored for up to two years post-index." 2. The authors should also consider replacing "selective bias" with "selection bias." ****** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Ang Yee Gary Reviewer #2: No ******** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. https://doi.org/10.1371/journal.pone.0304423.r003
Revision 2
30 Apr 2024 Author Response Reviewer #1: Thank you for addressing my concerns. The outcome occurs in 8% of the population. How would you manage this imbalance in outcomes? Thank you for your valuable comments regarding the imbalanced data on outcomes. We appreciate your input, as the issue of imbalanced data is one of the critical challenges in developing machine learning models. In this study, we did not employ oversampling or undersampling methods to address the imbalanced data for the following reasons: Undersampling method The undersampling method reduces the number of majority class instances to match the level of minority class instances, resulting in a decrease in the overall sample size. SurvTRACE, the analysis method used in this study, is derived from the Transformer and requires a larger sample size. We were concerned that a reduction in sample size might impact the performance of SurvTRACE; therefore, we decided against adopting the undersampling method. Oversampling method and SMOTE The oversampling method and SMOTE generate synthetic data by referring to minority class instances. In this study, we used a multiple imputation method to handle missing values and generate pseudo-complete data for analysis. We were concerned that creating additional synthetic data based on the already imputed data might result in a noisier dataset. Consequently, we decided against adopting these methods. Moreover, the report by Wallace et al. (2011 IEEE 11th International Conference on Data Mining, Vancouver, BC, Canada, 2011, pp. 754-763, doi: 10.1109/ICDM.2011.33.) indicates that applying SMOTE to imbalanced datasets with an outcome incidence rate of 5-10% and a dimensionality exceeding 100 may not lead to significant improvements in accuracy. The outcome incidence rate in our study is 8%, and the dimensionality is 171, which falls within the range indicated in Wallace et al.'s report. Based on the above reasons, we decided not to adopt oversampling or undersampling methods for the imbalanced data in this study and instead performed the analysis using the original dataset as is. However, as you have rightly pointed out, addressing the imbalanced data problem is crucial for improving the performance of machine learning models. In future research, we will strive to explore more effective methods for dealing with imbalanced data and aim to enhance the accuracy of the models. We sincerely appreciate your valuable comments, which we believe will contribute to the improvement of our research. Reviewer #2: The authors have addressed all my concerns. I just have two suggestions. 1. The authors should consider removing the word "retrospectively" and replacing "post-index" with "post-index event" in the following sentence (in the abstract): "Each patient’s initial PCI at our hospital was designated as the index procedure, and a composite of major adverse cardiovascular events (MACE) was retrospectively monitored for up to two years post-index." 2. The authors should also consider replacing "selective bias" with "selection bias." We greatly appreciate your valuable suggestions for improving our manuscript. We have carefully considered your comments and made the following changes: Regarding your first suggestion, we agree that removing the word "retrospectively" and replacing "post-index" with "post-index event" will enhance the clarity of the sentence. We have revised the sentence in the abstract as follows: Original (Page 1, Lines 12-14): "Each patient's initial PCI at our hospital was designated as the index procedure, and a composite of major adverse cardiovascular events (MACE) was retrospectively monitored for up to two years post-index." Revised (Page 1, Lines 12-14): "Each patient's initial PCI at our hospital was designated as the index procedure, and a composite of major adverse cardiovascular events (MACE) was monitored for up to two years post-index event." As per your second suggestion, we have replaced the term "selective bias" with "selection bias" throughout the manuscript to ensure the correct terminology is used. Original (Page 19, Lines 302-304): "Despite this thorough tracking, some events might have been overlooked as a result of patients relocating or transferring to other hospitals, potentially leading to selective bias." Revised (Page 19, Lines 302-304): "Despite this thorough tracking, some events might have been overlooked as a result of patients relocating or transferring to other hospitals, potentially leading to selection bias." We believe that these changes, based on your insightful comments, will significantly improve the quality and readability of our manuscript. We sincerely appreciate the time and effort you have taken to review our work and provide such constructive comments. Thank you once again for your valuable suggestions. Attachments Attachment Submitted filename: Respons_To_Reviewers_20240424.docx https://doi.org/10.1371/journal.pone.0304423.r004
13 May 2024 Decision Letter - Marcelo Arruda Nakazone, Editor The Potential of the Transformer-based Survival Analysis Model, SurvTrace, for Predicting Recurrent Cardiovascular Events and Stratifying High-risk Patients with Ischemic Heart Disease PONE-D-23-35647R2 Dear Dr. Kodera, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Marcelo Arruda Nakazone, M.D., Ph.D. Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed ******** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly ****** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: I Don't Know Reviewer #2: Yes ****** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No ****** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ****** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Please kindly add unbalanced dataset as one of the limitation as this can help improves the manuscript. Reviewer #2: The authors have addressed all my comments. As mentioned in my earlier comments, the most significant limitation of this study is that it relied on data from a single institution, which may make the authors unable to identify outcome events when patients seek care in other hospitals. The authors have made efforts to address this limitation. I am not fully confident that these measures will significantly eliminate the bias created by this situation. Barring this limitation, the article is technically sound. ****** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Dr Ang Yee Gary Reviewer #2: No ******** https://doi.org/10.1371/journal.pone.0304423.r005
Formally Accepted
16 May 2024 Acceptance Letter - Marcelo Arruda Nakazone, Editor PONE-D-23-35647R2 PLOS ONE Dear Dr. Kodera, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Professor Marcelo Arruda Nakazone Academic Editor PLOS ONE https://doi.org/10.1371/journal.pone.0304423.r006

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .