Peer Review History

Original SubmissionFebruary 23, 2023
Decision Letter - Rittal Mehta, Editor

PONE-D-23-05410Synthetic Data in Cancer and Cerebrovascular Disease Research: A Novel Approach to Big Data Synthetic Data in Cancer/Stroke ResearchPLOS ONE

Dear Dr. Lun,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Jul 08 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Rittal Mehta

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information.

3. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.

Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized.

Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

We will update your Data Availability statement to reflect the information you provide in your cover letter.

4. Your ethics statement should only appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please delete it from any other section.

5. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

6. We note that you have referenced (Lun et al, unpublished data, manuscript embargo) which has currently not yet been accepted for publication. Please remove this from your References and amend this to state in the body of your manuscript: (ie “Bewick et al. [Unpublished]”) as detailed online in our guide for authors

http://journals.plos.org/plosone/s/submission-guidelines#loc-reference-style

Additional Editor Comments:

In the current manuscript, the authors’ objectives were two-fold: the first was to compare key differences in demographics, acute treatments, hospital length of stay, and costs between cancer patients with ischemic stroke and non-cancer patients with ischemic stroke using synthetic data produced by MDClone, and 2) to validate the use of synthetic data in cancer and stroke research by comparing key statistical properties between the synthetic dataset and the source dataset from which the synthetic data originates.

The reviewer finds the use of MDClone fascinating and recognizes its value in terms of doing feasibility analysis or hypothesis generation. However, the reviewer Is concerned regarding the conclusion regarding objective 1 absent any multivariable analysis. The reviewer also noticed there were no p-values in any of the tables. The authors are advised to perform multivariable analyses to quantify the association of presence of cancer with outcomes of interest among patients with ischemic stroke; and to add p-values to both the tables in their univariate analysis

To analyze the data and answer questions mentioned in the aim of the manuscript, appropriate bivariate and multivariable analyses needs to be performed.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In the current manuscript, the authors’ objectives were two-fold: the first was to compare key differences in demographics, acute treatments, hospital length of stay, and costs between cancer patients with ischemic stroke and non-cancer patients with ischemic stroke using synthetic data produced by MDClone, and 2) to validate the use of synthetic data in cancer and stroke research by comparing key statistical properties between the synthetic dataset and the source dataset from which the synthetic data originates.

The reviewer finds the use of MDClone fascinating and recognizes its value in terms of doing feasibility analysis or hypothesis generation. However, the reviewer Is concerned regarding the conclusion regarding objective 1 absent any multivariable analysis. The reviewer also noticed there were no p-values in any of the tables. The authors are advised to perform multivariable analyses to quantify the association of presence of cancer with outcomes of interest among patients with ischemic stroke; and to add p-values to both the tables in their univariate analysis.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Revision 1

Response to Reviewers – PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information.

3. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.

Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized.

Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

We will update your Data Availability statement to reflect the information you provide in your cover letter.

We have included both of the synthetic datasets for patients with cancer and a history of ischemic stroke as well as patients with ischemic stroke that have no history of cancer.

4. Your ethics statement should only appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please delete it from any other section.

5. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information.

6. We note that you have referenced (Lun et al, unpublished data, manuscript embargo) which has currently not yet been accepted for publication. Please remove this from your References and amend this to state in the body of your manuscript: (ie “Bewick et al. [Unpublished]”) as detailed online in our guide for authors

http://journals.plos.org/plosone/s/submission-guidelines#loc-reference-style

These have all been corrected, thank you.

Reviewer #1: In the current manuscript, the authors’ objectives were two-fold: the first was to compare key differences in demographics, acute treatments, hospital length of stay, and costs between cancer patients with ischemic stroke and non-cancer patients with ischemic stroke using synthetic data produced by MDClone, and 2) to validate the use of synthetic data in cancer and stroke research by comparing key statistical properties between the synthetic dataset and the source dataset from which the synthetic data originates.

The reviewer finds the use of MDClone fascinating and recognizes its value in terms of doing feasibility analysis or hypothesis generation. However, the reviewer Is concerned regarding the conclusion regarding objective 1 absent any multivariable analysis. The reviewer also noticed there were no p-values in any of the tables. The authors are advised to perform multivariable analyses to quantify the association of presence of cancer with outcomes of interest among patients with ischemic stroke; and to add p-values to both the tables in their univariate analysis.

Thank you for your review. We have added p-values to Tables 1 and 2 and described in our Methods section the statistical tests that were utilized to obtain the p-values.

We have also performed a multivariable analysis quantifying the association between cancer with the outcome of recurrent ischemic stroke:

Results:

Using the synthetic dataset produced by MDClone, we identified 5 predictors of the primary outcome “recurrent ischemic stroke after the reference event” with a binary logistic regression model. Older age, a history of ischemic stroke, and dyslipidemia were found to be positive predictors of the outcome, while atrial fibrillation and a history of venous thromboembolism were negative predictors (Table 3). Using the original dataset, we identified 3 predictors of recurrent stroke: atrial fibrillation and venous thromboembolism were consistently found to be negative predictors of the outcome, with similar adjusted odds ratios compared to synthetic data (aOR 0.67 [95%CI 0.49 – 0.91] from synthetic data compared to 0.71 [95%CI 0.52 – 0.97] from original data for atrial fibrillation; aOR 0.51 [95%CI 0.31 – 0.85] from synthetic data compared to 0.43 [95%CI 0.25 – 0.72] from original data for venous thromboembolism). Dyslipidemia was similarly identified as a positive predictor of the outcome (aOR 1.89 [95%CI 1.09 – 3.27] from synthetic data compared to 1.82 [95%CI 1.06 – 3.13] for original data). There was no significant association between age or prior ischemic stroke with the outcome – in exploratory analyses, the p-values from their likelihood ratio tests were >0.1, and therefore they were not included in the final regression model. The Cox & Snell R2 value for both final logistic regression models were low: 0.026 for the synthetic dataset and 0.017 for the original dataset; suggesting that only approximately 2.6% of the variation in our primary outcome can be attributed to the predictors in our model.

Discussion:

In the multivariable model, 3/5 predictors were identified with similar odds ratios but two were not identified in the real patient dataset. Several studies have been published using synthetic data produced by MDClone, suggesting the growing recognition and support for its use amongst the scientific community.8,33 We were able to find three previous studies that compared synthetic data produced by MDClone to original data.7,9,34 All three studies found that results derived from synthetic data were representative of real data in terms of basic descriptive statistical properties. For variables with large patient numbers, there were highly accurate and strongly consistent results observed compared to original data, but in the context of large missingness of data or small patient numbers, the results of synthetic data were less reliable.7,34 One study found that for smaller population studies that evaluated confounders and effect modifiers in multivariable regression models, clear trends were still correctly observed with synthetic data, although the predictions were of moderate accuracy compared to original data.7 This is in line with the findings from our current study – we believe that the reason previous ischemic stroke was found to be a predictor with synthetic data but not the real dataset is related to the low proportion of patients with this diagnosis: only 22 patients were identified to have a history of ischemic stroke in both the synthetic and original datasets. The high missingness of data would render this covariate unreliable, which is further supported by the wide confidence interval for its adjusted odds ratio (aOR 2.57, 95%CI 1.07 – 6.18). The other covariate that was identified with the synthetic dataset but not original was age categorized by decades, which may be related to skewing of data related to the automatic removal of outliers by MDClone (Figure 1) and the proximity of the lower 95% CI to 1. A previous study also found that low sample size, highly irregular distributions, and high sparsity of data can all affect the data synthesis process and the interpretability of synthetic data.3

Decision Letter - Luiz Sérgio Fernandes de Carvalho, Editor

PONE-D-23-05410R1Synthetic data in cancer and cerebrovascular disease research: a novel approach to big data Synthetic data in cancer/stroke researchPLOS ONE

Dear Dr. Lun,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. The manuscript was evolved significantly after last review, but there are still queries to be answered, as reported by Reviewer #2. 

Please submit your revised manuscript by Oct 26 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Luiz Sérgio Fernandes de Carvalho, PhD, MSc, MD

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors have addressed the comments from the reviewer and may now be accepted. The authors are to be commended for this manuscrip.t

Reviewer #2: Thank you for the opportunity to review. This is truly novel and interesting manuscript.

Minor comments and questions to the authors:

1. CCI in outcome analyses: since you're investigating cancer patients and CCI contains cancer variable, have you adjusted CCI to remove cancer and re-calculate CCI?

2. Looking at your original dataset time horizon (2002-2019), which is very long, have you considered adjusting costs for inflation in your analyses to make comparisons valid? Current cost estimates presented in the manuscript should be adjusted for inflation.

3. How was stroke encounter defined, and how would the audience know whether the cost of an encounter can truly be associated with stroke? Also, how were indirect costs calculated?

Thank you!

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Mohammed Ali Alvi

Reviewer #2: Yes: Jan Sieluk

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Revision 2

Reviewer #1: The authors have addressed the comments from the reviewer and may now be accepted. The authors are to be commended for this manuscrip.t

Thank you for taking the time to review our manuscript.

Reviewer #2: Thank you for the opportunity to review. This is truly novel and interesting manuscript.

Minor comments and questions to the authors:

1. CCI in outcome analyses: since you're investigating cancer patients and CCI contains cancer variable, have you adjusted CCI to remove cancer and re-calculate CCI?

1. Thank you for this suggestion. The Charlson Comorbidity Index was automatically calculated by MDClone and therefore the cancer variable was not removed and re-calculated. This limitation has been added to the discussion section of our manuscript: “There were also some automatically calculated variables (i.e. the Charlson Comorbidity Index), where the individual components of the score could not be individually analyzed.”

2. Looking at your original dataset time horizon (2002-2019), which is very long, have you considered adjusting costs for inflation in your analyses to make comparisons valid? Current cost estimates presented in the manuscript should be adjusted for inflation.?

Thank you for this suggestion. We have broken down the median cost associated with encounters by year and adjusted each year’s median cost for inflation based on the 2022 Canadian consumer price index (Statistics Canada. Table 18-10-0005-01 Consumer Price Index, annual average, not seasonally adjusted). We have presented this data in the supplemental materials document S4 table. Pooling of the median values from each year was not undertaken as medians are not amenable to further mathematic calculations. However, we have also calculated the mean cost associated with encounters from each year and presented the original mean by year as well as the inflation-adjusted cost (adjusted for 2022), and presented the total mean sum for the cancer cohort (S4 Table A) and the non-cancer cohort (S4 Table B). The original mean for the cancer cohort was $15,314.50 and the inflation-adjusted cost was $20,686.29. Comparatively the original mean for the non-cancer cohort was $14,410.78 and after adjusting for inflation, the 2022 equivalent would be $17,295.31. Our conclusions are therefore similar – that the cancer cohort has higher costs associated with their encounter compared to the non-cancer cohort. We did not present these inflation-adjusted means in the main text of our manuscript because cost was not a normally distributed value and therefore presentation of the original median values would be more statistically correct. We have also added this to the text of our main manuscript: “We also calculated the median and average costs associated with stroke encounters for each year between 2005 – 2019 and adjusted the costs for inflation based on the 2022 Canadian consumer price index.28 The average costs associated with encounters in the cancer and stroke cohort was consistently higher compared to encounters in the non-cancer and stroke cohort (Supplemental Materials S4 Table).”

We hope this is satisfactory to the reviewer.

3. How was stroke encounter defined, and how would the audience know whether the cost of an encounter can truly be associated with stroke? Also, how were indirect costs calculated

Stroke encounters were identified using International Classification of Diseases 10th edition (ICD-10) codes. The list of codes can be found now in Supplemental Materials Table S3. For Emergency Department visits, encounters were only included if stroke was the “most responsible diagnosis” associated with the visit. For hospitalizations, encounters were included if ischemic stroke was defined as the “primary problem”.

Information regarding costs were available for all inpatient encounters from April 2002, and for ED encounters, from April 2011. Direct costs were defined as all expenses in direct functional centers related to patient care, including salaries, supplies, and equipment amortization. Indirect costs included overhead allocation based on the percentage of the activity in the functional center, and consisted of costs not directly related to the patient, such as human resources, finance, health records, administration fees, building maintenance, etc.

The above statements have been added to our revised manuscript.

Attachments
Attachment
Submitted filename: PLOS ONE Revisions Response to Reviewers V2.docx
Decision Letter - Luiz Sérgio Fernandes de Carvalho, Editor

Synthetic data in cancer and cerebrovascular disease research: a novel approach to big data Synthetic data in cancer/stroke research

PONE-D-23-05410R2

Dear Dr. Lun,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Luiz Sérgio Fernandes de Carvalho, PhD, MSc, MD

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: I have already reviewed the revised version before. The authors have addressed some more comments for another reviewer who recommended some changes to the revised version. Congrats again!

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Mohammed Ali Alvi

**********

Formally Accepted
Acceptance Letter - Luiz Sérgio Fernandes de Carvalho, Editor

PONE-D-23-05410R2

PLOS ONE

Dear Dr. Lun,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Luiz Sérgio Fernandes de Carvalho

Academic Editor

PLOS ONE

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .