Machine learning for predicting the diagnosis of tuberculous versus malignant pleural effusion: External validation and accuracy in two different settings

Alberto Garcia-Zamalloa; Rafael Arnay; Iván Castilla-Rodriguez; Javier Mar; Jose Manuel Gonzalez-Cava; Oliver Ibarrondo; Iñaki Salegui; Juan Antonio De Miguel; Nekane Mugica; Borja Aguinagalde; Jon Zabaleta; Begoña Basauri; Marta Alonso; Nekane Azcue; Eva Gil; Irati Garmendia; Jorge Taboada

doi:10.1371/journal.pone.0329668

Peer Review History

Original SubmissionMarch 22, 2025
15 May 2025 Decision Letter - Guocan Yu, Editor PONE-D-25-09667Machine learning for predicting the diagnosis of pleural tuberculosis: external validation and accuracy in two different settings.PLOS ONE Dear Dr. Garcia-Zamalloa, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Jun 29 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols . We look forward to receiving your revised manuscript. Kind regards, Guocan Yu Academic Editor PLOS ONE Journal requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf . 2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, we expect all author-generated code to be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse. 3. In the online submission form, you indicated that [All the data are included in a database in our Department, and they cannot be shared publicly, but we have no problem to share all them for everyone who ask for it and who meet the criteria for access to confidential data.]. All PLOS journals now require all data underlying the findings described in their manuscript to be freely available to other researchers, either 1. In a public repository, 2. Within the manuscript itself, or 3. Uploaded as supplementary information. This policy applies to all data except where public deposition would breach compliance with the protocol approved by your research ethics board. If your data cannot be made publicly available for ethical or legal reasons (e.g., public availability would compromise patient privacy), please explain your reasons on resubmission and your exemption request will be escalated for approval. 4. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ******** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ****** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ****** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ******** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Peer Review for Manuscript PONE-D-25-09667 Date: 9-April-2025 “Machine learning for predicting the diagnosis of pleural tuberculosis: external validation and accuracy in two different settings” These are my comments of the peer review for the manuscript requested. My general comments: In general, the authors presented a well-conducted of external validation study. This research was developed, based on a authors’ recent publication of development prospective cohort model (reported in 2021) of Adenosine deaminase (ADA) for pleural tuberculosis in low tuberculosis (TB) prevalence. The training cohort showed promising results, providing a good rationale for an external validation study. The analysis is very good; however, I think there is room for improvement in writing. To my own perception, I think the structure and presentation of this article writing are still not smooth. The plus point of this external validation is that the developed pleural TB model (TPE) was tested in a completely different cohort with different TB/TPE incidence. In this study, although test cohort showed a bit lower accuracy, predictive values, than the Train study cohort. I think these real-life data are highly appreciated, demonstrating the applicability of the machine learning TPE models. This highlights the real-life data testing in the external validation cohort. In clinical practice, tuberculous pleural effusion (TPE) is difficult for diagnosis, particularly in immunocompromised patients and/or limited resource countries with high TB burden. Hence, I raise a clinical question regarding the immune status of the study participants (training and test cohorts). Because HIV co-infection prevalence is highly significant among patients with TPE, do authors have any data about HIV infection or immune testing (CD4 cell counts…etc) in these cohorts? In additionally, a short justification for sample size calculation in external validation cohorts enhances the validity of the study as well. My specific comments: 1. Title: The title fully describes study aim and objectives. 2. Abstract: The abstract is well written. However, I also think one minor point for amendment, as below: Lines 45 to 47: The authors presented a sample size of the Test cohort with 832 consecutive patients in Bajo Deba health district (1996-2012), but did not show the sample size of the Train cohort from a prospective cohort study (2013-2020). To be consistent and transparent in data, the sample size (how many patients in Train cohort?) should be described (stated) in this section as well. 3. Introduction: There is room for improvement in the introduction section, as follows: Line 67: 95% UI needs to be fully written, as readers are not familiar with this term-UI. Lines 92 to 94: “The model is freely 92 available as an app (at https://pleurapp.ispana.es/) to help other physicians or thoracic surgeons apply this approach when dealing with exudative and lymphocytic pleural effusions” � This information should not be placed in the introduction because it was not connected with the flow of the main idea discussed. This can be relocated to the discussion as appropriate. Lines 99 to 100 in the introduction section: “we compared the 99 diagnostic accuracy of the ML procedure and the classical Bayesian analysis system for TPE in both 100 different clinical scenarios (Bajo Deba 1996-2012 and Gipuzkoa 2013-2022)”. � The authors mentioned two study cohorts to be modelled without brief introduction before. Hence, I recommend a brief introduction (1-2 sentences) of these 2 cohorts in the previous paragraphs. 4. Materials and Methods and Results There is room for Materials and Methods section for improvement, regarding data presentation. 4.1 I can understand that the authors aimed to place emphasis on the Test cohort (external validation), so proactively present the Test study cohort as the first group, while the Train study cohort as the second group. Intuitively, this style of data presentation brings readers (like me) to a certain level of confusion and needs to reread and rethink. Therefore, I recommend that the authors present data as routine to characterize the first group = Train cohort, and second group = external validation cohort. This order of presentation should be consistent throughout the manuscript text and Tables. In the Tables 1 and 2, the authors first present: Test to Train cohorts (in sequence); then in Tables 3 and 4: Train to Test cohorts (in sequence) � The transition in data presentation will make readers (eg, like my case) confused and take some time to reread, rethink the study data. Therefore, I highly recommend a consistent presentation style of study data, I prefer training to testing cohort data presentation (from left to right), as the conventional way. 4.2 In Lines 213 to 218, the authors do not need to describe detailed data about confusion matrices of all machine learning models one by one, because readers can track all these information in Figures 2 and 4 presented. Only salient features from these data should be stated in the manuscript text. 4.3 Lines 237-239 and Lines 242-245 “The estimated PPV area is calculated as a function of the pre-test probability (prevalence) using the sensitivity and specificity of each classifier obtained in the training dataset (Gipuzkoa): (sensitivity * prevalence) / ((sensitivity * prevalence) + (1 - specificity) * (1 - prevalence)).” “The real PPVs in Gipuzkoa and Bajo Deba are calculated as the true positives divided by the sum of the true positives and false positives, obtained in each dataset: TP / (TP + FP)” � The formulas should be relocated in the Methods section, as it is more appropriate. 5. Discussion: The discussion is comprehensively discussed and well written. From this study, the machine learning models outperformed the Bayesian modelling, as shown in a different study setting with different prevalence of TPE and malignancy. Conclusion: � I think this is a great study, and minor amendments are suggested to make it more comprehensible for readers. I agree that this study is appropriate for publication. Many thanks, Best regards, Reviewer #2: Overall well-written. See attached DOCX file for some reorganizing suggestions. Some of the Results have been included in the Methods section. The incidence of TB between the two groups is not so significant. ******** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy . Reviewer #1: Yes: Nguyen Tat Thanh (MD, PhD) Reviewer #2: No ******** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org . Please note that Supporting Information files do not need this step. Attachments Attachment Submitted filename: Peer Review-Manuscript PONE-D-25-09667.pdf Attachment Submitted filename: PONE-D-25-09667_reviewer.docx https://doi.org/10.1371/journal.pone.0329668.r001
Revision 1
4 Jun 2025 Author Response Reviewer #1: Peer Review for Manuscript PONE-D-25-09667 Date: 9-April-2025 “Machine learning for predicting the diagnosis of pleural tuberculosis: external validation and accuracy in two different settings” - (REVIEWER): These are my comments of the peer review for the manuscript requested. My general comments: In general, the authors presented a well-conducted of external validation study. This research was developed, based on a authors’ recent publication of development prospective cohort model (reported in 2021) of Adenosine deaminase (ADA) for pleural tuberculosis in low tuberculosis (TB) prevalence. The training cohort showed promising results, providing a good rationale for an external validation study. The analysis is very good; however, I think there is room for improvement in writing. To my own perception, I think the structure and presentation of this article writing are still not smooth. * (AUTHORS) Thank you very much for your general suggestion about the structure of the work. We have thoroughly followed your comments and, as a result, we think that we have improved the overall structure of the paper. Particularly, we have presented the training results first and then the testing results. Besides, we have improved the Material and Methods section, - (R) The plus point of this external validation is that the developed pleural TB model (TPE) was tested in a completely different cohort with different TB/TPE incidence. In this study, although test cohort showed a bit lower accuracy, predictive values, than the Train study cohort. I think these real-life data are highly appreciated, demonstrating the applicability of the machine learning TPE models. This highlights the real-life data testing in the external validation cohort. In clinical practice, tuberculous pleural effusion (TPE) is difficult for diagnosis, particularly in immunocompromised patients and/or limited resource countries with high TB burden. Hence, I raise a clinical question regarding the immune status of the study participants (training and test cohorts). Because HIV co-infection prevalence is highly significant among patients with TPE, do authors have any data about HIV infection or immune testing (CD4 cell counts…etc) in these cohorts? * (A) Thank you for your question. Indeed, tuberculosis is more prevalent amongst patients coinfected with HIV, but in this sense we must state that: - Screening for Human Immunodeficiencty virus (HIV) was performed in all patients diagnosed with any form of tuberculosis in the Gipuzkoa Region following the guidelines of the Tuberculosis Control Program implemented in the Basque Country since 2003. - There were no cases of TPE coinfected with HIV in our series from 2013 to 2022 in Gipuzkoa Region, namely the Training Cohort. Only three patients diagnosed with HIV infection developed pleural effusion through this period, and it was malignant in all cases, as we reported in our prospective project (1). The absence of cases of TPE amongst HIV coinfected patients would have probably been due to the widespread antiretroviral treatment. - Unfortunately, regarding the Testing Cohort in Bajo Deba Health District from 1996 to 2012, patients coinfected with HIV were attended and followed in the Regional Donostia University Hospital. This cohort was retrospective and we only have the information stored at the Bajo Deba Health District Hospital. - Nevertheless, and as pointed out in our first report from 1998 to 2008 (2), ADA accuracy is known to be equally reliable in HIV-positive patients with TPE, even in those with low CD4 T-cell count (3,4), and even in renal transplant recipients (5). o 1) Garcia-Zamalloa A, Vicente D, Arnay R, Arrospide A, Taboada J, Castilla-Rodriguez I, et al. (2021) Diagnostic accuracy of adenosinedeaminase for pleural tuberculosis in a low prevalence setting: A machine learning approach within a 7-year prospective multi-center study. PLoS ONE 16(11): e0259203. https://doi.org/10.1371/journal.pone.0259203 o 2) Garcia-Zamalloa A, et al. (2012) Diagnostic accuracy of adenosine deaminase and lymphocyte proportion in pleural fluid for tuberculous pleurisy in different prevalence scenarios. PLoSONE 7 (6): e38729. o 3) Riantawan P, et al. (1999) Diagnostic value of pleural fluid adenosine deaminase in tuberculous pleuritis with reference to HIV coinfection and a Bayesian analysis. Chest 116: 97-103. o 4) Baba K, et al. (2008) Adenosine deaminase activity is a sensitive marker for the diagnosis of tuberculous pleuritis in patients with low CD4 counts. PLoSONE 3 (7): e2788. o 5) Krenke R, et al. (2010) Use of pleural fluid levels of adenosine deaminase and interferon gamma in the diagnosis of tuberculous pleuritis. Curr Opin Pulm Med 16: 367-375. - (R) In additionally, a short justification for sample size calculation in external validation cohorts enhances the validity of the study as well. * (A) Thank you very much for the suggestion. We have included this information in the manuscript. Besides, we must state that our intention was to include as many pleural effusions as possible in the Testing Cohort, due to the fact that it was retrospective (Bajo Deba 1996-2012). Nevertheless, as expressed in our report from 2021 (1), calculation of minimal sample size was set to 200 patients o 1) Garcia-Zamalloa A, Vicente D, Arnay R, Arrospide A, Taboada J, Castilla-Rodriguez I, et al. (2021) Diagnostic accuracy of adenosinedeaminase for pleural tuberculosis in a low prevalence setting: A machine learning approach within a 7-year prospective multi-center study. PLoS ONE 16(11): e0259203. https://doi.org/10.1371/journal.pone.0259203 - (R) My specific comments: 1. Title: The title fully describes study aim and objectives. * (A) We finally decided to modify the title by following the Reviewer 2´s suggestion: “Machine learning for predicting the diagnosis of tuberculous versus malignant pleural effusion: external validation and accuracy in two different settings”. Thank you. 2. Abstract: The abstract is well written. However, I also think one minor point for amendment, as below: Lines 45 to 47: The authors presented a sample size of the Test cohort with 832 consecutive patients in Bajo Deba health district (1996-2012), but did not show the sample size of the Train cohort from a prospective cohort study (2013-2020). To be consistent and transparent in data, the sample size (how many patients in Train cohort?) should be described (stated) in this section as well. * (A) Thank you for the suggestion. We do find it very reasonable. We have included the number of pleural effusions of the Training cohort from 2013 to 2020, subsequently extended to 2022. 3. Introduction: There is room for improvement in the introduction section, as follows: Line 67: 95% UI needs to be fully written, as readers are not familiar with this term-UI. * (A) Thank you for the suggestion. We have modified it. - (R) Lines 92 to 94: “The model is freely available as an app (at https://pleurapp.ispana.es/) to help other physicians or thoracic surgeons apply this approach when dealing with exudative and lymphocytic pleural effusions” � This information should not be placed in the introduction because it was not connected with the flow of the main idea discussed. This can be relocated to the discussion as appropriate. * (A) Thank you very much for your suggestion. We have moved it into the Discussion chapter as last paragraph. - (R) Lines 99 to 100 in the introduction section: “we compared the 99 diagnostic accuracy of the ML procedure and the classical Bayesian analysis system for TPE in both 100 different clinical scenarios (Bajo Deba 1996-2012 and Gipuzkoa 2013-2022)”. � The authors mentioned two study cohorts to be modelled without brief introduction before. Hence, I recommend a brief introduction (1-2 sentences) of these 2 cohorts in the previous paragraphs. * (A) Thank you for the suggestion. We have included a brief exposition regarding the pleural effusions included in the two cohorts and the two different prevalence settings. Following the amendments of Reviewer 2, we also changed the term “higher prevalence setting” to “different prevalence setting”. 4. Materials and Methods and Results There is room for Materials and Methods section for improvement, regarding data presentation. 4.1 I can understand that the authors aimed to place emphasis on the Test cohort (external validation), so proactively present the Test study cohort as the first group, while the Train study cohort as the second group. Intuitively, this style of data presentation brings readers (like me) to a certain level of confusion and needs to reread and rethink. Therefore, I recommend that the authors present data as routine to characterize the first group = Train cohort, and second group = external validation cohort. This order of presentation should be consistent throughout the manuscript text and Tables. In the Tables 1 and 2, the authors first present: Test to Train cohorts (in sequence); then in Tables 3 and 4: Train to Test cohorts (in sequence) � The transition in data presentation will make readers (eg, like my case) confused and take some time to reread, rethink the study data. Therefore, I highly recommend a consistent presentation style of study data, I prefer training to testing cohort data presentation (from left to right), as the conventional way. * (A) Thank you for the recommendation. We have restructured the text and tables to introduce Gipuzkoa (Training) first and then Bajo Deba (Testing). 4.2 In Lines 213 to 218, the authors do not need to describe detailed data about confusion matrices of all machine learning models one by one, because readers can track all these information in Figures 2 and 4 presented. Only salient features from these data should be stated in the manuscript text. * (A) Thank you for the suggestion. We have added a clarification at the beginning of the paragraph to emphasize that we refer to the TP, FP, TN and FN values of a classification method simply consisting of using the ADA 40 + LP 50 criterion. The aim of this paragraph is to present a comparison with the results obtained by ML models shown in Figures 2 and 4. 4.3 Lines 237-239 and Lines 242-245 “The estimated PPV area is calculated as a function of the pre-test probability (prevalence) using the sensitivity and specificity of each classifier obtained in the training dataset (Gipuzkoa): (sensitivity * prevalence) / ((sensitivity * prevalence) + (1 - specificity) * (1 - prevalence)).” “The real PPVs in Gipuzkoa and Bajo Deba are calculated as the true positives divided by the sum of the true positives and false positives, obtained in each dataset: TP / (TP + FP)” � The formulas should be relocated in the Methods section, as it is more appropriate. * (A) Following your recommendation, we have modified the Material and Methods section to introduce the comparative analysis of ML and Bayesian analysis for estimating positive and negative predictive values as a function of pre-test probability. We have placed the mentioned formulas in this section. 5. Discussion: The discussion is comprehensively discussed and well written. From this study, the machine learning models outperformed the Bayesian modelling, as shown in a different study setting with different prevalence of TPE and malignancy. Conclusion: � I think this is a great study, and minor amendments are suggested to make it more comprehensible for readers. I agree that this study is appropriate for publication. * (A) Many thanks, Best regards, - Reviewer #2: Overall well-written. See attached DOCX file for some reorganizing suggestions. Some of the Results have been included in the Methods section. * (A) Thank you for the amendment. It is true that in the Material and Methods section it is shown a comparative analysis of our data. However, the aim of this analysis is to show that there are no statistically significant differences between both data sets. We prefer to see this analysis as part of the methodology that we followed to validate our Materials (Data), in order to proceed to train and test ML models with this data, rather than threat it as a result by itself. Also, following some suggestions of Reviewer 1, we have expanded the Material and Methods chapter with the methodology followed to train and test the ML models and the comparative analysis of ML and Bayesian analysis for estimating positive and negative predictive values as a function of pre-test probability - (R) The incidence of TB between the two groups is not so significant. * (A) Thank you for the suggestion. We modified the term “higher” prevalence for “different” prevalence. We also changed the title following Reviewer 2’s suggestion and we made some corrections to keep a consistent order in the presentation of the Training and Testing results (in this order). Attachments Attachment Submitted filename: 20250531_response_to_reviewers.docx https://doi.org/10.1371/journal.pone.0329668.r002
21 Jul 2025 Decision Letter - Guocan Yu, Editor Machine learning for predicting the diagnosis of tuberculous versus malignant pleural effusion: external validation and accuracy in two different settings. PONE-D-25-09667R1 Dear Dr. Alberto Garcia-Zamalloa, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Guocan Yu Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #3: All comments have been addressed ******** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #3: Yes ****** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #3: Yes ****** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #3: Yes ****** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #3: Yes ******** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Peer Review for Manuscript PONE-D-25-09667R1 Date: 10-June-2025 “Machine learning for predicting the diagnosis of tuberculous versus malignant pleuraleffusion: external validation and accuracy in two different settings” The revision mauniscript is much improved. All the suggested points have been resolved. I agree that the paper is published. My general comments: In general, the authors presented a well-conducted of external validation study. This research was developed, based on a authors’ recent publication of development prospective cohort model (reported in 2021) of Adenosine deaminase (ADA) for pleural tuberculosis in low tuberculosis (TB) prevalence. The training cohort showed promising results, providing a good rationale for an external validation study. The analysis is very good; however, I think there is room for improvement in writing. To my own perception, I think the structure and presentation of this article writing are still not smooth. The plus point of this external validation is that the developed pleural TB model (TPE) was tested in a completely different cohort with different TB/TPE incidence. In this study, although test cohort showed a bit lower accuracy, predictive values, than the Train study cohort. I think these real-life data are highly appreciated, demonstrating the applicability of the machine learning TPE models. This highlights the real-life data testing in the external validation cohort. In clinical practice, tuberculous pleural effusion (TPE) is difficult for diagnosis, particularly in immunocompromised patients and/or limited resource countries with high TB burden. Hence, I raise a clinical question regarding the immune status of the study participants (training and test cohorts). Because HIV co-infection prevalence is highly significant among patients with TPE, do authors have any data about HIV infection or immune testing (CD4 cell counts…etc) in these cohorts? In additionally, a short justification for sample size calculation in external validation cohorts enhances the validity of the study as well. My review: All my comments were appropriately answered. My specific comments: 1. Title: The title fully describes study aim and objectives. My review: The updated title is accepted. 2. Abstract: The abstract is well written. However, I also think one minor point for amendment, as below: Lines 45 to 47: The authors presented a sample size of the Test cohort with 832 consecutive patients in Bajo Deba health district (1996-2012), but did not show the sample size of the Train cohort from a prospective cohort study (2013-2020). To be consistent and transparent in data, the sample size (how many patients in Train cohort?) should be described (stated) in this section as well. My review: These comments have been amended as appropriate. 3. Introduction: There is room for improvement in the introduction section, as follows: Line 67: 95% UI needs to be fully written, as readers are not familiar with this term-UI. My review: My comment was resolved as appropriate. Lines 92 to 94: “The model is freely 92 available as an app (at https://pleurapp.ispana.es/) to help other physicians or thoracic surgeons apply this approach when dealing with exudative and lymphocytic pleural effusions” � This information should not be placed in the introduction because it was not connected with the flow of the main idea discussed. This can be relocated to the discussion as appropriate. My review: My comment was resolved as appropriate. Lines 99 to 100 in the introduction section: “we compared the 99 diagnostic accuracy of the ML procedure and the classical Bayesian analysis system for TPE in both 100 different clinical scenarios (Bajo Deba 1996-2012 and Gipuzkoa 2013-2022)”. � The authors mentioned two study cohorts to be modelled without brief introduction before. Hence, I recommend a brief introduction (1-2 sentences) of these 2 cohorts in the previous paragraphs. My review: My comment was resolved as appropriate. 4. Materials and Methods and Results There is room for Materials and Methods section for improvement, regarding data presentation. 4.1 I can understand that the authors aimed to place emphasis on the Test cohort (external validation), so proactively present the Test study cohort as the first group, while the Train study cohort as the second group. Intuitively, this style of data presentation brings readers (like me) to a certain level of confusion and needs to reread and rethink. Therefore, I recommend that the authors present data as routine to characterize the first group = Train cohort, and second group = external validation cohort. This order of presentation should be consistent throughout the manuscript text and Tables. In the Tables 1 and 2, the authors first present: Test to Train cohorts (in sequence); then in Tables 3 and 4: Train to Test cohorts (in sequence) � The transition in data presentation will make readers (eg, like my case) confused and take some time to reread, rethink the study data. Therefore, I highly recommend a consistent presentation style of study data, I prefer training to testing cohort data presentation (from left to right), as the conventional way. My review: My comment was resolved as appropriate. 4.2 In Lines 213 to 218, the authors do not need to describe detailed data about confusion matrices of all machine learning models one by one, because readers can track all these information in Figures 2 and 4 presented. Only salient features from these data should be stated in the manuscript text. My review: My comment was resolved as appropriate. 4.3 Lines 237-239 and Lines 242-245 “The estimated PPV area is calculated as a function of the pre-test probability (prevalence) using the sensitivity and specificity of each classifier obtained in the training dataset (Gipuzkoa): (sensitivity * prevalence) / ((sensitivity * prevalence) + (1 - specificity) * (1 - prevalence)).” “The real PPVs in Gipuzkoa and Bajo Deba are calculated as the true positives divided by the sum of the true positives and false positives, obtained in each dataset: TP / (TP + FP)” � The formulas should be relocated in the Methods section, as it is more appropriate. My review: My comment was resolved as appropriate. 5. Discussion: The discussion is comprehensively discussed and well written. From this study, the machine learning models outperformed the Bayesian modelling, as shown in a different study setting with different prevalence of TPE and malignancy. My review: My comment was resolved as appropriate. Conclusion: � I think this is a great study. The revised manuscript resolved all my comments. I agree that this study is appropriate for publication. Many thanks, Best regards, Reviewer #3: Authors have addressed the revisions. All required questions have been answered and that all responses meet formatting specifications ******** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy . Reviewer #1: Yes: Nguyen Tat Thanh, MD PhD Reviewer #3: Yes: Harun Agca ******** https://doi.org/10.1371/journal.pone.0329668.r003
Formally Accepted
Acceptance Letter - Guocan Yu, Editor PONE-D-25-09667R1 PLOS ONE Dear Dr. Garcia-Zamalloa, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Guocan Yu Academic Editor PLOS ONE https://doi.org/10.1371/journal.pone.0329668.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .