Grade prediction of lesions in cerebral white matter using a convolutional neural network

Noriaki Takemura; Yuya Shinkawa; Kazuo Ishii

doi:10.1371/journal.pone.0313516

Peer Review History

Original SubmissionJune 16, 2024
14 Oct 2024 Decision Letter - Xiaohui Zhang, Editor PONE-D-24-21793Grade prediction of lesions in cerebral white matter using a convolutional neural networkPLOS ONE Dear Dr. Ishii, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Nov 28 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Xiaohui Zhang Academic Editor PLOS ONE Journal requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, all author-generated code must be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly ******** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ****** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ****** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ****** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The paper addresses the important topic of automating the prediction of cerebral white matter lesion grades using convolutional neural networks (CNN) and MRI data, which is both timely and relevant. The focus on integrating clinical data, such as hypertension, with MRI images to predict lesion severity is novel and could have significant clinical applications, especially in neuroimaging and diagnostic tools for cerebrovascular diseases. The authors used a comprehensive set of MRI modalities, including T1-weighted, T2-weighted, and FLAIR images, which provided robust input data for the CNN. The study's attempt to optimize image sizes and axes range enhances the precision of the model's performance, resulting in high test accuracy and area under the curve (AUC) scores for different lesion grades. Comments: (1) While the model performs well for grades 0 to 2, the small sample size for grades 3 and 4 (39 and 8 patients, respectively) weakens the generalizability of the results. This imbalance could lead to overfitting, particularly for higher lesion grades, as the model may not have learned enough representative features. Can the authors discuss about this limitation? (2) Can the authors discuss about the generalizability of the proposed model for other populations or MRI devices? (3) Can the authors give some interpretability of the model such as which MRI features or regions contribute most to the model's decisions would increase the clinical utility of the approach in the future? Reviewer #2: In this paper, Takemura et al. established a diagnostic method for cerebral white matter lesions using MRI images and examined the relationship between the MRI images and the medical checkup data using a co-occurrence network diagram. The authors performed detailed evaluations of the performance of the convolutional neural network for the image size, range, and axial position of the MRI images. However, I found many details regarding the neural network experiments that need to be included, and the results lack solid explanation. I recommend that authors to address the concerns below: Major comments: 1. Line 143-145: “Cerebral white matter lesions were evaluated using MRI images categorized into the following five grades.” what quantitative criteria are used to categorize the grades? Are there specific thresholds? Please expand the dividing methods here. 2. Fig 5: a) It is unclear what the neural network’s output is in the paper or how the output images from the neural network are further used to classify the grades of white matter lesions. b) How is the convolutional neural network created? What are the training data and testing data? And what is the loss function used in training? How did the author handle the imbalance in the number of MRI images across different grades? 3. Line 245-253: the definition of the image size is not clear here. Does the author downsample the original image or crop the original image to the target image size? Minor comments: 1. I recommend using arrows to indicate cerebral white matter lesions in the image. This will help the reader see the regions more easily. 2. Line 329: “Figure 18”, a typo; the authors seem to refer to figure 19. 3. Fig 22: the letters inside each circle are too small ****** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ******** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. https://doi.org/10.1371/journal.pone.0313516.r001
Revision 1
16 Oct 2024 Author Response Comments of Reviewer #1: (1) While the model performs well for grades 0 to 2, the small sample size for grades 3 and 4 (39 and 8 patients, respectively) weakens the generalizability of the results. This imbalance could lead to overfitting, particularly for higher lesion grades, as the model may not have learned enough representative features. Can the authors discuss about this limitation? Answer: I added Table 2 to clarify the quantitative criteria to categorize the grade. Results showed Grade distribution as Grade 0: Grade 1: Grade 2: Grade 3: Grade 4 = 551(48.1%): 441(38.5%): 107(9.3%): 39(3.4%): 8(0.7%) in 1146 subjects. That probably reflected real Grade distribution in this hospital and this region. Normally small number of patients shows high Grade variance in the grading. So, the results of Grade 3 and Grade 4 might show the less generalizability because of overfitting, particularly for higher lesion grades. Therefore, there may be a limitation in severe Grades in this Grading model. However, this model is enough practical to conduct in clinical use for screening, because physician can check and diagnose by details inspection. It will be able to be checked and diagnosed by details inspection of patients in such distribution and such number of patients of Grade 3 and Grade 4. If the number of cases the patient will accumulated, the model will be able to get more generalizability. So, this model should be updated to maintain the grading quality for keeping the practical generalizability. Because machine leaning model is data-driven, the model should maintain the grading quality and its generalizability by update the model using additive data and adequate data administration. (2) Can the authors discuss about the generalizability of the proposed model for other populations or MRI devices? Answer: As I mentioned in the previous Q & A, machine learning is normally data-driven equipment. So, the proposed model may be practical in clinical use for other population and MRI devices. However, there is some chips for keeping generalizability because of the data-driven equipment. The grading quality should be maintained by updating using the adequate training data because grading quality is depending on the training data. To apply the proposed model to other populations or MRI devices, the ML user should keep and maintain the grading quality using some techniques such as transfer learning and fine tuning and should evaluate data grading quality. So, the proposed model is not final and not perfect for generalizability. The model should be maintained and updated for keeping practical grading quality. (3) Can the authors give some interpretability of the model such as which MRI features or regions contribute most to the model's decisions would increase the clinical utility of the approach in the future? Answer: Yes, we can add some interpretability of the model using causal inference approach such as logistic regression analysis and bayesian network model instead of co-occurrence network. That is a directed network instead of an undirected network. Using directed network, we can give some interpretability of the model and show the treatment guidelines. In elderly, the patients show multimorbidity and polypharmacy, and it is sometimes very difficult to treat. So, the causal inference approach using directed network, such as bayesian network, should be effective for interpretability of non-communicable diseases (NCDs), such as diabetes mellitus, cardiovascular diseases, cerebrovascular diseases and dementia. Using the medical big data PHRs, we can realize this approach. Unfortunately, in this study, the patient number is not so enough to show the interpretability in Grade 3 and Grade 4. But in Grade 0 - 2, we could show the relationship between hypertension grade and WMH using co-occurrence network. According to above answers, I added the below sentences into the discussion: The grade distribution was as follows: Grade 0: 551 (48.1%), Grade 1: 441 (38.5%), Grade 2: 107 (9.3%), Grade 3: 39 (3.4%), and Grade 4: 8 (0.7%) among 1,146 subjects. This likely reflects the actual grade distribution in this hospital and region. Normally, a small number of patients show high grade variance in the grading. The results for Grades 3 and 4 may indicate less generalizability, probably due to overfitting. Therefore, there may be some limitations in the higher grades within this grading model. However, the model might still be practical for clinical use in screening, as physicians can perform detailed inspections to verify diagnoses. As the number of cases accumulates, the model should be able to achieve greater generalizability. Since machine learning is typically data-driven, the proposed model may be practical for clinical use across different populations and MRI devices. However, there are some challenges in maintaining generalizability. The grading quality should be upheld by continuously updating the model with an appropriate training dataset, as the quality of grading depends on the quality of the training data. To apply the proposed model to different populations or MRI devices, ML users must ensure and maintain grading quality through techniques like transfer learning and fine-tuning, while also evaluating the quality of the data grading. Therefore, the proposed model is neither final nor perfect in terms of generalizability. We can enhance the model's interpretability by using causal inference approaches, such as logistic regression analysis and Bayesian network models, instead of a co-occurrence network. A directed network, as opposed to an undirected network, can provide some interpretability to the model and help in defining treatment guidelines. In elderly patients, who often exhibit multimorbidity and polypharmacy, treatment can be particularly challenging. Therefore, a causal inference approach using directed networks, like a Bayesian network, would be effective in interpreting non-communicable diseases (NCDs) such as diabetes mellitus, cardiovascular diseases, cerebrovascular diseases, and dementia. This approach can be implemented using large-scale medical data from personal health records (PHRs). Unfortunately, in this study, the number of patients was insufficient to demonstrate interpretability for Grades 3 and 4. However, in Grades 0-2, we were able to show the relationship between hypertension grade and white matter hyperintensities (WMH) using a co-occurrence network. Comments of Reviewer #2: Major comments: 1. Line 143-145: “Cerebral white matter lesions were evaluated using MRI images categorized into the following five grades.” what quantitative criteria are used to categorize the grades? Are there specific thresholds? Please expand the dividing methods here. Answer: I added Table 2 as shown below: Cerebral white matter lesions were evaluated using MRI images categorized into the following five grades: grade 0, in which no lesions are detected through grade 4, in which most lesions are observed [13], as shown in Table 2 [4,14]. Table 2. Grading the severity of white matter hyperintensities (WMH) [4,14] Deep and subcortical white matter hyperintensity; DSWMH Grade 0 Absence Grade 1 Punctuate foci (< 3 mm in diameter) Grade 2 Punctuate foci (≥ 3 mm in diameter) Grade 3 Confluence of foci with unclear boundary Grade 4 Large confluent areas And I added Reference 4 of Journal data: Cerebrovasc Dis. 2007;24(2-3):202-9. Major comments: 2. Fig 5: a) It is unclear what the neural network’s output is in the paper or how the output images from the neural network are further used to classify the grades of white matter lesions. Answer: The neural network’s output consists of probabilities for Grade 0, Grade 1, Grade 2, Grade 3, and Grade 4 using the output layer (type=Dense, output=5, activation=softmax). The grade with the highest prediction probability was classified as the predicted grade. b) How is the convolutional neural network created? What are the training data and testing data? And what is the loss function used in training? How did the author handle the imbalance in the number of MRI images across different grades? Answer: How is the convolutional neural network created?: The convolutional neural network was constructed using Keras module (Sequential from keras.models, Conv2D (Convolution layer), MaxPooling2D (Pooling layer), Activation, Dropout (Dropout layer), Flatten (Flatten layer), and Dense (Dense layer) from keras.layers). It consists of thirteen layers, as shown in Figure 5 and Table 5, with the following structure: Convolution, Convolution, Pooling, Dropout, Convolution, Convolution, Pooling, Flatten, Dense, Dropout, Dense, Dropout, Dense. What are the training data and testing data?: Each learning process was performed by mini batch method (batch size=32). And the training data and testing data were constructed automatically by model_selection.training_test_split method of sklearn module. Each batch was automatically and randomly split into (training data: testing data)=3:1. And what is the loss function used in training?: The loss function used in training is categorical_crossentropy from the Keras module. The neural network's output provides probabilities for Grade 0, Grade 1, Grade 2, Grade 3, and Grade 4, using the output layer (type=Dense, output=5, activation=softmax). The grade with the highest prediction probability is classified as the predicted grade. How did the author handle the imbalance in the number of MRI images across different grades? Each grade evaluation in ROC curve was conducted using a dataset consists of (indicated Grade: other Grades)=1:1, e.g. (Grade0: Grade1-4)=1:1. Major comments: 3. Line 245-253: the definition of the image size is not clear here. Does the author downsample the original image or crop the original image to the target image size? Answer: The original MRI image was changed from 30x30 to 300x300 pixels using the Image.resize() method of the Python Imaging Library PIL, and analyzed. According to above answers, I added explanations into Methods and Results: Methods: The convolutional neural network was constructed using the Keras module, specifically Sequential from keras.models, and Conv2D (convolution layer), MaxPooling2D (pooling layer), Activation, Dropout (dropout layer), Flatten (flatten layer), and Dense (dense layer) from keras.layers. It consists of thirteen layers, as shown in Figure 5 and Table 5, with the following structure: Convolution, Convolution, Pooling, Dropout, Convolution, Convolution, Pooling, Flatten, Dense, Dropout, Dense, Dropout, Dense. Each learning process was performed using the mini-batch method (batch size = 32). The training and testing data were automatically generated by the model_selection.train_test_split method from the sklearn module. Each batch was randomly split into a 3:1 ratio for training data and testing data, respectively. The loss function used in training is categorical_crossentropy from the Keras module. The neural network’s output provides probabilities for Grade 0, Grade 1, Grade 2, Grade 3, and Grade 4 through the output layer (type=Dense, output=5, activation=softmax). The grade with the highest predicted probability is classified as the predicted grade. Each grade evaluation in the ROC curve was conducted using a dataset consisting of a 1:1 ratio between the indicated grade and the other grades, e.g., (Grade 0: Grades 1-4)=1:1. Results: The original MRI image was changed from 30x30 to 300x300 pixels using the Image.resize() method of the Python Imaging Library PIL, and analyzed. Minor comments: 1. I recommend using arrows to indicate cerebral white matter lesions in the image. This will help the reader see the regions more easily. Arrows were added into Fig 2 and Fig 3. 2. Line 329: “Figure 18”, a typo; the authors seem to refer to figure 19. Answer: Thank you for your suggestion. I added Fig 18 because I forget to add Figure 18. 3. Fig 22: the letters inside each circle are too small Answer: Fig 22 was modified with large letters. Attachments Attachment Submitted filename: Response to Reviewers.docx https://doi.org/10.1371/journal.pone.0313516.r002
25 Oct 2024 Decision Letter - Xiaohui Zhang, Editor Grade prediction of lesions in cerebral white matter using a convolutional neural network PONE-D-24-21793R1 Dear Dr. Ishii, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Xiaohui Zhang Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed ******** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ****** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ****** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ****** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ****** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: (No Response) Reviewer #2: (No Response) ****** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ******** https://doi.org/10.1371/journal.pone.0313516.r003
Formally Accepted
31 Oct 2024 Acceptance Letter - Xiaohui Zhang, Editor PONE-D-24-21793R1 PLOS ONE Dear Dr. Ishii, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Xiaohui Zhang Academic Editor PLOS ONE https://doi.org/10.1371/journal.pone.0313516.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .