Comparative and complementary use of Bayesian inference and supervised learning for predictive modeling of coffee rust incidence among Kenyan smallholder farmers

Maurice Wanyonyi; Jacqueline Gogo Akelo; Veronicah Nyokabi Njenga; Frankline Obwoge Keraro; Titus Mutua Kioko

doi:10.1371/journal.pclm.0000754

Peer Review History

Original SubmissionOctober 25, 2025
3 Dec 2025 Decision Letter - Noureddine Benkeblia, Editor PCLM-D-25-00394 Integrating Bayesian Inference and Supervised Learning for Predictive Modeling of Coffee Rust Incidence Among Kenyan Smallholder Farmers PLOS Climate Dear Dr. Wanyonyi, Thank you for submitting your manuscript to PLOS Climate. After careful consideration, we feel that it has merit but does not fully meet PLOS Climate’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. ============================== Although the manuscript was found of interest for the filed, however, many concerns have been raised by the reviewers, and authors should address them before reconsidering then suitability of the manuscript for publication. ============================== Please submit your revised manuscript by December 18, 2025. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at climate@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pclm/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. We look forward to receiving your revised manuscript. Kind regards, Noureddine Benkeblia, Dr. Sci., Dr. Agr. Academic Editor PLOS Climate Journal Requirements: 1. Please note that PLOS Climate has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, we expect all author-generated code to be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/climate/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse. 2. In the online submission form, you indicated that The data for this research will be made available upon request All PLOS journals now require all data underlying the findings described in their manuscript to be freely available to other researchers, either 1. In a public repository, 2. Within the manuscript itself, or 3. Uploaded as supplementary information. This policy applies to all data except where public deposition would breach compliance with the protocol approved by your research ethics board. If your data cannot be made publicly available for ethical or legal reasons (e.g., public availability would compromise patient privacy), please explain your reasons by return email and your exemption request will be escalated to the editor for approval. Your exemption request will be handled independently and will not hold up the peer review process, but will need to be resolved should your manuscript be accepted for publication. One of the Editorial team will then be in touch if there are any issues. 3. We ask that a manuscript source file is provided at Revision. Please upload your manuscript file as a .doc, .docx, .rtf or .tex. 4. Please provide separate figure files in .tif or .eps format. For more information about figure files please see our guidelines: https://journals.plos.org/climate/s/figures https://journals.plos.org/climate/s/figures#loc-file-requirements 5. Some material included in your submission may be copyrighted. According to PLOS’s copyright policy, authors who use figures or other material (e.g., graphics, clipart, maps) from another author or copyright holder must demonstrate or obtain permission to publish this material under the Creative Commons Attribution 4.0 International (CC BY 4.0) License used by PLOS journals. Please closely review the details of PLOS’s copyright requirements here: PLOS Licenses and Copyright. If you need to request permissions from a copyright holder, you may use PLOS's Copyright Content Permission form. Please respond directly to this email or email the journal office and provide any known details concerning your material's license terms and permissions required for reuse, even if you have not yet obtained copyright permissions or are unsure of your material's copyright compatibility. Potential Copyright Issues: Figure 1 and 2: Please confirm (a) that you are the photographer; or (b) provide written permission from the photographer to publish the photo(s) under our CC-BY 4.0 license. 6. If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Does this manuscript meet PLOS Climate’s publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.-->?> Reviewer #1: Partly Reviewer #2: Yes ******** 2. Has the statistical analysis been performed appropriately and rigorously?-->?> Reviewer #1: Yes Reviewer #2: Yes ****** 3. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)??> The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.--> Reviewer #1: Yes Reviewer #2: No ****** 4. Is the manuscript presented in an intelligible fashion and written in standard English??> Reviewer #1: No Reviewer #2: Yes ****** Reviewer #1: Dear authors, I had the pleasure of reading your manuscript. Please consider my review, which will be done chapter by chapter: Abstract: Well-structured, technically sound abstract with a clear scientific narrative. However, this abstract mentions both Bayesian hierarchical modeling and supervised learning, but the results section emphasizes only the performance of Logistic Regression. The abstract states that the system “can be replicated and interpreted to predict sustainable coffee disease.” → “predict sustainable coffee disease” is semantically strange. Sustainability refers to the agricultural system, not the disease. Perhaps I would suggest: “...can be replicated and interpreted to support sustainable coffee disease management.” Introduction: The text has a solid foundation and demonstrates good technical and contextual mastery of the topic of coffee leaf rust and statistical and computational modeling approaches. The first paragraph is confusing and redundant. The opening sentence contains repetitions and lacks fluidity: “...which causes a reduction of up to half of the yield... is coffee leaf rust (Hemileia vastatrix) that causes an up to fifty percent yield loss...” You are repeating “causes” and “up to fifty percent yield loss” redundantly. Suggested rewording: “One of the most devastating pathogens affecting Arabica coffee in the tropics is Hemileia vastatrix, the causal agent of coffee leaf rust, responsible for yield losses of up to 50% and severe socioeconomic impacts across affected regions.” It seems that lines 18–22 and 20 repeat the same idea?!. “Support Vector Form” → perhaps it should be: “Support Vector Machine (SVM)”. “control precision coffee disease” → semantically incorrect. Rewrite: “...into a scalable decision-support system for precision coffee disease management.” The introduction itself contains several long sentences, which could be improved by simplifying and dividing them to make them easier to read. The text should follow the recommended logical structure Paragraph 1: Context and importance (disease, losses, economic relevance). Paragraph 2: Limitations of current methods and recent advances (ML, remote sensing). Paragraph 3: Specific gaps (lack of uncertainty, hierarchy, interpretability). Paragraph 4: Study proposal (hybrid model, main objectives, and contributions). The Materials and Methods section has a good ethical and methodological basis, but there is room for improvement. A positive point is that ethics and data management are well addressed. The same problem as in the introduction: some sentences are redundant and long. The description of the study (“varied ecological gradient” → “providing a homogeneous and representative basis”) is contradictory—if the gradient is varied, it is not homogeneous. Study Variables The concepts of “rust incidence” and “rust severity” need further definition (there may be ambiguity as to whether both were used or only one). Figure 2 appears blurred; a better quality photo should be submitted. Some issues that should be addressed by the authors. Were both used in different models (e.g., binomial for incidence and continuous for severity)? If only one was modeled, the other should be briefly mentioned as a complementary or validation variable. If these questions are answered later in the manuscript, the authors can ignore this part. What was the source and resolution of the climate variables (local weather stations, sensors, interpolation, etc.)? Were continuous variables normalized before model fitting (z-score, min-max)? Was multicollinearity checked? The Data proceeding section This is one of the most comprehensive methodological sections shared. It covers the entire pipeline in detail: preprocessing, balancing, specification of supervised and Bayesian models, validation, and scalability analysis. Good work by the authors. However, the text needs a thorough revision to improve clarity and narrative flow (long and redundant sentences). Spatio-temporal cross-validation: Very good concept, but it would be useful to indicate the number of blocks and spatial granularity. But... “Were folds defined by county boundaries or spatial buffers? How many folds were used?” Class Distribution and SMOTE: Percentage data could be presented as proportions: → “6,295 YES and 3,555 NO” → “64% positive, 36% negative cases.” The authors should try to answer the following questions (ignore if they are answered later): • Which hyperparameters were optimized? • What technique was used (grid search, Bayesian optimization, cross-validation)? • What software or library was used (e.g., scikit-learn 1.4.2, R caret, TensorFlow)? Results: • Distribution and dispersion: “Were all predictors approximately normally distributed? Were any variables transformed (e.g., logarithm of precipitation or distance) prior to modeling?” • Representativeness: “Do these means differ substantially across municipalities, or were they aggregated across all locations? Could aggregation mask local effects?” • Temporal dependence: “Given that lagged incidence and history of previous outbreaks were included, how many time points per farm were available on average?” • Spatial autocorrelation: “Was the high standard deviation in distance (813 m) indicative of clustered infection foci?” Regarding disease incidence: Were formal tests of difference between groups (infected vs. uninfected) applied? → This is crucial to support the argument of “significant differences.” I still maintain that the results section and the entire text are scientifically rigorous, but they need to be revised in terms of English, as they require a great deal of concentration from the reader. Discussion: There is good consistency between results and literature. Methodological limitations are acknowledged, which reinforces transparency. There appears to be no mention of statistical limitations. → No reference to possible overfitting or imbalance between classes (incidence vs. absence). I would suggest that the authors restructure the discussion text as follows: • 4.1 Environmental and spatial drivers • 4.2 Management and agronomic implications • 4.3 Methodological contributions and uncertainty quantification 4.4 Limitations and future work Conclusion: • The conclusion mixes a summary of findings, methodological insights, and implications without clear sections. • It is suggested to divide it into three blocks: • Summary of key findings • Methodological and practical implications • Broader significance and future potential References: References must follow the Vancouver style bibliography, as stated in the journal guidelines: https://journals.plos.org/plosone/s/submission-guidelines#loc-style-and-format Most are directly related to Hemileia vastatrix, machine learning, and agricultural applications. Good integration between classic and recent studies (2020–2025). References 5, 17, 18 (preprints / medRxiv / Authorea) • Problem: Preprints and servers such as Authorea or medRxiv are not yet peer-reviewed. • Suggestion: Keep in the text, but explicitly indicate “preprint (not peer-reviewed)” in the body of the article, or replace with published versions, if available. Coding or OCR errors: “Vel’asquez” → Velásquez; “S’anchez” → Sánchez; “Oca na-Zu niga” → Ocaña-Zúñiga. Note that some references do not have a doi or pagination. Reviewer #2: This is a technically sound paper that shows the role of modeling applied climate phenomena in a multidisciplinary context. The quantitative approach and focus on machine learning tools, and how their potential as predictive tools is well explained. This reviewer will encourage the authors to expand a little more on the climate aspects of the research. This will be more in line with the audience of this journal. The paper could expand a bit more on the science behind the microclimatic factors discussed. The authors present robust results from their analyses which are significant. Result presentation in figures, tables, and within paragraphs are clearly used. The paper will be better served from an ease of reading perspective if less repetition of the results a don across all three presentation options. Data presented in tables should not be presented in figures, graphically and in paragraph form. the reviewer will encourage the paper to provide some of the raw data used for the analyses. It may also help the paper if the epidemiology and ecological context of coffee rust disease was expanded on so readers not familiar with the disease and pathogen get a better picture of the subject matter. ****** what does this mean?). If published, this will include your full peer review and any attached files.). If published, this will include your full peer review and any attached files.). If published, this will include your full peer review and any attached files.). If published, this will include your full peer review and any attached files. Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public.If you choose “no”, your identity will remain anonymous but your review may still be made public.If you choose “no”, your identity will remain anonymous but your review may still be made public.If you choose “no”, your identity will remain anonymous but your review may still be made public. For information about this choice, including consent withdrawal, please see our Privacy Policy..--> Reviewer #1: Yes: Ricardo RamosRicardo RamosRicardo RamosRicardo Ramos Reviewer #2: No ******** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] https://doi.org/10.1371/journal.pclm.0000754.r001
Revision 1
10 Dec 2025 Author Response Attachments Attachment Submitted filename: Response to Reviewers.docx https://doi.org/10.1371/journal.pclm.0000754.r002
18 Mar 2026 Decision Letter - Daniel Parkes, Editor PCLM-D-25-00394R1 Integrating Bayesian Inference and Supervised Learning for Predictive Modeling of Coffee Rust Incidence Among Kenyan Smallholder Farmers PLOS Climate Dear Dr. Wanyonyi, Thank you for submitting your manuscript to PLOS Climate. After careful consideration, we feel that it has merit but does not fully meet PLOS Climate’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please pay particular attention to the new reviewers comments as they highlight methodological concerns that must be addressed. Please submit your revised manuscript by May 01 2026 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at climate@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pclm/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. We look forward to receiving your revised manuscript. Kind regards, Daniel Parkes, PhD Staff Editor PLOS Climate Journal Requirements: If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise. Additional Editor Comments (if provided): [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author Reviewer #1: All comments have been addressed Reviewer #3: All comments have been addressed Reviewer #4: (No Response) ******** publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.-->?> Reviewer #1: Partly Reviewer #3: No Reviewer #4: Yes ****** 3. Has the statistical analysis been performed appropriately and rigorously?-->?> Reviewer #1: Yes Reviewer #3: No Reviewer #4: Yes ****** 4. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)??> The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.--> Reviewer #1: Yes Reviewer #3: Yes Reviewer #4: Yes ****** 5. Is the manuscript presented in an intelligible fashion and written in standard English??> Reviewer #1: Yes Reviewer #3: Yes Reviewer #4: Yes ****** Reviewer #1: Dear authors. I had the pleasure of reading your work. I will analyze it chapter by chapter: Abstract and work: initially, it seems to be solid work with considerable merit, with real large-scale data and in the particular case of Africa. Introduction: it is relevant and well-founded with bibliography, however, the authors should pay attention to some writing problems. For example: list of “four long-standing weaknesses” Point (4) is truncated and repeats part of (2). “Support Vector Form” should not be:→ it should be Support Vector Machine. SHAP and PDP are not interpretability models, they are post-hoc explanation tools. I advise the authors to reread and correct these and other minor details. The materials section is generally acceptable. However, there seems to be some minor confusion. The authors say: “providing a homogeneous and representative basis for predictive modeling,” but they say that the counties are diverse in altitude, climate, and management. So, it is not homogeneous, but perhaps heterogeneous? Class distribution SMOTE: This section is ambitious and well-informed, but it has methodological problems. For example, the problematic use of SMOTE in spatiotemporal data, but SMOTE is not appropriate for spatiotemporal epidemiological data unless explicitly adapted or justified. The authors should also report whether SMOTE was applied after the training-test split, whether it was reapplied in each fold of the cross-validation. Whether it was used only for ML or also for the Bayesian model. Discussion and conclusion: The Discussion and Conclusion insist on: “universal” “generalized to tropical crops” “climate-adaptive agriculture broadly” But the problem is that the study is Kenya-specific. There is no external validation. There is no demonstrated empirical transferability. The same message appears in: Discussion (paragraphs 1, 4, 6), please shorten as much as possible as the article is already very long The Limitations and Future Work section is the best part, but: it does not mention SMOTE, it does not mention possible observation biases, it does not mention residual spatial dependence. It seems defensive on the part of the authors, not critical. I would advise the authors to Replace “hybrid integration” with: “comparative and complementary use of Bayesian and supervised learning approaches.” Explicitly admit: limitations of SMOTE, limitations of probabilistic calibration, absence of external validation. Bibliographic references: Solid, problem-free references; these are good, mainstream, and appropriate. Some problems: 13. Castro et al., 2023 – Unpublished/Preprint Serious problem Unpublished/preprint should not be used as evidence in a Discussion or Conclusion. Only acceptable if clearly identified as preliminary work. The same to 18 !! Reviewer #3: The manuscript proposes a hybrid Bayesian-machine learning framework for predicting coffee leaf rust incidence in Kenyan smallholder systems. Below are some comments on the paper: 1. The abstract and several results sections claim that Logistic Regression was "the most accurate predictive algorithm." However, Table 6 shows that Random Forest achieved the highest accuracy (0.7843) compared to Logistic Regression (0.7792). What Logistic Regression actually led in was AUC-ROC (0.867). The authors confuse or use interchangeably the terms "accuracy" and "AUC-ROC," which are different metrics. This imprecision carries through from the abstract to the discussion. 2. In Table 2 (logistic regression), "Past Outbreak History (Yes = 1)" has a positive coefficient (β = 0.53, p < .001), indicating that a history of previous outbreaks increases the probability of infection. In Table 4 (Bayesian hierarchical model), the same predictor has a negative coefficient (β = −0.453, HDI [−0.657, −0.237]), indicating the opposite effect. This direct contradiction between the two core models of the study is neither discussed nor explained anywhere in the manuscript. 3. Table 4 includes predictors such as "Management Intensity," "Farm Density," "Canopy Structure," and "Soil Moisture Retention" that do not appear in the "Study Variables" section and are not defined anywhere in the manuscript. There is no explanation of how these variables were constructed or why the Bayesian model uses a different set of predictors than the supervised models, compromising the comparability that the authors claim between both approaches. 4. The authors report that after applying SMOTE, the training distribution consisted of 4,925 instances per class (lines 327–328). If the partition was an 80:20 stratified split over 9,850 observations, the training set would contain approximately 5,036 instances of the majority class (80% of 6,295). SMOTE equalizes the minority class to the level of the majority class, so the expected number would be 5,036 per class, not 4,925. 5. References 18 and 19, both non-peer-reviewed preprints by the first author (Wanyonyi M), deal with coronary heart disease prediction and not with agriculture, coffee, or plant pathology. They are cited to justify the use of SMOTE, a well-established technique with numerous published references in indexed journals. 6. The authors emphasize the use of spatio-temporal cross-validation for the supervised machine learning models, but the Bayesian hierarchical model was evaluated using standard LOO-CV and WAIC (Table 5), which do not account for the spatial structure of the data. Given that spatial dependence is a central theme of the manuscript and motivated the block cross-validation design, evaluating the Bayesian model with techniques that ignore this structure represents a methodological inconsistency. 7. The authors attribute the low performance of ANN (accuracy = 0.7452) to the "lack of a significant amount of features and the lack of a deep learning architecture." This explanation is contradictory: if the features are insufficient, a deeper architecture would not resolve the problem. Furthermore, the authors used a network with a single hidden layer (Equation 6) but do not justify why more complex architectures were not explored, nor do they discuss whether the issue is truly the model's capacity or its regularization. 8. In the "Model Evaluation and Scalability Analysis" section, the authors state that "Calibration was measured using the Brier score." However, the numerical value of the Brier score is not reported in any table or results text for any model. The only calibration assessment presented consists of the calibration curves in Figure 13, which are qualitative. 9. The "Model Scalability and Deployment Feasibility" section compares the eight supervised models in terms of training time, model size, inference latency, and RAM usage. However, the Bayesian hierarchical model, which requires HMC with 4 chains and 2,000 iterations, is likely the most computationally expensive model in the study and is entirely excluded from this analysis. 10. The authors describe a cross-validation strategy with six spatial blocks (counties) and five temporal folds within each block, which would imply 30 training-validation combinations. However, the results presented in Table 6 show a single value per metric per model, not 30 values or averages with standard deviations. No table or figure is presented to demonstrate that this validation strategy was actually implemented, nor is there any discussion of whether performance varies across counties or time periods. The note in Table 6 mentions "held-out test data from stratified cross-validation," which contradicts the description of spatio-temporal block cross-validation. 11. The title, abstract, and Figure 3 present the study as a "hybrid model" that integrates both approaches. However, in the actual implementation, the Bayesian model and the supervised models are trained and evaluated entirely independently. No actual integration mechanism is described, such as stacking, weighted ensemble, transfer of posterior probabilities as features, or joint calibration. Figure 3 includes a block labeled "Hybrid combination (stacking / weighted ensemble)" that has no correspondence with any procedure described in the methods or any result presented. The framework is not hybrid in its execution; it is a parallel comparison of two independent approaches presented under an integration label that never materializes. Reviewer #4: This manuscript presents a comprehensive and methodologically rigorous framework for predicting coffee leaf rust (Hemileia vastatrix) incidence in Kenyan smallholder farming systems. The authors integrate Bayesian hierarchical logistic regression with multiple supervised machine learning algorithms and modern interpretability tools (partial dependence and SHAP analyses). The study addresses an important and timely problem at the intersection of plant disease epidemiology, climate risk, and data driven decision support for smallholder agriculture. Major Strength of the study is the high-quality dataset and validation strategy: The longitudinal dataset (9,850 plot time observations across six counties from 2018–2023) is rich for this context. The use of spatio temporal block cross validation appropriately addresses spatial and temporal dependence, strengthening the credibility of the performance estimates. Major Comments 1. Although the manuscript is positioned as a hybrid Bayesian–machine learning framework, the results consistently show that logistic regression performs as well as or better than more complex machine learning methods across discrimination, calibration, and computational efficiency. This is an important and interesting finding, but the framing could be clarified. It remains somewhat ambiguous whether the hybrid framework is intended as an integrated predictive system or primarily as a comparative modelling exercise. The authors should more explicitly state that one of the key insights of the study is that interpretable models can outperform or match complex learners in this application. This could be emphasized more clearly in the Abstract, Results synthesis and Conclusion. 2. The manuscript is thorough, but several sections, particularly within the results are overly detailed and repetitive. Numerical values presented in tables are frequently restated in the text. Some figure descriptions repeat information already conveyed clearly elsewhere without adding interpretive value. Condense descriptive sections and focus the narrative more strongly on interpretation, comparison, and implications. This would improve readability without sacrificing scientific rigor. Minor Comments 1. Some sentences, particularly in the Introduction and Discussion, are long and could be simplified. 2. Minor typographical and stylistic inconsistencies remain and should be corrected during final editing. 3. A brief clarification of why higher resolution measurements were unavailable would further strengthen the limitations section. Figures 4. The number of figures is high; the authors may wish to consider moving some descriptive figures to supplementary material. ****** what does this mean?). If published, this will include your full peer review and any attached files.). If published, this will include your full peer review and any attached files.). If published, this will include your full peer review and any attached files.). If published, this will include your full peer review and any attached files. Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public.If you choose “no”, your identity will remain anonymous but your review may still be made public.If you choose “no”, your identity will remain anonymous but your review may still be made public.If you choose “no”, your identity will remain anonymous but your review may still be made public. For information about this choice, including consent withdrawal, please see our Privacy Policy..--> Reviewer #1: No Reviewer #3: No Reviewer #4: No ******** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] https://doi.org/10.1371/journal.pclm.0000754.r003
Revision 2
27 Mar 2026 Author Response Attachments Attachment Submitted filename: Response to Reviewers.pdf https://doi.org/10.1371/journal.pclm.0000754.r004
1 Apr 2026 Decision Letter - Girma Gezimu Gebre, Editor Comparative and Complementary Use of Bayesian Inference and Supervised Learning for Predictive Modeling of Coffee Rust Incidence Among Kenyan Smallholder Farmers PCLM-D-25-00394R2 Dear Dr. Wanyonyi We are pleased to inform you that your manuscript 'Comparative and Complementary Use of Bayesian Inference and Supervised Learning for Predictive Modeling of Coffee Rust Incidence Among Kenyan Smallholder Farmers' has been provisionally accepted for publication in PLOS Climate. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow-up email from a member of our team. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they'll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact climate@plos.org. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Climate. Best regards, Girma Gezimu Gebre, PhD Academic Editor PLOS Climate ********************************************************* Additional Editor Comments (if provided): Reviewer Comments (if any, and for reference): Reviewer's Responses to Questions Comments to the Author Reviewer #1: All comments have been addressed Reviewer #4: All comments have been addressed ****** publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.-->?> Reviewer #1: Yes Reviewer #4: Yes ****** 3. Has the statistical analysis been performed appropriately and rigorously?-->?> Reviewer #1: Yes Reviewer #4: Yes ****** 4. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)??> The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.--> Reviewer #1: Yes Reviewer #4: Yes ****** 5. Is the manuscript presented in an intelligible fashion and written in standard English??> Reviewer #1: Yes Reviewer #4: Yes ****** Reviewer #1: In the future, the authors should respond, topic by topic, to each of the reviewers. The authors limited themselves to submitting a new paper with the overall result of 3 revisions, which makes it very difficult to evaluate the article again, especially after so much time has passed since the first submission. Even so, I believe that the article meets the minimum requirements for publication. Reviewer #4: (No Response) ****** what does this mean?). If published, this will include your full peer review and any attached files.). If published, this will include your full peer review and any attached files.). If published, this will include your full peer review and any attached files.). If published, this will include your full peer review and any attached files. Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public.If you choose “no”, your identity will remain anonymous but your review may still be made public.If you choose “no”, your identity will remain anonymous but your review may still be made public.If you choose “no”, your identity will remain anonymous but your review may still be made public. For information about this choice, including consent withdrawal, please see our Privacy Policy..--> Reviewer #1: No Reviewer #4: No ******** https://doi.org/10.1371/journal.pclm.0000754.r005

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .