Imaging strategies for follow-up during adjuvant nivolumab in esophageal cancer: A multicenter retrospective cohort study

Tamara J. Huizer; Anniek Strijdhorst; Laurens V. Beerepoot; Mark I. van Berge Henegouwen; Leni van Doorn; Sarah Derks; Bas P. L. Wijnhoven; Bianca Mostert; Hanneke W. M. van Laarhoven

doi:10.1371/journal.pone.0350105

Peer Review History

Original SubmissionDecember 3, 2025
20 Feb 2026 Decision Letter - Zhanzhan Li, Editor -->PONE-D-25-61933-->-->Optimizing imaging strategies for adjuvant nivolumab in esophageal cancer: the planning of scanning-->-->PLOS One Dear Dr. Strijdhorst, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Apr 06 2026 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:--> A letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Zhanzhan Li Academic Editor PLOS One Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Thank you for stating the following in the Competing Interests section: “The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Disclaimer: TJH and AS have no discolures. HvL reports: Research funding and/or medication supply: Amphera, Anocca, Astellas, AstraZeneca, Beigene, Boehringer, BMS,Daiichy-Sankyo, Dragonfly, MSD, Myeloid, ORCA, Servier; Consultant/advisory role: Auristone, Incyte, Merck, Myeloid, Servier; Speaker role: Astellas, Beigene, Benecke, BMS, Daiichy-Sankyo, JAAP, Medtalks, Novartis, Springer, and Travel Congress Management B.V. BM reports: Research funding and/or medication supply: BMS, Pfizer; Consultant/advisory role: Lilly, AstraZeneca; Speaker role: Servier, BMS, Amgen. SD reports: a consultant or advisory role for BMS (related to checkpoint inhibitors); research funding, medication supply, or both from Incyte (related to checkpoint inhibitors); and speaker roles for Servier, BMS, and Benecke. LVB reports: speaker role: Medtalks, BMS, Servier, Travel Congress Management. BW reports: Research funding BMS, consulting and speaker fee Medtronic, speaker role Travel Congress Management B.V. MvBH declares consultancies for Johnson and Johnson, Stryker, BBraun Intuitive and Medtronic. All fees and grants paid to institution.” Please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials, by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests). If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared. Please include your updated Competing Interests statement in your cover letter; we will change the online submission form on your behalf. 3. We note that your Data Availability Statement is currently as follows: “All relevant data are within the manuscript and its Supporting Information files.” Please confirm at this time whether or not your submission contains all raw data required to replicate the results of your study. Authors must share the “minimal data set” for their submission. PLOS defines the minimal data set to consist of the data required to replicate all study findings reported in the article, as well as related metadata and methods (https://journals.plos.org/plosone/s/data-availability#loc-minimal-data-set-definition). For example, authors should submit the following data: - The values behind the means, standard deviations and other measures reported; - The values used to build graphs; - The points extracted from images for analysis. Authors do not need to submit their entire data set if only a portion of the data was used in the reported study. If your submission does not contain these data, please either upload them as Supporting Information files or deposit them to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of recommended repositories, please see https://journals.plos.org/plosone/s/recommended-repositories. If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent. If data are owned by a third party, please indicate how others may request data access. 4. Your ethics statement should only appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please delete it from any other section. 5. If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions -->Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. --> Reviewer #1: Partly Reviewer #2: Yes Reviewer #3: No ******** -->2. Has the statistical analysis been performed appropriately and rigorously? --> Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: No ****** -->3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.--> Reviewer #1: No Reviewer #2: Yes Reviewer #3: No ****** -->4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.--> Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ****** -->5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)--> Reviewer #1: Dear Authors, Thank you for submitting this manuscript evaluating CT imaging strategies during adjuvant nivolumab treatment for esophageal cancer. This study addresses a clinically relevant question that has practical implications for patient follow-up after curative-intent treatment. The multi-center design and real-world data complement the pivotal CheckMate 577 trial findings. I have carefully reviewed your manuscript against PLOS ONE publication criteria. Below I provide detailed feedback organized by major issues, minor issues, and specific recommendations for improvement. MAJOR ISSUES MAJOR ISSUE 1: DATA AVAILABILITY STATEMENT CONTRADICTION There is a contradiction between your submission form responses and the manuscript text regarding data availability: Submission Form (Page 4-5): - "Yes - all data are fully available without restriction" - "All relevant data are within the manuscript and its Supporting Information files." Manuscript Text (Lines 283-286): - "Due to the nature of the study, in which exception consent was obtained from participants, the data used in this research cannot be made publicly available. Data sharing is not permitted as per the consent agreement." RECOMMENDATION: Option A - If data truly cannot be shared publicly: 1. Correct the submission form by selecting "No" for unrestricted availability 2. Revise the Data Availability Statement to read: "Data cannot be shared publicly due to ethical restrictions related to participant consent. The ethics approval (MEC2023-0631) specified that individual-level data would not be publicly released. Requests for de-identified, aggregated data may be directed to the corresponding author (Anniek Strijdhorst, [email]) and will require approval from the Medical Research Ethical Committee of Erasmus University Medical Center Rotterdam." 3. Specify what data CAN be shared (e.g., aggregated statistics, additional subgroup analyses upon request) Option B - If data can be made available: 1. Prepare a de-identified dataset removing all patient identifiers 2. Deposit in a public repository (e.g., Dryad, Figshare, or institutional repository) 3. Include the DOI or accession number in your Data Availability Statement 4. Ensure the manuscript text matches the submission form I recommend Option A given your ethics approval conditions, but ensure all statements are consistent throughout the submission. MAJOR ISSUE 2: CONCLUSIONS EXCEED WHAT THE DATA CAN SUPPORT The manuscript title, abstract, and conclusions make claims that exceed what a retrospective, non-randomized observational study can establish: Title: "Optimizing imaging strategies..." Abstract (Lines 46-47): "Routine CT imaging every 4 months, starting at 4 months after surgery, effectively detects recurrences during adjuvant nivolumab treatment while reducing unnecessary imaging." Conclusion (Lines 268-274): Implies the 4-month interval is the recommended or optimal strategy. However, your data show that both imaging strategies produced equivalent outcomes: Q3M group (n=67): - 12 patients (17.9%) developed recurrence - 8 of 12 recurrences (66.7%) detected by routine CT - Diagnostic yield: 5% at 3 months, 8% at 6 months, 3% at 9 months Q4M group (n=84): - 15 patients (17.9%) developed recurrence - 10 of 15 recurrences (66.7%) detected by routine CT - Diagnostic yield: 9% at 4 months, 8% at 8 months Both strategies showed nearly identical recurrence rates (~18%) and detection rates (~67%). Your study design cannot determine which is superior because: 1. Patients were not randomized to imaging intervals 2. Institutional practice determined interval (potential confounding) 3. Sample sizes preclude detecting meaningful differences 4. No formal statistical comparison between groups was performed RECOMMENDATION: 1. Revise the title to: "Characterizing imaging strategies for adjuvant nivolumab in esophageal cancer: a multi-center retrospective cohort study" OR "Imaging follow-up patterns during adjuvant nivolumab for esophageal cancer: real-world data from three Dutch centers" 2. Revise the abstract conclusion (Lines 46-51) to: "In this retrospective cohort, routine CT imaging at 4-month intervals detected the majority of on-treatment recurrences. The gradual decline in disease-free survival suggests that recurrences are distributed over time rather than clustered at specific time points. These real-world data may help inform, but cannot definitively establish, optimal follow-up intervals. Prospective studies comparing imaging strategies are needed." 3. Revise the Discussion conclusion (Lines 268-279) to explicitly state: "This observational study cannot determine the optimal imaging interval. Our findings describe current practice patterns and outcomes, which may inform clinical decision-making pending prospective comparative studies." 4. Add a statement acknowledging that both Q3M and Q4M showed equivalent outcomes in your cohort (12/67 vs 15/84 recurrences; 8/12 vs 10/15 detected by routine CT), and that selection of imaging interval should consider institutional resources, patient preferences, and clinical judgment rather than presumed superiority of one approach. MAJOR ISSUE 3: MULTIPLE DATA DISCREPANCIES Several numerical inconsistencies were identified that must be corrected: DISCREPANCY 1 - Patient Age: Line 160: "The nivolumab cohort included 151 patients, with a median age of 69 years (IQR, 60-72)..." Table 1: "Median age (range) - year: 66 [60-72]" The IQR values match (60-72), but the median differs by 3 years (69 vs 66). DISCREPANCY 2 - Lymph Node Status P-Value: Line 193-194: "both the early and late recurrence groups had a higher proportion of patients with ≥N1 disease post-surgery (p = 0.039)" Table 4: Shows p = 0.021 for "Pathological lymph-node status post-surgery" These p-values differ substantially (0.039 vs 0.021) for what appears to be the same analysis. I suspect Table 1 (66 years) is correct for age based on the IQR, and Table 4 (p=0.021) may be correct for the lymph node analysis, with the text containing typographical errors - but please verify from original data. MAJOR ISSUE 4: STROBE CHECKLIST NOT PROVIDED Lines 100-101 state: "Data collection and reporting followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines." However, no STROBE checklist is included with the submission. PLOS ONE recommends (and for some study types requires) completed reporting checklists as supplementary material. RECOMMENDATION: 1. Complete the STROBE checklist for cohort studies (22 items) Available at: https://www.strobe-statement.org/checklists/ 2. For each item, indicate: - The page/line number where the item is addressed, OR - "N/A" with brief explanation if not applicable 3. Submit as "S1_STROBE_Checklist.pdf" or similar 4. Add a statement to Methods: "A completed STROBE checklist is provided as Supporting Information (S1 Checklist)." MAJOR ISSUE 5: INCOMPLETE LIMITATIONS DISCUSSION While the Discussion acknowledges some limitations (Lines 228-231, 238-243), several important limitations are not explicitly addressed: 1. INABILITY TO COMPARE IMAGING STRATEGIES: The non-randomized design means institutional preference determined imaging interval, not randomn assignment. Any differences (or lack thereof) could reflect selection bias or confounding rather than true equivalence. 2. LEAD-TIME BIAS: Earlier or more frequent detection of recurrence through imaging may not translate to improved outcomes. Patients detected earlier appear to survive longer from diagnosis simply because diagnosis occurred earlier, not because of any true survival benefit. 3. DETECTION BIAS: The 67% detection rate by routine CT may be influenced by the timing of scans. Some "symptom-driven" detections may have been caught on the next routine scan regardless. 4. LIMITED GENERALIZABILITY: The cohort is predominantly male (82%, 124/151), adenocarcinoma (86%, 130/151), from Dutch academic centers. Results may not generalize to other populations, histologies, or healthcare settings. 5. EVOLVING TREATMENT LANDSCAPE: With FLOT becoming standard for adenocarcinoma, the population eligible for nCRT + adjuvant nivolumab is changing, potentially limiting future applicability. MINOR ISSUES MINOR ISSUE 1: TYPOGRAPHICAL AND FORMATTING ERRORS Location: Line 94-95 Issue: Awkward sentence structure Current: "...an incomplete pathological response and received at least one cycle of adjuvant nivolumab." Suggested revision: "...incomplete pathological response, and who received at least one cycle of adjuvant nivolumab." Location: Throughout Issue: Inconsistent spacing around punctuation and parentheses (For example line 66, 72 and 95) Recommendation: Careful proofreading for formatting consistency Cite line 68-69 MINOR ISSUE 2: PD-L1 ANALYSIS - CONFUSING PRESENTATION AND INTERPRETATION The PD-L1 subgroup analysis in Table 4 has confusing presentation: Table 4 shows for "PD-L1 CPS <5": - Early recurrence (0-6 mo): 7/11 (33) - Late recurrence (7-12 mo): 1/1 (17) - No recurrence: 8/29 (7) - P = 0.001 The format "7/11 (33)" appears to show: - Numerator/Denominator with available PD-L1 data - Percentage in parentheses refers to % of TOTAL group (7/21 = 33%) This is confusing because: - 7/11 = 63.6% of patients WITH PD-L1 data had CPS <5 - But (33) shows 33% of the total early recurrence group (7/21) - Similarly, 8/29 = 27.6% but (7) shows 8/124 = 6.5% The more clinically meaningful comparison is among patients WITH available PD-L1 data: - Early recurrence: 7/11 (63.6%) had CPS <5 - No recurrence: 8/29 (27.6%) had CPS <5 Additionally, this is based on very small numbers (only 41 patients had PD-L1 data available per Line 195), making the p=0.001 finding potentially unreliable. RECOMMENDATION: 1. Clarify the Table 4 presentation - specify whether percentages refer to: - Proportion of those with available PD-L1 data, OR - Proportion of the total group 2. Consider presenting both: "7/11 (63.6% of those tested; 33.3% of early recurrence group)" 3. More explicitly frame as hypothesis-generating in the text (Lines 195-196): "In an exploratory analysis limited by small sample size (n=41 with available PD-L1 data), patients with early recurrence were more likely to have CPS <5 (7/11, 63.6%) compared to those without recurrence (8/29, 27.6%), p=0.001. This finding should be interpreted with caution given the small numbers and requires validation in larger cohorts." MINOR ISSUE 3: CT READING METHODOLOGY NOT SPECIFIED The manuscript does not describe how CT scans were interpreted: - Single radiologist or consensus reading? - Were readers blinded to clinical information? - Were standardized criteria used for defining recurrence on imaging? RECOMMENDATION: Add to Methods section (around Line 121-125): "CT scans were interpreted by [board-certified radiologists / the treating institution's radiology department] according to standard clinical practice. Recurrence was defined as [new lesions suspicious for malignancy / RECIST criteria / clinical judgment]. [Readers were/were not blinded to clinical symptoms at time of interpretation.]" MINOR ISSUE 4: MISSING COMPARISON TABLE FOR SELECTION BIAS ASSESSMENT Of 229 eligible patients, only 151 (66%) received nivolumab. The 78 patients who did not receive treatment may differ systematically from treated patients. Understanding these differences is important for assessing selection bias. The reasons for not starting treatment are listed (Lines 155-157): - Poor performance status: n=22 - Patient preference: n=33 - Contraindications: n=5 - Post-operative death: n=7 - Pre-treatment recurrence: n=11 However, baseline characteristics of these patients are not compared to treated patients. RECOMMENDATION: Add a supplementary table (S1 Table) comparing baseline characteristics between: - Patients who received nivolumab (n=151) - Eligible patients who did not receive nivolumab (n=78) Include: age, sex, histology, stage, pathological response, performance status, and reason for not receiving treatment. This allows readers to assess whether the treated cohort is representative of the broader eligible population. MINOR ISSUE 5: TIMING CONVENTION INCONSISTENCY Lines 117-118 define DFS from "start of adjuvant nivolumab," but some analyses and discussion reference time from surgery. This can cause confusion when interpreting timing of events. For example: - DFS is from nivolumab start (Line 117) - But nivolumab starts ≤16 weeks after surgery (Line 117-118) - Table 3 shows "Time (months)" without specifying reference point - Discussion mentions "4 months after surgery" (Line 46) RECOMMENDATION: 1. Clearly state the reference time point (surgery vs. nivolumab start) for each analysis 2. Consider adding a note that nivolumab was initiated within 16 weeks of surgery (per protocol), so timepoints approximately align 3. In Table 3 header, specify: "Time (months from nivolumab initiation)" 4. In abstract/conclusions, clarify whether "4 months after surgery" means 4 months from surgery or 4 months from nivolumab start MINOR ISSUE 6: CONFIDENCE INTERVALS FOR KEY ESTIMATES Key percentages are reported without confidence intervals: - On-treatment recurrence: 27/151 (18%) - Detection by routine CT: 18/27 (67%) - 12-month DFS: 75% This makes it difficult to assess precision of estimates, which is particularly important given the relatively small sample size. RECOMMENDATION: Add 95% confidence intervals for: - On-treatment recurrence rate: 27/151 = 17.9% (95% CI: 12.1-24.9%) - Detection by routine CT: 18/27 = 66.7% (95% CI: 46.0-83.5%) - DFS at 4, 8, and 12 months (from Kaplan-Meier analysis) MINOR ISSUE 7: REFERENCE TO UNPUBLISHED DATA Lines 68-72 reference "Updated results from CheckMate 577, presented at the 2025 ASCO Annual Meeting" with overall survival data. Reference 12 cites this as a conference abstract. RECOMMENDATION: This is acceptable for providing context, but add a statement noting: "These data have been presented in abstract form only and have not yet undergone peer review. Final conclusions regarding overall survival benefit should await full publication." MINOR ISSUE 8: DIAGNOSTIC YIELD CALCULATION CLARIFICATION The diagnostic yield calculations (Lines 180-185) could be clearer: Q3M group: - 5% (3/62) at 3 months - 8% (4/51) at 6 months - 3% (1/32) at 9 months - 17% (3/18) at 12 months Q4M group: - 9% (6/68) at 4 months - 8% (4/49) at 8 months The denominators (62, 51, 32, 18, 68, 49) represent patients who had CT scans at each timepoint, not the total cohort. This is appropriate but could be stated more explicitly. RECOMMENDATION: Add clarification: "Diagnostic yield was calculated as the number of recurrences detected divided by the number of patients who underwent routine CT scanning at each timepoint. Denominators decrease over time due to prior recurrence, treatment discontinuation, or study end." QUESTIONS REQUIRING CLARIFICATION 1. PROTOCOL DEVIATIONS: Were there instances where patients deviated from their assigned imaging protocol (e.g., Q3M patient receiving scan at 4 months due to scheduling)? If so, how were these handled in the analysis? 2. SYMPTOM-DRIVEN VS. ROUTINE SCANS: How were "symptom-driven" scans (Table 2, n=3, 11%) distinguished from routine scans in your data collection? Was this based on documented clinical indication, or timing relative to protocol? 3. PRE-TREATMENT RECURRENCES: For the 11 patients with symptomatic recurrence before starting nivolumab, what was the median time from surgery to recurrence detection? This information would help contextualize the potential value of baseline imaging. 4. LOSS TO FOLLOW-UP: Were any patients lost to follow-up during the study period? If so, how many and how were they handled in survival analyses? 5. IMAGING PROTOCOLS: Were there differences in CT protocols between institutions (e.g., contrast, slice thickness, body regions imaged)? Could this affect detection sensitivity? 6. P-VALUE DISCREPANCY: Please clarify the correct p-value for the lymph node status analysis - is it 0.039 (Line 193-194) or 0.021 (Table 4)? 7. VERIFICATION: Given the multiple numerical discrepancies identified, can you confirm that all statistics have been independently verified against the source database? STRENGTHS OF THE MANUSCRIPT I want to acknowledge several strengths of this work: 1. CLINICAL RELEVANCE: This study addresses a genuine gap in clinical knowledge. The CheckMate 577 trial established efficacy but did not define optimal surveillance strategies. Clinicians need guidance on follow-up imaging, and real-world data are valuable. 2. MULTI-CENTER DESIGN: Including three hospitals (two academic, one teaching) enhances generalizability beyond single-institution experience. The variation in imaging protocols between centers (Q3M vs Q4M) provides a natural comparison, albeit non-randomized. 3. REAL-WORLD DATA: Complementing randomized trial data with real-world evidence is valuable for understanding how treatments perform in routine clinical practice outside controlled trial conditions. 4. TRANSPARENT REPORTING: The authors are forthcoming about limitations, including small sample size for subgroup analyses and lack of formal sample size calculation. 5. APPROPRIATE METHODS: The statistical approach (Kaplan-Meier, descriptive statistics) is appropriate for the study design and research questions. 6. CLINICAL CONTEXT: The Discussion appropriately situates findings withinthe evolving treatment landscape (FLOT, updated CheckMate 577 data). 7. PRACTICAL IMPLICATIONS: The finding that recurrences are distributed gradually over time (rather than clustered at a specific timepoint) has practical implications for follow-up scheduling. 8. COMPREHENSIVE DATA COLLECTION: The study collected relevant clinical variables including tumor characteristics, treatment details, imaging timing, and outcomes, enabling meaningful descriptive analysis. FINAL RECOMMENDATION RECOMMENDATION: MAJOR REVISION This manuscript addresses an important clinical question and provides useful eal-world data on imaging surveillance during adjuvant nivolumab therapy for esophageal cancer. The multi-center design and appropriate methodology are strengths. However, several issues must be addressed before publication: CRITICAL (must be corrected): - Resolve the data availability statement contradiction (Major Issue 1) - Correct all numerical discrepancies - age (69 vs 66) and p-value (0.039 vs 0.021) (Major Issue 3) IMPORTANT (significantly affects interpretation): - Temper conclusions to match observational study design; both Q3M and Q4M showed equivalent outcomes (Major Issue 2) - Provide STROBE checklist as supplementary material (Major Issue 4) - Expand limitations discussion to address lead-time bias, selection bias, and generalizability concerns (Major Issue 5) MINOR (should be addressed but less critical): - Fix typographical errors (Minor Issue 1) - Clarify PD-L1 table presentation (Minor Issue 2) - Add CT reading methodology (Minor Issue 3) - Consider supplementary table comparing treated vs untreated (Minor Issue 4) - Clarify timing conventions (Minor Issue 5) - Add confidence intervals (Minor Issue 6) - Add caveat about unpublished CheckMate 577 OS data (Minor Issue 7) - Clarify diagnostic yield methodology (Minor Issue 8) With appropriate revisions addressing these concerns, this work could make a meaningful contribution to clinical practice by informing (though not definitively establishing) follow-up imaging strategies for patients receiving adjuvant nivolumab after curative-intent treatment for esophageal cancer. I encourage the authors to carefully verify all numerical data, temper their conclusions to match the observational study design, and resubmit. Respectfully submitted, Peer Reviewer Reviewer #2: To the authors: Thank you for submitting your manuscript entitled “Optimizing imaging strategies for adjuvant nivolumab in esophageal cancer: the planning of scanning,” which evaluates real‑world imaging strategies during adjuvant nivolumab therapy for esophageal or gastroesophageal junction cancer following neoadjuvant chemoradiotherapy (nCRT) and R0 resection. By comparing 3‑monthly versus 4‑monthly CT‑based follow‑up schedules and their respective diagnostic yields for recurrence detection, the study suggests that a 4‑monthly scanning interval may balance the need for timely detection with the aims of minimizing radiation exposure and controlling healthcare costs. Although this study is of considerable interest, there are several issues I would like to raise prior to publication. 1. Exclusions for nivolumab: The study reports that 74 patients (21%) with a complete pathological response (ypT0N0) and 20 patients (6%) with microscopic irradical resection (R1) after surgery were excluded from the nivolumab cohort. Could you further clarify the specific follow‑up strategies, including imaging intervals, implemented for these patients who did not receive adjuvant nivolumab? In particular, were follow‑up intervals extended for ypT0N0 patients or shortened for R1 patients, and what were the actual surveillance strategies used in routine practice for these groups? 2. Imaging strategy for high‑risk esophageal cancer patients: The study identifies ≥ypN1 disease and PD‑L1 CPS <5 as factors associated with a higher risk of on‑treatment recurrence, although PD‑L1 data were available only in a subset of patients. Could you clarify whether the proposed strategy of “routine CT imaging every 4 months” has been specifically evaluated in these higher‑risk subgroups? Alternatively, should a more frequent interval, such as a Q3M strategy, be considered for such patients to ensure sufficiently timely detection of recurrence? 3. Comparison of Q4M vs Q3M interval strategy based on the diagnostic yield of CT scans: The study reports that, in the Q3M group, the diagnostic yield for recurrence detection was 5% at 3 months, 8% at 6 months, 3% at 9 months, and 17% at 12 months. In the Q4M group, the diagnostic yield was 9% at 4 months and 8% at 8 months, and no routine 12‑month CT scans were performed. To definitively conclude that Q4M is “superior” to Q3M, the following questions need to be addressed: 3‑1. Rationale for the absence of 12‑month CT scans in the Q4M group: Please provide a clear explanation for why routine CT scans were not performed at 12 months for patients in the Q4M group. Understanding the rationale for this omission is important for a complete comparison of the longer‑term performance of the two imaging strategies. 3‑2. Statistical significance of differences in diagnostic yield: Given the observed diagnostic yields in the Q3M and Q4M groups, was any statistical analysis performed to assess whether there is a significant difference in recurrence detection rates between these schedules at comparable time points? A formal statistical comparison would strengthen any claims regarding the relative performance of the two strategies. 3‑3. Impact of delayed detection on patient outcomes: The Q4M schedule inherently introduces an approximately one‑month delay in the first routine evaluation (3 vs 4 months) and a two‑month delay in the second evaluation (6 vs 8 months) compared with the Q3M schedule. Could you discuss whether such delays might adversely affect patients’ chances for potentially curative or salvage therapy, or otherwise worsen prognosis? Timely identification of recurrence is important to initiate salvage treatment as soon as possible. Therefore, a thorough discussion of the clinical implications of these detection delays is warranted. Reviewer #3: Conceptual issue: The paper considers two protocols, Q3M which has CAT scans at 3, 6, 9 and 12 months, and Q4M which has CAT scans at 4 and 8 months. The authors would like to reach some conclusion about whether Q4M is competitive with Q3M for detection of relapse. Unfortunately I think this is close to impossible due to the limitations of the data. There is no shared timepoint between the two protocols, so at any time where they could be compared, one has been scanned more recently than the other. Relapses which are not caught by scan are sometimes caught because they become symptomatic, so that later scans are not done at all, introducing further complication. The authors divide detected reccurrences into early (0-6 months) and late (7-12 months). However, since Q4M cannot detect recurrences in months 5 or 6 until month 8, an identical recurrence will be "early" in Q3M and "late" in Q4M. And the "late" category does not include months 9-12 for Q4M at all as those can never be caught by scanning, since no 12 month scan is done. Perhaps there is a sophisticated type of statistical analysis which could work here. This reviewer is not an expert in trial statistics. However, no such analysis has been attempted. The only hope I see of getting any answer to the relative effectiveness of Q3M vs Q4M is Monte Carlo simulation where various assumptions can be made about the underlying distribution of events, they can be subjected to Q3M and Q4M, and the resulting simulated data can be compared to the real data. The results of the study may be worth presenting anyway--there is a good deal of information here. But I am very uncomfortable with almost all conclusions drawn by the authors. Certainly the statement that Q4M seems adequate is not based in the data. Presentational issues: (1) There are discrepancies both within the front matter (likely generated by the submission website) and between front matter and paper. These need to be checked and corrected. Specifically: Front matter: states no funding Paper: lists funding sources Front matter: answers "yes" to "are data available?"; instructions specifically state that "by request" should be treated as a "no" answer and explained Paper: states that data are only available by request Both paper and front matter state that there are no known competing financial interests, and then go on to list, in detail, potential competing interests. I don't understand this: it appears contradictory. (2) The section "Cohort selection" is not sufficiently clear. I strongly recommend making a flowchart. Classes of patients are mentioned without saying whether they were included or excluded. For example, the text mentions 18 patients who were in other trials, and 74 patients who had a complete pathological response. I think the 18 were included and the 74 were excluded but this is not stated. It is also unclear throughout this section what the denominator of the given percentages is. For example, 11 patients had recurrence before adjuvant treatment: this is given as 5%, but 5% of what? Total patients? Eligible patients? Why are these 11 separated from "the main reasons for not starting adjuvant treatment"?--isn't recurrence a reason for not starting adjuvant treatment? The trial protocol does a better job with this; I recommend basing the publication version on the protocol. (3) A number of abbreviations are used without explanation: someone deeply involved with this specific cancer will know them, but cancer researchers from other subdisciplines may not, and general readers definitely will not. In one case two different unexplained abbreviations turn out to refer to the same thing. Minor points: p. 8 "as advocated in the Checkmate 577" Strange choice of words: is "mandate" or "recommend" intended? p. 295 "discolures" for "disclosures" p. 10 clarify whether CPS was based on specimen analysis previously carried out by others (who? when? with what criteria?) or by the current researchers. p. 10 Some patients apparently had recurrence before their first dose of nivolumab (and were therefore not dosed). What was the DFS value for these patients? Zero? Or were they excluded from DFS calculation? p. 11 "were summarized as means with standard deviations or medians with interquartile ranges" -- I am concerned this represents "experimenter degrees of freedom." Why were two different approaches used? p. 11 "real-world cohort and the CheckMate 577 trial" I would be comfortable contrasting simulated data with real-world data, but this retrospective trial and the prospective CheckMate were both done in the real world with real patients. I recommend "retrospective" or simply "this study." ****** -->6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.--> Reviewer #1: No Reviewer #2: No Reviewer #3: No ******** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] To ensure your figures meet our technical requirements, please review our figure guidelines: https://journals.plos.org/plosone/s/figures You may also use PLOS’s free figure tool, NAAS, to help you prepare publication quality figures: https://journals.plos.org/plosone/s/figures#loc-tools-for-figure-preparation. NAAS will assess whether your figures meet our technical requirements by comparing each figure against our figure specifications. https://doi.org/10.1371/journal.pone.0350105.r001
Revision 1
23 Apr 2026 Author Response Point-to-point response letter _________________________________________ Reviewer 1 _______________________________________ Reviewer #1: Dear Authors, Thank you for submitting this manuscript evaluating CT imaging strategies during adjuvant nivolumab treatment for esophageal cancer. This study addresses a clinically relevant question that has practical implications for patient follow-up after curative-intent treatment. The multi-center design and real-world data complement the pivotal CheckMate 577 trial findings. I have carefully reviewed your manuscript against PLOS ONE publication criteria. Below I provide detailed feedback organized by major issues, minor issues, and specific recommendations for improvement. We would like to thank the Reviewer for the thorough and constructive comments. We have made every effort to address the recommendations and believe that these revisions have improved the manuscript. For clarity and readability, we have combined some points where appropriate. Please find below our responses to the Reviewer’s comments and a summary of the changes made to the manuscript. MAJOR ISSUES MAJOR ISSUE 1: DATA AVAILABILITY STATEMENT CONTRADICTION There is a contradiction between your submission form responses and the manuscript text regarding data availability: Submission Form (Page 4-5) and information files." Manuscript Text (Lines 283-286): "Due to the nature of the study, in which exception consent was obtained from participants, the data used in this research cannot be made publicly available. Data sharing is not permitted as per the consent agreement." Thank you for your feedback regarding the data availability statement. We have updated the statement in the manuscript (page 12) as requested. However, we are unable to change the selection in the submission portal, as we do not have access to this option in the revision phase. "Data cannot be shared publicly due to ethical restrictions related to participant consent. The ethics approval (MEC2023-0631) specified that individual-level data would not be publicly released. Requests for de identified, aggregated data (including summary statistics and additional subgroup analyses upon request) may be directed to the Department of Internal Oncology at Erasmus University Medical Center via interne.oncologie@erasmusmc.nl and will require approval from the Medical Research Ethical Committee of Erasmus University Medical Center Rotterdam." MAJOR ISSUE 2: CONCLUSIONS EXCEED WHAT THE DATA CAN SUPPORT The manuscript title, abstract, and conclusions make claims that exceed what a retrospective, non-randomized observational study can establish: Title: "Optimizing imaging strategies..." Abstract (Lines 46-47): "Routine CT imaging every 4 months, starting at 4 months after surgery, effectively detects recurrences during adjuvant nivolumab treatment while reducing unnecessary imaging." Conclusion (Lines 268-274): Implies the 4-month interval is the recommended or optimal strategy. However, your data show that both imaging strategies produced equivalent outcomes: Q3M group (n=67): - 12 patients (17.9%) developed recurrence - 8 of 12 recurrences (66.7%) detected by routine CT - Diagnostic yield: 5% at 3 months, 8% at 6 months, 3% at 9 months Q4M group (n=84): - 15 patients (17.9%) developed recurrence - 10 of 15 recurrences (66.7%) detected by routine CT - Diagnostic yield: 9% at 4 months, 8% at 8 months Both strategies showed nearly identical recurrence rates (~18%) and detection rates (~67%). Your study design cannot determine which is superior because: 1. Patients were not randomized to imaging intervals 2. Institutional practice determined interval (potential confounding) 3. Sample sizes preclude detecting meaningful differences 4. No formal statistical comparison between groups was performed RECOMMENDATION: 1. Revise the title to: "Characterizing imaging strategies for adjuvant nivolumab in esophageal cancer: a multi-center retrospective cohort study" OR "Imaging follow-up patterns during adjuvant nivolumab for esophageal cancer: real-world data from three Dutch centers" 2. Revise the abstract conclusion (Lines 46-51) to: "In this retrospective cohort, routine CT imaging at 4-month intervals detected the majority of on-treatment recurrences. The gradual decline in disease-free survival suggests that recurrences are distributed over time rather than clustered at specific time points. These real-world data may help inform, but cannot definitively establish, optimal follow-up intervals. Prospective studies comparing imaging strategies are needed." 3. Revise the Discussion conclusion (Lines 268-279) to explicitly state: "This observational study cannot determine the optimal imaging interval. Our findings describe current practice patterns and outcomes, which may inform clinical decision-making pending prospective comparative studies." 4. Add a statement acknowledging that both Q3M and Q4M showed equivalent outcomes in your cohort (12/67 vs 15/84 recurrences; 8/12 vs 10/15 detected by routine CT), and that selection of imaging interval should consider institutional resources, patient preferences, and clinical judgment rather than presumed superiority of one approach. Thank you for your thoughtful and detailed feedback. We fully agree that, given the retrospective and non-randomized design of our study, we cannot definitively establish the superiority of any imaging regimen. We appreciate your careful consideration of the study’s limitations, including the lack of randomization, potential confounding by institutional practice, and limited sample sizes, which preclude formal statistical comparison between groups. In response, we have revised the manuscript to ensure that our claims are closely aligned with the data and presented in a more descriptive manner. Specifically: • The title and abstract have been updated to reflect the observational nature of the study and to avoid overstating the conclusions as following: “Imaging strategies for follow-up during adjuvant nivolumab in esophageal cancer: a multicenter retrospective cohort study” • The manuscript now explicitly states that both imaging strategies (Q3M and Q4M) produced nearly identical recurrence and detection rates, and that selection of imaging interval should be guided by institutional resources, patient preferences, and clinical judgment rather than presumed superiority. We revised the conclusion in the abstract as follows: “Routine baseline CT imaging did not detect recurrences, while routine imaging during adjuvant nivolumab identified the majority of recurrences. The gradual decline in disease-free survival suggests that recurrences are evenly distributed over time, supporting a routine imaging interval, such as every 3 or 4 months as used in our study. These real-world data may help inform clinicians, and future studies can further evaluate optimal imaging intervals.” • The discussion and conclusion sections have been revised to clarify that our findings describe real-world practice patterns and outcomes, which may inform clinical decision-making, but cannot determine the optimal imaging interval. We also acknowledge the need for prospective comparative studies to address this question. “In conclusion, this retrospective cohort study showed that 18% of patients with esophageal cancer, previously treated with neoadjuvant chemoradiotherapy and resection, developed recurrent disease during adjuvant treatment with nivolumab. This observational study cannot determine the optimal imaging interval. Our findings describe current practice patterns and outcomes, which may inform clinical decision-making pending prospective comparative studies. Routine baseline CT scans appear to have limited utility, as early recurrences were predominantly detected based on clinical symptoms. However, considering the timing of recurrences, follow-up intervals of 3–4 months for early recurrences and 7–8 months for late on-treatment recurrences could be reasonable options. Reducing the number of scans requires careful consideration of recurrence detection, imaging frequency, feasibility, and healthcare costs. At the same time, follow-up imaging remains essential, as many recurrences detected by routine CT scans were asymptomatic. Given the high costs of nivolumab and the potential for severe adverse effects, timely detection of recurrence is crucial to avoid unnecessary exposure to treatment in patients who may not benefit. The choice of routine imaging intervals should be guided by institutional resources, patient preferences, and clinical judgment.” MAJOR ISSUE 3: MULTIPLE DATA DISCREPANCIES Several numerical inconsistencies were identified that must be corrected: DISCREPANCY 1 - Patient Age: Line 160: "The nivolumab cohort included 151 patients, with a median age of 69 years (IQR, 60-72)..." Table 1: "Median age (range) - year: 66 [60-72]". The IQR values match (60-72), but the median differs by 3 years (69 vs 66). DISCREPANCY 2 - Lymph Node Status P-Value: Line 193-194: "both the early and late recurrence groups had a higher proportion of patients with ≥N1 disease post-surgery (p = 0.039)" Table 4: Shows p = 0.021 for "Pathological lymph-node status post-surgery" These p-values differ substantially (0.039 vs 0.021) for what appears to be the same analysis. I suspect Table 1 (66 years) is correct for age based on the IQR, and Table 4 (p=0.021) may be correct for the lymph node analysis, with the text containing typographical errors - but please verify from original data. We apologize for the inconsistencies in the representation of the data. We have carefully reviewed the original dataset and corrected the manuscript as follows: • Patient Age: The correct median age is 66 years (IQR 60–72), as shown in Table 1. The text has been updated to reflect this value. • Lymph Node Status P-Value: The correct p-value for pathological lymph-node status post-surgery is 0.021, as indicated in Table 4. The text has been revised accordingly. We have ensured that all numerical values in the manuscript are consistent with the original data. Thank you for bringing these discrepancies to our attention. MAJOR ISSUE 4: STROBE CHECKLIST NOT PROVIDED Lines 100-101 state: "Data collection and reporting followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines." We apologize that the STROBE checklist was not readily found within the submission. It was originally uploaded as Supporting Information, which may have caused it to be overlooked among the supplemental files. To ensure clarity, we have re-uploaded the checklist as supplemental data and revised the manuscript as follows: "Data collection and reporting followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines. (15) A completed STROBE checklist is provided as Supporting Information (S1 Checklist)." MAJOR ISSUE 5: INCOMPLETE LIMITATIONS DISCUSSION While the Discussion acknowledges some limitations (Lines 228-231, 238-243), several important limitations are not explicitly addressed: 1. INABILITY TO COMPARE IMAGING STRATEGIES: The non-randomized design means institutional preference determined imaging interval, not random assignment. Any differences (or lack thereof) could reflect selection bias or confounding rather than true equivalence. 2. LEAD-TIME BIAS: Earlier or more frequent detection of recurrence through imaging may not translate to improved outcomes. Patients detected earlier appear to survive longer from diagnosis simply because diagnosis occurred earlier, not because of any true survival benefit. 3. DETECTION BIAS: The 67% detection rate by routine CT may be influenced by the timing of scans. Some "symptom-driven" detections may have been caught on the next routine scan regardless. 4. LIMITED GENERALIZABILITY: The cohort is predominantly male (82%, 124/151), adenocarcinoma (86%, 130/151), from Dutch academic centers. Results may not generalize to other populations, histologies, or healthcare settings. 5. EVOLVING TREATMENT LANDSCAPE: With FLOT becoming standard for adenocarcinoma, the population eligible for nCRT + adjuvant nivolumab is changing, potentially limiting future applicability. Thank you for these valuable suggestions. We have specifically addressed the issues of selection bias, lead-time bias, detection bias, limited generalizability, and the evolving treatment landscape, as suggested by the reviewer. We have extended the limitations section in the manuscript to address the points raised, as follows (page 10): “Several limitations must be considered when interpreting the findings of this study. First, detection bias may have occurred, as some recurrences detected during scheduled imaging were already clinically suspected, potentially overestimating the detection rate in both strategies. Second, the non-randomized design and variability in institutional practices prevented a formal comparison between the Q3M and Q4M imaging strategies, as any differences between groups may reflect confounding factors rather than inherent differences. The generalizability of the results may be limited due to the homogeneity of the cohort, which was predominantly male (82%) and mainly comprised of adenocarcinoma patients (86%) from Dutch centers. These factors may limit the applicability of the findings to other populations, histologies, or healthcare settings. Lastly, while early detection may help spare patients from unnecessary treatments and their associated risks, timely detection of recurrence through routine imaging does not necessarily translate into better patient outcomes. Based on our data, we cannot determine whether detecting recurrence 1-2 months earlier with one strategy (Q3M vs Q4M) would lead to improved outcomes, such as greater opportunities for salvage surgery or other curative interventions, especially considering that the majority of recurrences were distant.” MINOR ISSUES MINOR ISSUE 2: PD-L1 ANALYSIS - CONFUSING PRESENTATION AND INTERPRETATION The PD-L1 subgroup analysis in Table 4 has confusing presentation: Table 4 shows for "PD-L1 CPS <5": The more clinically meaningful comparison is among patients WITH available PD-L1 data: - Early recurrence: 7/11 (63.6%) had CPS <5 - No recurrence: 8/29 (27.6%) had CPS <5 Additionally, this is based on very small numbers (only 41 patients had PD-L1 data available per Line 195), making the p=0.001 finding potentially unreliable. RECOMMENDATION: 1. Clarify the Table 4 presentation - specify whether percentages refer to: - Proportion of those with available PD-L1 data, OR Proportion of the total group 2. Consider presenting both: "7/11 (63.6% of those tested; 33.3% of early recurrence group)" 3. More explicitly frame as hypothesis-generating in the text (Lines 195-196): "In an exploratory analysis limited by small sample size (n=41 with available PD-L1 data), patients with early recurrence were more likely to have CPS <5 (7/11, 63.6%) compared to those without recurrence (8/29, 27.6%), p=0.001. This finding should be interpreted with caution given the small numbers and requires validation in larger cohorts." Thank you for this comment. We added these detailed numbers in Table 4, and added the more cautious interpretation to the Discussion as requested. MINOR ISSUE 3: CT READING METHODOLOGY NOT SPECIFIED The manuscript does not describe how CT scans were interpreted: Single radiologist or consensus reading? Were readers blinded to clinical information? Were standardized criteria used for defining recurrence on imaging? Thank you for this comment. We agree that the methodology for CT scan interpretation should be clarified. As the CT scans were performed as part of routine clinical care, they were interpreted by board-certified radiologists according to standard clinical practice at each participating center. No standardized research criteria or consensus readings were applied, and readers were not blinded to clinical information. We have clarified this in the revised manuscript as follows (Method section, page 4-5): "CT scans were performed as part of routine clinical care and were retrospectively reviewed for the purposes of this study. Board-certified radi Attachments Attachment Submitted filename: Point-to-point rebuttal_05042026.docx https://doi.org/10.1371/journal.pone.0350105.r002
11 May 2026 Decision Letter - Zhanzhan Li, Editor Imaging strategies for follow-up during adjuvant nivolumab in esophageal cancer: a multicenter retrospective cohort study PONE-D-25-61933R1 Dear Dr. Anniek Strijdhorst, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. For questions related to billing, please contact billing support. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Zhanzhan Li Academic Editor PLOS One Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions -->Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.--> Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed ******** -->2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. --> Reviewer #1: Yes Reviewer #2: Yes ****** -->3. Has the statistical analysis been performed appropriately and rigorously? --> Reviewer #1: Yes Reviewer #2: Yes ****** -->4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.--> Reviewer #1: Yes Reviewer #2: Yes ****** -->5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.--> Reviewer #1: Yes Reviewer #2: Yes ****** -->6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)--> Reviewer #1: Dear Authors, Thank you for the substantial and largely responsive revision. The manuscript is significantly improved: the title and conclusions now match the observational design, the prior numerical discrepancies are reconciled, the Methods on CT interpretation and CPS provenance are now specified, the limitations section has been broadened, the new patient- selection flowchart (Figure 1) is a clear addition, and the STROBE checklist is included as Supporting Information. The Reviewer 2 and Reviewer 3 concerns about high-risk subgroups, the Monte Carlo idea, and ambiguous cohort selection have been engaged thoughtfully. Below I summarize the small number of new or residual issues that should be cleaned up before acceptance. REMAINING ISSUES (R1) MAJOR ISSUES MAJOR ISSUE 1 (R1): NEW NUMERICAL DISCREPANCY — FIGURE 1 VS RESULTS TEXT PROBLEM: The new flowchart (Figure 1) shows under Exclusion (n=78): - Patients' preference (n=33) - WHO performance status ≥2 (n=22) - Recurrent disease within 12 weeks after surgery (n=10) - Postoperative death (n=8) - Contra indication ICI/other (n=5) Total: 33 + 22 + 10 + 8 + 5 = 78 ✓ But the Results text (line 164-165) states: "The main reasons for not starting adjuvant treatment were poor performance status (n=22), patient preference (n=33), and symptomatic recurrence (n=11) detected before starting adjuvant treatment." And the Discussion (line 244-245) states: "Eleven patients (5%) had symptomatic recurrences diagnosed prior to nivolumab initiation." The rebuttal letter also uses n=11. The flowchart says 10, the text and rebuttal say 11. Likewise, the flowchart shows postoperative death n=8, while the original R0 manuscript and rebuttal cited n=7. Also, the text now lists only 22 + 33 + 11 = 66 patients out of 78 exclusions — leaving 12 unaccounted for in the narrative (the 5 contraindications and 7-8 postoperative deaths are silently dropped). Given that the prior review flagged numerical inconsistencies as a major issue, introducing a new mismatch between Figure 1 and the running text undermines the verification statement provided in the rebuttal. It also affects the median-time-to-recurrence figure (7.9 weeks IQR 4-10) since the denominator differs. SPECIFIC RECOMMENDATION: 1. Reconcile to a single number — verify whether n = 10 or n = 11 patients had pre-nivolumab symptomatic recurrence, and update Figure 1, Results text (line 164), Discussion (line 244), and the rebuttal/flow- chart consistently. 2. In the Results narrative, list ALL five exclusion reasons so the totals reconcile to 78. Suggested wording: "Reasons for not starting adjuvant treatment were patient preference (n=33), poor performance status (WHO ≥2, n=22), symptomatic recurrence detected within 12 weeks of surgery (n=10), postoperative death (n=8), and contraindications to immune checkpoint inhibitors (n=5)." 3. Re-verify the 7.9-week median timing using whichever denominator is correct. MAJOR ISSUE 2 (R1): MISPLACED 95% CONFIDENCE INTERVAL PROBLEM: Results, line 175-176: "During treatment, 27 (18%) (95% CI: 12-25%) patients developed recurrent disease, predominantly distant metastases (89%). Eighteen patients (12%) (95% CI: 46-84%) were asymptomatic and their recurrence was detected by routine imaging." The CI "46-84%" cannot describe a percentage of 12. It is the CI for 18/27 = 66.7% — i.e., the proportion of recurrences detected by routine CT (which the original review estimated as 95% CI 46.0-83.5%). The CI has been attached to the wrong proportion. SPECIFIC RECOMMENDATION: Rewrite the sentence so the CI is paired with the proportion it describes. Suggested wording: "During treatment, 27/151 patients (18%, 95% CI 12-25%) developed recurrent disease, predominantly distant metastases (24/27, 89%). Of these 27 recurrences, 18 (67%, 95% CI 46-84%) were detected by routine imaging in asymptomatic patients." Note: 18 detected by routine CT corresponds to Table 2 ("Evaluation CT scans 18 (67)"), so the 18/27 framing is correct. FIGURE ANALYSIS F1 (Major) — Figure 1 (flowchart) vs Results text mismatch Already covered under Major Issue 1 (R1). The flowchart itself is well-constructed and the arithmetic within Figure 1 is internally consistent. The mismatch is between the figure and the running text. F2 (Minor) — Figure 1 wording: "Recurrent disease within 12 weeks after surgery" is clearer than "symptomatic recurrence" (used in the text) because it specifies the time window. Recommend the running text adopt the same phrasing. F3 (Minor) — Figure 1 abbreviation list: lists "WHO; World Health Organisation" but the flowchart uses "WHO performance status ≥2." Consider clarifying as "WHO performance status, World Health Organization performance status scale" or "ECOG/WHO performance status" so the figure stands alone. F4 (Minor) — Figure 2 (Kaplan-Meier): The curve is appropriate. The number-at-risk panel is included. The 12-month KM-estimated DFS appears to land at ~75% with 95% CI ~67-82% (visual estimate). The text reports 75% (95% CI: 69-83%) — the lower bound 69% is slightly higher than the visual ~67%, but within plotting tolerance. No action needed. MINOR ISSUES MINOR ISSUE 1 (R1): RESIDUAL TYPOGRAPHIC ERRORS - Abstract, line 31 (and page 1 of rebuttal letter): "DFSwas 89%" → "DFS was 89%" (missing space). - Results, line 158: "78% received nCRT with carboplatin/paclitaxel.After surgery" → add a space after the period. - Methods, line 93 (clean MS): the parallelism is still broken — "neoadjuvant treatment with chemoradiotherapy prior to surgery, microscopic radical resection (R0), incomplete pathological response and who received at least one cycle of adjuvant nivolumab." Suggest: "...chemoradiotherapy prior to surgery, achieved microscopic radical resection (R0) with incomplete pathological response, and received at least one cycle of adjuvant nivolumab." - Methods, lines 116-119 (tracked-change residue): in the tracked version "biopies" and "surgerical" appear briefly; the clean version uses "biopsies" and "surgical" — verify the typeset PDF uses the clean spelling. - Discussion, line 251 (clean MS): minor missing space — "...clinical evaluation.Although..." → "clinical evaluation. Although" - Conflict-of-interest section: "Daiichy-Sankyo" appears twice — the brand is "Daiichi Sankyo." Recommend correcting. - Methods, line 136-137: "summarized as mean with standard deviation (SD) when normally distributed,or medians" — missing space after comma. MINOR ISSUE 2 (R1): LIMITATIONS — TWO ADDITIONS RECOMMENDED The rewritten limitations paragraph (lines 287-301) is much improved but two prior sub-points are not represented: a) Selection bias for the 78 untreated patients: The reasons are listed but the limitations paragraph does not acknowledge the residual uncertainty this introduces. One sentence would suffice, e.g.: "Although reasons for not starting nivolumab are reported, baseline characteristics of the 78 untreated patients were not compared to the treated cohort, leaving residual uncertainty regarding selection effects." b) Evolving treatment landscape: This is discussed in the body (lines 275-285) but not flagged as a limitation. Consider adding: "Finally, with perioperative FLOT becoming standard for adenocarcinoma after the ESOPEC trial, the population eligible for nCRT followed by adjuvant nivolumab is shifting, which may limit the future applicability of these findings." MINOR ISSUE 3 (R1): RESIDUAL DIRECTIONAL LANGUAGE IN DISCUSSION/CONCLUSION The conclusion has been substantially tempered, but two phrases still read as recommendations: - Discussion, line 295-297: "Taken together, these data suggest that performing a CT scan around 3-4 months may be valuable for detecting early recurrences." - Conclusion, lines 336-338: "...follow-up intervals of 3-4 months for early recurrences and 7-8 months for late on-treatment recurrences could be reasonable options." These are appropriately hedged ("may be valuable," "could be reasonable options") so this is not a barrier to acceptance, but for full consistency with the new tempered framing, consider rewording the second to: "...the observed timing of recurrences was compatible with the 3-4 and 7-8 month intervals used in our cohort, but the optimal interval cannot be inferred from this study." MINOR ISSUE 4 (R1): COMPARISON WITH CHECKMATE 577 (75% vs 62% AT 12 MONTHS) The Discussion (line 287-289) states: "The 1-year DFS in our cohort was 75%, compared to 62% in the CheckMate 577 study. While no formal statistical tests were performed, these findings suggest that disease recurrence in this real-world retrospective cohort may be similar to that observed in a large clinical trial." The reported difference (75% vs 62%) is a 13-percentage-point gap — arguably better-than-CheckMate-577 outcomes, not "similar." Either the language should reflect that the real-world cohort appears to do at least as well as the trial, or possible explanations should be offered (selection effects, different risk distribution, missed early recurrences without standardized 12-week scan, shorter follow-up censoring effects, higher proportion adenocarcinoma in real world, etc.). STRENGTHS OF THE R1 REVISION 1. The title change ("Optimizing..." → "Imaging strategies for follow-up a multicenter retrospective cohort study") cleanly resolves the single biggest issue from R0. 2. The new Figure 1 flowchart is informative, internally consistent, and clearly shows the 451 → 353 → 229 → 151 cascade across three centers. 3. Methods on CT interpretation, recurrence definition, symptom-driven classification, and CPS provenance are now explicit and reproducible. 4. CIs added for headline estimates; CPS finding now correctly framed as exploratory and hypothesis-generating. 5. Numerical discrepancies from R0 are corrected and verification was independently performed. 6. Reviewer 3's conceptual concerns about non-comparability of Q3M and Q4M are addressed by removing direct comparative statistical claims and recasting the manuscript as descriptive. 7. The new clinical-relevance comparison of timing-from-surgery for symptomatic recurrence (7.9 weeks) vs baseline CT (10.6 weeks) is a nice quantitative addition that strengthens the "limited utility of baseline CT" claim. FINAL RECOMMENDATION RECOMMENDATION: ACCEPT with MINOR REVISION The R1 revision substantively addresses the prior major concerns (overclaiming, data discrepancies, STROBE checklist, limitations, data availability text, PD-L1 presentation, CT methodology, timing convention, CIs, diagnostic-yield clarification). The new Figure 1 and expanded Methods are real improvements. What remains is a small set of clean-up items, none of which require new analysis: CRITICAL TO FIX: - Reconcile Figure 1 vs text on n=10 vs n=11 pre-treatment recurrences and n=7 vs n=8 postoperative deaths (Major Issue 1, R1). - Move/relabel the misplaced 95% CI (46-84%) to the 67% (18/27) it describes (Major Issue 2, R1). EDITORIAL: - Fix residual typos ("DFSwas," "paclitaxel.After," "Daiichy-Sankyo," awkward parallelism in the eligibility sentence) (Minor Issue 1, R1). - Add one sentence each on selection bias (untreated cohort) and FLOT evolving landscape to the limitations paragraph (Minor Issue 2, R1). - Optionally soften the two remaining recommendation-flavored phrases in Discussion/Conclusion (Minor Issue 3, R1). - Reword the 75% vs 62% comparison so "similar" reflects the actual 13-point gap or offer explanations (Minor Issue 4, R1). With these clean-up edits, the manuscript would be suitable for publication. Respectfully submitted, Peer Reviewer Review Date: May 2026 (R1) Reviewer #2: Thank you for your thorough and thoughtful responses to my comments. All of my concerns have been adequately addressed, and the revisions have improved the clarity and rigor of the manuscript. I have no further questions or concerns at this time. ****** -->7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.--> Reviewer #1: No Reviewer #2: Yes: Ming-Ching Lee, MD, PhD, FACS, FCCP ******** https://doi.org/10.1371/journal.pone.0350105.r003
Formally Accepted
Acceptance Letter - Zhanzhan Li, Editor PONE-D-25-61933R1 PLOS One Dear Dr. Strijdhorst, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS One. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. You will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Zhanzhan Li Academic Editor PLOS One https://doi.org/10.1371/journal.pone.0350105.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .