Peer Review History

Original SubmissionSeptember 1, 2022
Decision Letter - Chloë Montes Strevens, Editor
Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.

PSTR-D-22-00073

Accurate detection and identification of insects from camera trap images with deep learning

PLOS Sustainability and Transformation

Dear Dr. Bjerge,

Thank you for submitting your manuscript to PLOS Sustainability and Transformation. After careful consideration, we feel that it has merit but does not fully meet PLOS Sustainability and Transformation's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript within 30 days Dec 14 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at SustainTransform@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pstr/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

We look forward to receiving your revised manuscript.

Kind regards,

Chloë Montes Strevens

Section Editor

PLOS Sustainability and Transformation

Journal Requirements:

1. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article's retracted status in the References list and also include a citation and full reference for the retraction notice.

2. Please amend your detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published.

a State the initials, alongside each funding source, of each author to receive each grant.

If you did not receive any funding for this study, please simply state: “The authors received no specific funding for this work.”

3. We ask that a manuscript source file is provided at Revision. Please upload your manuscript file as a .doc, .docx, .rtf or .tex.

4. Please provide separate figure files in .tif or .eps format only and remove any figures embedded in your manuscript file. Please also ensure that all files are under our size limit of 10MB.

For more information about figure files please see our guidelines:

https://journals.plos.org/sustainabilitytransformation/s/figures

https://journals.plos.org/sustainabilitytransformation/s/figures#loc-file-requirements

5. Figures 1-3: Please confirm (a) that you are the photographer; or (b) provide written permission from the photographer to publish the photo(s) under our CC-BY 4.0 license.

Additional Editor Comments (if provided):

This study offers an exciting opportunity to advance the field of automated animal detection in entomology, pushing boundaries that have been limited by animal size, diversity and behaviour. I particularly applaud the researchers for their efforts to increase the impact of their work by making their large training dataset public. Given the level of innovation and the undoubted value the research will bring to this field of study, I am delighted to accept the paper pending minor revisions. Specific points for revision are detailed in the two sets of reviewer comments and I ask that you carefully consider the feedback and provide a detailed response. In particular, I urge you to attend to two key points related to the publicly available dataset: First, please ensure all data are accessible and easily identifiable; Second, please consider adding your models alongside your data.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Does this manuscript meet PLOS Sustainability and Transformation’s publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

--------------------

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

--------------------

3. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

--------------------

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS Sustainability and Transformation does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

--------------------

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Comments to the authors

This is a well-presented work that includes a public, annotated dataset of insects captured with time lapse cameras in semi-natural conditions, together with a detection and classification approach based on the YOLO framework. The authors could consider making the best-performing model(s) publicly available alongside the training data (e.g. pytorch format or ONXX).

Line-by-line comments

Pg2 ln25: perhaps this is the first benchmark dataset for insect detection + identification? The “first” aspect could be included in the abstract here.

Pg 2 ln36: “the promise of deep learning to process such sensor signals” slightly ambiguous wording, perhaps “potential” is more appropriate here.

Pg3 ln 73-74: perhaps here it is useful to point the reader to a relevant publication or two discussing the challenge of Out-Of-Distribution (OOT) and hierarchical classification; for inspiration: https://ieeexplore.ieee.org/abstract/document/9415076 ; https://dl.acm.org/doi/abs/10.1145/3324884.3416609

Pg4 ln116-122: the split strategy is a little confusing: for the initial dataset you adopted a 80/20 train/val split, with a holdout set as large as train+val (?), but in the benchmark dataset you adopt a 85/15 split and do not change the size of the holdout set. It is not entirely clear from the M&M why this choice was made (if it’s arbitrary please state so here

Table 1: misspelling on line 17 “Non-Coccinalidae Coleoptera”. Make reader’s life easy and put in the caption the approximate split percentages of training and validation sets e.g. 80/20 train/val.

Pg8 ln201-206: (optional) the authors may consider pointing out that the model more readily generalizes from the 3 classes of Syrphidae to other Diptera, while it struggles to generalize from the 4 classes of Vespoidea to Hymenoptera, this in spite of similar imbalance between the rarest class (Eristalis tenax and Vespula vulgaris respectively) and the more abundant taxa.

Pg8 ln206-209: for reader’s convenience, please provide % of false positives and false negative over the total please.

Figure 8: in the caption, please change “species” to “taxa”.

Figure 8: (optional): the fact that the figure is presented in 4 panels makes it hard to read; the authors could consider plotting all the taxa’s detections in the same panel, clipping the most frequent ones (i.e. hoverflies and honeybees).

Pg9 ln250: please check reference #27, I could find no mention of iNaturalist or picture quality comparisons in that paper. If the intention is simply to point out that field-collected images are of low quality, this is sufficiently self-evident and no reference is needed.

Pg12 ln345-348: if an annotation tool was used to aid in the drawing of bounding boxes, etc. it should be mentioned and credited here.

Pg13 ln362-363: what does semi-randomly mean? Please provide some brief info if you separated the images more or less evenly over time of day and season, and across different cameras

Pg14 ln389-404: (optional) the detailed explanation of the features introduced in YOLOv4 could be reduced slightly. In any case I would suggest keeping the parts on bag of freebies and focal loss function.

Pg14 ln410-411: (optional) “These models were chosen as they have a size, speed, and accuracy that is compatible with most reasonably priced GPUs.” could be rephrased “the models were chosen because their size allow them to be run at the desired speed on most reasonably priced GPUs”, please check that this is the intended meaning.

Reviewer #2: In this manuscript, Bjerge et al. release a large dataset of annotated insects (and background) in a semi-natural context (freely flying and visiting flowers). They also propose a method to detect insects using yolo5, on this dataset, which they validate and test on unseen taxa. Finally, they report the seasonal patterns of the detected taxa.

Overall, the manuscript is a very valuable addition to computational entomology, but several points must be addressed before publication.

Data:

Firstly, I applaud the authors for their extensive work and the very large publicly available data they provide in this paper (almost 30000 insects). However, I have several important points to raise regarding data availability.

1. Critically, please do verify the integrity of `train1201.zip`. I cannot extract the version I downloaded (md5 checksum: `6831b05cab0988743a113819eb23be75`). I tried twice, and can definitely extract the other two files.

2. Related to this point, please upload your data on a DOI-based server (e.g. Zenodo). I think this will be required at some point in the editorial process and will help with point 1 -- by providing checksums and such.

3. Please document the naming convention of image files -- e.g. we have names like `12_13-20190704072200-snapshot.jpg` and `S5_187-Juli25_0_347-20190725082130`, what are fields such as 12, 13...

Background diversity: Although the number of images is very impressive, images (the ones I could download) are highly similar, and from only 5 cameras (almost identical backgrounds through time). Can the authors discuss how this impacts the generalisation and usefulness of the dataset? Have the authors tried to, for example, keep one camera out for testing? I would assume YOLO strongly learns the background and that performance will be lower on new backgrounds... In general, please discuss general applicability and complementarity with other datasets.

Single pass /one-stage classification: The authors have opted for a single pass classification using YOLO. They also justified that one-step detectors are faster. Since the reader understands that the majority of images have no insect, would it make sense to run a first conservative detection of generic "insects", and then use a higher performance classifier on the minor portion of positive images? This approach might have the potential to generalise better to other contexts (season, geography, flowers, ...). In addition, a two-pass approach might be better at handling the unbalance in class frequencies (at least in the first pass). Please discuss, but also explain how your available dataset can already be used to train a more general insect detector, if necessary.

Figure 8: Figure 8 appears somewhat out of context and several points may/should be improved.

1. It is rather abruptly introduced, and the reader does not necessarily understand why this is done. An introductory sentence in the result section may help.

2. Please state the sampling frequency. The reader understands counts were averaged (mean between cameras over multiple records?) over a time period (a day?). The decision to aggregate data per time period is an implicit model and could be explained (at least in the methods). The y-axis must contain the sampling period too e.g. "detection per day"?

3. The authors refer to "abundance" in the figure, but also "occurrence" in the text (l228) somewhat indiscriminately and claim "There is a clear seasonal dynamic in the occurrence of the various insects". First, the method does "only" record visits (occurrence), not abundance stricto sensu. Indeed, the visit rate of many pollinators is greatly impacted by the weather and other factors. Second, for the reader, it is unclear whether the observed patterns result from high-frequency weather change or from longer-term seasonal effects. This point could help explain the high interday variations (peaks).

4. It would be easier for the reader if the plot were organised as 8 rows/facets (in the same order as the rest of the paper), with their own y-scale, but with an aligned x-axis. At the moment, species are just paired because their y-axis is compatible, which is narratively confusing. 5. Having 8 vertically aligned subfigures would also simplify the comparison of phase/timing between peaks, and facilitate the visualisation of close peaks.

5. Please also add the total number of detection per taxa (overall sum), either on the figure or the legend, or both.

6. The figure could contain error bars or quantification of the variation (for instance, what are the discrepancies between cameras)

Minor points

* Figure 2, please add a scale bar both in pixel and mm (even approximate), for the reader to put things in context.

* Please justify the ranking of the different taxa in figures/tables. Or rank through taxonomical principle, alphabetically, ...

* Discuss the use of YOLO7 -- although I do not expect the authors to use YOLO7 in this publication.

* Figure 6 is not referenced in the text? The reader expects it in l184, maybe.

* figure 6, please also add the marginal numbers of observations e.g. in brackets after the name (although this can be found in previous tables, it helps in this context).

* Figures 4 and 5 are probably for the supplementary material as the reader sees them as specific implementation details, and the MS already has 8 main-text figures

* Methods: state, at least, the GPS location where the trial took place in the acquisition section

* The images are acquired between 05:00 and 22:00 why? The reader suspects the data is acquired at high latitude, shortly after the summer solstice, does this decision exclude otherwise visible crepuscular pollinators in the middle of summer?

* Check double commas ",," in affiliation b

--------------------

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public.

For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

--------------------

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Revision 1

Attachments
Attachment
Submitted filename: Rebuttal letter to PSTR-D-22-00073.pdf
Decision Letter - Wei-Ta Fang, Editor

Accurate detection and identification of insects from camera trap images with deep learning

PSTR-D-22-00073R1

Dear Dr. Bjerge,

We are pleased to inform you that your manuscript 'Accurate detection and identification of insects from camera trap images with deep learning' has been provisionally accepted for publication in PLOS Sustainability and Transformation.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow-up email from a member of our team. 

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they'll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact SustainTransform@plos.org.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Sustainability and Transformation.

Best regards,

Wei-Ta Fang, Ph.D.

Section Editor

PLOS Sustainability and Transformation

***********************************************************

Reviewer Comments (if any, and for reference):

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

**********

2. Does this manuscript meet PLOS Sustainability and Transformation’s publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented.

Reviewer #1: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS Sustainability and Transformation does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors have addressed all the concerns, and, together with the suggestions of the other reviewer, I think the manuscript is ready for publication. In particular:

- my comment on figure 7 (Pg8) stemmed from potential confusion between the classes above and below the black line, but the caption clarifies this and having it next to the figure is sufficient to resolve this.

- new Figure 8 is much more readable.

- extended explanation about train/val split is also much clearer.

- data and model avilability, description and searchability (i.e. addition to Zenodo) has also been improved.

Please note that on Pg10 ln259 (manuscript with track changes), in the sentence "The variations in visit rates are be impacted by the weather and the longer-term seasonal effects such as different species of blooming sedum plants.", the word "Sedum" should be correctly capitalized and italicized.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public.

For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

**********

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .