Peer Review History

Original SubmissionDecember 28, 2022
Decision Letter - Kirsten Bomblies, Editor, Ian R. Henderson, Editor

Dear Dr Teterina,

Thank you very much for submitting your Research Article entitled 'Genomic diversity landscapes in outcrossing and selfing Caenorhabditis nematodes' to PLOS Genetics.

The manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important problem, but raised some substantial concerns about the current manuscript. Based on the reviews, we will not be able to accept this version of the manuscript, but we would be willing to review a much-revised version. We cannot, of course, promise publication at that time.

Should you decide to revise the manuscript for further consideration here, your revisions should address the specific points made by each reviewer. We will also require a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

If you decide to revise the manuscript for further consideration at PLOS Genetics, please aim to resubmit within the next 60 days, unless it will take extra time to address the concerns of the reviewers, in which case we would appreciate an expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments are included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool.  PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, use the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

We are sorry that we cannot be more positive about your manuscript at this stage. Please do not hesitate to contact us if you have any concerns or questions.

Yours sincerely,

Ian R. Henderson

Academic Editor

PLOS Genetics

Kirsten Bomblies

Section Editor

PLOS Genetics

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Review of Genomic diversity landscapes in outcrossing and selfing Caenorhabditis nematodes

Anastasia A. Teterina1,2, John H. Willis1, Matt Lukac1, Richard Jovelin3, Asher D. Cutter3, Patrick C. Phillips1

General

I really liked this MS and think the data and the analyses are (in large measure) important and well performed. I have an issue with the AI modelling component which seems under-described and under-performs.

The core findings - that genome-wide diversity, LD and related measures are all indicative of higher effective population sizes in CaeRema, and that the arms-and-centres structure identified in CaeEleg is present in CaeRema and is likely driven by the same recombinational landscape - are important, and show the impact of selfing on CaaEleg and especially the impact of linked selection/sweeps. These findings are not _unexpected_ as single genome assemblies of several Caenorhabditis [and other] species have already shown similar strong patterning on chromosomes, but they are significantly deepened by the population genomic analysis of CaeRema. While this is currently a 2-way comparison (and thus could be significantly biased by unknown factors impacting either species) the analyses are well performed and - given that they conform to much theoretical inference - are likely to be very strong.

One of the critical findings from the work of the Andersen lab recently has been the discovery and definition of a large number and span of highly divergent regions in CaeEleg where short reads do not map. This generates a false run of homozygosity against the reference if the reference region(s) are present, or an uncovered gap if alternative regions are present This structure is significant (up to 20% of the genome is affected.) Is similar structure present in the CaeRema cross parents, or in the resampling data (perhaps particularly in the divergent individuals)? If present, defining it might assist in partitioning analysis of variation to only deal with regions of the reference that can be/are expected to be covered, and thus can have credible allele calls. The comment at line 969 makes me wonder if just this process might be at play here.

The modelling is interesting in showing the interacting impacts of different population level processes (neutral and adaptive) in patterning the diversity along chromosomes, and is a useful extension of the inferences drawn from the population genomic data. The statement "as predicted by theory" rings loud here - what new or surprising is present in these analyses and what does the contrast between CaeRema and CaeEleg bring to the discussion. The text here could be shortened significantly in this context.

The convolutional neural network component of the work is the weakest and I would suggest that it is removed from the MS. As I understand it the authors used a set of parameters derived from the empirical data to generate a set of synthetic data on a set of model chromosomes, and then back tested the emergent classifier on the real data. The classifier performed poorly, suggesting that the real CaeRema data derived from a selfing population rather than a sexual one. While it is important to publish negative results, I am not convinced that these constitute real negative data - they suggest that the CNN was not able to extract enough information from the model chromosomes which in turn suggests the modelling was not modelling real world processes (as we assume that the measurement of CaeRema biology - ie outcrossing predominates in a some what closed and thus inbred population - is correct).

Analytic comments

The CaeRema and CaeEleg genomes are different sizes (~25%) and it is thus of interest to ask where this difference lies. This analysis impacts on analyses that compare the "proportion" of each chromosome that is in each partition (arm; centre; arm). My calculations indicate that the centre partitions in both species have very similar spans, and that the expansion in CaeRema [or shrinkage in CaeEleg) is largely due to change in the arm partition spans. In the context of discussions of what drives the differences in patterns of evolution between the species this is I think germane.

Similarly, it would be good to know the arms vs centres distribution of the ~56% of the CarRema genome for which ancestral states could be inferred: my suspicion is that these 56% are biased to the centres, and this might impact analyses (though from Fig 3 the contrasts are so clear that I suspect that any impact might be minimal).

In lines 278ff it is mentioned that a "clearer definition" of the boundaries of arms and centres was possible. Does this mean that the numerical base position of the boundaries was changed because of this "clarification"? If not, it is not clear what clarification was afforded.

lines 308ff It is not clear what analysis is being reported. Is this the reference genome vs the imputed LCA? Do the authors have an explanation for the strong difference in tr/tv ratios? Why would the between species differences be so different from the within species ones for this metric? I would have liked to see a discussion of the impact of using a population of CaeRema and a single reference CaeLate in imputing the ancestor: is this the source of the difference?

Minor questions

Tab3 line 398

What does "low accuracy" mean in the methods column? Is it a statement on the performance of the method class as a whole (in which case remove) or a particular form of coalescent analysis (in which case we need a definition).

Textual comments

There are several textual infelicities in the MS which make reading somewhat difficult in places. Some are (imho) grammatically wrong, others I think obscure the meaning and content.

I found the opening paragraphs of the introduction over-general, and sometimes hard to parse scientifically. This could be shortened and focussed. The general statement that there are both selective and neutral processes at work, and that these processes overlap with each other in their impacts on the genome, and that general structural features of the genome - driven by selection on structure in general - can have consequences for/impose constraints on realise change, is unproblematic... but is more textbook introduction than research paper introduction.

eg line 48 "factors act and constantly interact"? Not sure of the meaning of this phrase.

The authors close the legends of (most) figures with a narrative discussion of what the figure is being used to show. My style is to leave the figures to show data and to have such narrative in the main text. I am not sure what PLoS protocol is.

Other text / presentation notes

* arms and centres are defined twice, and the definitions suggests capital-A Arms and capital C-Centres are the short forms (which are not used again).

* I would suggest moving the lists of values/means, p-values, SDs etc from section "Genetic diversity.." [lines 186ff: lines 194ff, 208ff, 222ff, 227ff] to a table where the hypothesis tested, the statistical test and the list of values could be presented accessibly. The current extended sentence listing all values is hard to read and obscures the statement of the actual findings. I would also suggest using the same exponent for all measures within a set/comparison (eg if values are 1.2 x10-2 and sd is given as 2.3 x 10-3 thuis suggests greater accuracy/significant figures for the SD. Give sam number sig figures for each, and think about normalising between species where one is x10-2 and the other x10-3)

* the plots of metrics across the genomes are stated to be measured in sliding windows of 100 kb (mostly) [eg Fig 2, Fig 3]. Is this really a sliding window? in which case what were the step sizes? or is it a tiled window of 100 kb (which is equivalent to a sliding window of 100k with a step of 100k, but, I would contend is categorically different - no sites are counted > once in a step window, whereas they are in a sliding window)

* I have a campaign to halt the use of the word "worm" when nematode is correct (and there are so many other worm-like taxa). Perhaps the authors could look at the ~9 instance of the use of worm(s) and decide if nematode(s) would be better.

* spans are sometimes presented as "100-kb", and sometimes as "100 kb". I think 100 kb is correct in each case. There are a few instances of no separation between value and unit

* sometimes "ten times" is spelt out, sometimes it is given "10x" [etc]; use the spelt out version

* line 232 "recombination rate IN"

* line 239 location of "details" missing

* occasional flipping between past tense and present tense in reporting of work done / results found (even in same sentence) - see lines 354ff for eg.

* line 340 is this a new paragraph, or just a misplaced carriage return?

* line 440 data "show" (data are plural)

Reviewer #2: Taking advantage of the Caenorhabditis clade, where selfing has evolved independently three times, Teterina et al use genomic analyses and evolution simulations to provide insight into the contribution that selfing vs. outcrossing has on genetic diversity. The authors find that while the overall pattern of diversity mirrors the underlying pattern of genetic recombination on chromosome arms vs. centers in both C. elegans (selfing) and C. remanei (outcrossing), C. remanei has significantly higher levels of genetic diversity as is predicted for an outcrossing species. This is an important study that sets the stage for future work to elucidate contributions of reproduction, selection and demography to genetic diversity. The results will be of interest to the broad audience of PlosGenetics.

1. The authors should consider making the manuscript more assessable to those outside population biology – they often mention programs, parameters, and values that have little meaning to those outside the field.

2. The authors should provide more information in the introduction about what is known about selfing vs. outcrossed species. There are some very interesting genetic studies from the Haag, Schedl and Ellison labs that show only a couple of mutations are required to convert between these two different reproductive modes.

3. I understand that this is outside the scope of this study, but it seems a more powerful approach would be look at more than one self vs. obligate outcrossing species to see whether the differences observed here are more generalizable. In the absence of this, the authors should be a little more careful in not generalizing their results beyond the two strains they examined.

Reviewer #3: This paper provides some important findings for C. remanei, including a genetic map and a polymorphism/LD survey of natural isolates. The species is of broad interest because it is an outcrossing relative of C. elegans, and because of the strikingly high level of nucleotide diversity found in natural populations. The authors also pair the new data with a reanalysis of existing data for a C. elegans population, and the contrast between the two is informative. Overall the genetic maps between the two species are qualitatively similar, and the drastic polymorphism and LD differences between the species (which were previously discovered), are consistent with the major life style difference (i.e. remanei is an outcrossing species whereas elegans reproduces primarily by self-fertilization).

One significant puzzle is not explained in the paper, and really needs to be addressed more fully. The inbreeding coefficient for C. remanei was found to be 0.38, a strikingly high figure for an outcrossing species. The PC plot (fig S4) does not really help us understand the cause of this, and neither does the corresponding text “ This cluster was most likely formed from individuals from a single-family lineage displaying intensive”.

There are a couple things about the Fis value that need attention here. (1) Are there runs of autozygosity consistent with consanguineous mating? (2) How long ago did the inbreeding occur? (3) How could inbreeding have occurred if the individuals mostly came from different isopods? (4) Were the sampled isopods from different species?

Lastly, regarding the high Fis, could there be a sequencing or pipeline problem that missed a lot of heterozygotes? The methods are thin regarding this possibility, and appear to not even tell us what the read depth was (sorry if I missed it somehow).

The analysis of changing population size on pp 16-18 and figure 5, is confusing. The two parts of figure 5 seem to tell different stories over recent times, and in particular the SMC++ suggests a flat history for everyone back to 1000 generations ago (albeit with a wide variance depending on the replicate). This is not consistent with the LD based story, nor the text. On balance it seems these analyses just are not working well, and it might be best to just report the conflicting signals, along with a figure of the SFS.

Unfortunately, the simulations and the neural network analyses add very little to the paper. The problems here start with the motivating rationale, which was to simulate data under a wide array of population genetic models to look for the kinds of model differences that mirror the observed differences seen between the elegans and remanei data. This kind of approach, without specific questions, and without much use of data driven estimators, seems bound to provide results that are hard to interpret.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Mark Blaxter

Reviewer #2: No

Reviewer #3: No

Revision 1

Attachments
Attachment
Submitted filename: Response_to_reviewers_PLoS_C.remanei_popgen.pdf
Decision Letter - Kirsten Bomblies, Editor, Ian R. Henderson, Editor

Dear Dr Teterina,

Thank you very much for submitting your Research Article entitled 'Genomic diversity landscapes in outcrossing and selfing Caenorhabditis nematodes' to PLOS Genetics.

The manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important topic but identified some concerns that we ask you address in a revised manuscript.

We therefore ask you to modify the manuscript according to the review recommendations. Your revisions should address the specific points made by each reviewer.

In addition we ask that you:

1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images.

We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

Please let us know if you have any questions while making these revisions.

Yours sincerely,

Ian R. Henderson

Academic Editor

PLOS Genetics

Kirsten Bomblies

Section Editor

PLOS Genetics

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #2: The revised manuscript by Teterina et al presents genetic and genomic analyses and evolution simulations to provide insight into the contribution that selfing vs. outcrossing has on genetic diversity. The authors find that while the overall pattern of diversity mirrors the underlying pattern of genetic recombination on chromosome arms vs. centers in both C. elegans (selfing) and C. remanei (outcrossing), C. remanei has significantly higher levels of genetic diversity as is predicted for an outcrossing species. Evolutionary simulations provide insight into contributions of selection, recombination, mutation, and selfing on genetic variation. The authors have done a good job addressing the previous reviews and the results will be of interest to the broad audience of PlosGenetics.

Reviewer #3: This paper is improved.

However I remain concerned about the estiamtes of historic population size, and apparent inbreeding.

There seems to be a good chance that something fairly fundamental about the history of the sampled population has been overlooked. There are two main clues to this. First are the analyses of historic population size. As mentioned previously, the two kinds of analyeses shown in the figure have little in common. Both can’t be right. Coloring figure 5 does little to alleviate the discrepancy.

Second, there is the high level of Fis. The author’s suggest recent consainguineous mating, which could certainly do it. However this would also leave a high variance in Fis along the chromosomes, something which was not observed. Again, something is amiss.

The paper is informative on the mapping and polymorphism fronts, but the population genetic analyses are incomplete.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

Reviewer #3: No

Revision 2

Attachments
Attachment
Submitted filename: Response_to_reviewers_second_revision.pdf
Decision Letter - Kirsten Bomblies, Editor, Ian R. Henderson, Editor

Dear Dr Teterina,

We are pleased to inform you that your manuscript entitled "Genomic diversity landscapes in outcrossing and selfing Caenorhabditis nematodes" has been editorially accepted for publication in PLOS Genetics. Congratulations!

Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional acceptance, but your manuscript will not be scheduled for publication until the required changes have been made.

Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org.

In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field.  This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager.

If you have a press-related query, or would like to know about making your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics!

Yours sincerely,

Ian R. Henderson

Academic Editor

PLOS Genetics

Kirsten Bomblies

Section Editor

PLOS Genetics

www.plosgenetics.org

Twitter: @PLOSGenetics

----------------------------------------------------

Comments from the reviewers (if applicable):

----------------------------------------------------

Data Deposition

If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website.

The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: 

http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-22-01473R2

More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support.

Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present.

----------------------------------------------------

Press Queries

If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org.

Formally Accepted
Acceptance Letter - Kirsten Bomblies, Editor, Ian R. Henderson, Editor

PGENETICS-D-22-01473R2

Genomic diversity landscapes in outcrossing and selfing Caenorhabditis nematodes  

Dear Dr Teterina,

We are pleased to inform you that your manuscript entitled "Genomic diversity landscapes in outcrossing and selfing Caenorhabditis nematodes  " has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Marianna Bach

PLOS Genetics

On behalf of:

The PLOS Genetics Team

Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom

plosgenetics@plos.org | +44 (0) 1223-442823

plosgenetics.org | Twitter: @PLOSGenetics

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .