Peer Review History
| Original SubmissionJuly 17, 2019 |
|---|
|
PONE-D-19-19081 Analyzing the Fine Structure of Distributions PLOS ONE Dear Dr. rer. nat. Thrun, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we have decided that your manuscript does not meet our criteria for publication and must therefore be rejected. I am sorry that we cannot be more positive on this occasion, but hope that you appreciate the reasons for this decision. Yours sincerely, Qichun Zhang, PhD Academic Editor PLOS ONE Additional Editor Comments (if provided): Two reviewers returned the critical comments focusing on the novelty of the manuscript. Basically, the author redo some existing method using Python where the new features have not been demonstrated clearly. In addition, the English writing should be pre-checked where some typos would affect the readability of the manuscript. Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: I Don't Know Reviewer #2: No ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The paper draws important attention to the pitfalls of existing distributional visualizations for effectively summarizing the nuances of non-normal distributions. Particularly, the paper assesses the efficacy of existing graphical representations (violin plot, box plot, bean plot) for summarizing skewed, multimodal, and uniform distributions, and provides a context and implementation to introduce 'mirror density plots' as an alternative. The context given for existing visual tools is sound, though there are some areas that could be improved: - Regarding histograms in section 2.2: “…in this work, only default parameter will be used because layman would probably not adjust parameters”. Given that the target user of a statistical visualization package in R or Python likely has experience in data science or statistics, this assumption warrants re-examination. For example: adding (bins=“auto”) is a common procedure for researchers using matplotlib’s built-in histogram function (see: https://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram_bin_edges.html#numpy.histogram_bin_edges). Additional context for histograms could be improved with an acknowledgement of non-uniform binning methods, though this may not be in scope for the paper. - It is surprising that this paper contains no ordinary density plots accompanied by sub-axis rugs, which are common methods for analyzing distributions. Similarly, ridgeline plots do not make an appearance, either in discussions of existing visualization methods, or in schematic comparisons of multiple dimensions. Given that these graphical representations seem more common than bean plots, for example, which are discussed at length, background context would benefit from the inclusion of more ordinary non-symmetric representations of distributions. - The paper would benefit from a ridgeline plot with a single axis, comparing the same non-normal distribution with various binning methods (e.g. each: default histogram using n=10; SciPy's "auto" method mentioned above; Scott's Rule mentioned in paper and detailed on SciPy link above; proposed PDE method; any others potentially relevant according to authors' literature review) The fundamental scientific contribution of the paper is the usage of the Pareto Density Estimation (PDE) to construct a visualization of a univariate distribution which captures non-normal characteristics of distributions, such as skew, multimodality, and uniformity. This method appears well-supported, and is explained concisely, with easily accessible packages for both R and Python to supplement the work. While the PDE method for binning appears well-defended, the implementation into a visual language leaves some questions unanswered: - In the broader data visualization community, "Mirror Density" plots are bivariate distributions: for example, one might construct two violin plots of distributions, conditioned on a second binary value (e.g. control vs. experiment), split the resultant forms in half lengthwise, and position them opposite one another to create a comparative representation of the conditioned distributions (see: https://www.d3-graph-gallery.com/graph/density_mirror.html). In this bivariate application, the comparative symmetry adds value to the analytic process. It is unclear from the paper whether the authors are aware of this namespace convergence, but independent of nomenclature, the paper would benefit from an assessment of the analytic value for making a univariate density plot symmetric. - In section 3.5 "The high-dimensional data set (d=45)... is investigated by selecting 12 features": Ridgeline plots with the PDE binning method may be a more space-conservative method of implementing the algorithm (though admittedly, d=45 remains a non-trivial 'curse of dimensionality'). - With regard to the German stock market data in section 3.5, the schematic MD (Fig. 9) and violin (Fig. 10) plots compare distributions in very different ranges. The paper would benefit from the removal of 'InterestExpense' and 'CapitalExpenditures' from the exemplary features, perhaps to be replaced with features of a range more similar to the other features in the schematic plots. - In general, plots should be ordered where possible, e.g. Fig. 5b,c,d should show skew parameter xi in order [0.6,0.95,1,1.1] for clarity. - Stacked histograms are not advisable for this application. Stacked histograms make sense when considering how categories sum to a total population (e.g. when exploring various revenue sources and the resultant aggregate revenue in a single graphic). For comparing model distributions of various skewness parameters, e.g. in Fig. 3a, 5a (are these the same histogram?), stacking does not seem appropriate. Neither does stacking seem appropriate for histograms of normalized data, e.g. Robustly Normalized values for Income Tax Share (ITS) and Municipality Income Tax Yield (MTY) in Fig. 12a. The sum total of 2 bins normalized from different ranges provides no substantial comparative analytic value. In both stacked histogram cases: overlaying, rather than stacking, may provide the intended visual effect, and would be appropriate to the data. As a reviewer with experience in data visualization, I feel confident in my assessment of this component of the paper; however, it is my hope that fellow peer reviewers can speak in a more informed manner on the statistical evaluations and experiments performed. The paper's conclusion in Section 5 "current density estimation approaches can lead to major misinterpretations if the default setting is not adjusted" seems to suggest that the scientific community would benefit greatly from the addition of PDE binning methods to existing open-source visualization packages such as ggplot, matplotlib, seaborn and plotly. I hope the authors consider integrated contribution to existing open-source tools. The paper would be improved by a spelling check, and a grammar proofread by a native English speaker. Typos and grammatical oversights do not obstruct communication, but do inhibit narrative flow. Reviewer #2: This paper studies the mirrored desity (MD) plot and show various structures of the MD results. This paper proposed a MD plot implemented in Python. Since mirrorred density (MD) plot has been developed in R already, the contribution of this paper is not clearly justified. It is not clear what kind of new features this paper introduce into the MD plot. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Jane Lydia Adams Reviewer #2: No |
| Revision 1 |
|
PONE-D-19-19081R1 Analyzing the Fine Structure of Distributions PLOS ONE Dear Dr. rer. nat. Thrun, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. We would appreciate receiving your revised manuscript by Jun 05 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols Please include the following items when submitting your revised manuscript:
Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. We look forward to receiving your revised manuscript. Kind regards, Dr Fatemeh Vafaee and Dr David Mayerich Academic Editors PLOS ONE Journal requirements: 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. 2. Thank you for stating the following financial disclosure: 'No' a) Please provide an amended Funding Statement that declares *all* the funding or sources of support received during this specific study (whether external or internal to your organization) as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now. b) Please state what role the funders took in the study. If any authors received a salary from any of your funders, please state which authors and which funder. If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." c) Please include your amended statements within your cover letter; we will change the online submission form on your behalf. 3. Thank you for stating the following in your Competing Interests section: 'No' a. Please update your Competing Interests statement to state any Competing Interests. If you have no competing interests, please state "The authors have declared that no competing interests exist.", as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now b. This information should be included in your cover letter; we will change the online submission form on your behalf. Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests 4. Please ensure that you refer to Figure 7 in your text as, if accepted, production will need this reference to link the reader to the figure. Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: (No Response) Reviewer #4: (No Response) ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #4: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #4: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #4: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #4: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The authors have made significant improvements to the manuscript in accordance with reviewer comments. These improvements include the addition of more exemplary data mining applications, new visualization methods with thorough comparative assessments, and careful clarification of novel scientific contribution (which was present in the initial draft, but has been more explicitly declared in the revision). There are a few minor formatting and language changes needed before publication: Page 6: “visualizing the b of the estimated probability density distribution (pdf) which will be called in short the distribution of the variable”: If ‘b’ is a variable, it is recommended that it be italicized to avoid confusion. Grammarly unfortunately doesn’t catch ‘atomic typos’, so be on the lookout for those in final revision. For example, p.11: “The bean plot has a mayor limitation” → “major”, p. 19 acknowledgements: “web scrapping” → ‘web scraping’. Note also p.12: "distrubition” → ‘distribution’. Please add: - label to y-axis in Fig. 1a - y-axis values to Fig. 20, 21 - titles to Fig. 13-18, 25-30 Fig. 10b particularly aids reader understanding of distributional differences, and this reviewer is appreciative of its addition, along with other ridgeline plots and the accompanying assessment of their merits. The authors’ thoughtful and comprehensive revision of this paper merits its publication. A note to the editor: It would aid ease of reading for figures and their captions to be included in-context within the paper, with paper body wrapped around. If PLOS intends to increasingly publish content related to data visualization (which would be in the scientific interest), this is a recommended amendment to the paper layout criteria. Reviewer #4: General comments: The authors introduce the Mirrored Density plot as a method to automate the visualisation of univariate densities, with a focus on the case where many features from the same dataset need to be visualised. The authors rightly point out that in a situation where the distributions of many variables need to be inspected as part of an exploratory analysis it is crucial that visualisation tools provide robust defaults that avoid producing misleading plots for a wide variety of distributions. At the core of this manuscript is the authors’ argument that their MD plot, which uses Pareto Density Estimation to obtain a density estimate, is superior to other commonly used visualisations, like the ridgeline, violin, or bean plot. While the argument is generally well presented and the authors offer several examples that are well suited to illuminate the differences between the various visualisation techniques, there is a key point the authors appear to be missing. The process of visualising the distribution of a univariate variable consists of two main steps, density estimation and visualisation. The authors make a convincing argument that PDE is better suited to the task then other commonly used methods, as it doesn’t rely on the user choosing appropriate parameters. However, the authors conflate the issue of density estimation with the visualisation by equating different visualisation approaches with the default estimation techniques offered by the implementations used in the comparison. The fact that PDE is well suited to the task isn’t particularly surprising but making it readily available for data visualisations is indeed a useful contribution. Considering that the primary contribution of this manuscript is relating to data visualisation (rather than density estimation) I am surprised that they do not offer a more systematic discussion of the relative merits of the different visualisation methods included in the comparison. As I see it the major features that distinguish these plots are 1. Horizontal vs vertical display 2. Presence or absence of a rug 3. Whether density estimates are displayed beyond the range of the data 4. Whether the density estimate is mirrored to create a symmetric display The authors have chosen a particular combination of these features but do not articulate clearly why they believe this to be desirable nor do they provide any evidence that this particular visualisation (as opposed to density estimation) is superior to others. In fact, it seems to me that the MD plot is essentially a violin plot with different default density estimation. Detailed comments: 1. The introduction contains several references to histograms that seem less relevant now that histograms have replaced by ridgeline plots for the purpose of the comparison. It would be helpful to shift the focus to ridgeline plots earlier. On pages 3 and 8, it is stated that the comparison will include histograms but there is no mention of the ridgeline plot. 2. I agree that the naming of the plot has potential for confusion with the existing plot of a similar name. The authors may wish to consider whether an alternative name would suit them better. I would, however, discourage arguments about which of the two methods is more appropriately named in the manuscript. 3. The quality of written English in the manuscript is generally acceptable but could be improved in a few places. I would encourage the authors to follow through on their plan to obtain assistance in editing the manuscript prior to publication. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Jane L. Adams Reviewer #4: Yes: Peter Humburg [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step. |
| Revision 2 |
|
PONE-D-19-19081R2 Analyzing the Fine Structure of Distributions PLOS ONE Dear Dr. Thrun, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. We apologize for the delay in getting back to you as we had difficulty in finding enough reviewers for the revised version of your manuscript. Not all the initial reviewers were available to review the revised version and finding a reviewer with relevant expertise who would accept to review has taken a long time. Nonetheless, we invite you to submit a revised version of the manuscript that addresses the points raised by Reviewer #5. Please submit your revised manuscript by Aug 28 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols We look forward to receiving your revised manuscript. Kind regards, Fatemeh Vafaee, Ph.D. Academic Editor PLOS ONE [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #4: All comments have been addressed Reviewer #5: (No Response) ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #4: Yes Reviewer #5: Partly ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #4: Yes Reviewer #5: N/A ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #4: Yes Reviewer #5: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #4: Yes Reviewer #5: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #4: (No Response) Reviewer #5: The authors present a variant of the violin plot, termed “mirrored density” plot, which is intended to provide users with a more useful depiction of the underlying univariate distribution for the purposes of data exploration. The authors correctly highlight the fact that the default parameters for many popular packages may not be suitable for data exploration purposes as they are not sensitive enough to the fine structure of the data. The mirrored density plot is proposed to address this shortcoming of existing visualization software by utilizing Pareto Density Estimation for the estimation of univariate probability densities. In order to argue for the adoption of mirrored density plots, the authors present a series of experiments on both simulated and real datasets in which mirrored density plots are compared to violin, ridgeline, and bean plots. Statistical tests were performed on simulated datasets to test for the presence of certain assumed/designed features (i.e bimodality and/or skewness), and plots were qualitatively inspected for agreement with these tests. I commend the authors for reproducing their work in Python in addition to R to ensure that performance is not implementation dependent. Reproducibility is in important concern, and it is good to see the authors taking measures to ensure consistency. There are several issues that I feel need to be addressed: 1) It is unclear why vioplot was used as the representative package for violin plots. ggplot2 is more widely used and accepted within the R community (155K monthly downloads vs 9K – although ggplot does a lot more than violin plots to be fair). Furthermore, the underlying functionality for MDplot is provided by ggplot2’s violin plot (https://github.com/Mthrun/DataVisualizations/blob/bc76a8c6dc737cb5c593479a534ef2a5b60b330e/R/ClassMDplot.R#L148), so it seems strange not to use this package for comparison. Please see Replication_Exp1_Fig3.svg. This figure shows the mirrored density plot overlayed with two violin plots from ggplot2. The green outline was produced with default parameters, and the red line with the minor adjustments which will be described below. When using ggplot2’s violin plot, multimodality is clearly visible when the second mean is 2.4 or 2.5 unlike the plots produced by vioplot. Please see Replication_Uniform_Fig1.svg. This is a replication of the 1000 uniform samples figure. Again, 2 violin plots are presented on top of the MD plot. The green line is with default parameters. The red line, which almost exactly matches the MD plot, uses kernel=’rectangular’, and adjust=’0.8’. I understand that there is an argument for providing useful default parameters, but I am not convinced it warrants an entirely new package. The use of vioplot instead of ggplot2 is largely responsible for the author’s claim that “Violin plots in R were not able to visualize the bimodality, which was surprising.” 2) “Statistical testing indicated that the ridgeline plot, bean plot, and MD plot have a similar sensitivity regarding bimodality and skewness as long as the sample is large enough.” – This is a gross misrepresentation of the statistical testing performed in this work. The statistical tests referred to were intended to test for the presence of bi-modality or skewness in the simulated datasets. They do not assess the performance of plotting methods. As such, this should not be taken as statistical evidence supporting MDplot’s performance. This paper is ultimately a qualitative comparison of methods and should be treated as such. 3) There is no justification/discussion regarding sample sizes in the simulated datasets which seem to have been chosen arbitrarily. Why were 1000 samples included for the uniform example, 15500 for multi-modality, and 15000 for skewness? It would also be valuable to see how each method performs at various sample-sizes as not all data exploration takes place with such a large sample size. The smaller/real dataset experiments do not address this question as the “ground-truth” behind the structure is ultimately unknown. 4) There is no discussion surrounding limitations/shortcomings of the work. It is important to provide this information for potential users so they can make a well-informed decision about whether this package is appropriate for their data. I strongly recommend a discussion surrounding the shortcomings of the qualitative nature of this work. Quantitative comparisons are possible for this sort of work – for example, blind-surveys could be conducted to see whether individuals can detect underlying structure from the plots alone (or whether they detect structure which is not there). Furthermore, there is no discussion surrounding the tendency of this method to over-fit to the data. Minor corrections: • Throughout the manuscript, both in-text and in-figures, when referring to a normal distribution, m and sd should be replaced with µ and σ respectively (e.g Fig 3b). • The authors should attempt to install their package (DataVisualizations) on a clean installation of R. It does not properly install the required packages. These packages have to be added manually. • It would be nice to have figures either superimposing the MD/violin/bean plots or showing them side by side for an easy visual comparison. • When referring to the Skewed normal distribution, you should use SN, and not N, to avoid confusion with the actual normal distribution (e.g fig 5b). • Plots should be formatted consistently. For example, titles of some are bolded (5b) while the others are not (5a). • Fig 6a uses the naming “beanplot” whereas the rest of the paper uses “bean plot” • In “Given a feature in the data space, there are several approaches for evaluating univariate structures using the indications of the quantity and range of values, e.g., quantile-quantile plots” – e.g. should just be spelled out as “for example” • “The counterparts of the box plot are the range bar [7], and its extension to the notched box plot [8] is nearly unable to visualize multimodality [3]; therefore, it is disregarded in this work. ” – I think you mean to say “The boxplot and it’s counterparts (i.e range bar and notched box) are unable to visualize multimodality and are therefore disregarded in this work.”, but I’m not sure. • The following quote is missing and ending quote: “Pareto density estimation (PDE), the radius for hypersphere density estimation is chosen optimally w.r.t information theoretic ideas [28]. • W.r.t (with respect to) above should probably be spelled out for clarity. Square brackets can be used to indicate a quote has been altered. • There are several places where the phrase “ridgeline plot, violin plot and bean plot” is used. I would suggest changing to “ridgeline, violin, and bean plots” for brevity. Should this suggestion be ignored, then “Although the Python ridgeline and the violin plot use density estimators implemented in different packages” should be made consistent • Plot in the quote above should be plots – same correction applies to “in contrast to the histogram and MD plot.” • Remove the first comma in “The results show that the MD plot is the only schematic plot, which is appropriate for every case and does not require adjustments to its process of density estimation by various parameters” (currently it reads as the MD is the only schematic plot, which is isn’t). • In “Using web scraping, the information of n=269 cases was extracted.” replace was with were. Overall, the English is good, however, there are minor typos and punctuation mistakes scattered throughout. I acknowledge that this was professionally vetted by nature/springer for language editing, but they missed several mistakes. Further notes regarding the similarity of “rectangle” kernel estimation to the Pareto Density Estimation approach: • I have included another comparison of violin and md plots applied to uniformly sampled data (small_uniform_sample.svg). Again, I was able to get quite a similar result using the “rectangle” option for kernel density estimation. • In all fairness, this required a smaller bandwidth factor (adjust was set to 0.5 instead of 0.8). • This suggests that there may be an argument to be made in support of PDE as it does a better job showing the fine-grained structure of the data. • Furthermore, I concede that these plots taper off towards the end which may be misleading to end users. [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. |
| Revision 3 |
|
PONE-D-19-19081R3 Analyzing the Fine Structure of Distributions PLOS ONE Dear Dr. Thrun, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Sep 18 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols We look forward to receiving your revised manuscript. Kind regards, Fatemeh Vafaee, Ph.D. Academic Editor PLOS ONE Additional Editor Comments (if provided): Comment from Editor: I appreciate your effort in improving the manuscript as per reviewers' comments; before accepting the paper, please address minor comment raised by the Reviewer and review the manuscript for English quality making sure that there is no grammatical error and improve figures' quality whenever possible. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #5: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #5: (No Response) ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #5: (No Response) ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #5: (No Response) ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #5: (No Response) ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #5: I thank the authors for taking the time to consider my recommendations, and I am reasonably satisfied with how they have been addressed. In particular, I am pleased to see a discussion surrounding method limitations and a revision of the interpretation of statistical testing. Although it would arguably have been more appropriate to compare MD plots to geom_violin (instead of vioplot) in the main figures, the authors have included in-text references to SI F which does make these comparisons and noted the fact that geom_violin is capable of detecting bi-modality. They have noted that they are happy to improve figures/grammar upon acceptance, so I will leave it to the editor to make a decision regarding this matter. I will include one small nitpick however. The authors replaced all occurrences of "e.g" with "for example". My apologies for not being more clear with my comments. I was only suggesting that the one instance of e.g be replaced with for example as it was in-sentence. It is still appropriate (and probably preferable) to make use of "e.g" inside parenthesis (e.g here). [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. |
| Revision 4 |
|
Analyzing the Fine Structure of Distributions PONE-D-19-19081R4 Dear Dr. Thrun, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Fatemeh Vafaee, Ph.D. Academic Editor PLOS ONE |
| Formally Accepted |
|
PONE-D-19-19081R4 Analyzing the Fine Structure of Distributions Dear Dr. Thrun: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Fatemeh Vafaee Academic Editor PLOS ONE |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .