Peer Review History

Original SubmissionApril 10, 2022
Decision Letter - Yangyang Xu, Editor

PONE-D-22-10602Data-Driven Models for Atmospheric Air Temperature Forecasting at a Continental Climate Region.PLOS ONE

Dear Dr. Hameed,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Aug 06 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Yangyang Xu

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. 

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

3. Thank you for stating the following financial disclosure: 

"This study is funded from Almaarif university college"

Please state what role the funders took in the study.  If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." 

If this statement is not correct you must amend it as needed. 

Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf.

4. Thank you for stating the following in the Acknowledgments Section of your manuscript: 

" The authors like to thank Al-Maarif university college for supporting this study "

We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. 

Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: 

"This study is funded from Almaarif university college"

5. Please include your full ethics statement in the ‘Methods’ section of your manuscript file. In your statement, please include the full name of the IRB or ethics committee who approved or waived your study, as well as whether or not you obtained informed written or verbal consent. If consent was waived for your study, please include this information in your statement as well. 

6. We note that Figure 1 in your submission contain [map/satellite] images which may be copyrighted. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For these reasons, we cannot publish previously copyrighted maps or satellite images created using proprietary data, such as Google software (Google Maps, Street View, and Earth). For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright.

We require you to either (a) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (b) remove the figures from your submission:

a. You may seek permission from the original copyright holder of Figure(s) [#] to publish the content specifically under the CC BY 4.0 license.  

We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text:

“I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.”

Please upload the completed Content Permission Form or other proof of granted permissions as an "Other" file with your submission.

In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].”

b. If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only.

The following resources for replacing copyrighted map figures may be helpful:

USGS National Map Viewer (public domain): http://viewer.nationalmap.gov/viewer/

The Gateway to Astronaut Photography of Earth (public domain): http://eol.jsc.nasa.gov/sseop/clickmap/

Maps at the CIA (public domain): https://www.cia.gov/library/publications/the-world-factbook/index.html and https://www.cia.gov/library/publications/cia-maps-publications/index.html

NASA Earth Observatory (public domain): http://earthobservatory.nasa.gov/

Landsat: http://landsat.visibleearth.nasa.gov/

USGS EROS (Earth Resources Observatory and Science (EROS) Center) (public domain): http://eros.usgs.gov/#

Natural Earth (public domain): http://www.naturalearthdata.com/

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #2: N/A

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The paper presents an interesting case study on applying four machine learning models to forecast air temperature in Cray station of North Dakota using selected historical temperature as features. The forecasting performance of the four models are compared. There are several aspects that are not adequately addressed by the paper (see the major comments below), making the comparison results less solid.

Major comments:

1. As shown in Figure 6. the temperature data in the paper exhibit strong seasonality. A common practice for seasonal time-series data is to decompose the data into a seasonal component and a deseasonalized component, and then use ML models to predict for the latter. This usually improves forecast performance since some of the temporal dependence in the data can be captured by the seasonal models. The paper should also compare with the ML models using deseasonalized data.

2. It is unclear to me how tuning parameters are chosen for each ML model using the "trial- and error- method," and hence the performance comparison in the paper is not entirely convincing (and the results may not be reproducible either). I think the paper should rigorous describe the tuning parameter selection procedure, including but not limited to: What are the hyperparameters being tuned? What are their candidate values? What are the criteria to select the optimum values? What are the selected hyperparameters for the results reported in Section 3?

3. I think more details and justifications are needed on using autocorrelation function (ACF) and partial autocorrelation function (PACF) to choose the number of lags in the candidate models (Table 3). While it seems that the paper is using the lags that lead to PACF values larger than a certain threshold to determine the features included in the daily models, this does not seem to be the case for the weekly models. Also, the weekly models in general achieve better performance by including more features. Does including lag 8 and beyond further improve the performance?

4. The paper should include the performance of traditional time-series models (e.g., ARIMA) as a benchmark, otherwise it is hard to gauge the performance of the four ML models. There are also many off-the-shelf data-based approaches for univariate time-series forecasting, such as THETA [1], Prophet [2] and TBATS [3]. It would be interesting to see what their performance would be for this data set.

5. It is unclear how forecast is performed on the test data set. For example, to forecast $T_t$ on 2020-1-1 using M2, do you plug in the actual values of $T_{t-1}$ and $T_{t-3}$, or do you plug in predicted values of them? From a practical perspective, these two approaches correspond to a one-step-ahead forecast and a multiple-step-ahead forecast, respectively. The paper should clarify which of these two types of forecasting is considered.

6. The paper compares RT and QRT with GBR. I would recommend also comparing their performance with random forest [4] and quantile regression forest [5], which are known to have lower prediction variance than RT and QRT, respectively.

7. In Section 4, the paper summarizes the forecast errors by scatter plots, histograms, and boxplots, which seem redundant to me as they all convey similar information on the model performance. In addition, it would be interesting to see a plot of residuals and/or errors against the time, from which one can examine the temporal patterns of the residuals/errors. A strong temporal dependence in the residuals may suggest that the model is underfitting the dependence among the data and more complex models may be needed.

Minor comments:

1. What is the same size of the training and test data sets in each model?

2. In the second paragraph of Section 2.1, what are the reported summary statistics? The standard deviation for July does not seem correct.

3. In Figure 8, are the models for different months trained by the same data set from 2000-2015, or the model for a particular month is trained by the data from the same month (e.g., the model for January is trained by the January data only)? Due to the strong seasonality in the data, the results from the two approaches may be different.

4. In Equation (1) and (3), there should be a transpose sign between w and x. The optimization formulation in Equation (2) and (3) should include the "slack variables."

5. The mathematical notations should be clearly defined and used consistently. For example, throughout the paper, $\\gamma$, $y$ and $T$ are all used to denote the dependent variable. Equations in Section 2.5 are not correctly numbered.

6. The writing in Section 3 could be improved. For example, the detailed discussion on model fitting during the training phase seems unnecessary and could be misleading as the main focus of the paper is on out-of-sample forecasting.

Reference:

[1] Assimakopoulos, V. and Nikolopoulos, K. (2000). The theta model: a decomposition approach to forecasting. International Journal of Forecasting 16, 521-530.

[2] Taylor SJ, Letham B. (2017). Forecasting at scale. PeerJ Preprints 5:e3190v2 https://doi.org/10.7287/peerj.preprints.3190v2

[3] De Livera, A.M., Hyndman, R.J., & Snyder, R. D. (2011). Forecasting time series with complex seasonal patterns using exponential smoothing, Journal of the American Statistical Association, 106(496), 1513-1527.

[4] Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.

[5] Meinshausen, N., & Ridgeway, G. (2006). Quantile regression forests. Journal of Machine Learning Research, 7(6).

Reviewer #2: Review of: Data-Driven Models for Atmospheric Air Temperature Forecasting at a Continental Climate Region.

This paper uses multiple machine-learning or deep-learning methods to predict daily and weekly air temperature, given the air temperature of previous days (or weeks). Since machine learning and deep learning became one of the main techniques that could be used in various fields, application of these methods in weather and atmospheric science has been growing. In that sense, I think this paper could be one of the case studies of application of ML and DL methods in weather prediction.

However, there is some concerns regarding the performance of the model. I think the performance of the developed model is not sufficient to claim that the authors are actually making the prediction of air temperature. Additional analysis to increase the model performance is needed for this paper to be considered for publication.

Major comments:

1. Section 2.2 – 2.4

These methodologies are widely used, and it is unnecessary to explain the standard procedures for each method. What is important is to highlight the advantage of each model and variation of each model that the authors made to be more suitable for this study. For example:

1) the authors used RBF kernel for SVR. What is the bandwidth of the kernel? Is there a regularization parameter?

2) DT methods, what is the depths of the trees? What are the number of samples required to split?

3) Elaborate the pruning method. How is this done?

4) GBR, what is the learning rate? What is the depth of the model?

Overall, there are so many hyperparameters that authors selected, but none of these values were given. It will be more informative to give this number than to explain the standard model background. Furthermore, with the values of hyperparameters given, it is essential to perform a sensitivity test on the hyperparameters – authors note that trial and error with 100 iteration was used for this. Was there a big difference in the performance of the model? What are the selected parameters?

2. Please elaborate the ACF and PACF (such as equations). Why are these functions used to determine the best lags?

3. On the other hand, despite the RT and QRT models showing the best performance during the training phase, they came last during the testing phase since they may overfit the data during the training phase.

Did authors fix the hyperparameters to prevent the overfitting? I think if the authors guess that overfitting happened, they should fix the models.

4. For the daily predictions, MAE of the best model (SVR) is about 2.7degC, and for weekly predictions, MAE is 3.3degC. The problem with these numbers is that we don’t know how good these values are. So, there should be some kind of baseline to compare to see how good these results are. For example, what would be the MAE of daily predictions if we just use the previous day’s temperature as prediction? Or, for example, what if we just use the seasonality of temperature for the prediction?

5. One of the concerns is that 30% of the data at the end of the observation is used as testing dataset. As authors mentioned in the introduction, the climate is changing. So, the physical drivers of temperature as well as mean and trend of temperature and change during the observation. The authors can use 30% of randomly selected data as a testing data or add in “year” term to the model and see how the results change.

6. More detailed assessment of the predictions should be made. For example, looking at Fig. 10, the prediction errors are higher in colder temperatures. Naturally this is reasonable since temperature variability is higher in winter. Seasonal analysis of temperature should be included. – Now that I look at Figure 8, I see the seasonal analysis. Please talk about this in the main text, and please provide some insights on increase of error in winter is happening.

7. The authors discussed outliers (or extreme temperature events) using the IQR. One of the most important features in forecasting temperature is ability to capture the extreme temperatures. So, rather than using IQR with all datasets, authors should see how the models capture the seasonal extreme temperatures (for example, 2sigma values in June, July August, for heat extreme).

8. Overall, I don’t think the model is doing a good job on forecasting temperature. I took the referenced data (is it Cray or Crary? – I could not find station named Cray). For the 2000-2021 period, I simply compared daily temperature and the daily temperature the day before, and the RMSE was 3.83 degC for the entire period and MAE was 2.91 degC. I mentioned this in the previous comment, but what this means is that the model is not significantly better than simply taking the temperature of day before as its predictions. Furthermore, mean error with my calculation was -0.0015, and STD was 3.83. Comparing this with Fig 11, there is no significant improvement in the model. Attempt to utilize ML to forecast temperature is indeed valuable, but better model performance is needed for this research to be justified.

Minor comments:

It is well known that numerous meteorological and ecological events, human life, and crops in agricultural areas are significantly influenced by climate conditions as well as the environment's physical conditions.

- “Environment’s physical conditions” are rather vague. Please clarify or give examples.

Given these climate conditions, the temperature is a critical factor that can change and is one of the most significant meteorological parameters.

- What are “these climate conditions”? This sentence is confusing. Are you trying to say that temperature can alter other environments?

Moreover, air temperature is one of the most influential factors in the evapotranspiration phenomenon, which is vital for managing water resources and agricultural activities.

- “phenomenon” can be deleted

Furthermore, it has been observed that the temperature varies significantly, which may be responsible for the changes in weather throughout the time [41]

- This in unclear. Are you referring to seasonal variation of temperature or year-to-year variation of extreme temperatures?

The upper air patterns of each season have distinctive features, bringing several weather conditions.

- What are the distinctive features and several weather conditions? Elaborate or delete this sentence.

Accordingly, the continental climate of North Dakota makes the forecasting of the weather patterns a problematic task.

- Why is it difficult to forecast the weather continental climate compared to other climates? Please elaborate.

Tables 1 and 2 show the statistical characteristics of the minimum, mean, average, standard deviation, and skewness of the daily and weekly air temperature at the Cray meteorological station from 2000 to 2021.

- Is it Cray or Crary? I could not find station named Cray. Please make sure.

Are all the diagrams for the models (Figs 2-4) really necessary? I advise to delete these figures.

There are lots of interesting figures, but it does not get talked in the main text. Please give some elaboration of figures in the main text.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Revision 1

Dear Editor and reviewers,

All the received comments have been carefully addressed.

Attachments
Attachment
Submitted filename: Response to Reviewers.docx
Decision Letter - Yangyang Xu, Editor

PONE-D-22-10602R1Data-Driven Models for Atmospheric Air Temperature Forecasting at a Continental Climate Region.PLOS ONE

Dear Dr. Hameed,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Oct 19 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Yangyang Xu

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: I appreciate the effort authors made to improve the paper. I think this is publishable after minor revisions.

1. This paragraph is unclear

North Dakota is located in the middle of North America and is subjected to extreme climate conditions, with hot summers and cold winters.

- Isn’t this the same for all regions? Why did you select North Dakota?

Furthermore, it has been observed that the temperature varies extremely from season to season, which may be responsible for the changes in weather throughout the time [42].

- Same. Of course, weather is causing temperature variation. Why is ND special? Or does it represent the general climate in continental US?

The continental climate of North Dakota makes forecasting weather patterns a problematic task. The difficulty in forecasting may be related to the climate throughout the year. Knowing that the high variation in temperatures through the seasons may hinder predicting the temperature

- It is confusing what the authors are trying to tell us. Please clarify.

2. On SVR method, authors say: “14 However, this equation is unreliable in many hydrological challenges (nonlinear- regression analyses)”. However, the authors are predicting temperature, which is not hydrological (it might be in some aspect, but not generally). Please clarify or make it consistent.

3. This is clearly wrong: The first reason may be associated with the length of the data set used to develop the models. The St.D of the temperature records in February is very high (St.D = 7.695). The second significant reason is the data length used in this study.

The first reason is associated with variability of temperature in wintertime, not the length of the data. Furthermore, the authors claim this is due to the insufficient data (in second reason). I assume this is because Feb has only 28 days, compared to other months with 30, 31 days. Do authors think 2-day (or about 8%) difference really make a huge difference in prediction performance? This can be tested by shortening other months to 28 days. If authors are going to claim this, it should be tested. However, I think this is not the main driver that causes the performance difference. The performance difference is also shown on weekly analysis, in which Feb has probably the same lengths (4-5 weeks) as other months.

4. The last reason may be related to the extreme negative values. This case study has a slight increase in temperature over time; therefore, there are more days with high temperatures than days with low temperatures.

I don’t think trend in temperature really matters. Authors are claiming that there are very few extreme-low temperature to be trained, and this is mostly due to high variability of temperature in wintertime, which overlaps with the first reason.

5. Same. Please make more high-level analysis on the reason: Moreover, the advanced analysis of forecasting error exhibits that the performance of the models is significantly affected by the length, consistency, and variability of data. As February has a fewer number of days as well as higher variability of recorded data, the monitored error in this month is higher than in other months.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Revision 2

Dear Editor,

Thank you for giving us the opportunity to submit a revised draft of our manuscript titled

"Data-Driven Models for Atmospheric Air Temperature Forecasting at a Continental Climate Region" to PLOS One journal. The authors appreciate the time and effort that you and the reviewers have dedicated to providing your valuable feedback on my manuscript. We are grateful to the reviewers for their insightful comments on our paper. Besides, we have been able to incorporate changes to reflect of the suggestions provided by the reviewers. Notably, the changes in the manuscript have been highlighted.

Here is a point-by-point response to the reviewers' comments and concerns.

Comments from Reviewer

Reviewer #2: I appreciate the effort authors made to improve the paper. I think this is publishable after minor revisions

1. This paragraph is unclear

North Dakota is located in the middle of North America and is subjected to extreme climate conditions, with hot summers and cold winters.

- Isn’t this the same for all regions? Why did you select North Dakota?

Reply:

- Yes, because they are located in a continental climate. One of the characteristics of this climate is that there is a difference between temperatures in summer and winter.

- The reasons of selecting the North Dakota are:

1- The location of the case study is in a climate where there is a significant change in the temperatures throughout the seasons.

2- Data availability.

3- It has a remarkable location. The distance from North Dakota to Equator and North Pole are almost the same.

4- All the above points were incorporated in the revised manuscript.

Furthermore, it has been observed that the temperature varies extremely from season to season, which may be responsible for the changes in weather throughout the time [42].

- Same. Of course, weather is causing temperature variation. Why is ND special? Or does it represent the general climate in continental US?

Reply: Due to its far inland location and proximity to both the North Pole and the Equator, which is almost equal, there are noticeable temperature fluctuations. We have incorporated this this point in the revised manuscript.

The continental climate of North Dakota makes forecasting weather patterns a problematic task. The difficulty in forecasting may be related to the climate throughout the year. Knowing that the high variation in temperatures through the seasons may hinder predicting the temperature

- It is confusing what the authors are trying to tell us. Please clarify.

Reply: Thank you for pointing out this problem. The authors with the best of their efforts have provided more clarity on this sentence as per the recommendations of the reviewer.

2. On SVR method, authors say: “14 However, this equation is unreliable in many hydrological challenges (nonlinear- regression analyses)”. However, the authors are predicting temperature, which is not hydrological (it might be in some aspect, but not generally). Please clarify or make it consistent.

Reply: The authors have now clarified this sentence as per the reviewers’ recommendations. The sentence is now more lucid and consistent for the readers.

3. This is clearly wrong: The first reason may be associated with the length of the data set used to develop the models. The St.D of the temperature records in February is very high (St.D = 7.695). The second significant reason is the data length used in this study. The first reason is associated with variability of temperature in wintertime, not the length of the data. Furthermore, the authors claim this is due to the insufficient data (in second reason). I assume this is because Feb has only 28 days, compared to other months with 30, 31 days. Do authors think 2-day (or about 8%) difference really make a huge difference in prediction performance? This can be tested by shortening other months to 28 days. If authors are going to claim this, it should be tested. However, I think this is not the main driver that causes the performance difference. The performance difference is also shown on weekly analysis, in which Feb has probably the same lengths (4-5 weeks) as other months.

Reply: The authors are very thankful to the reviewer for pointing out this issue. The authors totally agree with the point raised by the reviewer which says that the length of data is not the reason which makes a considerable difference in the prediction error. However, the variation of temperature may be the main reason and accordingly the authors have adjusted the manuscript based on the reviewers’ comments. Furthermore, the authors also have focused on the effects of extreme values on model performance.

4. The last reason may be related to the extreme negative values. This case study has a slight increase in temperature over time; therefore, there are more days with high temperatures than days with low temperatures. I don’t think trend in temperature really matters. Authors are claiming that there are very few extreme-low temperature to be trained, and this is mostly due to high variability of temperature in wintertime, which overlaps with the first reason.

Reply:

1. The authors completely agree with the reviewer that the first reason related to data duration does not have a significant effect in the context of model performance. Therefore, as per the reviewer’s suggestions, the authors have removed it from the entire manuscript and have disused this issue concerning the variability of data.

2. Moreover, the authors want to clarify that the negative extreme values interpret the performance of the model for specific months when temperatures are very low (i.e., January, February, and December). For these months, the negative extreme values are fewer in comparison to other months (see table 1) leading to higher forecasting error (see figure 7).

5. Same. Please make more high-level analysis on the reason: Moreover, the advanced analysis of forecasting error exhibits that the performance of the models is significantly affected by the length, consistency, and variability of data. As February has a fewer number of days as well as higher variability of recorded data, the monitored error in this month is higher than in other months.

Reply: More high-level analysis is performed accordingly. please see pg 27 , line number 4-9

Attachments
Attachment
Submitted filename: Response to Reviewers.docx
Decision Letter - Yangyang Xu, Editor

Data-Driven Models for Atmospheric Air Temperature Forecasting at a Continental Climate Region.

PONE-D-22-10602R2

Dear Dr. Hameed,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Yangyang Xu

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: I acknowledge the effort authors has made to reflect my suggestions. I think this paper is now publishable.

However, I am not capable of going through minor grammatical issues, so please make sure your paper is grammatically sound.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

**********

Formally Accepted
Acceptance Letter - Yangyang Xu, Editor

PONE-D-22-10602R2

Data-Driven Models for Atmospheric Air Temperature Forecasting at a Continental Climate Region.

Dear Dr. Hameed:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Yangyang Xu

Academic Editor

PLOS ONE

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .