Our objective was to quantify the similarity in the meteorological measurements of 17 stations under three weather networks in the Alberta oil sands region. The networks were for climate monitoring under the water quantity program (WQP) and air program, including Meteorological Towers (MT) and Edge Sites (ES). The meteorological parameters were air temperature (AT), relative humidity (RH), solar radiation (SR), barometric pressure (BP), precipitation (PR), and snow depth (SD). Among the various measures implemented for finding correlations in this study, we found that the use of Pearson’s coefficient (r) and absolute average error (AAE) would be sufficient. Also, we applied the percent similarity method upon considering at least 75% of the value in finding the similarity between station pairs. Our results showed that we could optimize the networks by selecting the least number of stations (for each network) to describe the measure-variability in meteorological parameters. We identified that five stations are sufficient for the measurement of AT, one for RH, five for SR, three for BP, seven for PR, and two for SD in the WQP network. For the MT network, six for AT, two for RH, six for SR, and four for PR, and the ES network requires six for AT, three for RH, six for SR, and two for BP. This study could potentially be critical to rationalize/optimize weather networks in the study area.
Citation: Deshmukh D, Ahmed MR, Dominic JA, Zaghloul MS, Gupta A, Achari G, et al. (2022) Quantifying relations and similarities of the meteorological parameters among the weather stations in the Alberta Oil Sands region. PLoS ONE 17(1): e0261610. https://doi.org/10.1371/journal.pone.0261610
Editor: Ashraf Dewan, Curtin University, AUSTRALIA
Received: September 23, 2021; Accepted: December 6, 2021; Published: January 13, 2022
Copyright: © 2022 Deshmukh et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data used in this study is freely accessible and downloadable from their respective websites, and it is mentioned in the manuscript. The links are as follows: <http://www.ramp-alberta.org/data/map/default.aspx?c=Climate> and <https://wbea.org/network-and-data/monitoring-stations/>.
Funding: This work was funded under the Oil Sands Monitoring (OSM) Program. It was independent of any position of the OSM Program. The fund was awarded to QKH having an agreement no. 19GRAEM25. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
In general, understanding weather conditions (in other words meteorology) of the atmosphere plays a critical role in our sustainable existence on the Earth surface, including real-time weather analysis for forecasting weather-induced calamities [1,2]. Such analysis is based on the commonly monitored meteorological parameters, including air temperature (AT), relative humidity (RH), solar radiation (SR), atmospheric pressure (or barometric pressure, BP), precipitation (PR; i.e., rain, snow, freezing rain, sleet, hail, drizzle, and fog), and wind (i.e., speed and direction) . These parameters may also be used for other applications. For example, the measurements of AT and BP are used to study the movement of air and energy exchange within the atmosphere. Also, AT and RH are key elements that highly influence the growth of plants and organisms sustaining in a particular region and facilitate the public service and environmental policy . AT is also directly related to the land surface temperature that caused the increasing intensity of surface urban heat islands in many cities [5,6], and help determining temperature regimes in vegetation phenology . On the other hand, SR plays key role in the energy balances of various physical, chemical and biological processes and utilized as most abundant of all renewable energy resources . In addition, the PR measurement has many applications in flood estimation, computing plant water requirements, hydrological analyses and studying water-related issues including erosion and quality . For example, extreme AT and PR events caused warming in the coastal and inland areas , and PR records help in forecasting river flow . Moreover, wind speed and direction recordings are often required to issue weather-related warnings, and they also play a major role in the movement and distribution of spores, pollen, and pollution elements in the atmosphere [12,13].
The meteorological parameters are measured by the weather stations in a network to understand the dynamics of weather conditions for a particular region. Weather stations are usually distributed with specific distances to ensure that their measurements and observations are adequate for the broader need of climate-dependent services, applications, and research for a region . These observations require to fulfil certain criteria and accuracies in describing the meteorological parameters. As such, World Meteorological Organization (WHO) provides few guidelines to ensure representativeness of the observations. Such representativeness is often related to space (horizontal spacing between two stations) and time (measurement interval). The horizontal spacing depends several factors including station locations (land or sea), type of recordings (continuous or non-continuous), and spatial scale of the weather prediction model (i.e., global, regional or local) . For example, the horizontal spacing between two stations should not exceed 250 km in a populated area and 300 km in sparsely populated areas for land stations, whereas 250 km for sea stations . In the case of various weather-related models, the guidelines suggest that at least one station is required for each 10000, 2500, and 100 km2 area for the Numerical Weather Prediction (NWP) model, Global Model (GM), and Regional Model (RM), respectively . On the other hand, measurement frequency varies depending on the potential applications. For example, minutes interval for aviation, hours for agriculture, and days for climate description (historical record and description of average daily weather events) .
Often weather stations are established for specific purposes by different agencies without coordination and following any established guidelines that results in sub-optimal network (with possible overlap) in the region having less or more than the recommended spacing between weather stations . Such overlapping networks in a region, serving different stakeholders, usually involve measurements of the same parameters with near-similar readings, although they are intended for different goals and purposes. However, such overlapped network might result observational redundancy without improving the quality of information, and require higher operating and maintenance cost funded from the same source, i.e., provincial, or federal government . Three distinct networks of weather stations are currently operational in the oil sands region of Alberta under Oil Sands Monitoring (OSM) Program. The program undertakes environmental monitoring within the area that integrates air, water, land, and biodiversity to assess any impacts of oilsands activities on the environment . To establish the relationship between network density and network performance due to such overlapping, research on the rationalization or optimization (i.e., redundancy or gaps in the networks) are found in literature [20,21].
Rationalization of climate monitoring stations have been implementing in Canada over the last three decades, which allowed the reduction of several stations without sacrificing useful details of climatic information [22,23]. However, the approach was often restricted to a maximum of two parameters, such as AT and PR, due to their wider applications in climatic and hydrological modellings [24–27]. Such an approach is determined by capturing and comparing anomalies in the entire network, and its various subsets. These anomalies were estimated using three approaches, such as: (i) time-series trend analysis per decade [20,21]; (ii) statistical analysis of the parameters, such as, mean, median, variance, standard deviation, and coefficient of variation [24,28,29]; and (iii) spatial descriptive statistics in a GIS (Geographic Information System) environment [27,30–32]. In addition to these approaches, closeness and similarity between two datasets are also estimated using two distinct analyses, i.e., graphical, and quantitative . In graphical analysis, the observations of two stations are visually compared for the same period (time-series plot) for identifying time-related variations between two datasets, such as linear or nonlinear trends, upward and downward shifts , and presence of error . Additionally, the scatter plot is another graphical representation that is frequently used to test the model performance (closeness between two datasets) by using the coefficient of determination and slope of the fitted line. On the other hand, the quantitative analysis is classified into two categories, such as (i) analysis of association, and (ii) analysis of coincidence.
Analysis of association is the accuracy estimation using indices, such as coefficient of determination, Pearson correlation coefficient, Spearman’s correlation coefficient, Nash-Sutcliffe coefficient, and cosine similarity. Among these indices/measures, Pearson’s correlation coefficient is one of the most widely used statistics today, because it determines both the strength and direction of the relationship between two variables . On the other hand, analysis of coincidence includes several metrices, such as absolute average error (AAE), relative difference (RD), mean squared error (MSE), root mean square error (RMSE), and bias (B). Among these, AAE is a more natural measure of average error in compared to the highly used measure of RMSE. This is because, RMSE is a function of three characteristics of a set of errors, while AAE is relatively simple to calculate . While these methods indicate the closeness and similarity between two datasets and the ability of one dataset to predict another, they do not estimate actual similarity, i.e., the number of similar values (data points) in each station-pair in the datasets. Moreover, to the best of our knowledge from literature, we did not find any approach that considered instrumental errors to quantify data closeness or similarity in finding redundant parameters/stations in the weather networks. Hence, assuming the sufficiently dense weather stations, we set our overall goal to rationalize the existing networks with a new concept of percentage similarity (PS) and identify the number of weather stations and their measured parameters that provides a similar set of observations. For this, performing a parametric similarity analysis for the measured parameters, including AT, RH, SR, BP, PR, and snow depth (SD) would be appropriate. In fulfilling our overall goal, the specific objectives in this study were to:
- evaluate various measures in both the analysis of association and coincidence in finding the most representative measures in establishing the relationship;
- calculate the percentage of similarity of the measurements in the datasets by considering the instrumental errors to find the similarity among the weather stations and meteorological parameters of the networks; and
- determine an optimal network of weather stations in the oil sands region based on the estimation of the percentage of similarity in each meteorological parameter.
2. Study area and data availability
2.1. Study area
The lower Athabasca River Basin (ARB) of Northern Alberta in Canada, also a part of the Athabasca oil sands area, is our study area. Three distinct networks of weather stations measure the meteorological parameters in the area (Fig 1). The networks included: (i) OSM Water Quantity Program (WQP): seven stations (i.e., C1, C2, C3, C4, C5, L1 and L2); (ii) WBEA Meteorological Towers (MT): six stations (i.e., JP104, JP107, JP201, JP213, JP311, and JP316); and (iii) WBEA Edge Sites (ES): six stations (i.e., JE306, JE308, JE312, JE316, JE323, and R2 (see Fig 1). These 19 weather stations of the networks span between longitude 109°W and 114°W, and latitude 56°N and 58°N. Note that the minimum and maximum distances between stations in each network were found 11.91 and 153.56 km, 69.31 and 241.83 km, and 36.48 and 186.05 km for OSM WQP, WBEA MT, and WBEA ES, respectively. This sufficiently fulfills the WMO requirement of having networks of stations (i.e., 300 km distance between stations for a sparsely populated land) to represent the study area. The Athabasca River passes through the study area that contains various tributaries of the Athabasca and Clearwater rivers. The landscape varies from upland Boreal forests to poorly drained wetlands within the low land regions . The elevation of the stations in each network varies from 294 to 559 m, 256 to 626 m, and 299 to 520 m msl (mean sea level) for OSM WQP, WBEA MT, and WBEA ES, respectively. The landscape characteristics, i.e., topography, surrounding vegetation and closeness to a water body, of the stations are presented in Table 1.
A digital elevation model (source: USGS; https://earthexplorer.usgs.gov/; accessed on 15 Nov 2021) was used in the background, and the 2015 landcover classes and the major rivers (source: Government of Canada provided under Open Government Licence–Canada that allows adaptation for any lawful purpose; https://open.canada.ca/data/en/dataset/; accessed on 15 Nov 2021) over it were made semi-transparent to understand the topography.
The climatic regime of the area is sub-arctic, where the average annual AT varies from 0.7 to 1°C. It is characterized by long and cold winter, short and wet summer, and short spring and fall seasons. Spring and fall are receiving substantial amount of annual total PR that varies from 376 to 456 mm. The wettest month in the region is July, while November through April are the driest months. Based on climate normal from year 1981–2010 at Fort MacMurray, the area is having average annual RH of 40.1 to 87.5%, average BP 96.9 to 97.2 kPa, and average annual SD 0 to 30 cm . Besides, the area receives an average annual SR of 108–128 W/m2 .
2.2. Meteorological parameters and data availability
Stations C1 to C5 of OSM WQP record daily AT in °C, RH in %, SR in W/m2, BP in kPa, PR in mm, and SD in cm at 2 m height, where L1 and L2 record daily AT, RH, and PR. On the other hand, WBEA MT stations record hourly AT, RH, SR, and PR at 2 m, where AT and RH are also measured at 16, 21, and 29 m. In case of WBEA ES network, all stations record hourly AT, RH, SR, and BP at 2 m. Wind speed and direction is also measured at these stations, but this variable was not analyzed in the scope of this paper. The available period of data records for the networks are provided in Table 2.
3.1. Analysis of association
We applied several measures including Pearson correlation coefficient (r), coefficient of determination (R2), Spearman’s correlation coefficient (Rs), Nash-Sutcliffe efficiency (E), and Cosine similarity (Cosθ), as shown in Eqs 1 to 5. (1) (2) (3) (4) (5) where D1, and D2 refers to observational data recorded at Station A and B, respectively, n is the number of observations, RSS is the residual sum of squares, and TSS is the total sum of squares.
3.2. Analysis of coincidence
We adopted several measures including AAE, RD, MSE, RMSE, and B, as shown in Eqs 6 to 10. Note that the meaning of all symbols in these equations are same as described in sub-section 3.1. Analysis of Association.(6)(7)(8)(9)(10)
3.3. Determination of the representative measure
It might be possible that one estimate from each of the association and coincidence measures could sufficiently describe the similarities in meteorological observations to represent the entire datasets. Identifying one such measure would reduce the ambiguity in using multiple measures. Therefore, to find such a representative measure in each group, we performed linear regressions among measures. In case of the association, where all measures were dimensionless, we performed the comparison of the estimates from all meteorological parameters together. In contrast for the measures of coincidence, we separately plotted the estimates derived for various meteorological parameters as the estimates were in different units.
3.4. Similarity analysis
We performed a similarity analysis on the station pairs for the variables (i.e., meteorological parameters) using acceptable values of the instrumental error suggested by the standard operating procedure (SOP; [40–43]) by applying Eq 11. (11) where N1 is the total data count, and N2 is the count that satisfy the following set of arguments:
- If the absolute difference between D1 and D2 are ± 0.5°C for AT, ± 5% for RH, and ± 2.5 cm for SD as suggested in the SOP; and
- If the % deviation between D1 and D2 falls 20% for hourly SR, 10% for daily SR, 2% for PR, and 1% for BP as per the SOP. In this case, the deviation is calculated based on the higher value between D1 and D2.
4. Results and discussion
4.1. Comparison among the measures of association
Fig 2 shows the plots of association measures that were estimated for all the meteorological parameters with reference to r. Our analyses revealed that the association between two datasets could be described by using only r, as high correlations (R2 > 0.84) were observed between r and other measures (i.e., R2, Rs, Cosθ, E) in most cases. The R2 indicates the proportion of variance in one variable due to another. In general, values of R2 if higher than 0.50 were considered as significant and acceptable [44,45] and values of R2 higher than 0.70 were considered as strong [46,47]. In case of E, we considered only the positive estimates as negative estimates would indicate that the observed mean would be a better predictor in this context . Consequently, we opted to use r as a representative for estimating the measures of association for the entire datasets.
4.2. Comparison among measures of coincidence
For the comparison, we plotted AAE against other relevant measures (i.e., RMSE, MSE, RD and B) for various meteorological parameters (see Figs 3 and 4). In general, the AAE estimates were well correlated with strong R2 values (> 0.96) with all other estimates except for RD and B. It would be the case as RD provided an estimate of percent deviation from the highest record between the two stations for a parameter of interest, while AAE, RMSE and MSE provided the actual differences. On the other hand, B presented the summation of both negative and positive differences between two data records; thus it would potentially give different values in comparison to AAE, RMSE and MSE . Taking these into consideration, we assumed to employ AAE as a representative for estimating the measures of coincidence for the entire datasets.
Measures of coincidence related relationships among (a) AAE, RMSE and MSE, and (b) AAE, RD and B for AT in °C; (c) AAE, RMSE and MSE, and (d) AAE, RD and B for RH in %; (e) AAE, RMSE and MSE, and (f) AAE, RD and B for SR in W/m2 for the entire dataset. Data outliers are circled in panels (e) and (f).
Measures of coincidence related relationships among (a) AAE, RMSE and MSE, and (b) AAE, RD and B for BP in kPa; (c) AAE, RMSE and MSE, and (d) AAE, RD and B for PR in mm; and (e) AAE, RMSE and MSE, and (f) AAE, RD and B for SD in cm for the entire dataset. Data outliers are circled in panel (a).
4.3. Relations and similarity analysis
We determined the relations and similarity for each station-pair for the associated parameters according to the estimated values of r, AAE, and PS. Here, we considered at least 75% for PS in finding the similarity between station-pairs. Estimated values of r, AAE, and PS for each station-pair of the three networks are shown in Tables 3–6, and associated regression equations are shown in Supporting Information. A detailed relations and similarity analysis for identifying the least number of required station/s in each network for each meteorological parameter are presented in the following sub-sections. Note that the analysis was performed in each network separately (instead of all three networks combined) and we considered it as a limitation in this study. It was because, we wanted to retain the different goals and purposes for establishing three distinct networks by the agencies.
4.3.1. OSM WQP network.
Table 3 shows relations and similarity analysis on 21 station-pairs for the six associated meteorological parameters (i.e., AT, RH, SR, BP, PR, and SD) in this network. For AT, we found that all the station-pairs were strongly related (i.e., r ≥ 0.99 and AAE between 0.59 and 2.09°C), while the PS values were in the range of 33.99 to 84.60%. Considering the PS values, we found that at least five stations (i.e., C1, C2, C4, C5, and L1) would be required for representing this network. The other two stations (i.e., C3 and L2) were similar to C1 station, which would be likely due to their spatial closeness (see Fig 1) and similar altitude (303 to 331 m msl) .
For RH, we observed that all the station-pairs were highly related with a r ≥ 0.83 and AAE values within the acceptable range of SOP (i.e., less than 10% ; see Table 3). We also noticed that the PS values in the station-pairs were greater than 80% except the C4 vs C5 and C5 vs L1. Considering the PS values, we identified that only one station (i.e., C1 with the longest data records) would be required for representing this network. Such higher similarity was observed possibly because they were very close to waterbodies . Note that station L2 was giving the highest PS value, however, this station did not record all meteorological parameters.
In case of SR, we observed that all the station-pairs were very strongly related with a r values ≥ 0.92 and low AAE values ≤ 29.35 W/m2 (see Table 3). However, the PS values were less than or equal to 42.29%, and therefore, we considered that there was no acceptable similarity in related to SR among the stations, and at least five stations (i.e., C1, C2, C3, C4, and C5) would be required in the network. Note that two stations (i.e., L1 and L2) did not record SR measurements. Such dissimilarity would probably be attributed to several factors, including altitude, terrain, air quality, cloud cover, and vegetation that affect the amount of SR received at any place .
In case of the parameter BP, we found that four stations (i.e., C2, C3, C4, and C5) were recording the data. These station-pairs showed their strong relationships with a r values ≥ 0.81 and AAE values in the range of 0.48 to 3.17 kPa (see Table 3). Considering the PS values, we identified that at least three stations (i.e., C2, C3, and C5) would be required for representing this network (see Table 3). The other station (i.e., C4) were similar to C3 station (i.e., PS = 98.41%), which would be likely due to their similar altitudes (i.e., 295 and 305 m msl) .
Further, in case of PR, we noticed that the station-pairs were having reasonable relations as reflected in r (between 0.46 and 0.83) and AAE values (between 0.56 and 1.47 mm) (see Table 3). However, PS values were extremely low for all station-pairs (i.e., between 2.91 and 9.79%); where such a high dissimilarity in the network was due to the variable nature of PR that was evident even at any small scale . This also suggested that all seven stations (i.e., C1, C2, C3, C4, C5, L1, and L2) would be required for representing PR in this network.
Finally in case of SD, station C1 showed very strong relations (i.e., ≥ 0.92) and least error (i.e., AAE ≤ 3.87 cm) among the four station-pairs (see Table 3). Besides, we found that PS values of the three station-pairs with C1 were strong (i.e., > 76%; see Table 3), except C5 station with PS value of 70.10%. These measures indicated that C1 would be a representative for SD in the network for three stations, i.e., C2, C3, and C4. Here, such strong similarity would be due to having little altitude differences in locations of the stations, which attributed to the fact that snow accumulation on the ground was depended on latitude and time of year in addition to altitude, vegetation and wind . Moreover, dissimilarity of C5 (altitude 559 m msl) with C1 station (altitude 303 m msl) would be due to the same fact of having higher elevation differences of the locations. Therefore, both C1 and C5 stations would be the least required ones for SD in the network.
4.3.2. WBEA MT network.
Tables 4 and 5 show the relations and similarity analysis of six stations (i.e., JP104, JP107, JP201, JP213, JP311, and JP316) for AT, RH, SR, and PR parameters, where AT and RH were measured at different heights (i.e., 2, 16, 21, and 29 m; see Table 4). In this network, we found that all station-pairs were highly related with r values from 0.72 to 0.99 and AAE values ranges from 1.46 to 4.5°C (see Table 4). However, the PS values were significantly low (i.e., between 20.30 to 46.52%) for all station-pairs. Considering the PS values, all stations were required in the network for AT measurements. Such low similarity in the network would be due to the largely spaced distribution of the stations (see Fig 1), and significant altitude differences among these stations (i.e., 256 to 626 m msl) .
In the view for RH, all station-pairs showed reasonably strong relations with r values from 0.73 to 0.90 and AAE values in the range of 5.92 to 11.06% (see Table 4). Considering the PS values, we identified that at least two stations (i.e., JP104 and JP201) would be required for representing the RH this network. The other stations (i.e., JP107, JP213, JP311, and JP316) were similar to JP104 station (i.e., PS values in the range of 75.81 to 82.08%). Note that strong similarity for RH in this network was observed, because RH value would be similar over a region of interest .
Next, in case of SR, we found that all station-pairs in the network were strongly related with the r values in the range of 0.85 to 0.93 and low AAE values ranges from 35.98 to 63.39 W/m2, except the station-pair JP104 vs JP201 with r = 0.18 and AAE = 249.04 W/m2 (see Table 5). However, we noticed that PS values were significantly low (i.e., ≤ 45.83%) for all station-pairs, and therefore, all stations (i.e., JP104, JP107, JP201, JP213, JP311, and JP316) were required for the measurements of SR in the network. Such significant dissimilarity in SR was attributed to the factors including varying topography (i.e., high differences in the station-altitudes from 256 m to 626 m msl), and cloud cover that might affected the amount of SR received .
Lastly for PR, we found that four stations (i.e., JP107, JP213, JP311, and JP316) were recording the data. Here, all stations-pairs were having weak relations with r values ≤ 0.31 and AAE values in the range of 0.55 to 0.66 mm (see Table 5). In addition, we observed that the PS values were extremely low (i.e., ≤ 2.73%), and therefore, all stations were required for the measurements of PR in the network. Such dissimilarity was likely due to the variable nature of PR observed even in a short distances .
4.3.3. WBEA ES network.
We presented relations and similarity analysis for the six stations (i.e., JE306, JE308, JE312, JE316, JE323, and R2) of WBEA ES network in Table 6. For AT measurements, we found that all station-pairs were having very strong relations with r values ranges from 0.97 to 0.99 and acceptable AAE values in the range of 1.53 to 2.66°C (see Table 6) . However, the PS values were low, i.e., in the range of 26.41 to 49.72%. Considering the PS values, all six stations were required in the network. Here, such a dissimilarity was observed probably due to the widely spaced distribution of the stations (see Fig 1), and significant altitude differences  that ranged from 299 m to 520 m msl.
In case of RH, we identified that all station-pairs were having very strong relations with the values of r ranges from 0.82 to 0.92 and AAE in the range of 5.77 to 8.66% except one station-pair (JE312 vs R2) having a reasonable relation (i.e., r = 0.65 and AAE = 11.67%; see Table 6). Here, the PS values were in the range of 61.42 to 84.26%. Considering the PS values, we identified that JE306, JE308 and JE323 stations were required for the measurements of RH in the network. Overall, the similarity of the stations in this network was likely related to having similar landcover .
While comparing station-pairs for the measurements of SR, we observed that all were showing strong relations with the r values ranges from 0.76 to 0.90 and AAE values in the range of 53.04 to 92.06 W/m2 (see Table 6). However, in all cases, we noticed that PS values were considerably low (i.e., ≤ 42.54%). Considering the PS values, we identified that all stations were necessary for the measurements of SR in the network. Such a low similarity would be due to the several factors, including varying topography and cloud cover that could affect receiving amount of SR over a place .
In the event of BP, we noticed that the relations were weak to moderate (i.e., r values range from 0.11 to 0.55 and AAE values in the range of 0.12 to 2.56 kPa) for all station-pairs in the network (see Table 6). Here, the PS values covered a wide range, i.e., 0 to 99.99%. Considering the PS values, we found that two stations (i.e., JE306 and JE312) were required for measuring BP in the network. Here, strong similarities were likely due to their locations in the valley region with similar altitudes .
4.4. Suggested optimization
We synthesized the network-specific required weather stations for each of the meteorological parameters (see Table 7) based on similarity analysis detailed in Section 3.4 and 4.3. We observed that all the stations were required for some of the parameters, i.e., (i) PR for OSM WQP, (ii) AT, SR, and PR for WBEA MT; and (iii) AT, and SR for WBEA ES networks. Though, each of the networks might be optimized for some of the parameters but not for all; thus, we would require keeping all the existing stations. As these would be cases, then we would consider not to remove the parameter-specific sensors that even showed redundancy. It would be due to their useability in case of failure of similar sensors in other stations.
In this study, we demonstrated that the PS analysis could quantify similarity between weather stations. It determines the least number of station/s to fully represent the spatial variability in climate measurements required in a network of interest. Moreover, such similarity analysis is a better measure in compared to the relational measures like r and AAE in quantifying the similarity between two meteorological datasets. It also assists to identify the best representative station for a particular parameter in a network that could represent the area. In most of the instances, we noticed that the station-pairs were related. However, in case of similarity, we identified that the measurements from five, three, two, and one station/s would be the least required for the measurements of AT, BP, SD, and RH parameters, respectively in the OSM WQP network. For the same network, we also found that seven and five stations were the least required ones for the measurements of SR and PR, respectively. For the WBEA MT network, we identified that all six stations were required for the measurements of AT, SR, and PR, where two would be sufficient for RH. Moreover, in case of WBEA ES network, we found that the measurements from all stations were the least required ones for AT and SR parameters. However, only three stations would be adequate for RH, and two for BP, in this network. Note that we could not perform similarity analysis on few station-pairs, because some stations did not record some specific meteorological parameters of our interest. Nevertheless, we showed that the similarity analysis of using the PS value had potential applications to rationalize/optimize weather station network (stations and parameters) in the study area. It would help to minimize the associated operational costs without sacrificing the scientific credibility of the monitoring programs. However, we recommend evaluating these methods thoroughly before applying them to other weather networks in Canada, and elsewhere during any decision-making process. Further, apart from meteorological study, PS could be a tool to find similarity between two datasets with same parameter in other field of research.
S1 Table. Regression equations in relation to similarity analysis of all meteorological parameters of interest for OSM WQP stations.
Here, ‘-’ indicates measurements were not available.
S2 Table. Regression equations in relation to similarity analysis of AT and RH at different heights for WBEA MT stations.
Here, ‘-’ indicates measurements were not available.
The authors would like to acknowledge the Alberta Environment and Parks (AEP) and Wood Buffalo Environmental Association (WBEA) for providing data free of charge.
- 1. Schmiedeberg C, Schröder J. Does Weather Really Influence the Measurement of Life Satisfaction? Soc Indic Res. 2014;117: 387–399.
- 2. Wiston M , KM M. Weather Forecasting: From the Early Weather Wizards to Modern-day Weather Predictions. J Climatol Weather Forecast. 2018;06. pmid:31516406
- 3. World Meteorological Organization. Guide to Meteorological Instruments and Methods of Observation. Geneva, Switzerland; 2010.
- 4. Li L, Zha Y. Mapping relative humidity, average and extreme temperature in hot summer over China. Sci Total Environ. 2018;615: 875–881. pmid:29017129
- 5. Dewan A, Kiselev G, Botje D. Diurnal and seasonal trends and associated determinants of surface urban heat islands in large Bangladesh cities. Appl Geogr. 2021;135: 102533.
- 6. Dewan A, Kiselev G, Botje D, Mahmud GI, Bhuian MH, Hassan QK. Surface urban heat island intensity in five major cities of Bangladesh: Patterns, drivers and trends. Sustain Cities Soc. 2021;71: 102926.
- 7. Hassan QK, Rahman KM. Applicability of remote sensing-based surface temperature regimes in determining deciduous phenology over boreal forest. J Plant Ecol. 2013;6: 84–91.
- 8. Wang L, Kisi O, Zounemat-Kermani M, Salazar GA, Zhu Z, Gong W. Solar radiation prediction using different techniques: Model evaluation and comparison. Renew Sustain Energy Rev. 2016;61: 384–397.
- 9. Shawky M, Moussa A, Hassan QK, El-Sheimy N. Performance assessment of sub-daily and daily precipitation estimates derived from GPM and GSMaP products over an arid environment. Remote Sens. 2019;11.
- 10. Abdullah AYM, Bhuian AH, Kiselev G, Dewan A, Hassan QK, Rafiuddin M. Extreme temperature and rainfall events in Bangladesh: A comparison between coastal and inland areas. Int J Climatol. 2020.
- 11. Belvederesi C, Dominic JA, Hassan QK, Gupta A, Achari G. Short-term river flow forecasting framework and its application in cold climatic regions. Water (Switzerland). 2020;12: 1–18.
- 12. Hubbard KG, Hollinger SE. Standard Meteorological Measurements. In: Hatfield J.L. Baker J.M., editor. Micrometeorology in Agricultural Systems, Volume 47. The American Society of Agronomy, Inc. Crop Science Society of America, Inc. Soil Science Society of America, Inc.; 2015. pp. 1–30. https://doi.org/10.2134/agronmonogr47.c1
- 13. Colston JM, Ahmed T, Mahopo C, Kang G, Kosek M, de Sousa Junior F, et al. Evaluating meteorological data from weather stations, and from satellites and global models for a multi-site epidemiological study. Environ Res. 2018;165: 91–109. pmid:29684739
- 14. World Meteorological Organization. Guidelines on Climate Observation Networks and Systems. Geneva, Switzerland; 2003.
- 15. World Meteorological Organization. Manual on the Global Observing System. WMO. Geneva, Switzerland; 2017.
- 16. World Meteorological Organization. Manual on the Global Data-processing and Forecasting System. Geneva, Switzerland; 2012.
- 17. Anderson J, Ash G, Wright H. A Statistical Comparison of Weather Stations in Carberry, Manitoba Canada. 92nd American Meteorological Society Annual Meeting (January 22–26, 2012). American Meteorological Society; 2012. pp. 1–25.
- 18. Mishra AK, Coulibaly P. Developments in hydrometric network design: A review. Rev Geophys. 2009;47: 1–24.
- 19. Alberta Envirvonment and Parks. Oil Sands Monitoring Program: Annual Report for 2017–2018. Environment and Climate Change Canada, Government of Alberta, Edmonton, Alberta; 2018.
- 20. Janis MJ, Hubbard KG, Redmond KT. Station density strategy for monitoring long-term climatic change in the contiguous United States. J Clim. 2004;17: 151–162.
- 21. Vose RS, Menne MJ. A method to determine station density requirements for climate observing networks. J Clim. 2004;17: 2961–2971.
- 22. Burn DH, Goulter IC. An approach to the rationalization of streamflow data collection networks. J Hydrol. 1991;122: 71–91. (91)90173-F.
- 23. Ouarda T, Rasmussen P, Bob´ee B, Morin J. Ontario Hydrometric Network Rationalization, Statistical Considerations, Research Report No. R-470,. Quebec, Canada; 1996.
- 24. Hubbard KG. Spatial variability of daily weather variables in the high plains of the USA. Agric For Meteorol. 1994;68: 29–41.
- 25. Easterling DR, Karl T., Mason EH, Hughes PY, Bowman DP. United States Historical Climatology Network (U.S. HCN) Monthly Temperature and Precipitation Data. Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory ORNL/CDIAC-87 NDP-019/R3. 5285 Port Royal Rd., Springfield, VA 22161; 1996.
- 26. DeGaetano AT. Spatial grouping of United States climate stations using a hybrid clustering approach. Int J Climatol. 2001;21: 791–807.
- 27. St-Hilaire A, Ouarda TBMJ, Lachance M, Bobée B, Gaudet J, Gignac C. Assessment of the impact of meteorological network density on the estimation of basin precipitation and runoff: A case study. Hydrol Process. 2003;17: 3561–3580.
- 28. Amorim AMT, Gonçalves AB, Nunes LM, Sousa AJ. Optimizing the location of weather monitoring stations using estimation uncertainty. Int J Climatol. 2012;32: 941–952.
- 29. Mishra AK. Effect of rain gauge density over the accuracy of rainfall: A case study over Bangalore, India. Springerplus. 2013;2: 1–7. pmid:23419944
- 30. Guttorp P, Sampson P, Newman K. Nonparametric estimation of spatial covariance with application to monitoring network evaluation. In: Walden A, Guttorp P, editors. Environmental and Earth Sciences. London: Edward Arnold; 1992. pp. 39–51.
- 31. Tsintikidis D, Georgakakos KP, Sperfslage JA, Smith DE, Carpenter TM. Precipitation Uncertainty and Raingauge Network Design within Folsom Lake Watershed. J Hydrol Eng. 2002;7: 175–184.
- 32. Berndt C, Haberlandt U. Spatial interpolation of climate variables in Northern Germany—Influence of temporal resolution and network density. J Hydrol Reg Stud. 2018;15: 184–202.
- 33. Smith J, Smith P. Environmental modelling. An introduction. Oxford, UK: Oxford University Press; 2007. https://doi.org/10.1017/S0014479708006893
- 34. World Meteorological Organization. Guide to Climatological Practices. Geneva, Switzerland: WMO; 2018.
- 35. Ratner B. The correlation coefficient: Its values range between 1/1, or do they. J Targeting, Meas Anal Mark. 2009;17: 139–142.
- 36. Willmott CJ, Matsuura K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res. 2005;30: 79–82.
- 37. Alberta Environment and Water. Groundwater Flow Model for the Athabasca Oil Sands, North of Fort MacMurray: Phase 1 Conceptual and Numerical Model Development. Burnaby, BC.,Canada; 2012.
- 38. Government of Canada. Canadian Climate Normals. In: 1981–2010 Climate Normals & Averages. [Internet]. 2021 [cited 29 Apr 2021]. Available: https://climate.weather.gc.ca/climate_normals/index_e.html.
- 39. Wood Buffalo Environmental Association. WBEA 2000 Annual Report. Fort McMurray, Alberta; 2000.
- 40. Apogee Instruments . Affordable and accurate barometric pressure sensor. 2020 [cited 21 Apr 2021]. Available: https://www.apogeeinstruments.com/barometric-pressure/.
- 41. Government of Alberta. Standards and Quality Program. In: Alberta’s Environmental Monitoring and Science Program [Internet]. 2020 [cited 15 Dec 2020]. Available: http://environmentalmonitoring.alberta.ca/resources/standards-and-protocols/.
- 42. Hinckley A. Pyranometers: What You Need to Know. 2020 [cited 21 Apr 2021]. Available: https://www.campbellsci.com/blog/pyranometers-need-to-know.
- 43. World Meteorological Organization. Guide to Instruments and Methods of Observation. Geneva, Switzerland; 2018.
- 44. Mooi E, Sarstedt M. A Concise Guide to Market Research: The Process, Data, and Methods Using IBM SPSS Statistic. Third Edition. the process, data, and methods using IBM SPSS statistics. 2019.
- 45. Golmohammadi G, Prasher S, Madani A, Rudra R. Evaluating three hydrological distributed watershed models: MIKE-SHE, APEX, SWAT. Hydrology. 2014;1: 20–39.
- 46. Veiga VB, Hassan QK, He J. Development of flow forecasting models in the bow river at Calgary, Alberta, Canada. Water (Switzerland). 2015;7: 99–115.
- 47. Anonymous. Coefficient of Determination. 2021 [cited 21 Apr 2021]. Available: https://www.creativesafetysupply.com/glossary/coefficient-of-determination/.
- 48. Olabanji MF, Ndarana T, Davis N, Archer E. Climate change impact on water availability in the olifants catchment (South Africa) with potential adaptation strategies. Phys Chem Earth. 2020;120: 102939.
- 49. Šimundić AM. Bias in research. Biochem Medica. 2013;23: 12–15. pmid:23457761
- 50. Montgomery K. Variation in Temperature With Altitude and Latitude. J Geog. 2006;105: 133–135.
- 51. Consultants Hatfield. Regional Aquatics Monitoring in Support of the Joint Oil Sands Monitoring Plan: Final 2015 Program Report. Edmonton; 2016.
- 52. Rathod APS, Mittal P, Kumar B. Analysis of factors affecting the solar radiation received by any region. 2016 International Conference on Emerging Trends in Communication Technologies (ETCT). Dehradun, India: IEEE; 2016. pp. 1–4. https://doi.org/10.1109/ETCT.2016.7882980
- 53. WEST JB. Prediction of barometric pressures at high altitudes with the use of model atmospheres. J Appl Physiol. 1996;81: 850–1854. pmid:8904608
- 54. National Snow & Ice Data Center. Snow and Climate. In: Snow and Climate [Internet]. 2020 [cited 21 Apr 2021]. Available: https://nsidc.org/cryosphere/snow/climate.html.
- 55. Wood Buffalo Environmental Association. WBEA 2019 Annual Report. Fort McMurray, AB, Canada; 2020.
- 56. Ramesh K, Anitha R, Ramalakshmi P. Prediction of lead seven day minimum and maximum surface air temperature using neural network and genetic programming. Sains Malaysiana. 2015;44: 1389–1396.