Figures
Abstract
Imagery from fixed, ground-based cameras is rich in qualitative and quantitative information that can improve stream discharge monitoring. For instance, time-lapse imagery may be valuable for filling data gaps when sensors fail and/or during lapses in funding for monitoring programs. In this study, we used a large image archive (>40,000 images from 2012 to 2019) from a fixed, ground-based camera that is part of a documentary watershed imaging project (https://plattebasintimelapse.com/). Scalar image features were extracted from daylight images taken at one-hour intervals. The image features were fused with United States Geological Survey stage and discharge data as response variables from the site. Predictions of stage and discharge for simulated year-long data gaps (2015, 2016, and 2017 water years) were generated from Multi-layer Perceptron, Random Forest Regression, and Support Vector Regression models. A Kalman filter was applied to the predictions to remove noise. Error metrics were calculated, including Nash-Sutcliffe Efficiency (NSE) and an alternative threshold-based performance metric that accounted for seasonal runoff. NSE for the year-long gap predictions ranged from 0.63 to 0.90 for discharge and 0.47 to 0.90 for stage, with greater errors in 2016 when stream discharge during the gap period greatly exceeded discharge during the training periods. Importantly, and in contrast to gap-filling methods that do not use imagery, the high discharge conditions in 2016 could be visually (qualitatively) verified from the image data. Half-year test sets were created for 2016 to include higher discharges in the training sets, thus improving model performance. While additional machine learning algorithms and tuning parameters for selected models should be tested further, this study demonstrates the potential value of ground-based time-lapse images for filling large gaps in hydrologic time series data. Cameras dedicated for hydrologic sensing, including nighttime imagery, could further improve results.
Citation: Chapman KW, Gilmore TE, Mehrubeoglu M, Chapman CD, Mittelstet AR, Stranzl JE (2024) Stage and discharge prediction from documentary time-lapse imagery. PLOS Water 3(4): e0000106. https://doi.org/10.1371/journal.pwat.0000106
Editor: Bimlesh Kumar, Indian Institute of Technology Guwahati, INDIA
Received: September 27, 2022; Accepted: March 1, 2024; Published: April 16, 2024
Copyright: © 2024 Chapman et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All image feature datasets and models are available via https://doi.org/10.32873/unl.dr.20210322. Platte Basin Timelapse imagery is available to University of Nebraska personnel and collaborators for research and teaching.
Funding: The authors are grateful to the Platte Basin Timelapse project for providing the imagery used in this research. This research was supported by the U.S. Department of Agriculture—National Institute of Food and Agriculture NEB-21-177 (Hatch Project 1015698 to TG). Additional student support was provided by the University of Nebraska Research Council through a Grant-in-Aid grant funded through a gift from the John C. and Nettie V. David Memorial Trust Fund (to TG). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
1 Introduction
Accurate measurement and modeling of stream stage and discharge are important for daily water management, flood forecasting and management, ensuring compliance with water use agreements, and for the design of reservoirs, water supply systems, bridges, and culverts [1]. Continuous time series data from stream gages (gauges) are also critical for calibrating and/or validating groundwater and surface water models. Gaps in stream stage and discharge records increase uncertainty in the both modeling and managing water resources. Our study explores whether documentary time-lapse imagery provides enough information to fill data gaps in important hydrological records.
Stream stage is typically measured with floats, pressure, optical, and acoustic sensors [2] (we will refer to these as hydrological sensors). Stream discharge is typically derived from stream stage using rating curves or equations defined for different hydraulic structures (e.g., weir equations). Gaps in stream stage and discharge records may occur due to improper hydrological sensor installation (e.g., especially during short-term studies, when site characteristics are not well-known), equipment failure, and/or gaps in funding for monitoring programs. Fixed ground-based cameras have the potential to provide redundancy for hydrological sensors, providing both quantitative and qualitative information on hydrologic conditions. However, the use of image features from ground-based time-lapse imagery to fill data gaps has not been extensively tested, standardized or adopted in practice.
Extraction of discharge and stage data from imagery—visible light photos or videos—is becoming increasingly common. Direct measurement of stage utilizing still imagery, without the use of machine learning, has been demonstrated to have accuracy in both the laboratory (±3 mm) [3, 4] and in field studies applications [5]. Discharge measurements may also be made with large scale particle velocimetry [6–9], space-time image velocimetry [10], optical tracking velocimetry [11], and other techniques [12] such as the streamflow rank estimation [13, 14]. These studies commonly rely on dedicated camera systems (installed for the purpose of stream monitoring, e.g., U.S. Geological Survey Hydrologic Imagery Visualization and Information System [15]), whereas the imagery used in our study are from an unrelated water documentary project.
Several studies have investigated methods to fill gaps for hydrological and other environmental data [16–20]. Machine learning has also been used to predict stream stage and discharge based on (1) upstream gaging station data from other sites [17, 21–23], (2) other hydrologic data such as precipitation and tide level [21–23], and (3) improved methods for developing rating curves [24]; however, these studies did not combine programmatic labeling of images with stream stage and discharge measurements from existing pressure transducers, image alignment, calculated image features, machine learning, and Kalman filtering (reducing noise, see [25, 26]) to improve prediction accuracy. Furthermore, the hourly data time intervals (during daylight hours) used in this study have higher temporal resolution than the 6-hour to monthly time intervals used in other studies [17, 21–23, 27, 28]. Lastly, previous studies did not allow qualitative assessment of site conditions when data gaps have occurred, whereas time-lapse imagery used here provides critical information for qualitatively (visually) evaluating model performance.
In this paper we focus on the benefits of a documentary time-lapse camera, co-located with a stream gage station, to demonstrate filling six-month to one-year gaps in stage and discharge records. Key objectives of the study are to address (1) whether scalar features extracted from time-lapse imagery collected for a documentary project provide enough information to fill data gaps using machine learning models, and (2) whether the gap-filling (i.e., reconstructions) of stream stage and discharge are, based on performance metrics, of sufficient quality for calibration and/or validation of other hydrologic models [29] used for water and flood management. The study highlights some benefits of passive monitoring at stream gage stations, including a visual record of the scene for 2-D analysis as opposed to only point measurements from the stage sensor, and the ability to visually assess discharge conditions during periods when traditional sensor data is unavailable. We note that this paper is a first step in a larger research project, with subsequent research underway to address in detail the tuning of model parameters and the relative value of the scalar image features used.
2 Study site and data description
Multiple data sets, consisting of stream stage measurements and images were considered for this study. Options included the GaugeCam project [3], Platte Basin Timelapse [30] Mick’s Slide and North Platte River State Line Weir camera sites, and other sources. Images and measurements from the North Platte River State Line Weir site were selected due to the high image resolution (4288x2848 RGB), large number of images (57,544), and proximity (about 20 m) to a USGS stream gage station [31] for comparison. The high-resolution daytime images represent a river scene that includes flow over a weir (Fig 2). The water surface appearance varies with the discharge rate. The images were captured for a documentary project, without installation of specific reference points that would aid image-based detection of stream stage. Over the study period, the camera moved non-periodically in time and randomly in translation and rotation. The extents of movement were roughly ±4° in rotation and ±0.25 m in translation, thus changing the position of the camera and skyline in the images in the images. Image alignment resolved these issues for the purposes of this study, but camera movement and lack of dedicated calibration targets made high-precision, direct image measurements of stage [32] unlikely for this image dataset.
2.1 Study site
The studied USGS stream gage (06674500) is located at the Wyoming-Nebraska state line [31] southeast of Henry, Nebraska (Latitude: 41.9885755; Longitude: -104.0532823). The North Platte River drainage area upstream of the Wyoming-Nebraska State line gage is approximately 58,000 km2 (USGS, 2020). Discharge at the gage is strongly affected by snowmelt in the headwaters, originating in the Rocky Mountains. Other factors that influence discharge include diversion for irrigation and transbasin water transfer, groundwater withdrawals, return flows, and reservoirs. Mean discharge at this site for water years 1929 through 2017 was 22.4 m3/s. For water years 2015, 2016, and 2017, mean discharge was 27.4, 44.9, and 29.0 m3/s, respectively. From 1997 to 2022, 568 or 6.2% of the daily streamflow measurements were estimated due to ice, equipment failure, etc. thus validating the need for additional monitoring methods.
2.2 Data description
The USGS was the source for ground-truth stream stage sensor measurements and discharge calculations [33]. Data was available at 15-min intervals for the time period 09-June-2012 to 11-Oct-2019. Daylight images, taken hourly, were provided by the Platte Basin Timelapse for the same time period. Exposure, shutter speed, capture times, and other data were extracted from each of the image’s Exchangeable Image File (EXIF) metadata written into the images at the time of capture.
3 Methods
Image-based machine learning models were trained and tested to estimate stream stage and discharge. Although our machine learning models are not run-off models or distributed watershed models (they are based on on-site imagery, trained by same-site gage data), we evaluated performance primarily on the same error metrics that are ubiquitous in hydrological modeling studies and which allow comparison with other studies. We also considered upper and lower benchmarks to evaluate discharge model performance with simple benchmarks determined from on-site gage data. The selection of these benchmarks is discussed further in the Sections 3.3.3 and 4.6.
The higher-level process employed in developing and testing programmatically calculable scalar image features was divided into two tasks, as follows:
- 1. Task 1: Image feature development
- a. Select a data set that includes images and accompanying stream stage sensor and discharge measurements representative of both low flow and high flow conditions
- b. Identify and extract features for each image representing stage and discharge. This step includes image alignment and other data cleaning processes.
- c. Create machine learning models with features from a training set of images well suited for identification of effective image features.
- d. Measure prediction performance of the models for an independent test set of images to determine whether features diminished mean squared error.
- e. Return to step (b) to create new features and repeat until the prediction performance is acceptable.
- 2. Task 2: Use case modeling
- a. Use the image features to create models to predict measurements for three year-long data gaps.
- b. Calculate and report the performance ratings for the year-long gaps for 2015, 2016, and 2017.
- i. Review imagery and adjust methodology where models demonstrate poor performance
It should be noted that the use of image processing and machine learning are at the core of this study and are discussed in detail throughout. However, the focus in this study is not the inner workings of these tools, but to apply these tools to determine stream stage and discharge measurement prediction accuracy. In-depth study of model selection and tuning is the topic of ongoing research, as described in Discussion and Conclusions sections.
In designing our study, we also carefully considered trade-offs between the development of a scalar image feature set versus the application of other common approaches such as convolution neural network or similar algorithms. Although our research group has access to high performance computing, we opted to focus on an approach with relatively light computational requirements so our study can be easily replicated. The entire study was completed using a laptop computer with an i7 processor with 16GB of RAM and no GPU. The calculation of scalar features for about 40,000 high-resolution images took approximately 12 hours. This approach now serves as the basis for an application [34] that facilitates meaningful machine learning approaches requiring computational resources that are readily available to most students and scientists. Further discussion of trade-offs and ongoing research are described in the Discussion and Conclusions sections.
3.1 Training and test sets
The data are separated into two distinct training and test sets. Task 1 focused on image feature development and quality of the image features. Image feature development was an iterative process using a large dataset to determine which scalar image features may be useful in models. We used an approach advocated by [35], which identifies this as the “Dev (development) set—which you use to tune parameters, select features, and make other decisions regarding the learning algorithm.” Ng also notes that “Before the modern era of big data, it was a common rule in machine learning to use a random 70%/30% split to form your training and test sets. This practice can work, but it’s a bad idea in more and more applications where the training distribution … is different from the distribution you ultimately care about …” Thus, for Task 1, 13879 images (30% of images) were randomly selected from the entire data set of 40,000+ daytime images from 2012 to 2019 as a training set. The remaining 28181 images (70%) were used as a test set.
Task 2 focused on the actual use cases. Use cases involved training and testing data to simulate data gap-filling for different time periods. The training and test sets for Task 2 were created for the use cases (Table 1). Use cases were created to determine whether we could fill gaps in the data with training data taken from before and after the gap to be filled. We selected test sets that simulated one-year gaps for the USGS years 2015, 2016, and 2017. Those years were selected to limit the scope of the research and still provide a viable test of the models to perform predictions. We then created training sets consisting of sequential leading and trailing data, e.g., 5,000 images before and after the gap to be predicted. It should be noted that additional training sets with different amounts of leading and trailing data (500, 1,000, 2,500, and 8,500) were tested. The test and training data sets and resulting models and predictions are available from [36], but the 5,000 before/5,000 after training sets were found to yield satisfactory results.
Each line in the table represents three training sets made up of leading and trailing data.
3.2 Feature development
The development of effective image features was a collaborative effort which utilized the expertise of imaging engineers and hydrologists. “Hand-crafted” features [37, 38] were explicitly chosen to take advantage of hydrologist expert knowledge. These features provide a baseline with which to compare to follow-on machine learning-selected features studies. The identified effective features successfully predicted stream stage and discharge for the test set selected for that scenario. Additionally, the features were optimized through an iterative process involving refinement of the programmatically calculated features using the workflow described below.
3.2.1 Workflow.
Fig 1 is a flowchart that shows the image feature development workflow. The individual elements of the workflow are described in more detail in S1 Text, but it is helpful to understand the context of each element within the workflow.
3.2.2 Image features.
Machine learning models based on images often require laborious manual annotation or labeling of images. The large number of images in the data set and limited resources made it impracticable to perform such manual annotation; therefore, methods to calculate scalar image features were implemented in C++ and a workflow to develop, annotate, refine, and prune features was created (Fig 1). The features were developed with the help of the OpenCV [39] computer vision library. Two types of scalar features were extracted: Whole-image features and domain-informed features. Whole-image features are generic features such as average pixel intensity and image entropy that could be extracted from any image. Domain informed features are ones a hydrologist could identify specific to the domain of hydrology such as the shape and texture of the white water and the color of the water above and below the weir. These scalar values were stored as tables in Comma Separated Value (CSV) files. Each table row holds the image capture timestamp, camera settings (shutter speed, f setting, ISO speed, etc.), sensor measurement timestamp (closest to the capture–must be <15 minutes or the image is not used), sensor measurements for stage and discharge, and scalar values for each of the calculated image features. These features are available at [36].
Whole-image features represent generic statistics about the image as a whole, irrespective of the domain. These include:
- Intensity mean and sigma—average and standard deviation of the pixel intensities
- Entropy mean and sigma—average and standard deviation of the Shannon entropy and Hartley function calculated from all the pixels within a specified radius of each of pixel position
- Hue, Saturation, Value (HSV) mean and sigma—average and standard deviation of each of the HSV planes which define color for each pixel with values of Hue, Saturation and Value
It was observed that the shape and texture of the turbulent water (whitewater) below the weir varied with stream stage and discharge (e.g., [9]). Additionally, color, pixel intensity, and texture of the water in specific areas above and below the weir also varied with changes in stage and discharge. First, methods to find each weir location in the image were developed. Then, algorithms were created to measure the features. Fig 2A illustrates a typical image from the data set while Fig 2B shows an image annotated with the results from a weir search. Unlike other studies that used edge detection tools to track the water moving up and down a bridge support or riverbank, there were no convenient reference objects to determine stage in the image scene at the North Platte River State Line Weir.
A: Typical Platte River Basin image at North Platte River State Line Weir site. B: Same image with annotations to show the weir search region (green), bank location (yellow), and weir found location (red). (Images provided by [30] under a CC BY license, with permission from Michael Farrell, original copyright 2012–2019.).
The first set of domain informed features (Fig 3) consists of scalar measures of the shape, color and texture of the turbulence below the weir. This process included the following steps: (1) find the line of the weir, (2) segment the whitewater, and (3) calculate measures based on the segmented whitewater. Whitewater measures include color and texture within the whitewater area (Shannon entropy, color (HSV) statistics, and intensity statistics) and shape characteristics (area, perimeter length, and distances of whitewater contour points from the weir (minimum, maximum, mean, and standard deviation of lengths).
ROI = Region of Interest. (Images provided by [30] under a CC BY license, with permission from Michael Farrell, original copyright 2012–2019.).
The second set of domain informed features are based on the observation that color, intensity, and texture of the water in specific areas above and below the weir varied with changes in stage and discharge. The Regions of Interest (ROIs) from which these features were calculated are shown in yellow for the area defined above the weir and blue for below the weir (Fig 3). The objective was to measure the difference in appearance between the upstream and downstream water surfaces relative to stage and discharge.
3.2.3 Feature calculation challenges.
A typical image is shown in Fig 2A. All effective features can be extracted from these types of images. Each feature could not be calculated on all the images and some images even needed to be eliminated because there is significant variation in the images from what was typical. Over time, the position of the weir in the image varied showing translation and rotation due to drift in location of camera position. This was accommodated by a weir position search algorithm. Edge detection was performed within a specified ROI and a line fit was calculated from qualifying points. A fixed ROI that traverses the weir is divide into 10 segments, roughly orthogonal to the weir. The position of the maximum value of the sum of pixels intensities traveling from upstream to down stream is calculated for each segment. The position of the zero crossing of the second derivative of each of the maximum value positions is used to refine the position to sub-pixel values. A random sample consensus (RANSAC) line fit is performed with each of the found segment points to provide a line equation.
If the angle of the fit line was within a specified minimum and maximum, the weir search was defined as successful. There were 55,804 total images in the data set. Of those, features were calculated for 42,059, which did not have insufficient light (Fig 4E) or occlusions like frost on the lens (Fig 4B). Of the 42,059 processable images, it was possible to find the weir in 36,134 images for which both weir and whole-image features could be calculated. The remaining 5,925 images had debris or ice and snow that prevented the finding of the weir, so only whole-image features were calculated.
Examples of Platte River Basin images where weir position is not found: A) Snow, ice, and debris, B) ice or frost on the lens, C) elevated stage, D) debris occludes parts of the weir, E) dawn/dusk image that is too dark, and F) a typical “good” image. (Images provided by [30] under a CC BY license, with permission from Michael Farrell, original copyright 2012–2019.).
At times, the weir could not be found because it was occluded in the images by elevated stage (Fig 4C), e.g., debris (Fig 4D), or ice and snow (Fig 4A). This challenge was accommodated by setting the below-weir shape and texture features to an artificially low “not calculable” value. These types of images were included in all the training sets, test sets, and predictions as the other features combined with the knowledge that their weir positions were not findable allowed for the creation of satisfactory models.
The camera system collected images during daylight hours, but some images were still too dark to extract effective features. These dawn/dusk images were eliminated from evaluation if the whole-image mean pixel intensity was below a threshold (set to 45) rendering the image too dark. Fig 4E shows a typical dark image. Images with ice on the lens such as Fig 4B present the same kind of problem. These images were also eliminated if whole-image contrast was below a minimum threshold. The contrast was calculated by running a Canny filter on the image with settings of 35 and 70 for the first and second hysteresis procedure thresholds and an Sobel aperture of 3. If the count of edge points after the Canny filter was less than 1.5 percent of the pixels in the image, the image was eliminated.
In keeping with the research goal of finding whether there was enough information in the images to build a model of sufficient precision to be interesting, the features were kept as simple as possible. Tables of the features we used to build the models with their CSV file header names and brief descriptions are shown in S2 Text.
3.3 Models, predictions, noise filtering, and error metrics
3.3.1 Models.
Three types of models were created for each training set. The models were selected based on three criteria: 1) Suitability to predict scalar stage and discharge measurements from a scalar feature set, 2) availability in standard machine learning tools such as Weka [40], SciKit Learn [41], and R [42], and 3) use of the tools in similar types of machine learning tasks [17, 22–24, 27, 28, 43, 44].
The models chosen were
- Multi-layer Perceptron (MLP) [45, 46]
- Random Forest Regression (RFR) [45, 47]
- Support Vector Regression (SVR) [45, 48]
While it is possible to identify strengths and weaknesses of the different models [35, 44] emphasizes the iterative nature of machine learning algorithm development. In the spirit of this admonition, error metrics shown in Section 3.3.3 with equations in S3 Text were calculated and minimized through an iterative process to select features and identify a model that best met the objective of the study: creation of predictive models to accurately fill data gaps.
The inputs to the models, predictions, and the models themselves are available for download from [36]. These include training and test feature and sensor data in CSV file format, prediction files that hold observed vs. calculated tabular data, Kalman filter results, error metrics in Open Document Spreadsheet format, and the models themselves in Weka model format. The Weka models are the binary files with the file extension “model” that hold trained SVR, RFR, and MLP machine learning models used to make predictions. The models can be opened in Weka 3.8 or above to view the default model parameters used in the study. It is possible to run the models on the provided test data. The model creation and prediction processes are well described in “The Weka Workbench” [40].
3.3.2 Predictions and noise filtering.
After the models were created, the test case features were transferred to each of the models to produce predictions, which were then filtered for noise. The true stream stage and discharge of a stream is typically highly correlated over short time intervals. This correlation can be used to reduce prediction error from model variance by applying Kalman filtering [49] on a time series of stream stage and discharge predictions. The use of Kalman filters is not uncommon in hydrology [25, 50]. For an introduction and practical perspective on Kalman filters, see [51]. For a simplified example, assume stage and discharge models produce predictions with noise that is additive, Gaussian with zero mean and a fixed variance, and uncorrelated across predictions. Further assume that stage is a Gaussian process that evolves slowly over time. This makes the joint distribution between recent predictions produced by the model and the true stage well-defined. Kalman filtering is a procedure whereby this joint distribution is maintained, and filtered estimates of the true stage are produced as the expectation of stage, conditioned on observing several of the model’s noisy predictions using this modeled prior distribution.
3.3.3 Error metrics.
Ground truth is taken to be the stage or discharge values in the USGS data tables for the North Platte River State Line Weir. The predictions are the stage or discharge predicted by the various models and filtered for noise with the Kalman filter. The error metrics are commonly used metrics within hydrology described in [25, 52]. The equations for the error metrics are shown in S3 Text. These include Mean Absolute Error (MAE), Mean Square Error (MSE), Root Mean Square Error (RMSE), Normalized RMSE (NRMSE), Prediction Bias (PBIAS), Nash-Sutcliffe Efficiency (NSE), and RMSE-observations Standard Deviation Ratio (RSR). For benchmark-based performance evaluation we took (1) actual USGS measured daily stream discharge as the upper benchmark, and (2) the median of all measurements for each day of the year from 2009 through 2014 as the lower benchmark. The upper benchmark is idealistic and the lower benchmark simplistic. This approach does introduce longer-term seasonal runoff patterns into the evaluation of model performance (as opposed to comparison to the mean observed discharge for the modeled time period, i.e., NSE). We recognize that there is no widespread agreement on exact benchmarks to use, and the approach is not common practice in hydrological modeling [53], but we provide these metrics for future comparison with single-site studies.
4 Results
The results are divided into six sections. The first three, (1) image feature development, (2) use cases, and (3) 2016 half-year gap use cases, describe the actual model predictions and their performance ratings. The fourth section is a comparison of our results with those from other stage and discharge machine learning model studies. The fifth section compares the performances of the different machine learning models. The sixth section shows the performance metrics of the MLP discharge model relative to appropriate upper and lower benchmarks.
4.1 Image feature development
Models built with the features identified in the image feature development scenario were investigated. Performance of RFR exceeded that of MLP and SVR based on all error metrics (Table 2). The observed vs. predicted graphs for the MLP (Fig 5) demonstrate that the simulated data closely predicted the observation data. Maximum absolute error (86.8 m3/s) occurred when high discharge was underestimated by the MLP model, but the MLP model tended to overpredict low flows Fig 5. The low-flow errors were relatively small, but cumulatively led to a small and negative PBIAS (-4.71%). That the predictions tended toward the mean could be an artifact of the smoothing inherent to the Kalman filtering.
The test set consisted of the remaining 70% of the images in the data set. Values are shown only for measurements where usable daytime images were available (nighttime data are excluded).
Error metric equations are in S3 Text.
We concluded from the image feature development process that the identified scalar image features held adequate information to build machine learning models to predict stage and discharge. We anticipated that they would likely yield good performance ratings for the use cases as well.
4.2 Use cases
We created and tested models for the 2015, 2016, and 2017 gap years. Each of the models performed well for 2015 and 2017, but not as well for 2016. The NSE for 2015 and 2017 ranged from 0.83 to 0.90. The predictions for 2016, however, were not optimal and ranged from 0.45 to 0.82. Performance metrics relative to upper and lower benchmarks are described and shown in Section 4.6. Table 3 shows all the error metrics for the year long gaps. S3 Text shows the error metric equations.
Error metric equations are in S3 Text.
The poorer performance of the 2016 model can also be seen in the observed vs. predicted graphs for the MLP model for each year (Fig 6). These results were not due to the feature set because the models for the image feature development scenario accurately predicted both stage and discharge for 2016 (Fig 5). The predicted curves track the observed values closely for 2015 and 2017, but for the times of elevated discharge in 2016, the predicted curve substantially underpredicts the observed discharge. In addition, highest stage and discharge for 2016 (≈240 m3/s) were much higher than the most elevated levels for 2015 (≈175 m3/s) and 2017 (≈125 m3/s).
Values are shown only for measurements where usable images were available (i.e., nighttime data are excluded).
4.3 2016 half-year gap use cases
Additional MLP models were built to understand the poor performance of the 2016 full-year gap models. The observed vs. predicted graphs (Fig 6) illustrated that the 2016 predictions were not satisfactory for stage or discharge at times of elevated discharge. The stage and discharge from 2015 and 2017 training sets (used to train the models to predict the 2016 gap year) were much lower than the highest stage and discharge in the 2016 data. We suspected that the lack of high discharge in the training data was the source of poor performance in the 2016 gap year model. To verify, we trained half-year gap models for the first and second halves of 2016 with surrounding data that included the higher stage and discharge measurements from that same year. NSE improved to a range between 0.87 to 0.95 and PBIAS was reduced substantially compared to the 2016 full-year gap (<18% versus 32%, Table 4). It can also be seen that the predicted curve more closely tracks the observed curve (Fig 7).
Values are shown only for measurements where usable daytime images were available (nighttime data are excluded).
Error metric equations are in S3 Text.
4.4 Model comparisons
For this study, we used machine learning models (SVR, RFR, and MLP) readily available within common machine learning tools such as Weka, SciKit Learn, and R. These models produced satisfactory results. In general, the MLP (Fig 8; blue bars) demonstrated the highest performance, with the RFR model (orange bars) a close second. The SVR models performed reasonably well, but not as well as MLP and RFR. As expected, there is a noticeable drop in performance for most of the metrics for 2016 gap year due to the training and test set discrepancies for that year.
We conclude that the models, particularly MLP and RFR, performed well and can be used in stage and discharge predictions. There is still an opportunity for research into model tuning and optimization that could improve the reported models’ performance by improving prediction results in the case of challenging scenes representing extreme events. We expect that it would be possible to improve the SVR through adjustment to the various parameters within Weka that control the model building, such as the complexity parameter, the kernel to use and the tolerance parameter used for checking stopping criterion. Deeper exploration of model tuning is the subject of an additional study.
4.5 Comparative studies
Several studies have used machine learning models to predict stream stage and discharge. The reported statistical metrics were RMSE and NSE for all but one study. As can be seen in Table 5, each study reported good results in terms of both RMSE and NSE for the domains within which they were working and the goals they set. There were three major differences that sets our study apart from the others:
First, our models included 42 features while the other studies used one to four variables. The large image feature set used in the study produced strong models but we recognize that the optimal feature data set for adequate gap-filling models and/or a generalized modeling approach across other study sites may have an intermediate number of features (between the number used in our study and the number used in previous studies). Image features are programmatically calculable and therefore they are easy to merge with ground truth data. They are also easily merged with non-image variables like those used in the comparison studies. Use of additional variables (e.g., precipitation, gage data from upstream or downstream) can potentially increase the accuracy of image-based models.
Second, the time interval between feature measurements was hourly for this study while the other studies ranged from six hours to one month. We focused on hourly data in part because images were collected hourly, but also to reconstruct (predict) the maximum amount of gage data possible. Thus, this study provided more fine-grained time-correlated predictions due to the shorter hourly time interval compared to other studies, thus preserving more information. A drawback to image feature in this study, however, is the lack of nighttime imagery and the occasional artifacts that prohibited analysis of image features.
Finally, this study and two of the compared studies [27, 28] included only on-site data while the other studies used a combination of on-site and off-site data. An example of using off-site data can be found in [17]), where “The idea is to model flow at one gage as a function of flow at another gage or gages.” There are trade-offs between these approaches. In some cases, off-site data such as precipitation elsewhere in the watershed can add information to models, but those data may not be well-correlated with stream stage or discharge at the study site. For instance, spatial variability in precipitation could in some cases add noise to the model inputs. Similarly, groundwater and tributary inflows and managed reservoir releases may confound relationships between upstream discharge data and the given study site. However, in some cases, upstream and downstream discharge can be strongly correlated, if not linearly. These issues could be further explored by incorporating image features from one or more sites and/or incorporating other off-site data in future studies.
We conclude from this comparative analysis that we have been able to build accurate models with images that provide a smaller time interval between predictions, more features than have been available in previous studies and with only on-site data sources. We also believe the model in this study could benefit from the use of variables identified in the comparative studies as much as they could benefit from the use of image features when images are available.
4.6 Model performance based on daily data
Having established the feasibility of using daytime image features from high-quality images to predict available hourly discharge values over year-long data gaps, we further evaluated model performance on a daily time-step, which is common for hydrologic model calibration (Fig 9). Model performance was evaluated based on the same error metrics as hourly models. We also relied on simple benchmarks (Fig 9) to evaluate model performance relative to longer-term seasonal run-off patterns at the North Platte River State Line Weir. Collectively, this analysis also tested if hourly daytime-only imagery is representative of daily discharge.
The upper benchmarks are the blue lines which show the daily means of the USGS stream flow measurements with light blue shaded areas around them that show the 5% and 10% error bands as discussed in [1]. The lower benchmarks are the green lines which show the six-year daily median (2009–2014) of the USGS stream flow previous to the gap being predicted. The prediction curves created from the MLP machine learning models are the orange lines.
Based on daily average predictions, MLP model performance (Table 6) was similar to hourly predictions (Table 3). PBIAS for the daily time-step model was greater for daily values; however, still below 20% in the worst case (2016 gap year, -19.4%).
Establishment of a lower benchmark required a method to fill year-long gaps of daily discharge measurements for each prediction year. Methods that are able to fill gaps of that length depend on off-site data and/or were more suited to fill much shorter data gaps [54–57]. Therefore, we decided to use the six-year daily median of the USGS measured stream discharge from 2009 to 2014 as the lower benchmark. The lower benchmark therefore reflects long-term seasonal runoff characteristics for the site. We recognize that the North Platte River is a highly managed system, where upstream reservoirs regulate stream discharge, however that regulation is driven in part by seasonal demands of downstream users.
For the upper benchmark we used the daily mean discharge for 2016–2017 [31]. This was the same data used to train the machine learning models. It should be noted that uncertainty in discharge measurements commonly range from 5% to 10% as suggested in Tables 6 and 7 of [1] and illustrated in Fig 9. There are discontinuities in both the USGS measured stream discharge daily means and in the prediction curves. These discontinuities and the selected benchmarks are addressed further in Section 5.
To incorporate the longer-term seasonal patterns of discharge into model evaluation, results in Table 6 were used in Equation 1 of [53]
(1)
where Rx is the prediction performance measure, Rlower is the lower benchmark performance measure, and Rupper is the upper benchmark performance measure.
The benchmark-based performance metrics (Table 7) show the MLP models for all three years performed better than the lower benchmark except for PBIAS for the 2016 gap year. In addition, they show the 2015 and 2017 models performed better than the 2016 model, as expected.
5 Discussion
This study was focused on the feasibility of an image-based approach to time series gap-filling of stage and discharge using on-site data. Based on a large imagery dataset (over 40,000 images in total), relative to other image-based hydrological studies (e.g., [10, 58–61]), we extracted scalar image features to train and test machine learning models to fill simulated months- to year-long data gaps in stage and discharge records. The approach employed in this study relied on substantial image processing and the expertise to extract specific (hand-crafted) image features from aligned, high-resolution imagery. Some image features extracted (e.g., turbulent area features downstream of a weir) are specific to the chosen research site and may not be transferable to other sites. An advantage of the approach was the low computational requirement to complete the project, as evidenced by the application of an average laptop computer, which are available to most everyone. On the other hand, we could have used deep learning approaches [62, 63] and high performance computing resources that help to offset the expertise required to build hand-crafted image feature datasets.
Discontinuities in model predictions depicted in Fig 9 were a result of images that could not be processed due to ice and debris in the river and due to fog (Fig 4). Additional research could be performed on ways to ameliorate the problem with images that could not be processed. The existence of poor-quality images is a drawback to any image-based hydrological modeling approach. On the other hand, the availability of imagery for gap filling is beneficial for quickly reviewing image scenes and confirming site conditions, which is not possible with methods that rely only on statistical methods for gap filling. For instance, imagery for the 2016 gap year clearly shows high flow conditions, highlighting both the qualitative and quantitative value of image-based approaches. For sites where upstream and/or downstream stream gages with well-correlated streamflow do not exist (a condition simulated by the design of this study), imagery can provide invaluable information to support data gap-filling.
A challenge associated with the available imagery in this study was the lack of nighttime imagery, which eliminated the possibility of nighttime stage and discharge predictions. Comparison of hourly and daily time-step predictions with actual stage and discharge records suggest the lack of nighttime imagery was not problematic in this study. For flashier hydrologic systems where major changes in stream discharge could occur over night, this could be problematic. Because we have successfully measured stage with cameras using infrared light emitting diodes for nighttime images in a previous study [3, 32], we believe there is optimism for success for using nighttime imagery where an illumination source is available. However, separate models may need to be developed for predictions under daytime (color imagery) and nighttime conditions (grayscale infrared images or other artifacts of available illumination).
In addition to standard calculations of model error (e.g., RMSE, NRMSE, or NSE), we calculated error using simple upper and lower benchmarks based on only on-site data to evaluate performance. These benchmarks were selected based on the goal of this project, which was to determine if documentary time-lapse imagery (i.e., captured for reasons other than image processing for hydrological studies) could be used to reconstruct stream stage and discharge records with accuracy that is suitable for calibration and/or validation of other hydrological models. In part because benchmarks are not as standardized as, e.g., RMSE, there are many potential ways to select benchmarks. For example, benchmarks could be established based on other gap-filling methods that rely on upstream and downstream gage data (e.g., [64, 65]), though we opted not to use this approach because our study location was not hydrologically similar to the stations upstream and downstream due to the complex system of canals. Additional exploration of suitable benchmarks could also be conducted by altering characteristics of the images used to develop predictive models. For example, strategies could be employed in the future to create reasonable random variation in those inputs including introduction of Gaussian noise, randomly resizing the images, and adding random spatial variation.
For benchmarks, we used the USGS sensor data as the upper benchmark for evaluating model results on a daily time-step. This was the same data used as “ground truth” to build the machine learning models but there is measurement error associated with these measurements, probably in 10% range [1]. For the lower benchmark, we used the daily median discharge from 2009 to 2014. By using median daily values from the years immediately preceding the gap years for which the models were built, the lower benchmark reflected long-term seasonal patterns of discharge. In the North Platte River watershed, numerous reservoirs regulate flow, but seasonal patterns are also driven by demands of downstream users (including environmental flows) and the regulatory requirements to provide for those demands. While the selected benchmarks are simple, these benchmarks could be easily replicated for comparative studies where simulated data gaps are used to evaluate new gap-filling methods.
The studied river scene included a weir that induced turbulence on the downstream side of the weir. We were surprised to learn that the non-weir features were a positive strong contributor to the quality of the models, possibly on par in some cases with the turbulent water (whitewater) feature. Even more surprising was that the non-weir features contributed to prediction improvements even when the weir features were not available. We believe the addition of features based on well-known image processing algorithms such as edge direction and magnitude in and around the water line, region-based Fourier response, region-based moments and gradients of convolutions, statistical change detection and a variety of additional relatively well-known and well understood image features available in OpenCV (2019) and textbooks such as [66, 67] could improve model quality. These and other feature extraction algorithms do not have to be only applied to 2D images, but are equally applicable to 3D (from stereoscopic, laser triangulation, LIDAR, etc.), hyper-spectral, ultrasonic, and other images. We believe that when image features can be combined with non-image features such as precipitation, cloud cover, temperature, humidity, chemical signature, irrigation, and other measures of local activity, the model performance could show substantial improvements. All of these items are opportunities for future research, which could be based on the image feature datasets and model files available in the DOI [36].
6 Conclusion
This study was carefully limited to the question of whether documentary time-lapse imagery from fixed ground-based cameras could lead to stream stage and discharge predictions of sufficient quality the calibration and/or validation of other hydrologic models used for water and flood management. The next steps include evaluation of image features that are most useful for stage and discharge predictions [68], and the transferability of hand-crafted features to other sites. Future research could include testing the efficacy of image-based machine learning models for different sites (both hand-crafted and deep learning approaches), improvements and standardization in selection of lower and upper performance benchmarks, refinement of machine learning model parameters, and computational improvements to speed and quality of image feature calculation and machine model building. Further research could address similar approaches for real-time data validation and/or qualitative assessment of site conditions including the presence of ice or other flow obstructions, in addition to investigation of transferability of different approaches to different study sites.
It is important to assure the data in the training set used to build a model includes the range of values represented in the test set in addition to identification of features that can predict stage and discharge accurately in straight-forward scenarios for the selected study site. Otherwise, uncertainty in model predictions increases if the model used to predict stream stage and discharge contains values outside of the trained data set. Although algorithms can be used to ensure that the maximum and minimum observations are included in the training dataset [69], when there is an actual gap (unlike the simulated gaps in [69] and other studies) in discharge records, there is no guarantee that stage or discharge has not exceeded the bounds of existing observations. In this sense, image-based gap filling has an advantage over purely numerical gap filling methods. Images from the test set (the gap) can be compared to images from the training set to get a sense for whether the training set is adequate to build a good model to fill the gap. In addition, performance relative to the upper and lower benchmarks can help identify inadequacies in the training set.
The process defined in this paper for identifying image features to produce precise machine learning models was an outcome of this research. The use of that process to investigate different image scenes and applications was beyond the scope of this paper, but is integral to our ongoing research and development of the GaugeCam Remote Image Manager Educational—Artificial Intelligence (GRIME-AI) software [34] designed to easily facilitate similar studies. The development of precise models in these different scenarios will almost certainly require the development of additional image features, but it also might require the introduction of additional cameras with varying views of the scene as well as leading and lagging time series data from upstream and downstream sensors and cameras, weather data, as well as other data. This means that it would be desirable for the further development of these libraries to consume more than just images and stage and discharge measurements, calculate image features, and fuse the data for machine learning tasks. Such functionality could include the use of other scalar and categorical data from a wide variety of sources including weather stations, satellites, and community/public observations (citizen scientists).
Acknowledgments
The authors gratefully acknowledge the Platte Basin Timelapse project for providing the images and the United States Geological Survey, Wyoming office for supplying the stage and discharge data. We gratefully acknowledge anonymous reviewers who greatly improved the paper.
Disclaimer: Christian D. Chapman is currently an MIT Lincoln Laboratory employee. No laboratory funding or resources were used to produce the result/findings reported in this publication.
References
- 1.
Boiten W. Hydrometry. CRC Press; 2008.
- 2.
Turnipseed DP, Sauer VB. Stage Measurements at Gaging Stations: U.S. Geological Survey Techniques and Methods book 3. Reston, Virginia: U.S. Geological Survey; 2010.
- 3. Gilmore TE, Birgand F, Chapman KW. Source and magnitude of error in an inexpensive image-based water level measurement system. Journal of Hydrology. 2013;496(2013):178–186.
- 4. Zhang Z, Zhou Y, Liu H, Gao H. In-situ water level measurement using NIR-imaging video camera. Flow Measurement and Instrumentation. 2019;67(2019):95–106.
- 5. Etheridge JR, Birgand F, Burchell MR II. Quantifying nutrient and suspended solids fluxes in a constructed tidal marsh following rainfall: The value of capturing the rapid changes in flow and concentrations. Ecological Engineering. 2015;78:41–52.
- 6. Muste M, Xiong Z, Schöne J, Li Z. Validation and Extension of Image Velocimetry Capabilities for Flow Diagnostics in Hydraulic Modeling. Journal of Hydraulic Engineering. 2004;130(3):175–185.
- 7. Muste M, Fujita I, Hauet A. Large-scale Particle Image Velocimetry for Measurementsin Riverine Environments. Water Resources Research. 2008;44(4).
- 8. Muste M, Ho HC, Kim D. Considerations on direct stream flow measurements using video imagery: Outlook and research needs. Journal of Hydro-environment Research. 2011;5:289–300.
- 9. Muste M, Hauet A, Fujita I, Legout C, Ho HC. Capabilities of Large-Scale Particle Image Velocimetry to Characterize Shallow Free-Surface Flows. Advances in Water Resources. 2014;170:160–171.
- 10.
Zhen Z, Huabao L, Yang Z, Jian H. Design and evaluation of an FFT-based space-time image velocimetry (STIV) for time-averaged velocity measurement. In: 2019 14th IEEE International Conference on Electronic Measurement and Instruments (ICEMI). IEEE; 2019. p. 503–514.
- 11. Tauro F, Tosi F, Mattoccia S, Toth E, Piscopia R, Grimaldi S. Optical Tracking Velocimetry (OTV): Leveraging Optical Flow and Trajectory-Based Filtering for Surface Streamflow Observations. Remote Sensing. 2018;10.
- 12. Young DS, Hart JK, Martinez K. Image analysis techniques to estimate river discharge using time-lapse cameras in remote locations. Computers and Geosciences. 2015;76:1–10.
- 13.
Gupta A, Chang T, Walker J, Letcher B. Towards Continuous Streamflow Monitoring with Time-Lapse Cameras and Deep Learning; 2022. p. 353–363.
- 14.
USGS. USGS Flow Photo Explorer; n.d. Available from: https://www.usgs.gov/apps/ecosheds/fpe/#/.
- 15.
USGS. HIVIS (Hydrologic Imagery Visualization and Information System); n.d. Available from: https://apps.usgs.gov/hivis/.
- 16. Ren H, Cromwell E, Kravitz B, Chen X. Using Deep Learning to Fill Spatio-Temporal Data Gaps in Hydrological Monitoring Networks. Hydrology and Earth System Sciences. 2019;196.
- 17. Tfwala SS, Wang YM, Lin YC. Prediction of Missing Flow Records Using Multilayer Perceptron and Coactive Neurofuzzy Inference System. The Scientific World Journal. 2013;2013(584516). pmid:24453876
- 18. Thurstan RH, McClenachan L, Crowder LB, Drew JA, Kittinger JN, Levin P S P JM Roberts C M. Filling historical data gaps to foster solutions in marine conservation. Ocean and Coastal Management. 2015;115 (2015).
- 19. Kim Y, Johnson MS, Knox SH, Black TA, Dalmargo HJ, Kang M, et al. Gap-filling approaches for eddy covariance methane fluxes: A comparison of three machine learning algorithms and a traditional method with principal component analysis. Global Change Biology. 2019;26(3):1499–1518. pmid:31553826
- 20. Dastorani MT, Moghadamnia A, Piri J, Rico-Ramirez M. Application of ANN and ANFIS models for reconstructing missing flow data. Environmental monitoring and assessment. 2009;166:421–34. pmid:19543999
- 21. Chen C, He W, Zhou H, Xue Y, Zhu M. A comparative study among machine learning and numerical models for simulating groundwater dynamics in the Heihe River Basin, Northwestern China. Scientific Reports. 2020;10(1):1–13. pmid:32127583
- 22. Gong Y, Zhang Y, Lan S, Wang H. A Comparative Study of Artificial Neural Networks, Spport Vector Machines and Adaptive Neuro Fuzzy Inference System for Forecasting Groundwater Levels Near Lake Okeechobee, Florida. Water Resource Management. 2016;30:375–391.
- 23. Yoon H, Jun SC, Hyun Y, Bae GO, Lee KK. A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coastal aquifer. Journal of Hydrology. 2011;396:128–138.
- 24. Jiang Z, yan Wang H, wu Song W. Discharge estimation based on machine learning. Water Science and Engineering. 2013;6(2):145–152.
- 25. Ritter A, Muñoz-Carpena R. Performance evaluation of hydrological models: Statistical significance for reducing subjectivity in goodness-of-fit assessments. Journal of Hydrology. 2013;480:33–55.
- 26. Jamal A, Linker R. Inflation method based on confidence intervals for data assimilation in soil hydrology using the ensemble Kalman filter. Vadose Zone Journal. 2020;19(1):e20000.
- 27. Seo Y, Kim S, Singh VP. Comparison of different heuristic and decomposition techniques for river stage modeling. Environmental Monitoring and Assessment. 2018;190(392). pmid:29892912
- 28. Jain SK. Modeling river stage–discharge–sediment rating relation using support vector regression. Hydrology Research. 2012;43(6):851–861.
- 29. Arnold Jeffrey G; Moriasi DN, Gassman PW, Abbaspour KC, White MJ, Raghavan Srinivasan, et al. SWAT: Model use, calibration, and validation. Transactions of the American Society of Agricultural and Biological Engineers. 2012;55(4):1491–1508.
- 30.
Forsberg M, Farrell M. Used with permission from: Platte Basin Timelapse; 2011-2020. Available from: http://plattebasintimelapse.com/.
- 31.
USGS 06674500 NPRSLW station. 06674500 NORTH PLATTE RIVER AT WYOMING-NEBRASKA STATE LINE Station; 2020. Available from: https://waterdata.usgs.gov/nwis/inventory/?site_no=06674500&agency_cd=USGS.
- 32. Chapman KW, Gilmore TE, Chapman CD, Birgand F, Mittelstet AR, Harner MJ, et al. Technical Note: Open-Source Software for Water-Level Measurement in Images With a Calibration Target. Water Resources Research. 2022;58(8):e2022WR033203.
- 33.
USGS 06674500 NPRSLW data. 06674500 NORTH PLATTE RIVER AT WYOMING-NEBRASKA STATE LINE Stage and Discharge data; 2020. Available from: https://nwis.waterdata.usgs.gov/usa/nwis/uv/?cb_00060=on&cb_00065=on&format=rdb&site_no=06674500&period=&begin_date=2012-01-01&end_date=2020-09-10.
- 34.
Gilmore TE. GRIME-AI; n.d. Available from: https://gaugecam.org/grime-ai-details/.
- 35.
Ng A. Machine Learning Yearning (Draft). Andrew Ng; 2018. Available from: https://www.deeplearning.ai/machine-learning-yearning/.
- 36.
Chapman KW, Gilmore TE. Camera-based Water Stage and Discharge Prediction with Machine Learning [Data set]. In University of Nebraska-Lincoln Data Repository. University of Nebraska Consortium of Libraries—UNCL; 2021. Available from: https://doi.org/10.32873/unl.dr.20210322.
- 37. Côté-Allard U, Campbell E, Phinyomark A, Laviolette F, Gosselin B, Scheme E. Interpreting Deep Learning Features for Myoelectric Control: A Comparison With Handcrafted Features. Frontiers in Bioengineering and Biotechnology. 2020;8:158. pmid:32195238
- 38. Lin W, Hasenstab K, Cunha GM, Schwartzman A. Comparison of handcrafted features and convolutional neural networks for liver MR image adequacy assessment. Scientific Reports. 2020;10(20336 (2020)). pmid:33230152
- 39.
OpenCV. Open Source Computer Vision Library; 2019.
- 40.
Frank E, Hall MA, Witten IH. The WEKA Workbench. Online Appendix for “Data Mining: Practical Machine Learning Tools and Techniques”. 4th ed. Morgan Kaufmann; 2016.
- 41.
Buitinck L, Louppe G, Blondel M, Pedregosa F, Mueller A, Grisel O, et al. API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD Workshop: Languages for Data Mining and Machine Learning; 2013. p. 108–122.
- 42.
R Core Team. R: A Language and Environment for Statistical Computing; 2016. Available from: https://www.R-project.org/.
- 43. Araujo P, Astray G, Ferrerio-Lage JA, Mejuto JC, Rodriguez-Suarez JA, Soto B. Multilayer perceptron neural network for flow prediction. J Environ Monit. 2011;13:35–41. pmid:21088795
- 44. Fukami K, Fukagata K, Taira K. Assessment of supervised machine learning methods for fluid flows. Theoretical and Computational Fluid Dynamics. 2020;34:497–519.
- 45.
Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. Springer; 2009.
- 46. Rosenblatt F. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review. 1958;65(6):386–408. pmid:13602029
- 47. Breiman L. Random Forests. Machine Learning. 2001;45.
- 48. Cortes C, Vapnik V. Support-Vector Networks. Machine Learning. 1995;20:273–297.
- 49. Kalman RE, Bucy RS. New Results in Linear Filtering and Prediction Theory. Journal of Fluids Engineering. 1961;83(1):95–108.
- 50. Sun Y, Bao W, Valk K, Brauer CC, Sumihar J, Weerts AH. Improving forecast skill of lowland hydrological models using ensemble Kalman filter and unscented Kalman filter. Water Resources Research. 2020;n/a(n/a):e2020WR027468.
- 51.
Kim Y, Bang H. Introduction to Kalman Filter and Its Applications. IntechOpen; 2018.
- 52. Moriasi DN, Arnold JG, Liew MWV, Bingner RL, Harmel RD, Veith TL. Model Evaluation Guidelines for Systematic Quantification of Accuracy in Watershed Simulations. Transactions of the ASABE. 2007;50(3):885–900.
- 53. Seibert J, Vis MJP, Lewis E, van Meerveld HJ. Upper and lower benchmarks in hydrological modelling. Hydrological Processes. 2018;32:1120–1125.
- 54. Dery SJ, Stieglitz M, McKenna EC, Wood EF. Characteristics and Trends of River Discharge into Hudson, James, and Ungava Bays, 1964–2000. Journal of Climate. 2005;18:2540–2557.
- 55.
Yannick R. Filling gaps in time series in urban hydrology. Laboratoire de Génie Civil et d’Ingénierie Environnementale; 2014.
- 56. Zhang Y, Post D. How good are hydrological models for gap-filling streamflow data? Hydrology and Earth Science Systems. 2018;22:4593–4604.
- 57. Dembélé M, Oriani F, Tumbulto J, Mariéthoz G, Schaefli B. Gap-filling of daily streamflow time series using Direct Sampling in various hydroclimatic settings. Journal of Hydrology. 2018;569:573–586.
- 58. Tosi F, Rocca M, Aleotti F, Poggi M, Mattoccia S, Tauro F, et al. Enabling Image-Based Streamflow Monitoring at the Edge. Remote Sensing. 2020;12 (2047).
- 59. Zhang Z, Zhou Y, Liu H, Zhang L, Wang H. Visual Measurement of Water Level under Complex Illumination Conditions. Sensors. 2019;19(19). pmid:31554301
- 60. Muhadi NA, Abdullah AF, Bejo SK, Mahadi MR, Mijic A. Deep Learning Semantic Segmentation for Water Level Estimation Using Surveillance Camera. Applied Sciences. 2021;11(20).
- 61. Vandaele R, Dance SL, Ojha V. Deep learning for automated river-level monitoring through river-camera images: an approach based on water segmentation and transfer learning. Hydrology and Earth System Sciences. 2021;25(8):4435–4453.
- 62. Sit M, Demiray BZ, Xiang Z, Ewing GJ, Sermet Y, Demir I. A comprehensive review of deep learning applications in hydrology and water resources. Water Science and Technology. 2020;82(12):2635–2670. pmid:33341760
- 63.
Shen C, Lawson K. 19. In: Applications of Deep Learning in Hydrology. John Wiley and Sons, Ltd; 2021. p. 283–297. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1002/9781119646181.ch19.
- 64. Hirsch RM. A Comparison of Four Streamflow Record Extension Techniques. Water Resources Research. 1982;18:1081–1088.
- 65.
USGS. move.1: Maintenance of Variance Extension, Type 1 in USGS-R/smwrStats: R functions to support statistical methods in water resources; n.d. Available from: https://rdrr.io/github/USGS-R/smwrStats/man/move.1.html.
- 66.
Ballard DH, Brown CM. Computer Vision. Prentice Hall, Inc.; 1982.
- 67.
Gonzalez RC, Woods RE. Digital Image Processing. 4th ed. Pearson; 2018.
- 68.
Chavez-Jimenez C. M., Salazar-Lopez L. A., Chapman K., Gilmore T., Sanchez-Ante G. Feature Analysis and Selection for Water Stream Modeling. In Rodriguez-Gonzalez A. Y., Perez-Espinoza H., Martinez-Trinidad J. F., Carrasco-Ochoa J. A., Olvera-Lopez J. A. (Eds.) Pattern Recognition (pp. 3–12). Springer Nature Switzerland; 2023
- 69. Nelson NG, Muñoz-Carpena R, Phlips EJ, Kaplan D, Sucsy P, Hendrickson J. Revealing Biotic and Abiotic Controls of Harmful Algal Blooms in a Shallow Subtropical Lake through Statistical Machine Learning. Environmental Science and Technology. 2018;52(6):3527–3535. pmid:29478313