Trail bridges can improve access to critical services such as health care, schools, and markets. In order to evaluate the impact of trail bridges in rural Rwanda, it is helpful to objectively know how and when they are being used. In this study, we deployed motion-activated digital cameras across several trail bridges installed by the non-profit Bridges to Prosperity. We conducted and validated manual counting of bridge use to establish a ground truth. We adapted an open source computer vision algorithm to identify and count bridge use reflected in the digital images. We found a reliable correlation with less than 3% error bias of bridge crossings per hour between manual counting and those sites at which the cameras logged short video clips. We applied this algorithm across 186 total days of observation at four sites in fall 2019, and observed a total of 33,800 daily bridge crossings ranging from about 20 to over 1,100 individual uses per day, with no apparent correlation between daily or total weekly rainfall and bridge use, potentially indicating that transportation behaviors, after a bridge is installed, are no longer impacted by rainfall conditions. Higher bridge use was observed in the late afternoons, on market and church days, and roughly equal use of the bridge crossings in each direction. These trends are consistent with the design-intent of these bridges.
Citation: Thomas E, Gerster S, Mugabo L, Jean H, Oates T (2020) Computer vision supported pedestrian tracking: A demonstration on trail bridges in rural Rwanda. PLoS ONE 15(10): e0241379. https://doi.org/10.1371/journal.pone.0241379
Editor: Yan Chai Hum, University Tunku Abdul Rahman, MALAYSIA
Received: July 24, 2020; Accepted: October 13, 2020; Published: October 26, 2020
Copyright: © 2020 Thomas et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data used in this paper is derived from digital imagery of individuals. The tabulated data is included as Supporting Information. Consistent with our ethical statement, the original images cannot be shared publicly because of privacy concerns. Data are available from the Rwanda National Ethics Committee or the University of Colorado Boulder Institutional Review Board for researchers who meet the criteria for access to confidential data. Contact: email@example.com.
Funding: Funding for this work was provided by the Autodesk Foundation. The funder provided support in the form of contracted services through the University of Colorado Boulder for authors ET and SG. Synaptiq provided support for the study in the form of a salaries for HJ and TO. Amazi Yego Ltd. provided support for the study in the form of a salary for LM. The funders did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.
Competing interests: HJ and TO are employed by Synaptiq Inc. which is contracted to provide analysis services similar to those presented in this paper. Author LM is employed by Amazi Yego Inc., which is contracted by the University of Colorado Boulder to conduct research data collection within the study described. This does not alter our adherence to PLOS ONE policies on sharing data and materials.
Isolation caused by lack of transportation infrastructure makes access to basic social and economic activities unreliable for rural communities. This uncertain access to markets, income-generating opportunities, and health and education facilities contributes to persistent rural poverty . The World Bank estimates that one billion people worldwide lack access to an all-weather road, illustrating the scope of the problem and the challenge of addressing it at scale .
Bridges to Prosperity (B2P) is a non-profit organization that builds trail bridges to connect isolated rural communities to road networks and critical destinations and services including markets, hospitals and schools. Fig 1 illustrates an example bridge location in Rwanda. B2P has constructed 339 trail bridges in 21 countries. A study in Nicaragua established economic and livelihood benefits attributable to these bridges .
In an effort to establish any economic, health or educational impacts of these trail bridges in Rwanda, we conducted a matched-cohort study over a 12 month period in 2018-2019. As part of this study, we installed motion-activated digital cameras at several of the bridge crossings. The images collected are intended to support characterizing bridge use. Objective measurement technologies and techniques are self-evidently important to develop and refine when supporting claims of the effectiveness of environmental interventions.
A variety of technologies and analysis methods have been deployed and validated to count pedestrians, bicycles and vehicles crossing bridges and other transportation infrastructure. These methods include in-person observational counting [4, 5], motion-activated counters , magnetic inductive loops , infrared light beams, pressure pads, thermal cameras , and digital video [8, 9] and imagery analysis .
These methods have been almost entirely deployed in high income urban settings, and often use technologies that would be cost-prohibitive or infeasible in a rural, low income setting. Bridges to Prosperity’s standard bridge use monitoring methods typically rely on manual, in-person data collection, which is time-consuming, produces temporally limited data, and is labor intensive. In order to identify and track trends and magnitude of bridge use, a continuous and automated method would be useful.
In this paper we describe the development, implementation, validation and findings of a novel method using low-cost, readily available motion-activated digital cameras in combination with open-source computer vision algorithms for measuring the use of these bridges. We describe the technology deployed, the computer vision supported detection algorithm applied, a human-validated error estimate, and early findings of bridge use patterns.
The following section details the technologies and methods deployed in this study. In brief, human manual counting was collected at several bridge sites and cross-validated between two manual counters. Digital cameras were installed at 12 total bridges, recording short video clips or digital still images. Manual counting was then compared to a.) the timestamps of the digital files from the cameras, b.) computer-vision supported counting of both the video clips and the digital stills. Following this validation, analysis of bridge use trends was conducted. Fig 2 presents a flowchart of these data collection technologies and analysis methods applied.
Digital cameras were installed at 12 total bridges, recording short video clips or digital still images. 60-minute aggregations of manual counting was then compared to the timestamps of the digital files from the cameras, and computer-vision supported counting of both the video clips and the digital stills. Following this validation, analysis of bridge use trends including satellite-detected rainfall as a co-variant was conducted for 4 bridges over 2-14 weeks in fall 2019. Green blocks indicate manual counting, blue blocks indicate digital data collection, and orange blocks indicate statistical analyses. Human manual counting was collected at several bridge sites in rural Rwanda, and cross-validated between two manual counters.
2.1 Camera selection, installation and image collection
In this study, we examined if a motion-activated digital camera system could support bridge use data collection. We reviewed several commercially available motion-activated digital cameras marketed for outdoor long term monitoring of wildlife. After comparing cost, complexity, battery lifetime and mechanical interfaces, we selected the Browning Spec Ops Advantage (Browning Trail Cameras, www.browningtrailcameras.com), available retail for about $150. Fig 3 shows one of these cameras installed in a protective housing next to a trail bridge entry ramp.
These cameras were installed at 12 bridge crossing sites over varying periods in 2019. Two modes of image data collection were employed—motion activated digital still images, and motion-activated short (3 second) video clips. The cameras automatically employed infrared LED lighting to support after-dark observations. Images were recorded on local data cards.
2.2 Manual counting
To support subsequent image analysis validation, at five sites we conducted daytime manual counting over 9 days, for about 8 hours per day. Counting was recorded with time-stamp electronic clickers. Each count represented one observed crossing of an individual, in either direction. Four of these day-long observations included two staff members independently and concurrently observing and recording bridge crossings in order to cross-validate this method.
2.3 Computer vision analysis
After varying periods of camera installation, the imagery files were recovered. For several sites, the still and video images were then “stitched” together to create continuous data files, then used to apply and refine computer vision algorithms for detecting and counting bridge users.
Computer-vision supported counting involves observing people in the video, tracking them as they move, and determining their direction of motion based on their tracks. We used modern deep neural networks and other machine vision tools provided by the open source OpenCV machine vision toolkit (www.opencv.org) to accomplish each of these steps automatically. The first step, finding people in an image, is an example of object detection , in which the goal is to find instances of specific types of objects and put bounding boxes around them. In this case, we used the open source Darknet  implementation of the YOLO (You Only Look Once)  object detection deep neural network that is pretrained to, among other things, detect people at frame rate.
Object detection has many applications, and thus has received significant attention in the machine vision community . Popular approaches tend to yield algorithm “families”, such as R-CNN , Fast R-CNN , Faster R-CNN , and Mask R-CNN . The latter learns to identify whether individual pixels belong to an object, a level of detail not needed for our application, while the first three place bounding boxes around detected objects. Later members of the R-CNN family tend to be faster, but not necessarily more accurate. The YOLO family includes YOLO, YOLO v2 , and YOLO v3 . We chose YOLO due to its superior runtime performance over the R-CNN and the availability of pretrained models specifically for the person detection use case.
For all deep object detection methods, people are detected independently across frames. That is, the fact that a person is detected in frame i does not inform detections in frame i − 1 or frame i + 1. Tracks must be built from consecutive frames for each person. That is accomplished by some form of an appearance model that characterizes the visual appearance in each bounding box so that the matching bounding box, the one with the most similar appearance, can be found in subsequent frames. Common choices for tracking with appearance models are the DLIB correlation algorithm  and the Simple Online and Realtime Tracking with a Deep Association Metric (DeepSort) algorithm . We used the latter as it integrated more easily with the rest of our system.
Note that errors can occur anywhere in the pipeline. While false positives are rare in the object detection stage, false negatives can and do occur, where people are missed in one or more frames due to lighting conditions, occlusion, debris on the camera lens, etc. Appearance models are often based on simple image descriptors, like color histograms, and can thus also lead to false or missed matches. That said, the empirical results, described below, suggest that the overall system is robust and accurate.
The geometric application of this algorithm is illustrated in Fig 4, wherein an object centroid is tracked moving from one side of the frame to the other (i.e, left to right).
The angle that a tracked object makes with the line between two anchor points at the center of the bridge is tracked over time. The behavior of that value for a given detection ID indicates whether the traversal is left-to-right or right-to-left.
We start by defining a distance threshold between each object and two anchors placed at (xt, yt) and (x1, y1). With these anchors, we can create unique lines and track the rotation they make with the trackable centroids at (x3, y3) and (x4, y4). When a subject enters right, the counter begins to count the angle defined by the points (xt, yt), (x1, y1) and (x2,y2). When a subject enters left, the counter begins to count the angle defined by the points (xt, yt), (x1, y1) and (x3, y3). Maintaining a unique id while counting is essential in this step. Each angle is added to its corresponding trackable object queue. Once an object exits the trackable line region, the sum of the difference between contiguous angles in their respective object queue will be positive or negative. Positive indicates left direction while negative indicates right.
2.4 Ethics—Human research subjects
At the sites where cameras are installed, it was not practical to secure informed consent from every person using the bridge. Instead, the cameras were installed in public locations, are highly visible, and include a placard in Kinyarwanda stating, “This camera was installed in (month, year) for research approved by the Rwanda National Ethics Committee. It is recording people crossing the bridge. This will help us to understand the impact the bridge is having on surrounding communities. Please do not damage it or try to steal it.” Further, our research protocol includes blurring the faces of people in any images or videos published. This statement and approach was approved by the Rwanda National Ethics Committee on January 28, 2019. The individual pictured in Fig 3 (co-author Gerster) has provided written informed consent (as outlined in PLOS consent form) to publish their image alongside the manuscript.
3.1 Manual counting validation
Manual counting by two separate staff were conducted across 4 days and 3 sites. A total of 1,713 separate crossings were observed by the manual counters. These two independent manual counts are compared to establish confidence in this method. Fig 5 illustrates a linear regression of each counter at each site, aggregated at 15 minute intervals. These results indicate nearly total agreement between the two manual counts (R2 > 0.98). Therefore, in subsequent analysis these manual counts are considered the ground-truth.
3.2 Motion activated event counting
In previous work, we and others have relied on motion-activated event counting applied to the use of sanitation infrastructure [22–24]. The digital cameras used in this study create separate files for each motion-activated image or video. We used these time-stamped image files to establish if simple event detection similar to the latrine monitors deployed in other studies (without image analysis) could be a sufficient measure of bridge use. Fig 6 shows manual counting compared to motion-activated timestamp events (digital files) aggregated at 60 minute intervals for 9 day-long observation periods at 6 bridge sites. As illustrated, there are poor correlations between these counting methods across all tested sites. This poor correlation suggests that motion-detector based counting would not be a reliable indicator of the total number of bridge crossing events.
Manual and digital file timestamps are aggregated at 60 minute intervals. Poor correlations (R2 range 0–0.4) between these counting methods are observed across all tested sites, suggesting that motion-detector based counting is not a reliable indicator of bridge crossings.
3.3 Computer vision supported counting
Adapting the OpenCV computer vision people counter described above, we analyzed videos and photos collected at these 6 sites over 9 day-long observation periods. Fig 7 presents six example screen shots from this analysis, representing a range of camera installation positions, bridge crossing behaviors, weather conditions, and lighting conditions. In each case, individuals are identified and counted when they cross the center of the frame.
Faces have been blurred to protect identities consistent with research protocols. In (a), We see four people are identified crossing the centerline to the left. In (b) we see the same site at night, where the infrared illumination is sufficient to capture individuals, including distinguishing those socializing versus crossing the bridge. In (c), observe the same site in the morning while condensation is apparent on the camera lens. The algorithm is still able to identify and count individuals. In (d), another site where people are identified and counted while animals are not. In (e) and (f) two other example bridges with varying camera locations and angles, and subjects.
Fig 8 shows manual counting compared to computer vision supported counting aggregated at 60 minute intervals for 9 day-long observation periods at 6 bridge sites, while Fig 9 reflects this same data aggregated across sites.
Manual and computer counts are aggregated at 60 minute intervals. Strong correlations between manual counting and computer-vision counting are observed (R2 range 0.82–0.99).
Manual and computer counts are aggregated at 60 minute intervals and show strong overall correlation (R2 = 0.89).
As illustrated, there is some variability in correlation between the computer vision counts and the manual counts between sites. We found that the motion-activated video-clip files provided greater support for the computer vision algorithm compared to the stills. Table 1 presents error estimates disaggregating by the video and photo-still data types. The overall error bias of the video-clip data type was 2.63% per hour of counting.
3.4 Bridge use trends
Based on the findings that our computer vision algorithm supported by the motion-activated video clips has a low error bias per hour of counting, we then conducted computer vision people counting for the 4 sites for which we had video-clip data types across longer observational periods ranging between 17 and 51 days of continuous observation during August—November 2019. Fig 10 illustrates these estimated total daily crossings, along with daily rainfall at these sites. Rainfall estimates are provided using the remote-sensing based Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) . Table 2 presents statistics on the observed crossings at these four sites.
Non-zero rainfall is observed only for the Ruharazi and Kabere bridge sites, for which observations were available during the rainfall season September-November. Site-level and aggregated linear regressions (not shown) of daily bridge crossings compared to daily rainfall or 7-day mean rainfall did not indicate any correlation, indicating that bridge use during these observation periods at these sites was not dependent on rainfall.
Average daily bridge crossings range 85-478, daily standard deviations range 45-249.
Site-level and aggregated linear regressions of daily bridge crossings compared to daily rainfall or 7-day mean rainfall did not indicate any correlation, indicating that bridge use during these observation periods at these sites was not dependent on rainfall. As these observation periods captured only part of the year, we sought to establish if the rainfall variability and extremes observed during this period were representative of likely rainfall patterns throughout the year. Table 3 presents the observed rainfall mean, standard deviation, minimums and maximums recorded for each site during the observation periods, and for July 2017 to June 2018. An unpaired t-test of the total sample for these four sites of the rainfall during the observation period and over a 3-year period indicated no significant difference, suggesting that the observational period may be sufficient in capturing typical rainfall variability and any subsequent attribution of rainfall to bridge use.
Unpaired t-test comparing 3-year rainfall variability to rainfall observed during camera-observation period in 2019 indicated no significant difference, suggesting the observational period captures typical rainfall variability and any subsequent attribution of rainfall to bridge use.
We then examined site-level and aggregate bridge use trends. Fig 11 shows the percentage of bridge crossings per hour of day for each site over the observation period. The trends indicated in the plot suggest high late afternoon use at all sites. Fig 12 shows the percentage of bridge crossings for each day of the week at each site. These trends indicate higher use on Sundays, which are market and church days in these communities. Finally, Fig 13 shows the percentage of bridge crossings in each direction for each hour of the day. “Towards Village” indicates individuals crossing the bridge in the direction of the community identified as most-impacted by the bridge, while “Away from Village” indicates individuals traveling out of the community. These trends indicate roughly equal use of the bridge crossings in each direction.
Trends indicate high late afternoon use.
Hourly use trends indicate higher use on Sundays, which are market and church days in these communities.
4 Discussion and forward work
This study developed and validated an accurate and useful method for counting and characterizing the use of trail bridges in rural Rwanda. In this study, we deployed motion-activated digital cameras across several trail bridge sites in Rwanda. We conducted and validated manual counting of bridge use to establish a ground truth. We adapted an open source computer vision algorithm to identify and count bridge use reflected in the digital images. We found a reliable correlation with low mean error of bridge crossings per hour between manual counting and those sites at which the digital cameras collected short video clips when triggered.
We then applied this algorithm across 186 total days of observation at four sites in fall 2019, and observed a total of 33,800 daily bridge crossings ranging from about 20 to over 1,100 individual uses per day, with no apparent correlation between daily or weekly rainfall and bridge use. Bridge use trends were consistent with the design-intent of these bridges indicating higher use on market and church days, and roughly equal use of the bridge crossings in each direction.
Bridges to Prosperity’s theory of change posits that rural communities are periodically and dangerously isolated by flooding events, and that trail bridges eliminate this isolation and risk. The analysis presented in this paper suggests that bridge use is not dependent on rainfall, potentially indicating that communities prefer the trail bridges to alternative or baseline river crossings. However, while no rainfall dependence on bridge use was observed, further investigation is required to establish if there are any seasonal attributes to bridge use (such as harvest) or extreme weather events (flooding).
The work presented in this paper was conducted in support of a large scale (approximately 200 site) randomized controlled trial currently being conducted and scheduled for completion in 2024. As part of the large-scale study, we plan to deploy about 50 of these camera systems at bridge sites. The findings presented in this paper suggest that between-site variability in bridge use may be more significant than within-site variability. This may motivate moving camera systems between sites. Further, this large scale study will provide an opportunity to compare bridge use patterns to community level economic, health and educational outcomes.
The computer vision algorithm we deployed also detected the direction (left-to-right and right-to-left) movement of the subjects. Additionally, as illustrated in the example images provided above, the nature of bridge use can be in part deduced through review of the collected images. The cameras also record local ambient temperature and barometric pressure. These additional data and capabilities may support further opportunities for bridge use characterization and modeling.
The authors thank Bridges to Prosperity, Wyatt Brooks, Kevin Donovan, Laura MacDonald, Marie-Claire Nikuze,Laurien Ngwinondebe, Christian Ituze, Jean D’Amour Kwizera, Pie Nkubito, Denyse Niragire, and Jean De Dieu Bineza.
- 1. Gollin D, Lagakos D, Waugh ME. The agricultural productivity gap. Quarterly Journal of Economics. 2014;
- 2. The World Bank. Transport;. Available from: https://www.worldbank.org/en/topic/transport/overview#:~:text=Accessibility and affordabilityMore than 1.25, the world’s roads every year.
- 3. Brooks W, Donovan K. Eliminating Uncertainty in Market Access: Evidence from New Bridges in Rural Nicaragua. Econometrica (in revision). 2019;.
- 4. Schneider RJ, Arnold LS, Ragland DR. Methodology for Counting Pedestrians at Intersections. Transportation Research Record: Journal of the Transportation Research Board. 2009;
- 5. Schasberger MG, Raczkowski J, Newman L, Polgar MF. Using a bicycle-pedestrian count to assess active living in downtown Wilkes-Barre. American Journal of Preventive Medicine. 2012; pmid:23079274
- 6. Cherrett T, Bell H, McDonald M. Traffic management parameters from single inductive loop detectors. Transportation Research Record. 2000;
- 7. Greene-Roesel R, Diogenes MC, Ragland DR, Lindau La. Effectiveness of a Commercially Available Automated Pedestrian Counting Device in Urban Environments: Comparison with Manual Counts. TRB 2008 Annual Meeting. 2008;.
- 8. Ma Z, Chan AB. Counting people crossing a line using integer programming and local features. IEEE Transactions on Circuits and Systems for Video Technology. 2016;
- 9. Li J, Shao C, Xu W, Li J. Real-time system for tracking and classification of pedestrians and bicycles. Transportation Research Record. 2010;
- 10. Crouzil A, Khoudour L, Valiere P, Truong Cong DN. Automatic vehicle counting system for traffic monitoring. Journal of Electronic Imaging. 2016;
- 11. Liu L, Ouyang W, Wang X, Fieguth P, Chen J, Liu X, et al. Deep Learning for Generic Object Detection: A Survey. International Journal of Computer Vision. 2020;
- 12. Redmon J. Darknet: Open Source Neural Networks in C; 2016. Available from: https://pjreddie.com/darknet/.
- 13. Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; 2016.
- 14. Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; 2014.
- 15. Girshick R. Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision; 2015.
- 16. Ren S, He K, Girshick R, Sun J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017;
- 17. He K, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2020;
- 18. Redmon J, Farhadi A. YOLO9000: Better, faster, stronger. In: Proceedings—30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017; 2017.
- 19. Redmon J, Farhadi A. Yolov3. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2017;.
- 20. King DE. Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research. 2009;.
- 21. Wojke N, Bewley A, Paulus D. Simple online and realtime tracking with a deep association metric. In: Proceedings—International Conference on Image Processing, ICIP; 2018.
- 22. Sinha A, Nagel CL, Thomas E, Schmidt WP, Torondel B, Boisson S, et al. Assessing latrine use in rural India: A cross-sectional study comparing reported use and passive latrine use monitors. American Journal of Tropical Medicine and Hygiene. 2016;
- 23. O’Reilly K, Louis E, Thomas E, Sinha A. Combining sensor monitoring and ethnography to evaluate household latrine usage in rural India. Journal of Water, Sanitation and Hygiene for Development. 2015;5(3):426–438.
- 24. Turman-Bryant N, Clasen TF, Fankhauser K, Thomas EA. Measuring progress towards sanitation and hygiene targets: A critical review of monitoring methodologies and technologies; 2018.
- 25. Funk C, Peterson P, Landsfeld M, Pedreros D, Verdin J, Shukla S, et al. The climate hazards infrared precipitation with stations—A new environmental record for monitoring extremes. Scientific Data. 2015;