Electronic Data Capture Tools for Global Health Programs: Evolution of LINKS, an Android-, Web-Based System

The rapid expansion of mobile networks globally, coupled with the decreasing cost of mobile equipment [1], is allowing global health programs increasingly to utilize mobile- and cloud-based technology in their efforts to target important challenges to public health. Our initial electronic data collection system employed personal digital assistants (PDAs) [2], [3], but these proved to have significant cost and scalability limitations. The present report describes a second-generation, more efficient, cloud-based, smartphone-based system and the key elements that lead to its greater efficiency.


Introduction
The rapid expansion of mobile networks globally, coupled with the decreasing cost of mobile equipment [1], is allowing global health programs increasingly to utilize mobile-and cloud-based technology in their efforts to target important challenges to public health. Our initial electronic data collection system employed personal digital assistants (PDAs) [2,3], but these proved to have significant cost and scalability limitations. The present report describes a second-generation, more efficient, cloudbased, smartphone-based system and the key elements that lead to its greater efficiency.

The LINKS System
While there are a number of tools available for data collection (EpiCollect, FormHub, EpiInfo, and others), these tools were not ideal for our purposes because of either license restrictions or other challenges. The starting point for the new mobile application, called the LINKS system ( Figure 1), was the open source project Open Data Kit (ODK) [4,5]. ODK allows the collection of a wide range of data using only the internal components of smartphone devices, such as the built-in GPS and the camera that can be used as a barcode scanner.
A server-based application (app) processes incoming data and writes those data to a database. A dynamic web interface was developed to present the collected data to the user in the form of tables, graphs, maps, and downloadable datasets. The system was deployed on Ubuntu Linux, running on Amazon.com's Elastic Cloud (AWS EC2, http://aws.amazon. com/ec2/) infrastructure. Geotrust secure certificates were installed to encrypt the data during transmission and between the user's browser and the server. Data are managed through a web interface or downloaded for offline use outside of the system.
The LINKS system was initially developed to address shortcomings of the earlier PDA-based data-capture systems and to support the interests of the Neglected Tropical Diseases (NTDs) community in employing an integrated approach to the NTDs using shared technical platforms. The LINKS system can N support mobile technology running on a wide range of locally accessible hardware N be used in both highly connected (internet) and connection-poor settings N have a mechanism to deploy additional surveys to equipment already in the field N be built entirely with industry-standard open source software to avoid costly licensing fees N be cloud-based to allow for centralized management and increase scalability for large, highly dispersed projects Since its launch in June of 2011, the LINKS system has been deployed to over 20 countries by multiple partner organizations (Table 1). Upon the completion of each project, both data collectors and project managers assessed its usability and the perceived benefits as well as the challenges to using the LINKS system ( Table 2). This survey was administered to 30 individuals across academic and government organizations. Feedback was received from staff in Ethiopia, Tanzania, Kenya, Mozambique, Nigeria, Dominican Republic, and Indonesia. Additional feedback was received from implementing partners in the United States and United Kingdom.
Cost savings were an immediately recognized benefit to deploying an appbased system that could run on any Android device ( Table 3). The system was developed using open source applications and deployed on cost-effective cloud-based hosting. Acquisition costs of individual data collection devices were cut in half (and have continued to decrease), and shipping costs were also reduced by approximately half as there was not only less equipment (weight) being used but also no longer a need to ship equipment back to the central office for reprogramming before the next deployment. Training savings, too, were realized, because the system mirrors the existing paper forms and, compared to PDA systems, does not require extensive practice navigating multiple steps either to enter data or to send data to the server. Training costs are anticipated to be further reduced in the future as video-based training is introduced through websites such as YouTube (www.youtube.com).
Data quality is another important domain for evaluation. Routinely, the quality of the data has been assessed by reviewing the number of errors detected in the submitted data during the course of the project. The LINKS system automatically enforces point-of-source data checks (range checks, required variables, logical rules, etc.) to keep data errors to a minimum; however, detectable errors still exist. The largest LINKS project, the Global Trachoma Mapping Project (GTMP), is currently mapping the prevalence of trachoma in areas were trachoma infection is suspected. Over the next two years this project will evaluate over 4 million individuals in over 1,200 districts. Over the past 11 months, during which 1 million individuals were surveyed, this multinational project has seen a daily error rate of 0.14% (two errors/1,430 submissions). A future comparison of error rates between paper and other electronic data collection systems would be highly beneficial to validate the LINKS system further.
Finally, we evaluated the time between the end of data collection and implementation of results. A non-cloud-based system requires manual synchronization of data using local laptop computers, adding time and equipment. In contrast, a cloud-based system automatically synchronizes the data directly from the smartphones whenever connected to the internet, allowing data managers to identify and communicate issues with the field team during the collection of data. This allows data from all projects sites to flow from collection to implementation more rapidly. In the case of the GTMP, results take, on average, three days from the end of data collection to be included in public programmatic tools and for program implementation planning.
In addition to all that has been learned with this system over the past two years, it is important to note that mobile connectivity in remote areas (while sometimes still a challenge) is not an impediment to the implementation of these technologies.

Challenges Remaining
Experiences with the LINKS system have also identified a number of persistent challenges to implementing a mobilebased system. Most notably, the essential network requirement of such a system can prove demanding in certain environments. While workarounds such as storing data locally until a connection is available are now feasible, many of these connectivity challenges will also likely solve themselves    as the marketplace becomes increasingly dependent on widespread internet and cellular connectivity. A further challenge with the cloudbased service model is the concern over data ownership, specifically, the acknowledgment that data is the sole property of the principal organization (e.g., national Ministries of Health) regardless of where the data are being stored. Similar efficiencies have been achieved from using a central cloud-based system in other technical services, such as e-mail and file storage. These services are accepted as the norm in personal use, and although it can be anticipated that national and global health data systems will move in this direction as well, this transition may be met with initial resistance.
The positive user feedback, combined with the cost-effective results from early deployment of the LINKS system, reinforces the drive to make continued development of electronic data collection systems and their rapid diffusion into daily use a priority for global health programs everywhere.