Can Unmanned Aerial Systems (Drones) Be Used for the Routine Transport of Chemistry, Hematology, and Coagulation Laboratory Specimens?

Background Unmanned Aerial Systems (UAS or drones) could potentially be used for the routine transport of small goods such as diagnostic clinical laboratory specimens. To the best of our knowledge, there is no published study of the impact of UAS transportation on laboratory tests. Methods Three paired samples were obtained from each one of 56 adult volunteers in a single phlebotomy event (336 samples total): two tubes each for chemistry, hematology, and coagulation testing respectively. 168 samples were driven to the flight field and held stationary. The other 168 samples were flown in the UAS for a range of times, from 6 to 38 minutes. After the flight, 33 of the most common chemistry, hematology, and coagulation tests were performed. Statistical methods as well as performance criteria from four distinct clinical, academic, and regulatory bodies were used to evaluate the results. Results Results from flown and stationary sample pairs were similar for all 33 analytes. Bias and intercepts were <10% and <13% respectively for all analytes. Bland-Altman comparisons showed a mean difference of 3.2% for Glucose and <1% for other analytes. Only bicarbonate did not meet the strictest (Royal College of Pathologists of Australasia Quality Assurance Program) performance criteria. This was due to poor precision rather than bias. There were no systematic differences between laboratory-derived (analytic) CV’s and the CV’s of our flown versus terrestrial sample pairs however CV’s from the sample pairs tended to be slightly higher than analytic CV’s. The overall concordance, based on clinical stratification (normal versus abnormal), was 97%. Length of flight had no impact on the results. Conclusions Transportation of laboratory specimens via small UASs does not affect the accuracy of routine chemistry, hematology, and coagulation tests results from selfsame samples. However it results in slightly poorer precision for some analytes.


Introduction
Unmanned Aerial Systems (UAS), colloquially known as drones, are aircraft without an onboard human pilot. On December 1 st 2013 Amazon.com introduced the world to the idea of civilian drones when its CEO unveiled Prime Air, a delivery drone, on live TV. However UAS are not new. They have been in use since the early 1900's [1] but were primarily developed and flown by military organizations due to their enormous cost. Recent advances in technology have provided high quality sensors at low price-points, greatly expanding the availability and potential utility of UAS. Once of these potential new uses is the routine transport of small goods such as diagnostic clinical laboratory specimens.
Transport of biological specimens, whether by planes, trains, or cars, is ubiquitous in both high-and low-resourced environments [2][3][4]. The majority of specimens are obtained in physician offices or clinics that tend to have small laboratories with limited testing menus [5,6]. Thus samples must be transported to larger, more complex laboratories to provide the testing required for clinical care. To illustrate, there are approximately 244,000 laboratories in the United States [7]. In 2006, physician office and other small non-hospital clinical laboratories accounted for~75% of the total number of laboratories [6,8], but they only accounted for 13% of the test volume. In addition 63% of their testing was in a point-of-care format which proffers a limited range of tests relative to core laboratory testing [6,8]. A 2011 survey of clinical laboratories in Kampala, Uganda showed the same pattern. Physician office laboratories (POL's) accounted for 94% of clinical laboratories [9], but only accounted for 52% of the test volume [5] and > 80% of these POL's performed only simple kit tests (point of care tests) or light microscope exams.
In addition to being a potentially new mode of transporting biological samples, UAS have unique advantages such as no traffic delays, low overhead costs, and the ability to go where there is no passable road. The impact of poor or difficult road access on healthcare is well documented in both high- [10] and low-resourced [11,12] countries. UAS are a potential way around this barrier, but are only useful if they do not adversely affect the test results of transported samples [2,[13][14][15][16].
Our first challenge in addressing the impact of UAS transport on laboratory results was the high and expanding number of tests used in clinical care [17]. Fortunately, less than 0.5% (40/ 2000) of these tests account for 80% of the test volume. Thus we began by focusing, in these first experiments, on the impact of UAS transport on the 33 most common tests performed in hospital laboratories [17,18]. A second challenge was determining what quality criteria to use for evaluating any differences we might see. There is no single worldwide consensus on acceptable performance for laboratory tests. The most widely used performance criteria are intended for interpretation of External Quality Assessment reports. They are largely measures of accuracy, and vary by jurisdiction [19][20][21][22]. To account for these limitations, we evaluated our results in three ways. 1) We used four performance acceptability criteria including two from groups outside the United States [19][20][21][22]; 2) We examined changes in reference range-based clinical classification; and 3) We examined differences between laboratory-derived (analytic) CV's and that from our paired samples.
To the best of our knowledge there has been no published research of the impact of UAS transportation on the stability of biological specimens or on the laboratory test results obtained from those specimens. Obtaining this data, which would be needed to determine the feasibility of UAS transportation of biological samples, is the objective of this study.

Study Design
All participants were orally consented using an identical script in English. Oral consent was used to guarantee anonymity for the volunteers. The samples were identified using a study ID and there was no key linking the participants to the samples or results. The consent procedure and the study were approved by the Johns Hopkins Medicine Human Subjects Institutional Review Board (Baltimore). 56 volunteers were recruited for the study: 36 females and 23 males. The mean age (SD) was 38.1 ± (11.6) years. Three paired samples (6 total) were obtained from each of the 56 adult volunteers: two 3.5 mL serum separator tubes, two 3 mL Potassium EDTA whole blood tubes, and two 2.7 mL citrated plasma tubes (BD Vacutainer). All six samples were collected in a single event using standard phlebotomy technique.
One set of the paired tubes was driven to the flight site and flown in the UAS. The second sample set was driven to the flight site but not flown (Fig 1). Flight times were staggered, from a minimum of 6 to a maximum of 37.5 minutes. All samples were kept at ambient temperatures. The maximum temperatures in the transport vehicles and in the shade at the flight site on the two flight days were 76 and 79°F respectively.
For flight, the samples were packed in a sample payload module which served to control the in-flight environment as well as to contain the samples in the unlikely event of a leak or breakage (Fig 2). The flights were conducted in compliance with Advisory Circular (AC) 91-57 [23], Model Aircraft Operating Standards as well as the International Air Transport Association's (IATA) Guidelines for the packaging of potentially infectious liquid biological materials (REF 6.1) (Fig 2) [24]. Briefly, each sample was enclosed by three layers of packaging and enough STP absorbent material (SAF-T-PAK, Hanover, MD 21076; http://www.saftpak.com/STPPack/ ) to absorb twice the full volume of all the samples in the payload. The primary receptacles were the original sample tubes, separated from each other by a custom-cut foam block. The secondary receptacles were two sealed biohazard bags wrapped in opposite orientations around all the Primary Receptacles. The tertiary receptacle was the rigid aircraft fuselage, made of impact absorbent EPS foam. Finally, the module carried an IATA label designating the contents as a class 6.2 infectious substance.
After flight operations were completed, all the samples (flown and stationary) were transported back to the Johns Hopkins Hospital Core laboratory. The time from the first drawn sample to the last result was less than 8 hours for all 336 samples in this experiment. The time from phlebotomy to arrival at the laboratory was uniform for sample sets from each individual but was not uniform across individuals. Serum and citrated plasma samples were centrifuged at 1900 × g for 7 minutes at 18.5°C and analyzed. Chemistry testing was performed on the Roche Hitachi c701 analyzer (Roche Diagnostics, Indianapolis, IN) and Hematology (CBC) testing performed on the Sysmex XN-9000 hematology analyzer (Sysmex America, Inc., Lincolnshire, IL). PT and aPTT measurements were made on BCS XP analyzers (Siemens Medical Solutions USA, Inc., Malvern, PA 19355).

Flight Protocol
Samples were flown in a small fixed-wing aircraft ("Aero" from 3D Robotics, Berkeley, CA 94710; http://3drobotics.com) at an Altitude above Ground Level (AGL) of 100 meters. The aircraft was controlled using a conventional hobbyist 2.4GHz radio control link. A fixed-wing aircraft was selected over other aircraft types, such as helicopter or multi-rotor, because it has the best range capability for a given take-off weight, is least expensive, and is least mechanically complex. The aircraft was launched with a hand toss, and flown up to the test altitude of 100 meters. It orbited the flight field, within the pilot's visual range, for the duration of the test. At the end of the flight, it was brought down to land on the belly skid. Among other precautions, the test was conducted away from populated areas, the aircraft was under the control of a groundbased pilot, and the aircraft's altitude was less than 100 meters.

Statistical Analysis
Deming regression was used to compare flown with stationary results for Sodium, Potassium, Chloride, CO 2  Linear regression was also used to investigate differences between flown and stationary samples pairs as a function of flight time. To determine if our results met clinical and regulatory quality criteria, we compared the 95% limits of agreement of our results to the intervals describing performance acceptability requirements for individual analytes [19][20][21][22]. To examine the repeatability of our results, we compared analytic CV's based on repeat measurements of control material, to CV's from our flown and terrestrial specimens. To determine the effect of UAS flight on the clinical classification of patients, we stratified patient groups according to their reference ranges; normal and abnormal, and compared agreement of the flown and stationary sample pairs (concordance). Analyse-it Software for Microsoft Excel Version 3.90.1 (Analyse-it Software, Ltd., Leeds, UK) and Excel (Microsoft, Redmond, WA) were used to do the analysis.

Correlations
Tables 1, 2, and 3, show data describing the linear relationship between the flown and stationary sample chemistry, hematology, and coagulation results. The slopes of the regression equations were between 0.93 and 1.10 for all 33 tests and between 0.95 and 1.05 for 26 of the 33 tests. In addition, the intercept was close to zero: < 5% of the mean value for 31 of the 33 analytes, and <13% of the mean value for all analytes. Thus for these 26 tests, the results obtained from the flown and stationary sample pairs were within 5% of each other, and within 10% of each other for all 33 tests. 21 of the 33 tests had coefficients of determinations (r 2 ) above 0.9 and six of the 33 tests had coefficients of determinations had (r 2 ) less than 0.7 between the results from the flown and stationary sample pairs. Tables 1, 2, and 3, also show the between-run Coefficients of Variation (CV) of a normal control material for each of the 33 measured analytes compared to the population CV of stationary versus flown sample pairs. [25] Eight chemistry analytes were directly measured tests with controls. For six of these eight analytes (Table 1), the analytic CV was lower than the population CV, however the differences were small. The absolute differences between the analytic CV and the population CV ranged from 0.0 to 3.3 across the eight chemistry analytes.
19 hematology analytes were directly measured or calculated tests with controls. For 11 of these 19 analytes (Table 2), the population CV was lower than the analytic CV. The absolute differences between the analytic CV and the population CV ranged from 0.0 to 1.6 for 17 of these 19 analytes. The other two analytes, percent Basophil and percent Eosinophil, had  Eos y = 1.07x-0.01 0.99 0.7 × 10 9 /L 7.9 0.2 × 10 9 /L 13.9 Table 2. Summary of the hematology results from flown and stationary samples; as well as analytic CV's based on controls versus sample pairs. *These are population CV's.
doi:10.1371/journal.pone.0134020.t002 differences of 21.4 and 6.0 respectively. The difference between the means of the laboratory controls and the population means were large: 9-fold (4.8/0.7) and 5-fold (0.71/0.15) respectively. Four coagulation analytes were tests with controls. For all of these four analytes (Table 3), the population CV was higher than the analytic CV. The magnitude of difference between the analytic CV and the population CV ranged from 0.91 to 2.64 across the four analytes. show the percent differences in the results obtained between individual flown and stationary sample pairs. The dashed lines delineate the 95% limits of agreement. Only Glucose had a mean difference > 1.0%. Its mean difference was 3.2%. The blue lines show the mean difference for analytes where this was > 0.2% of the mean value.

Bland-Altman Comparisons
Two of the eight chemistry analytes that correspond to measured tests, (CO 2 and Glucose) had a 95% limit of agreement greater than 10%. Two of the 14 hematology analytes that correspond to measured or calculated, non-transformed tests, (Eosinophil and Basophil) had a 95% limit of agreement greater than 10%. Of note, these two analytes are also the two with the lowest mean levels. Three of the four coagulation analytes that correspond to measured or calculated non-transformed tests, (aPTT ratio) had a 95% limit of agreement greater than 10%.

Allowable Performance Limits
With the exception of CO 2 (bicarbonate), the 95% intervals for sample pair differences seen in this study were less than the performance criteria we used for comparisons [19][20][21][22]. The strictest bicarbonate 'Allowable Limits' criterion, the Royal College of Pathologists of Australasia Quality Assurance Programs', was 10%. The 95% interval for bicarbonate in our study was 13.6%. Table 4 shows the concordance of the results from the stationary versus flown sample pairs using reference-range defined normal and abnormal cohorts. The overall concordance between results from the stationary and flown sample sets was 97% overall, 98.4% for normal results and 86% for abnormal samples.

Flight Time
The length of flight had no measurable impact on the differences between results from flown and stationary samples. S1 Fig illustrates this finding for Sodium. The pattern was the same for all analytes.

Discussion
This report examines the effect of small UAS transport on the 33 most common chemistry, hematology, and coagulation clinical laboratory tests. The results from flown versus stationary sample pairs were compared using several statistical approaches to determine the presence and magnitude of any differences between them. In particular we were examined three kinds of errors; systematic bias, random errors, and changes in clinical classification. The laboratory results from stationary and flown sample pairs were similar and did not show any systematic biases. A few analytes; namely, Chloride, CO 2 , MCV, MCH, Basophil %, Eosinopil, and partial Thromboplastin time (aPTT) had a 95% limit of agreement >10%. However these analytes had low mean levels (Eosinophil and Basophil), high variability (CO 2 and Monocytes), or were based on transformed data (aPTT ratio). Of the 21 directly measured (as opposed to calculated) analytes in this study, only CO 2 (bicarbonate) failed to meet the 'Allowable Limits' criteria. However this was due to poorer precision rather than a systematic bias. It is likely that this reflects the intrinsic variability of the assay as well as prolonged time-to-analysis rather than changes in atmospheric carbon dioxide levels with altitude as these are small relative to the variances in our results [26,27]. Random errors resulted in slightly poorer precision in the experimental sample pairs compared to analytic CV's. Presently we are unable to conclusively determine if this is due to UAS transport or protracted time from initial phlebotomy to analyte measurement. This is because the regulatory environment in which our experiment was performed, which limited UAS flights to specific unpopulated areas and resulted in protracted time-to-analysis, still exists. However it is clear that the impact on precision is small. The overall agreement when samples were stratified clinically (normal vs. abnormal) was 97%. The agreement for normal samples was 99%. The agreement for abnormal samples alone was significantly worse. However this reflects two limitations of our test cohort rather than a discrepancy in the clinical classification of abnormal samples due to UAS transport. There were very low numbers of abnormal samples, and the abnormal values tended to be just outside the reference range so variation in results between sample pairs would lead to a re-classification of the result even though the magnitude of the changes were small.
The coefficient of determination, r 2 of 21 of 33 tests in this report met or exceeded an r 2 0.9. The other 12 tests had r 2 values < 0.9, but these were for reasons that were unrelated to agreement between the two result sets. r 2 constant is also affected by a low mean value of a cohort, a narrow range of values (highest-lowest) in a cohort, or only a few possibilities within Impact of Drone Transport on Lab Results a cohort (e.g. a dichotomous variable). In our case, all 12 of these tests with r 2 <0.9, had either low mean normal values, narrow normal range, or relatively few possible values within that range (S1 Table) At the inception of this work, there was no precedent for packaging samples for UAS transport. To address this, we considered environmental variables that might be relevant for this mode of transportation including temperature, atmospheric pressure, and acceleration. Temperature change with altitude, or Adiabatic Lapse Rate, is small (0.6°C/100m) at elevations from 0-12,000m [28]. Atmospheric pressure change with altitude is also small for small changes in altitude, about 1.2 kPa (0.012 atm) for every 100 meters [29][30][31]. Our test flights called for changes in altitude that were less than 100 m, therefore we reasoned that no specific measures would be needed to stabilize temperature or pressure when ambient conditions were not extreme. However, we anticipated that acceleration might be a significant environmental factor because the UAS was launched by a hand toss, and landed by sliding to a stop on its belly (Fig 1 and https://vimeo.com/123492106). To mitigate these effects, we packed the sample vials individually in custom-cut soft foam (Fig 1), sealed this in two flexible biohazard bags (zip-loc) with absorbent material, and placed this package inside the fuselage, which is constructed of impact absorbing EPS foam (e.g. Styrofaom). Under the conditions of our experiment, there was no impact of the flight on hemolysis rates. This was determined by comparing hemolysis indices of the flown and stationary sample sets as measured on the Roche Hitachi c701 analyzer. It is likely that the custom-cut soft foam scaffold used to hold the tubes helped to stabilize them in transit. Thus any adoption of UAS transport of clinical diagnostic specimens will need to follow similar practices.
This study's most significant limitation is that the volunteers were mostly healthy individuals and so their results were in the relatively narrow normal range, rather than spread across the full assay range (low to high) for each test. Thus we do not know the impact of UAS transport on results that are outside of the normal reference range. Subsequent experiments will be required to flesh out changes across the full reporting range (low to high) of each test-type. Nevertheless this paper is an important first step in determining if laboratory tests for the most common analytes used in healthcare are reliable when those samples are transported by UAS.

Conclusions
Our findings demonstrate that, for the 33 test-types in this study, laboratory results from UAS-transported samples agree with those transported terrestrially: there were no systematic differences in results from flown versus terrestrial specimens. However, there was slightly worse precision in the flown samples. Full adoption of UAS transport of diagnostic specimens will require similar studies for other types of laboratory tests, specimens, and environmental conditions. Supporting Information S1 Data. Database of the raw data generated in this experiment.  Table. Showing the size of normal range, the mean analyte level and the R 2 for all analytes. (DOCX)