Image-Based Medical Expert Teleconsultation in Acute Care of Injuries. A Systematic Review of Effects on Information Accuracy, Diagnostic Validity, Clinical Outcome, and User Satisfaction

Objective To systematically review the literature on image-based telemedicine for medical expert consultation in acute care of injuries, considering system, user, and clinical aspects. Design Systematic review of peer-reviewed journal articles. Data sources Searches of five databases and in eligible articles, relevant reviews, and specialized peer-reviewed journals. Eligibility criteria Studies were included that covered teleconsultation systems based on image capture and transfer with the objective of seeking medical expertise for the diagnostic and treatment of acute injury care and that presented the evaluation of one or several aspects of the system based on empirical data. Studies of systems not under routine practice or including real-time interactive video conferencing were excluded. Method The procedures used in this review followed the PRISMA Statement. Predefined criteria were used for the assessment of the risk of bias. The DeLone and McLean Information System Success Model was used as a framework to synthesise the results according to system quality, user satisfaction, information quality and net benefits. All data extractions were done by at least two reviewers independently. Results Out of 331 articles, 24 were found eligible. Diagnostic validity and management outcomes were often studied; fewer studies focused on system quality and user satisfaction. Most systems were evaluated at a feasibility stage or during small-scale pilot testing. Although the results of the evaluations were generally positive, biases in the methodology of evaluation were concerning selection, performance and exclusion. Gold standards and statistical tests were not always used when assessing diagnostic validity and patient management. Conclusions Image-based telemedicine systems for injury emergency care tend to support valid diagnosis and influence patient management. The evidence relates to a few clinical fields, and has substantial methodological shortcomings. As in the case of telemedicine in general, user and system quality aspects are poorly documented, both of which affect scale up of such programs.


Introduction
Rapid advances in telecommunication and information technology have sparked the development of a variety of systems that allow for new forms and domains of medical consultation. During the past two decades, many broad reviews of telemedicine have been published, describing the state of knowledge and assessingto some extent -the quality of the evidence at hand. Some reviews are wide ranging both in scope and geography [1,2], some are broad in scope but restricted to some countries [3], some deal with specific perspectives of application (like diagnostic and management decisions) [4,5], and rare ones look at costs [6]. Two recent systematic reviews added to the literature in this area: one assessed the effect of telemedicine on professional practice and on patient health care outcome [7] and the other was a systematic review of reviews about the effectiveness of telemedicine [8]. A consistent finding across reviews is that radiology, mental health, and dermatology are three domains of application with positive clinical outcomes [4,5]. Yet, there are serious concerns that the evaluations conducted thus far are of rather poor methodological quality (e.g., design, methods, size/dimension) [9], with weak theoretical foundations, and limited to assessments of clinical management rather than patient recovery (and health). Also, little is known regarding their sustainability or the manner in which they can be implemented in other settings [4,5].
In the particular case of expert advice in the acute care of injured patients, expert consultation by telephone could be expected to significantly improve care access, quality and outcome by decentralising knowledge, speeding up and improving decision making and limiting patient transfer or expert displacement. This is encouraging as injury is an increasing cause of concern worldwide and it affects people from resource poor areas -where prognosis is not so good -to a far greater extent [10,11]. Reviews are available in this domain, but many are descriptive [12][13][14] or context specific [4,15]. A 2006 review focusing on accident and emergency telemedicine for primary care concluded that most studies conducted until then demonstrated technical feasibility and improved triage with an increasing range of local management, but few cost-effectiveness assessments were available [16]. Those reviews briefly introduced the role of telemedicine in the emergency department [14], current trends in the development and adoption of tele-medical adjuncts for injury control [17], potential applications/functions of telemedicine for trauma and disaster management, and a review of systems from a US perspective [15]. Successful domains of application identified thus far are the transmission of computed tomography scans for urgent neurosurgical opinion and the transmission and interpretation of radiographs (usually peripheral limb films) for on going support of minor injury units [12,13]. In the case of burn injuries, studies are consistent on technical and clinical feasibility whereas less is known as regards clinical outcomes [18]. Systems have been evaluated in the main for their clinical accuracy, health care provider satisfaction, and follow-up of wound care [15].
Consulting those reviews helps us understand where the knowledge stands and what ethical and legal challenges are posed by the use of telemedicine in acute care. The knowledge at hand informs about various aspects of telemedicine, including those where experts are consulted and/or involved remotely in patient care. Yet, they provide limited assessments of the quality of the evidence thus far and they mix various types of telemedicine without specifying whether their conclusions actually apply to all of them. Although the field changes rapidly and new forms of teleconsultation enter the field of trauma care, those newer forms have not been reviewed in their own rights.
Against this background, this systematic review was undertaken to 1. revisit and update the literature specifically on image-based telemedicine for medical expert consultation in acute care of injuries; and 2. systematically review the evidence at hand regarding system, user and clinical perspectives. Four main research questions are addressed: What is the system quality? What is the diagnostic validity? What is the effect on the management and clinical outcomes? What is the level of user satisfaction?

Methods
The procedures used in this review followed the PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) Statement [19]. There is no published protocol for the systematic review, but the procedure is described in detail below.
The DeLone and McLean Information System Success Model [20] was used as a framework to synthesise the results according to system quality, user satisfaction, information quality (diagnostic validity) and net benefits (management and clinical outcomes).

Date sources and searches
This systematic review includes studies that were published in articles from peer-reviewed journals. A systematic search identified potentially relevant articles in five electronic databases commonly used in this research area: MEDLINE, EMBASE, CINAHL, Cochrane Library and PsychINFO. The databases were searched without a time limitation in June 2012, with the following terms (in title and abstracts or as MeSH terms): ''telemedicine'', ''mHealth'', ''m-health'', ''eHealth'', ''e-health'', ''mobile health'', ''emergency'', ''emergencies'','' injury'', ''injuries'', ''trauma'', "acute burn", "acute burns". Relevant articles were also sought from the list of references of the reviews identified in the search, all articles from the archive of all online issues of the ''Journal of Telemedicine and Telecare'' and ''Telemedicine and e-Health'' Journal, starting from 2005 and from the list of references of the articles considered as eligible (see Figure 1 below).

Study selection
Articles on the subject of telemedicine for medical expert consultation in emergency care were included if they met the following criteria: evaluated the acute stage of injury/trauma care, in emergency or pre-hospital settings; telemedicine intended to be used from point of care to specialist, and including image transfer; the system was assessed using human subjects; articles written in the English, French, Spanish, German or Nordic languages.
Studies were excluded due to the following criteria: reviews, case studies or purely descriptive studies; done under extreme conditions (disaster situations, war zones, space, etc.); image transferred does not consist of trauma images, and if image transfer was done in conjunction with real-time interactive videoconferencing.
First, two reviewers independently evaluated the abstracts of all records identified in the initial databases search. If either reviewer could not exclude the article, the full-text was obtained and evaluated by the reviewers to assess eligibility. Secondly, additional references from reviews, journals and eligible articles were screened by title by one of the reviewers. If the title indicated relevance, the abstract and then the full-text were screened by two reviewers.

Data extraction and quality assessment
The articles were reviewed by all authors and after that a decision were made regarding the items that could be measured across most studies. Data was extracted on country, type of image, and the clinical focus that included the medical discipline and required expert, and the technology used for image treatment. From the four perspectives investigated, i.e. system quality, user satisfaction, diagnostic validity and clinical management, system quality and user satisfaction were seldom evaluated and the data gathered on those aspects related to the following. System quality considered above all on image quality, and time to complete different steps in the telemedicine process and user satisfaction included the perceived ease of use and usefulness of the system. In the case of diagnostic validity and management outcomes a wider range of data were compiled relative to the methodology of the studies, including e.g., sample size and statistics used, and to the results obtained.
Attention was also paid to the methodological rigor of the studies by considering how various potential sources of bias were dealt with, based on ''The Cochrane Collaboration's tool for assessing risk of bias'' [21]: selection, performance, detection and attrition. The use of a ''gold standard'' was an additional criteria used for the studies dealing with diagnostic validity and management outcomes.
Data extraction for each article involved at least two reviewers who independently reviewed each article. After the individual assessments, the reviewers met two by two to discuss and agree on the data extraction and quality assessment of each article.

Results
Of the 331 articles identified in the database search, 16 were eligible for review, an additional 3 were obtained from screening of relevant references from reviews, and 5 were obtained from screening the references of the eligible articles ( Figure 1). The 24 articles describe 22 telemedicine systems, with two articles by Hsieh [22,23] describing one system, and two articles by Wallace [24,25] describing another.
The articles were published between 1992 and 2011, and were mostly carried out in high-income countries. They appeared in 18 different journals, two of which were from the telemedicine field and the others from medical journals (Table 1). Table 2 describes some general characteristics of the systems investigated, ordered according to their stage of development: feasibility studies; pilot or small-scale roll-out studies; and postimplementation studies. These characteristics include the conditions assessed, which belonged to different medical disciplines and mostly general traumas, followed by orthopaedic and hand injuries, that most often required the expertise of plastic surgeons, followed by radiologists and orthopaedic surgeons. Images were captured, transmitted and displayed through various technologies, and there were two types of images transmitted: radiological images and clinical photographs of the injury. As also shown in Table 2, most articles reported on management outcomes and diagnostic validity (perspective), while others assessed user satisfaction and system quality. Figure 2 represents the number of articles that report on different perspectives, within three different time periods. There is a trend whereby more recent articles seem to be focusing more on management outcomes and less on diagnostic validity.

System quality
12 articles assessed whether image transfer provides adequate support for injury acute care assisted by telemedicine. These articles evaluated the quality of the images [22,[26][27][28][29][30][31][32] and how long it takes to complete different steps in the telemedicine process [22,23,[32][33][34][35]. In some cases assessment of image quality was done using scales, and in others it was not clear how the assessments were made. Image quality was considered lower for telemedicine compared with original radiographs in a few of the studies [26][27][28]30]. Users in other studies expressed satisfaction with the telemedicine image quality [22,[29][30][31][32]. In one of the studies, the quality of the telemedicine images was rated lower than the original radiographs, although the users were still satisfied with those images [30]. Operation time (from taking the image to reception of the image) was 3 to nearly 15 minutes [22,23,32] and the time for creating a file was 3 to7 minutes [34]. One study indicated that telemedicine radiographs took longer time to read than originals [33], and another one that telemedicine increased the time at the emergency department [35].

User satisfaction
Five articles report user data [24,25,31,32,36]. They address the ease of use [24,25,31] and the perceived usefulness of the system for clinical decision making [24,25,32,36]. Three indicate that the data were gathered by questionnaire, [24,31,36] but none specifies how the question read, what the alternative answers were, and whether the questions -or answers -were standardized or validated. Overall, the studies report high levels of satisfaction and perceived ease of use. Table 3 presents the 17 articles that assessed diagnostic validity. Eight assessed systems at the feasibility stage, and nine at the pilot and post-implementation stages. The former all used radiological images, and had different designs whereby the assessors would either assess the images by one modality (i.e. either the original or digitized radiograph) [30,37], digitized images before the original ones [27][28][29], original images before the digitized ones [26], or mixed modalities where some assessors started with original and others started with digitized radiographs [33,38]. In one article [29] the description of the radiograph was also assessed and compared to the digitized and original radiographs. Assessments by both modalities were done one directly after the other [28,29], two weeks apart [26], at least four weeks apart [33,38], or six months apart [27]. All studies used a gold standard, and employed accuracy (sensitivity and specificity) measurements, Receiver Operating Characteristic (ROC), Kappa or McNamar's test.

Diagnostic validity
In five of the nine other studies that assessed systems at the pilot or post-implementation stages, the assessors considered cases through the telemedicine modality only [22,23,32,35,39]; in two, both telemedicine and on-site interpretations were done one after the other [40,41]. In two articles a telephone description was compared to the telemedicine modality [31,42]. Three evaluations used a gold standard [22,23,39], and statistical analysis included kappa, correlation coefficient and descriptive statistics, and accuracy analysis (sensitivity and specificity). Results showed generally good diagnostic accuracy, except in one study [39].
The main limitation of the evidence at hand was that 7 of the studies did not specify a gold standard [29,31,32,35,[40][41][42] and that four studies did not use statistical tests to validate the diagnosis [31,32,35,40]. Convenience sampling were often used, in some studies clearly described [23,27,28,30,32,33,38], but in others not [22,26,29,37,40,41]. Even if this limits the general representativeness of the studies, it may reflect specific or complicated diagnosis. Some of the articles did not clarify the performance of the studies [22,23,26,28,40,41] which made it difficult to review the rigor of those studies. Table 4 presents the 16 articles that assess the effect of imagebased telemedicine on the clinical management of patients. In these articles, management plans after viewing digitized images were compared with written or oral descriptions [24,25,29,31,36,[42][43][44], original radiographs or on-site examination [29,40,41,43], and video [45]. Others estimated the consequence of misdiagnosis [39], or compared to the management suggested by the referring doctor [32].
Clinical outcome was assessed only in one of the recent studies [42]. In this article, mortality and Glasgow Outcome Score (GOS) at 6 months were compared between the patients who were transferred following telephone consultation and those transferred following telemedicine consultation (including images). Proportion of poor outcome (dead, vegetative or severely disabled) was higher in the group without telemedicine (32,1% vs 25,8%), but these differences were not significant. Overall mortality in both groups was the same (14,3%).

Main findings
This review dealt specifically with systems based on the transfer of images as a mean of consultation on acute injuries of various kinds. To date, by and large, those systems use above all radiologic images, they are evaluated at a feasibility stage or during smallscale pilot testing, and are put in place in a limited number of countries, all of which are high income. None of them are prehospital.
Whereas the impact of the systems on diagnostic validity and management outcomes are often studied (see below). As is the case in other fields of telemedicine, the data at hand are less informative regarding both system quality and user satisfaction and we found only anecdotal economic evaluations [22][23][24]31] although methodological examples are available in the literature [46]. This may be due in part to the short life of some of the systems evaluated, but these knowledge gaps are now regarded as research areas to receive priority so as to allow policymakers and health care planners to make informed decisions (not least in low-and middleincome countries) [47][48][49]. Partly as a consequence, no standard methods of measurement emerge as regards systems quality or user satisfaction. From the reports available on user satisfaction for instance [24,25,31,32,36], ease of use and usefulness of telemedicine are the two aspects studied and the studies are difficult to reproduce.
As observed in previous reviews concerned with telemedicine for the support of medical care in general [1][2][3][4][5][6][7][8][9] or in some reviews for emergency care of injuries in particular [15,16], the quality of the evaluations performed to date is somewhat poor, sometimes by using inadequate or no gold standard or an imprecise reference to validate the system, sometimes by their limited sample size and inappropriate statistical methods, and sometimes even by their poor reproducibility.

Diagnostic validity
Whether image-based telemedicine in acute care yield accurate clinical diagnosis was investigated for 16 systems and the majority relied solely on radiological images, some for injuries in general and others for specific body parts (e.g., hand or head). All but two feasibility studies [29,38] involved accuracy assessments. They were of varying size in terms of number of cases and a gold standard was used in all instances. Evaluations of systems at the small-scale phase [22,23,31,32,35,[39][40][41] were more inclined to use a gold standard over time and in particular when they were of larger size (number of cases/images). The implemented system evaluated [42] did not use a gold standard.
Not surprisingly, the general impression is that transmitted images, above all radiological ones and of a variety of body parts, can be accurately interpreted by specialists and that this has become more evident over time, i.e. while the technology itself allowed for better pictures, transmission and reading conditions. This finding can be interpreted as if consulting a radiologist or specialist is (has become) as accurate when using transmitted images as when using original ones. It is also of note that factors like age and experience of the teleexpert may impact on the level of accuracy just as do some characteristics of the injury.

Management outcomes
Whether telemedicine affected patient management was investigated for 15 systems and all but one of them implied radiological images, some for injuries in general and others for specific body parts (e.g., hand or head). At the feasibility stage, three systems [29,36,45] out of 10 were assessed for their potential influence in that respect. All small scale system implementations assessed patient management (9 systems; [22,31,32,34,35,[39][40][41]43]) and so did the three system evaluated once roll out (4 articles) [24,25,42,44].       The general impression is that consultation by telephone contributes to a change in management plan, including accuracy of triage/referral or a given treatment plan/procedure. This can be interpreted as if consulting a radiologist or specialist influences the management decisions made in acute care regarding injured patients. This in turn is conditional to transmitted images being as accurately interpreted as original ones (in case of radiology) or as at seeing the patient at bedside. Some of the data supporting this finding are perceptual (point of care or expert) and others are factual -a change was reported/observed.
The data at hand however is of relatively poor quality with limitations in the use of gold standard, in study size (8 evaluations being based on less than 50 cases 2 having between 80 and about 100 and the remaining 4 having over 150 cases), and in statistical methods.
The review was limited in the way that most studies came from high-income countries and may not be representative of the conditions prevailing in low-and middle-income countries where this kind of research is much needed [47,50]. The research is based on well-established databases commonly used in similar type of reviews. We may have missed some studies captured in other types of databases but we assume that this loss is most likely to be small given the broad scope of the search itself. Furthermore, the review is restricted to articles written in the English, French, Spanish, German or Nordic languages. We also acknowledge the high likelihood of publication bias in favour of studies showing positive effect of telemedicine systems, which affects the state of knowledge [47]. Unfortunately, we were not able to describe and compare the technical features of the systems as much as we had expected in the beginning of the review process. The studies were published in different type of journals, but mainly medical ones, and the level of detail was very uneven. It goes without saying that it would be a great contribution to this field of research -and practice -if there were clear criteria to be met for the description of the systems evaluated.

Way forward
As availability of telecommunication and information technology expands, and penetration into low-and middle-income countries increases, image based telemedicine can play a key role in increasing access to expert advice in the acute care of injured patients. However, current evidence is generally of low methodological quality and is limited in focus. In order to facilitate scale up of injury based acute care telemedicine systems -in a time of increasing burden of injury in many parts of the world -the literature is still incomplete.
-Studies are needed to inform program development and implementation in general (to better understand barriers to largescale implementation) and in resource poor settings in particular (where such systems are most urgently needed) [48,50,51].
-Research in this field needs to pay greater attention to user perspective (both healthcare professionals and patients) [48,51]. Failure to do so is a major threat to sustainability, as user acceptance is a prerequisite to implementation.
For scaling up telemedicine programs, several authors emphasize the importance of a common architectural design and interoperability of initiatives into existing health services [47,[50][51][52][53]. National policies to ensure patient security and liability [51,53] and liaison of public and private partnerships [50][51][52][53][54] are other important elements for a broadening of initiatives. Policies could also ensure that strategies for monitoring and evaluation are included in the planning [52]. The creation of standard methods, instruments and measures would greatly assist interoperability and reproducibility of the myriad programs in use and being developed.

Conclusions
The present systematic review shows that image-based telemedicine systems for injury emergency care tend to support valid diagnosis and influence patient management. However, the current evidence is generally of low methodological quality and     -No use of a gold standard is limited in focus. User and system quality aspects are poorly documented, both of which affect scale up of such programs. Further work is required on quality, interoperability, and scalability.