Imaging Non-Specific Wrist Pain: Interobserver Agreement and Diagnostic Accuracy of SPECT/CT, MRI, CT, Bone Scan and Plain Radiographs

Purpose Chronic hand and wrist pain is a common clinical issue for orthopaedic surgeons and rheumatologists. The purpose of this study was 1. To analyze the interobserver agreement of SPECT/CT, MRI, CT, bone scan and plain radiographs in patients with non-specific pain of the hand and wrist, and 2. to assess the diagnostic accuracy of these imaging methods in this selected patient population. Materials and Methods Thirty-two consecutive patients with non-specific pain of the hand or wrist were evaluated retrospectively. All patients had been imaged by plain radiographs, planar early-phase imaging (bone scan), late-phase imaging (SPECT/CT including bone scan and CT), and MRI. Two experienced and two inexperienced readers analyzed the images with a standardized read-out protocol. Reading criteria were lesion detection and localisation, type and etiology of the underlying pathology. Diagnostic accuracy and interobserver agreement were determined for all readers and imaging modalities. Results The most accurate modality for experienced readers was SPECT/CT (accuracy 77%), followed by MRI (56%). The best performing, though little accurate modality for inexperienced readers was also SPECT/CT (44%), followed by MRI and bone scan (38% each). The interobserver agreement of experienced readers was generally high in SPECT/CT concerning lesion detection (kappa 0.93, MRI 0.72), localisation (kappa 0.91, MRI 0.75) and etiology (kappa 0.85, MRI 0.74), while MRI yielded better results on typification of lesions (kappa 0.75, SPECT/CT 0.69). There was poor agreement between experienced and inexperienced readers in SPECT/CT and MRI. Conclusions SPECT/CT proved to be the most helpful imaging modality in patients with non-specific wrist pain. The method was found reliable, providing high interobserver agreement, being outperformed by MRI only concerning the typification of lesions. We believe it is beneficial to integrate SPECT/CT into the diagnostic imaging algorithm of chronic wrist pain.


Introduction
Chronic hand and wrist pain is a common clinical issue for hand surgeons, orthopaedic surgeons and rheumatologists. Physicians are often challenged by symptoms that are hard to allocate precisely and may also change in the course of time. After the clinical examination, the imaging work-up usually starts with plain radiographs. Cross-sectional modalities such as MRI (Magnetic resonance imaging) are frequently performed subsequently. MRI has the major advantage of showing early damage to intra-articular and extra-articular soft tissue such as cartilage, ligaments and tendons, which is a common reason for wrist pain. Therefore, MRI is recommended by expert consensus opinion if radiographs are negative [1]. However, some patients are still lacking an appropriate diagnosis after MRI, and hence, adequate therapy. Identifying the responsible pathology is especially difficult in patients having clinically non-specific wrist pain and multiple lesions.
Hybrid SPECT/CT (Single photon emission computed tomography / computed tomography), which has emerged in the last years, provides information both about the morphological structure and the metabolic activity of lesions, and allows for an exact anatomical localisation of pathological lesions [2,3].
The accuracy of observers and the interobserver agreement in MRI depends on the type of lesion in the hand and wrist, and may vary even in experienced radiologists [4,5]. In a recent pilot study, it was demonstrated that SPECT/CT is more specific than MRI concerning the detection of clinically relevant lesions in patients with non-specific wrist pain [6]. It was also shown that SPECT/CT has a higher interobserver and intraobserver reliability than CT, bone scan, or a combination of both in patients with non-specific pain of the foot and ankle [7]. To date, the experience with SPECT/CT of the hand and wrist is limited, and there are no major studies focusing on the interobserver agreement of SPECT/CT compared with other established imaging modalities.
Thus, the purpose if this study was 1. to analyze the interobserver agreement of SPECT/CT, MRI, CT, bone scan and plain radiographs in patients with non-specific pain of the hand and wrist, and 2. to assess the diagnostic accuracy of these imaging methods in this selected patient population.

Patients
Thirty-two consecutive patients (median age: 38 years, range 18 to 73 years, 19 females, 13 males) with non-specific pain of the hand or wrist were included. Ethical approval was waived by the approving IRB (Cantonal Ethics Committee) due to the retrospective nature of the study. For the same reason, written consent was not obtained from subjects and waived by the approving IRB. The diagnosis of non-specific wrist pain was made by the referring hand surgeon, based on patient history, clinical examination, plain radiographs and clinical guidelines [8]. Conservative therapy failed to improve the symptoms in all patients. Continuance of clinical symptoms was present in all patients at the time of all imaging procedures.
SPECT/CT imaging was performed on a hybrid SPECT/CT system with a built-in flat-panel CT component (BrightView XCT, Philips Healthcare) after injection of a mean activity of 650 MBq 99m Tc-DPD (Technetium-99 m -3,3-diphosphono-1,2propanedicarboxylic acid, Teceos, Behringwerke, Marburg, Germany). Early-phase planar images were acquired directly after injection during 5 minutes (matrix 256x256mm, FOV 40cm). Late-phase images (matrix 256x256mm, FOV 40cm), SPECT (matrix 512x512mm, FOV 40cm) and CT images (matrix 512x512mm, FOV 40cm, 120 kV, 85 mAs, automated dose modulation, 0.5s rotation time, slice thickness 0.33mm) were acquired after 3 hours. SPECT and CT images were reconstructed by iterative reconstruction, CT images in all three planes as 1 mm slices. SPECT and CT images were fused by an automated software algorithm on a dedicated workstation (Extended Brilliance Workspace, Philips Healthcare). Both planar bone scans and CT for read-out were derived from the examination performed on the SPECT/CT machine.

Image evaluation
A region-based evaluation of all five modalities was carried out by four radiologists and / or nuclear medicine physicians using the local PACS (Merlin PACS, Phönix-PACS, Freiburg, Germany). Reader 1 (L.W.) had one year of experience in radiology, including 4 months of CT. Reader 2 (M.P.) had four years of experience in radiology, thereof two years in CT, and nine years in nuclear medicine. Reader 3 (K.S.) had 12 years of experience in radiology and 15 years in nuclear medicine. Reader 4 (A.B.) had nine years of experience in radiology. All images were analyzed independently, and in a blinded and randomized fashion. The readers were provided with a brief clinical history, including information about prior trauma, the time of onset and the location of symptoms, and if the dominant or the non-dominant wrist was affected during exercise or also at rest. Readers were advised to focus on the relevant lesion, i.e. the lesion being responsible for the patient's symptoms and requiring treatment or a change of treatment.
Several categories of pathology were assessed: • detection of the clinically relevant lesion(s) (yes / no ), • its exact location or joint site (designation of the respective location on a standardized read-out form with 74 possible sites per hand / wrist), • the predominant type (bone, cartilage, synovia, tendon, ligaments, capsule, or any combination), and • the assumed etiology of pathology (posttraumatic, degenerative, primary constitutional, inflammatory, stress, vascular, or any combination).
The standard of reference consisted of complete clinical examination and all imaging procedures performed, with a mean clinical follow-up period of 20 months in 21 patients. During this period of time, every patient had at least one further clinical examination by a hand surgeon. The remaining eleven patients additionally underwent arthroscopy during their mean clinical follow-up period of 16 months. If complete clinical and diagnostic work-up failed to detect the cause of the pain in a patient, the standard of reference was rated as negative. Hence, if the readers rated an imaging method as negative in this patient, the result was defined as true negative. Results were expressed as mean detection rate, which means that a lesion detected correctly by e. g. one of two readers was rated as 50% mean detection rate.

Statistical analysis
The interobserver agreement was determined by calculating the kappa for denominating the relevant lesion, and location, type and etiology of pathology (see evaluation above). Results of each diagnostic method were subdivided according to the level of training of the readers (readers with expert / basic knowledge in radiology: 3, 4 / 1, 2; expert / basic knowledge in nuclear medicine: 2, 3 / 1, 4). Kappa for all analyzed imaging criteria are presented as kappa for two readers (agreement between two particular readers) or as mean of the four kappa for each experienced versus each inexperienced reader (agreement between experienced and inexperienced readers). Kappa values including 95% bias corrected bootstrap confidence intervals (CI) were computed using the procedure kapci in Stata 11.2 (StataCorp, College Station, Texas, USA). Moderate agreement was defined if between 0.41-0.60, substantial agreement if between 0.61-0.80, and almost perfect agreement if >0.80) [9].
The diagnostic accuracy and the area under the receiver operating characteristics (ROC) curve were calculated for detecting the relevant lesion according to the standard of reference mentioned above. A score was collected for experienced and inexperienced readers concerning lesion detection and localization, broken down to lesion type and geographic distribution, respectively. Differences between experienced and inexperienced readers in lesion detection were analyzed by Wilcoxon signed ranks test, based on a mean detection rate calculated for each reader group. A pvalue of <0.05 was considered statistically significant.

Clinical follow-up
Twenty-seven of the 32 patients had a lesion that was responsible for their symptoms, whereas in 5 patients no causative lesion could be found. An overview of all lesions and mean detection rates is given in table 1.

Diagnostic accuracy
The diagnostic accuracy of plain radiographs was insufficient in all readers (range 25% -31%). Bone scans also showed no valuable performance, with the highest diagnostic accuracy achieved being 53% (experienced reader). In CT, the highest diagnostic accuracy was 41% (experienced reader). In contrast, SPECT/CT performed well yielding diagnostic accuracies of 75% to 78% for experienced readers, and still 41% to 47% for inexperienced readers. However, MRI yielded only insufficient diagnostic accuracy (experienced readers: range 53% -59%, inexperienced: 25% -50%). SPECT/CT was significantly more accurate for experienced than for inexperienced readers (p < 0.001), while there was borderline significance concerning MRI (p = 0.06). The most specific modality for the evaluation of patients with clinically nonspecific wrist pain by experienced readers was SPECT/CT (mean specificity 90%), whereas MRI yielded only poor specificity (10%). Both modalities were the only ones with reasonable sensitivity (SPECT/CT 74%, MRI 65%). Detailed results are listed in table 2.
The correct localization of lesions (spatial accuracy) was highest in SPECT/CT, slightly outperforming MRI. Especially lesions located in the ulnocarpal compartment (Figures 1 and  2) and DRUJ were detected correctly by experienced readers to a greater extent than by inexperienced readers, both with SPECT/CT (mean value 75% vs. 44%) and MRI (mean value 69% vs. 44%). The situation was different concerning the carpal bones: Lesions in the proximal row were properly assessed only by experienced readers on MRI (mean value 70%; SPECT/CT: 40%), in the distal row only by SPECT/CT (mean value 80%; MRI: 60%). Details are given in table 3.
Concerning the evaluation of the type of the lesions, there was no significant difference found between SPECT/CT and MRI in general. SPECT/CT performed better in the evaluation of predominantly bony lesions, while MRI showed its known strength in soft tissue pathologies.
MRI was slightly superior to SPECT/CT concerning the evaluation of the assumed etiology of lesions, when read by experienced individuals (correct etiology in 39 of 2x 27 lesions (72%), and 37 of 2x 27 lesions (69%)). Generally, posttraumatic lesions were evaluated with a higher accuracy on SPECT/CT than on MRI, while degenerative lesions showed better results upon evaluation with MRI compared to SPECT/CT.

Interobserver agreement
Detailed results are given in table 4. Kappa for interobserver agreement in lesion detection are depicted in Figure 3. Table 1. Relevant lesions according to the standard of reference; mean detection rate by all modalities for experienced and inexperienced readers.

Plain radiographs.
There was similar substantial agreement on detection of the relevant lesion and its location between experienced observers (kappa 0.69, 0.73, respectively), and between experienced and inexperienced observers (mean kappa 0.64, 0.62, respectively). The agreement on type and etiology of lesions was generally poor between experienced readers, as well as between experienced and inexperienced readers.
Bone scan. Bone scan yielded substantial agreement between the experienced readers on location (0.63), and moderate agreement on type (0.51) and on etiology (0.50). The agreement between both observer groups was moderate on all criteria.
CT. While the diagnostic accuracy was insufficient, there was almost perfect agreement between the experienced readers on lesion detection and location (0.87), substantial agreement on lesion type (0.74), and moderate agreement on lesion etiology (0.54). Between both reader groups, there was substantial agreement on all criteria except lesion type.
SPECT/CT. In contrast to the aforementioned modalities, the accuracy of experienced readers was relatively high in SPECT/CT (lesion detection 0.93, localisation 0.91, etiology 0.85). Interestingly, no agreement on any of the criteria was found between the two reader groups.
MRI. While MRI was the modality with the second bestthough merely low -accuracy of experienced readers, there was substantial agreement between the experienced readers on all criteria (0.71 -0.75). In contrast, there was no valuable agreement between experienced and inexperienced readers.

Diagnostic accuracy
The crucial point in the imaging of wrist pain is to allocate one or more causative or active lesions to a patient's symptoms. Establishing the diagnosis in morphological imaging can be difficult, especially in patients with chronic pain, because morphological changes are known to lag behind metabolic activity. In patients with low back pain due to neoplastic lesions, SPECT has been shown to have similar results as MRI [10,11].
Bone scans using Tc-99 m -DPD are ideal for detecting active bone turnover. CT alone is limited in musculoskeletal conditions due to its low soft tissue contrast. Thus, isolated cartilage and ligament lesions are not easily recognized on CT without intra-articular contrast medium. In patients with acute wrist trauma and persisting pain, studies have shown that CT is superior to bone scan in diagnosing occult fractures, and slightly inferior to MRI [12,13]. However, CT is considerably less sensitive than MRI in detecting cancellous bone lesions such as trabecular fractures [14,15]. This limitation may be overcome by hybrid imaging obtained with a SPECT/CT system, combining metabolic information with precise anatomical location of tracer uptake.
As shown in another recent study [6], SPECT/CT was the most specific modality in such a selected patient population (specificity 90% in the present study), with a very high PPV of 0.98, whereas MRI was highly non-specific (10%) for experienced readers. In contrast, the specificity of MRI for inexperienced readers (80%) was found slightly higher than that of SPECT/CT (60%). We conclude that if an inexperienced reader discovers a pathology on MRI, it will likely correspond to the clinically leading pathology. This especially holds true when considering that the overall mean detection rates for experienced and inexperienced readers were not significantly different (44% vs. 38%). However, the low specificity of MRI in the experienced group reveals one dilemma of this modality: through its ability to show a lot of different pathologies in different compartments, it becomes hard to grade the severity of all those pathologies and then pinpoint the clinically relevant one in patients with several co-existing lesions. The human body provides many "false positives" secondary to the aging process and remote trauma. CT scans and bone scans inherently will provide a more binary result given their more limited ability to detect pathology compared with MRI. This allows for a more limited range of interpretations, contrary to MRI. SPECT/CT and MRI were the only modalities with reasonable sensitivity, with SPECT/CT slightly outperforming MRI (sensitivity: 74% vs. 65%, respectively; accuracy: 77% vs. 56%, respectively). The registration technique in hybrid SPECT/CT already proved to be more accurate than methods used previously [16][17][18][19]. Utsunomiya and co-workers found fused SPECT and CT images especially useful for the differentiation of osteoarthritis from malignant conditions [20]. A recent study by Linke et al. also demonstrated that SPECT/CT increases the diagnostic accuracy in orthopaedic disorders of the extremities [21]. The authors reported a revision of the diagnostic category in one third of patients. These findings parallel our results. False negatives in MRI in our study were particularly lesions with bone remodelling and many similarly appearing degenerative or posttraumatic changes in one or more compartments, but with distinct tracer uptake of one of these lesions in SPECT/CT. Consequently, SPECT/CT was superior to MRI in bone-only lesions and in bone lesions combined with other types of pathology. The overall etiology of lesions was correctly identified by SPECT/CT and MRI to a similar extent, but with advantages for MRI. MRI may depict a multitude of soft tissue pathologies not detectable by CT and / or bone scan. Thus, MRI provides a much broader view than only bone changes, and can give pertinent clinical information not detectable by a CT scan or a bone scan. This can however be foiled by several co-existing lesions in the same patient, as

Interobserver agreement
The lack of exact anatomic localization of a lesion has been a major disadvantage of bone scans concerning the interobserver agreement [18,20,22]. This is due to the fact that a hot spot at one place, e.g. the lunate bone, could result from different pathologies (e.g. Kienböck's disease, osteoarthritis, fracture etc.). This limits the interpretation in the absence of sufficient anatomical and morphological information [23]. We focussed on the agreement between experienced readers, as well as on the agreement between experienced and inexperienced readers to determine if the lack of training may be overcome by combined metabolic and morphological information. Fused SPECT and CT images were already shown to increase the interobserver agreement and observer confidence in suspected neoplastic bone lesions, compared to separate sets of scintigraphic and CT images [20]. SPECT/CT was also demonstrated to have a higher interobserver agreement and intraobserver agreement than bone scan, CT and a combination of both in disorders of the foot and ankle [7]. However, MRI was not assessed in that study.
In the present study we also found that SPECT/CT is the modality with the highest agreement between experienced readers. The second highest agreement was found in CT, however, with lower accuracy. A higher interobserver agreement with values of 1.00 for TFCC and cartilage lesions and 0.89 for ligament lesions was found in the study by De Filippo and co-workers analyzing CT arthrography in patients with degenerative or posttraumatic arthropathy of the wrist [24]. These differences are probably based on the use of contrast medium. The inherent limitations of naïve CT may be  Table 3. Geographic distribution of relevant lesions according to the standard of reference, mean detection rate by all modalities for experienced and inexperienced readers.  overcome by the addition of functional information in SPECT/CT, which in turn increases both interobserver agreement and accuracy. This holds obviously true only for readers with a certain level of experience since we could not demonstrate a fair agreement on any of the analyzed criteria between readers with different levels of experience, both in SPECT/CT and MRI. However, this may not be the case when assessing only patients with degenerative lesions [7].
Only for the typification of lesions, MRI showed a hardly higher interobserver agreement for experienced readers (0.75) than SPECT/CT (0.69). However, the agreement between the experienced and inexperienced readers was again higher in SPECT/CT (for all evaluated criteria), pointing out that SPECT/CT might be at least slightly easier to understand and to read than MRI. However, in another study assessing cartilage lesions of the wrist, MRI had a high specificity, but low interobserver agreement [25]. A fair interobserver agreement was found for synovial lesions of the hand [26]. Finding an "active" osseous lesion in MRI is predominantly based on the prevalence of "bone marrow edema", a pattern of ill-defined hyperintense signal on T2-weighted images. However, this feature is considered non-specific [27][28][29][30][31]. Bone marrow edema and radiotracer uptake do not necessarily prevail concurrently [6]. Supposedly this partially explains the observed relatively high interobserver agreement but low specificity of MRI. However, it is well established that bone scan and MRI findings may have little clinical relevance, particularly in the absence of pertinent history.

Limitations
Besides the relatively small number of patients, the study was performed retrospectively. The results of all imaging modalities are partly an inherent bias on the standard of reference of this study, i. e. on the clinical course. However, this is a general limitation of retrospective diagnostic imaging studies. Another limitation may be the time gap between the examinations with median intervals of 32 to 37 days. Lesions certainly may change during this period. However, all patients had chronic, clinically relevant symptoms at any time point of imaging. Furthermore, the scarcity of inflammatory lesions in our cohort tends to bias the results away from MRI as a useful modality.

Conclusion
Overall, SPECT/CT yielded the highest diagnostic accuracy in both experienced and inexperienced readers. The method was found reliable, providing higher interobserver agreement than all other imaging modalities, being outperformed only by MRI concerning the typification of lesions. However, training and experience are mandatory for the correct reading of SPECT/CT. SPECT/CT may be integrated into the diagnostic imaging algorithm of patients with chronic wrist pain, especially if MRI results are equivocal.