Clinical Importance of the Heel Drop Test and a New Clinical Score for Adult Appendicitis

Objective We tried to evaluate the accuracy of the heel drop test in patients with suspected appendicitis and tried to develop a new clinical score, which incorporates the heel drop test and other parameters, for the diagnosis of this condition. Methods We performed a prospective observational study on adult patients with suspected appendicitis at two academic urban emergency departments between January and August 2015. The predictive characteristics of each parameter, along with heel drop test results were calculated. A composite score was generated by logistic regression analysis. The performance of the generated score was compared to that of the Alvarado score. Results Of the 292 enrolled patients, 165 (56.5%) had acute appendicitis. The heel drop test had a higher predictive value than rebound tenderness. Variables and their points included in the new (MESH) score were pain migration (2), elevated white blood cell (WBC) >10,000/μL (3), shift to left (2), and positive heel drop test (3). The MESH score had a higher AUC than the Alvarado score (0.805 vs. 0.701). Scores of 5 and 11 were chosen as cut-off values; a MESH score ≥5 compared to an Alvarado score ≥5, and a MESH score ≥8 compared to an Alvarado score ≥7 showed better performance in diagnosing appendicitis. Conclusion MESH (migration, elevated WBC, shift to left, and heel drop test) is a simple clinical scoring system for assessing patients with suspected appendicitis and is more accurate than the Alvarado score. Further validation studies are needed.


Introduction
Acute appendicitis is one of the most common abdominal surgical emergencies presenting at the emergency department (ED) [1][2][3]. Despite the increasing availability of ultrasonography and computed tomography (CT), clinical examination remains the cornerstone of the diagnostic process when patients present with right lower quadrant pain. Recent guidelines recommend the establishment of local pathways for the diagnosis of acute appendicitis and note that the combination of clinical and laboratory findings of pain characteristics, tenderness, and laboratory evidence of inflammation identify most patients with suspected appendicitis [4]. Physical examination may reveal signs of peritoneal irritation in the right lower quadrant or diffusely. In addition, other symptoms such as obturator sign, psoas sign, or Rovsing's sign may be associated with appendicitis depending on the location of the inflamed appendix. However, these indications are only weakly predictive of appendicitis [5]. The heel drop test has been shown to be superior to the old rebound test for detecting intraperitoneal inflammation since it is more objective and less subject to misinterpretation [6]. However, only one study in Turkey has been performed on the usefulness of the heel drop test as a clinical indication of acute appendicitis [7]. Other diagnostic strategies include the use of scoring systems, of which the Alvarado score, derived from retrospectively collected data from 305 adult patients in the mid-1980s, is the best known clinical prediction rule for estimating the risk of appendicitis [8][9][10][11][12]. This score is calculated from symptoms, physical examination, and basic laboratory data and assigns a score from 0 to 10. The original study of this system reported a sensitivity of 81% and specificity of 74% in identifying patients who needed an appendectomy, and subsequent validation studies have showed variable performances of this score [13][14][15]. The modified Alvarado score uses the same value categories without the shift to left of leukocytosis, ranging from a score of 0 to 9 [16]. Patients with an Alvarado score <5 or a modified Alvarado score <4 are considered to be at low risk for appendicitis.
The primary aim of our present study was to evaluate the accuracy of the heel drop test as a clinical factor in acute appendicitis. We compared its performance with that of other wellknown physical examination findings of appendicitis. We also tried to develop a new clinical score for adult appendicitis based on the heel drop test as a variable, and tried to compare the reliability of the new score to that of the Alvarado score.

Materials and Methods Patients
This study was approved by the Institutional Review Boards of each participating hospital (Asan Medical Center and Ulsan University Hospital), and written informed consent was obtained from the enrolled patients or guardians on behalf of the participants. We conducted a prospective observational study of consecutive patients who visited the ED of two large, urban, tertiary referral hospitals with symptoms suggestive of acute appendicitis from January 1 st to August 31 st , 2015. All patients who presented to the ED with abdominal pain and right lower quadrant direct tenderness, and who underwent contrast enhanced abdominal CT were enrolled. Patients younger than 17 years, those who were pregnant, and those with renal insufficiency and other contraindications for contrast-enhanced CT scans were excluded from the analysis.
Standard data including demographic, clinical, and laboratory information were collected. The Alvarado score was retrospectively calculated after the end of data collection, and was not used to help predict the likelihood of acute appendicitis. After completing the data entry, abdominal CT with intravenous non-ionic contrast material was performed in all patients and reviewed by board-certified attending radiologists. The decision to operate was made by the surgeon on duty on the basis of clinical impression and abdominal CT scan results. The main outcome was the presence or absence of acute appendicitis based on surgical findings. Appendicitis was considered complicated when perforation or a periappendiceal abscess was present. Diagnoses of patients who did not undergo surgical exploration were made by CT findings. Final diagnoses were classified into three groups: normal appendix, uncomplicated appendicitis, and complicated appendicitis.
Physical examinations including rebound tenderness, defense, psoas, Rovsing's, obturator sign, and the heel drop test were performed by emergency physicians on duty, after completing education and training for standardized physical examinations for the study. Patients were asked to look at the face of the physician running the test and come down with all his/her weight on his/her heels after standing on his/her toes on a smooth surface [6,7]. During this exercise, findings indicative of perceived pain were evaluated as a positive result in the heel drop test.

A new clinical score derivation
The diagnostic score was constructed by backward logistic regression analysis including variables in the Alvarado score and the heel drop test, along with other clinical data with statistical significance. Points for the score were weighted by the odds ratio (OR) rounded to the nearest integer. A new score was calculated for each patient by summing the weighted scores when variables were present.

Statistical analysis
Frequency tables for categorical variables were calculated, along with the mean ± standard deviation or median (interquartile range) for continuous variables. The results from the logistic regression analysis are presented as ORs and corresponding 95% confidence intervals (CIs). The calibration of the model for goodness-of-fit was performed using the Hosmer-Lemeshow test. The scores were compared by receiver operating characteristic (ROC) analysis, and the area under the ROC curve (AUC) was determined. Two different cutoff values were determined using ROC analysis: one focusing on high sensitivity and another focusing on high specificity. Sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV), and positive and negative likelihood ratios (PLR, NLR) were calculated for each parameter, calculated new scores, and Alvarado scores. All statistical analyses were performed using SPSS version 21.0 (SPSS, Chicago, IL).
Patients with acute appendicitis were older than those with other diagnoses. The ratio of having acute appendicitis was higher in men than in women, and in cases with pain migration compared to cases without pain migration. White blood cell (WBC) count, proportion of neutrophils, and C-reactive protein concentrations were higher and rebound tenderness, positive psoas sign, Rovsing sign, and heel drop test were significantly more common in patients with acute appendicitis than in those with other diagnoses. Meanwhile, no significant difference was found in the risk of appendicitis between cases with and without anorexia, nausea or vomiting, muscular defense, or obturator sign. Body temperature was also not significant between the groups (Table 1). Features in the Alvarado score and the heel drop test were analyzed between patients with complicated and uncomplicated appendicitis; however, none of these features showed a significant difference ( Table 2).
Variables for the new score and the respective points are presented in Table 3 showed that the model fitted the data well (P = 0.949). The total points of the score ranged from 0 to 10. The new score was compared to the Alvarado score using AUC analysis. In this comparison, the MESH score had a higher AUC than the Alvarado score: [AUC 0.805 (95% CI 0.754-0.855) vs. AUC 0.701 (95% CI 0.642-0.761)] (Fig 1). Based on ROC analysis, scores of 5 and 11 were chosen as cut-off values. A box plot showed that a score of 5 accurately differentiated patients with and without acute appendicitis, regardless of the complications present (Fig 2).
To determine the predictive characteristics of each parameter, the MESH score and Alvarado score, sensitivity, specificity, PPV, NPV, PLR, and NLR were calculated. The heel drop test had higher sensitivity and specificity than rebound tenderness. A MESH score !5 showed higher sensitivity, specificity, PPV, NPV, and PLR, and lower NLR compared to an Alvarado score !5, while similar results were shown when the MESH score !8 was compared to the Alvarado score !7 ( In male patients with a lower cutoff (MESH score !5, and Alvarado score !5), the MESH score showed higher sensitivity than the Alvarado score (0.90 vs. 0.71). However, in female patients with a lower cutoff, the difference in sensitivities was small (0.82 vs. 0.78). In both male and female patients with higher cutoffs (MESH score !8, and Alvarado score ! 7), the differences in specificities were small (male: 0.88 vs. 0.83, and female: 0.92 vs. 0.89) ( Table 5).

Discussion
The MESH score was found in our current analysis to be more accurate in classifying actual appendicitis patients than the Alvarado score. It also performed better in subgroup analysis for male and female patients. The superior performance of the MESH scores may be attributable to the inclusion of the heel drop test, which is a more objective and less subject to misinterpretation test than the old rebound test [6]. In addition, excluding variables with lower sensitivity such as anorexia and elevated temperature (!37.3°C) may have contributed to its accuracy. The strength of the MESH score may also be partially due to the use of prospectively collected data of all patients with abdominal pain and right lower quadrant direct tenderness, rather than only those with confirmed acute appendicitis on surgery. In cases that present with abdominal pain without tenderness, referred pain or other causes of abdominal pain should be considered first. We therefore tried to confine patient enrollment to those with more actual cases with suspected acute appendicitis as the first impression i.e., those with right lower quadrant direct tenderness.
Numerous studies have examined the value of the Alvarado score and the modified Alvarado score in the prediction of acute appendicitis [16][17][18]. A systematic review of published data showed that the score is most useful in ruling out appendicitis, and a score below 5 has a sensitivity of 94-99% for appendicitis not being present [15]. However, a recent study performed at two academic urban EDs in the United States have criticized the low sensitivity of 72% for the low risk Alvarado score as insufficient to safely discharge patients without additional diagnostic testing [19]. In our current study, the results were similar, with a sensitivity of 75% and a specificity of 50% shown in the low risk group according to the Alvarado score. Several attempts have been made to refine the variables in the Alvarado score. One study from Hungary tried to modify the score for easier utilization by adding ultrasound investigation as a score variable [20]. The authors reported an AUC increase from 0.749 to 0.899 after addition of the ultrasound variable. However, routinely adding imaging results is not always feasible or practical. Another study from Turkey tried to improve the accuracy of the modified Alvarado score by adding 'tenesmus' as a variable [21]. However, they classified patients into two groups (score !7 vs. <7), and the goal for score utilization was different from current trends (three risk groups), which limited its application.
In our study, the heel drop test was included as a variable and improved the score's performance compared to the Alvarado score. Few publications have reported an association between the heel drop test and acute appendicitis [22,23]. To the best of our knowledge, only one study from Turkey has shown improvement in diagnostic accuracy when classical examination methods are accompanied by a positive heel drop test [7]. In that study, a positive heel drop test had an OR of 2.51 for appendicitis. The combination of the presence of right lower quadrant pain, WBC ! 11.950/μL, and heal drop test positivity led to an increase in the diagnosis of appendicitis by almost 8.14-and 22.12-fold in men and women, respectively. In this study, we determined that pain migration, WBC>10,000/μL, shift to left, and positive heel drop test increased the acute appendicitis risk by 2.44-, 3.38-, 2.35-, and 3.43-fold, respectively. A positive heel drop test showed the highest OR among the parameters included in the model. However, compared to WBC and shift to left, the heel drop test showed less sensitivity, meaning it is not useful as a single rule-out parameter for acute appendicitis. While the heel drop test was originally introduced to evoke peritoneal irritation by moving intraperitoneal contents up and down and to detect the presence or absence of peritonitis, especially in acute appendicitis, it is also considered the most sensitive test for meningitis [24]. Although it is similar to rebound tenderness, the heel drop test may be easier to elicit tenderness when the patient has firm abdominal wall muscles. A modified heel drop test can be performed by hitting the bottom of the patient's heel with the examiner's hand while the patient remains in the supine position [25]. This procedure will transmit a vibration to the patient's inflamed peritoneum and elicit right lower quadrant pain.
Our study has several limitations. Although this was a dual-center study with different patient populations, it is possible that our results may not be generalizable to other settings. Since we only enrolled patients on whom the first impression was acute appendicitis (positive right lower quadrant direct tenderness), this score may not be applicable to appendicitis patients without right lower quadrant tenderness, whose diagnoses are usually made by using advanced imaging modalities including ultrasound or computed tomography. The duration of abdominal pain before ED presentation was not reported for our patients, and this could certainly have impacted the patient's clinical presentation as well as their laboratory results, such as WBC count and its differential [26]. In addition, the interrater reliability for different variables was not verified in our present study, and variation in the interpretation of physical examination findings could have existed. However, we attempted to mitigate this effect via education and training to perform standardized physical examinations before conducting this study. Finally, the true diagnosis of appendicitis in the group of non-operated patients could have been undetected and therefore the reliability of the MESH score on these patients cannot be evaluated. However, on discharge from the ED, patients are always encouraged to revisit when their symptoms get worse, and during the study period, no patients had two or more visits to our ED.

Conclusions
In summary, we aimed to determine the impact of the heel drop test on the diagnosis of acute appendicitis. The heel drop test showed higher predictive characteristics compared to the rebound tenderness. By combining various significant parameters, the MESH (migration, elevated WBC, shift to left, and heel drop test, leukocytosis, and shift to left) score was found to predict acute appendicitis more accurately than the Alvarado score. Our observations warrant future studies to refine the variables and its cutoffs, and validations to improve the clinical diagnosis of acute appendicitis.