It’s how you say it: Systematic A/B testing of digital messaging cut hospital no-show rates

Failure to attend hospital appointments has a detrimental impact on care quality. Documented efforts to address this challenge have only modestly decreased no-show rates. Behavioral economics theory has suggested that more effective messages may lead to increased responsiveness. In complex, real-world settings, it has proven difficult to predict the optimal message composition. In this study, we aimed to systematically compare the effects of several pre-appointment message formats on no-show rates. We randomly assigned members from Clalit Health Services (CHS), the largest payer-provider healthcare organization in Israel, who had scheduled outpatient clinic appointments in 14 CHS hospitals, to one of nine groups. Each individual received a pre-appointment SMS text reminder five days before the appointment, which differed by group. No-show and advanced cancellation rates were compared between the eight alternative messages, with the previously used generic message serving as the control. There were 161,587 CHS members who received pre-appointment reminder messages who were included in this study. Five message frames significantly differed from the control group. Members who received a reminder designed to evoke emotional guilt had a no-show rates of 14.2%, compared with 21.1% in the control group (odds ratio [OR]: 0.69, 95% confidence interval [CI]: 0.67, 0.76), and an advanced cancellation rate of 26.3% compared with 17.2% in the control group (OR: 1.2, 95% CI: 1.19, 1.21). Four additional reminder formats demonstrated significantly improved impact on no-show rates, compared to the control, though not as effective as the best performing message format. Carefully selecting the narrative of pre-appointment SMS reminders can lead to a marked decrease in no-show rates. The process of a/b testing, selecting, and adopting optimal messages is a practical example of implementing the learning healthcare system paradigm, which could prevent up to one-third of the 352,000 annually unattended appointments in Israel.

Introduction payer-provider healthcare organization in Israel, which provides primary, specialty, and inpatient care to over 52% of the Israeli population and has 4.4 million members. CHS's comprehensive healthcare data warehouse combines hospital and community medical records. All Israeli citizens are covered by one of four healthcare organizations, and while it is possible to switch between the organizations, membership turnover within CHS is less than 1% annually [30], allowing for consistent longitudinal follow-up. CHS' electronic health records (EHR) contain administrative and clinical data, socio-demographic information, diagnoses from community and hospital settings, recorded chronic diseases, clinical markers, and appointment related details. All CHS members' information was extracted from CHS's EHR, as of the index date (appointment date) and from current demographics.
CHS operates an appointment reminder system that automatically sends a text message five days prior to a scheduled appointment with a link to an internet-based system that allows for confirmation or cancellation the appointment in advance. Data from the CHS SMS appointment reminder system was retrieved and appended to the abovementioned data points.

Study population and design
The population eligible for this study included all CHS members who were 18 years old and older, with scheduled appointments between December 1, 2018 to March 31, 2019, at one of the 596 outpatient clinics within CHS' 14 hospitals. All participants had a valid cell phone number in the CHS EHR and consented to receive phone-based appointment reminders. The index date was defined as the date of the scheduled appointment (Fig 1). Randomization occurred via a randomization program and participants were assigned to one out of nine messages that were issued five days in advance of the appointment. Members with multiple appointments during the study period could receive the same or differently framed messages for each appointment.

Variables definitions
Appointment reminders. CHS members were randomly assigned to one out of nine possible message frames that reminded them of their upcoming appointment (see Table 1). Eight variations were designed based on the following principles: the 'social norm' versions highlighted the idea that social identity and descriptive norms potentially motivate individuals to perform certain actions [31][32][33][34]. The 'emotional' versions aimed to provoke an emotional reaction in order to prompt members to take action by mentioning people they care about [35] or to evoke feelings of sympathy or empathy [36]. The 'appointment cost' version was based on the opportunity cost effect [37]. Although members do not directly pay the healthcare organization for missed appointments, this narrative highlights the amount of money they cause the organization to lose by not showing up [38]. Both 'professional figure' and 'personal' versions relied on the messenger effect that suggests that people's compliance to a message is affected by the figure who delivers it, for example, an actual name or authority figure rather than an automated device [39]. The ninth frame was the routinely used reminder message, in use by CHS in recent years, and was retained as the control group.
Outcomes. The primary outcome was a no-show event, defined as a scheduled appointment that a CHS member failed to attend. The no-show rates were calculated as the number of no-show appointments out of the total number of appointments scheduled. The secondary outcome was advanced cancellation, defined as members who cancel their scheduled appointment in advance of the appointment date/time. The advanced cancellation rates were calculated as the number of cancellations out of the total number of appointment reminders sent. Baseline measurements. Sociodemographic variables were measured at index date and included biological sex, age (years), socioeconomic status (SES; low, medium, high; based on clinic-level data), population sector (Jewish, non-Jewish), and immigrant status (immigrated to Israel, born in Israel). Clinical characteristics included smoking status (current, former, or non-smoker, as reported in the EHR), body mass index (computed from documented weight and height measurements), and Charlson Comorbidity Index (computed from risk factors to evaluate an age-comorbidity score [40]). Comorbidity variables were evaluated as of the index date, and included cardiovascular diseases (yes/no; defined as any of the following: acute myocardial infarction unstable angina pectoris, angina pectoris, acute coronary syndrome, percutaneous transluminal coronary angioplasty, coronary artery bypass graft, ischemic heart disease, ischemic stroke), diabetes (yes/no), chronic kidney disease (yes/no; defined as the last eGFR value prior to index date less than 60 ml/min/1.73m2), celiac disease (yes/no), and inflammatory bowel disease (yes/no) (see S1 Table). We extracted these diagnoses from community and hospital records, as well as from the CHS chronic disease registry.
The appointment characteristics included past non-attendance-i.e.,'chronic no-show' (yes/ no; defined as members who missed appointments at least two times in a row within one year prior to index date), time to an appointment (calculated as the difference in days between the date of scheduling the appointment to the date of the appointment), and clicking on the link (yes/no; defined as whether member clicked on the link in the SMS reminder message).

Statistical analysis
Socio-demographic characteristics, clinical, and appointment-related variables were calculated within the nine different SMS groups. Summaries of continuous variables are presented as means and standard deviations unless skewed, and in that case, are presented as medians and interquartile ranges. Categorical variables are presented as absolute numbers and percentages, as appropriate.
In order to assess whether any of the message frames caused a lower risk for no-shows compared to the currently existing message frame, multinomial testing was performed. Univariate and multivariate analyses using binary logistic regression models accounting for features determined via an automated generic framework to be most predictive of no-show (i.e., sociodemographic characteristics, clinical, and appointment related variables) behavior.
Message frames were considered as treatment variables, with the existing message serving as the reference group and the record of attendance as the binary outcome variable. Secondary analyses were conducted to assess the effect of the different message frames on canceling appointments in advance.
Statistical analyses were conducted using the R language (version 3.5.3, R Foundation for Statistical Computing, Vienna, Austria). All statistical tests were 2-tailed, and a 5% significance threshold was maintained.

Ethics
This study was reviewed by the IRB of the CHS organization and it was determined that this study was not a clinical trial, but rather an organizational initiative to optimize internal policy. It received an exemption for the need for individual informed consent since it was determined that the various intervention arms posed no harm to members. It was not registered as a clinical trial for these reasons. Obtaining consent would introduce a burden to the members (larger than the intervention itself); obtaining informed consent would cause serious practical problems that would undermine the trial results (particularly for the control group), and the risk of harm was low since the intervention merely consisted of small modifications to existing routine processes.

Results
During the study period, between December 2018 and March 2019, there were 218,066 scheduled appointments in CHS's hospital outpatient clinics, of which 161,587 had a valid associated mobile telephone number with approval for receiving phone-based appointment reminders (Fig 1). Among those who received one of the nine SMS appointment reminders (Table 1), in 104,469 (64.6%) cases, reminder's accompanying link was opened within 48 hours of receiving the message.
Socio-demographic, clinical, and appointment related characteristics by type of appointment reminder can be seen in Table 2. Approximately half of the eligible population was female (55.4%), and the average age of the population was 59.3 years. The distribution of all members' characteristics and appointment information was similar between the nine treatment groups, and no significant differences were found (e.g., all p values > 0.05) ( Table 2).  Table 3 present no-show and advanced cancellation rates in the groups receiving one of the eight alternative message frames compared with the generic control. Five out of the eight alternative message reminders presented in Table 1 ('appointment cost', 'emotional relatives', 'emotional guilt', 'social norm', and 'social identity') had significantly lower rates of noshows and higher rates of canceling in advance compared with the routinely used message reminder. Both 'social norm' and 'social identity' framed messages were associated with a 17.8% (OR: 0.73, 95% CI: 0.61, 0.79) and 17.7% (OR: 0.83, 95% CI: 0.76, 0.87) no-show rate and 21.8% (OR: 1.08, 95% CI: 1.07, 1.08) and 24.6% (OR: 1.19, 95% CI: 1.11, 1.24) advanced cancellation rate, respectively. The 'standard', 'personal request', and 'professional figure' messages did not produce significantly different results as compared with the control message (Fig 2, Table 3).
The multivariate analysis showed that the relative reduction in the risk of no-show and advanced cancellation remained quite unchanged after adjusting for socio-demographic variables, clinical and appointment-related characteristics, and past non-attendance behavior ( Table 3).

Discussion
We have shown that careful design of SMS narratives based on behavioral economic principles can reduce hospital outpatient clinic no-show rates by over 30 percent. Out of nine differently framed reminders, five produced statistically significant lower no-show rates and higher advanced cancellation rates. The emotional guilt and specific cost message frames showed the greatest nominal differences in no-show rates and advanced cancellation rates compared with the control group (14.2% and 15.3% compared to 21.1% in no-show rates and 26.3% and 27.4% compared to 17.2% in advanced cancellation rates, respectively). While many health interventions approach behavioral challenges by emphasizing the need to support and prompt the individual through reminders, these results indicate that different messages can influence no-show and cancellation rates [5,6,12,[41][42][43]. These results are aligned with behavioral economic theories. However, the current data do not offer adequate support that the varied effects have resulted from those specific psychological mechanisms. Future research is needed to explore this topic.
These results highlights the potential of introducing behavioral economics principles into multiple avenues of healthcare delivery in order to improve member adherence and reduce waste in care provision.
Our study ventures beyond the published literature by including a standard alternative message alongside the affect-based alternatives, possibly indicating that the reduction in noshow rates was in fact due to a change in context and not merely in response to a simple change in wording.  This study had several limitations. First, all reminder messages were sent five days prior to the appointment. This five day period might be considered as relatively long, as previous studies have reported shorter periods of one to three days between sending the reminder and appointment date [6]. Sending the reminder SMS five days before the appointment may have allowed the psychological effect of the remainder to decay over time, meaning that even though the members may have confirmed attendance, they may not attend the appointment. However, if at the time of the reminder, the members had forgotten about the appointment, it may have been possible that this five day period of time will enable them to rearrange schedule in order to attend.
Another possible limitation was the inability to distinguish between members who read the SMS reminders and those who did not. However, we retain our confidence in the overall conclusions since 64.6% of the receipients clickedon the link within 48 hours, indicating that the majority of people read the message. Also, the assignment to the SMS frame for each appointment was properly randomized, and there is no reason to suspect additional potential confounders. Furthermore, all members within the study period had no more than three appointments. It is possible that receiving the same message when scheduling new appointments diminished the effect over time. The multiple and varying number of appointments per patient in the sample led to correlated error variance in the study dataset, which may have produced biased error estimates. However, due to the short period of the study (December 2018 -March 2019), not many patients had repeated appointments (approximately 9.4% of the total population).
It is important to note that the expected effect of rephrasing reminder messages may be limited, as it was designed with the 'average' person in mind, rather than customized on an individual level. While we found that the effect of sending alternative messages was maintained after adjustment by various covariates, it is possible that interaction of individual characteristics may modify the impact of specific message frames on no-shows. Future research should focus on customizing the content per person to even further reduce no-show rates. The major strength of our findings was that all 14 CHS hospitals, located throughout the country, were included in this study. This means that the effect of different SMS versions on no-shows was assessed among participants who came from diverse backgrounds, and thus, can be generalized to a greater scale for policy implications.
The number of unattended appointments across all outpatient clinics in CHS' 14 hospitals is approximately 600,000 annually (18.7% of all outpatient clinic appointments). Our results indicate that replacing the current reminder message with a carefully designed message can potentially save 187,000 appointments annually. Nationally, this change can potentially result in saving approximately 350,000 unattended appointments, thereby improving the quality of care across the country.
Since May 2019, CHS changed the policy to adopt the "emotional guilt' narrative in all outpatient clinics for all messages used in daily practice (more than 3 million appointments a year), and is monitoring the scale of the real-world impact of this change. This can serve as an example of how research knowledge gained by a learning healthcare system can be implemented into routine clinical practice and effect changes in the organization's policy [29,44]. It is worth noting that such a change may have additional unintended consequences other than improving visit attendance, such as a reduction in clinic or physician satisfaction ratings.
The era of digital health enables healthcare providers to systematically customize their interaction with members in order to increase the effectiveness of healthier behavior [45]. This simple example of how strategic use of traditional messaging substantially impacts members' behaviors shows the untapped potential of smart messaging in health care. Improvement in member engagement depends on the utilization of technical add-ons, but even more so, on the nuances and characterization of the way the messaging is constructed.
Supporting information S1