In Vivo versus Augmented Reality Exposure in the Treatment of Small Animal Phobia: A Randomized Controlled Trial

Although in vivo exposure is the treatment of choice for specific phobias, some acceptability problems have been associated with it. Virtual Reality exposure has been shown to be as effective as in vivo exposure, and it is widely accepted for the treatment of specific phobias, but only preliminary data are available in the literature about the efficacy of Augmented Reality. The purpose of the present study was to examine the efficacy and acceptance of two treatment conditions for specific phobias in which the exposure component was applied in different ways: In vivo exposure (N = 31) versus an Augmented Reality system (N = 32) in a randomized controlled trial. “One-session treatment” guidelines were followed. Participants in the Augmented Reality condition significantly improved on all the outcome measures at post-treatment and follow-ups. When the two treatment conditions were compared, some differences were found at post-treatment, favoring the participants who received in vivo exposure. However, these differences disappeared at the 3- and 6-month follow-ups. Regarding participants’ expectations and satisfaction with the treatment, very positive ratings were reported in both conditions. In addition, participants from in vivo exposure condition considered the treatment more useful for their problem whereas participants from Augmented Reality exposure considered the treatment less aversive. Results obtained in this study indicate that Augmented Reality exposure is an effective treatment for specific phobias and well accepted by the participants.


Introduction
Specific phobia (SP) is the most prevalent 12-month disorder DSM-IV disorder (8.7%) [1] among the USA population, whereas lower prevalence were found in some European countries (3.5%) [2]. In the case of the animal phobia subtype [3], the prevalence ranges between 3.3% and 7% [3,4] and it is one of the most prevalent subtypes of SP [5,6]. Around 50-80% of people case study [43] showed a decrease in scores on fear and avoidance and on the variables related to the BAT. Specifically, after the AR exposure session, the participant was capable of interacting with real cockroaches, and the improvements were maintained at the 1-month follow-up. Regarding the participant's opinion of the AR exposure treatment, she reported a high level of satisfaction, considering AR to be slightly aversive, but less aversive than in vivo exposure. This same AR system has been shown to be useful for inducing anxiety in participants [49], and it has also been tested using a multiple baseline design with 6 individuals who suffered from cockroach phobia [50]. Results showed that AR exposure was effective, as all the participants improved significantly on all the outcome measures. Furthermore, the improvements were maintained over time (6-and 12-month follow-ups).
These preliminary studies show that exposure through AR can be useful for the treatment of SP (cockroaches and spiders). However, despite these promising results, we have found no randomized controlled trials on AR efficacy compared to the current treatment of choice for SP, in vivo exposure, nor any study focused on analyzing the opinion and acceptance by the participants of AR exposure.
Thus, the present study aims to examine the efficacy and acceptance (expectations and satisfaction) of two treatment conditions in which the exposure component was applied in different ways: in vivo (IVE) versus the AR system (ARE), in a randomized controlled trial. Taking into account the meta-analysis data in the literature on IVE, we expect this treatment to be efficacious. Data from previous AR pilot tests lead us to predict that this procedure will also be effective. As for the participants' acceptance, we expect both treatments to be well accepted, with AR being considered less aversive than IVE.

Materials and Methods Design
The study was registered in the National Institute of Health Registration System (http://www. clinicaltrials.gov) with Clinical Trials Registration Number: NCT01361074. The authors confirm that all ongoing and related trials for this intervention are registered. The RCT was conducted in accordance with the CONSORT 2010 Statement [51] and the CONSORT-EHEALTH guidelines [52,53] to study the efficacy of two experimental conditions: 1) In Vivo Exposure and 2) Augmented Reality Exposure (see S1 CONSORT Checklist). The participants were randomly assigned to the two experimental conditions. Repeat measurements at pre-treatment, post-treatment, three-month follow-up, and six-month follow-up were included. Regarding the sample size, power calculations were based on data from two meta-analyses [28,29]. Both of them were based on studying the efficacy of VR for the treatment of anxiety disorders, and both included the BAT as outcome measure. Calculations indicated that a sample size of 15 participants in each group (30 in total) would be sufficient to detect an effect size of d = 1.11 or d = 1.12 with a power of 80%; however, we were more conservative and decided to double this estimation and include at least 30 participants in each treatment condition (at least 60 in all). This decision was made for two main reasons: first, although data were available from meta-analyses on VR efficacy for the treatment of SP using several VR sessions, there were no data available from RCT about the efficacy of VR or AR applied in a single session that could serve as a reference. Therefore, we decided to be more conservative and increase the sample size in order to have more power and try to avoid a Type II error.
as well as the data collection took place at January 2011 to January 2013. Participants were recruited through advertisements sent by mail to university community members and announcements placed around the campus and in the local media.
Inclusion criteria were: (a) meeting DSM-IV-TR [5] for the diagnosis of SP (animal subtype) to cockroaches or spiders; (b) being at least 18 years old and having a minimum 1-year duration of the phobia; (c) being willing to follow the study conditions and sign the consent form; and d) presenting a score of at least 4 on the fear and avoidance scales of the diagnostic interview applied. Exclusion criteria were: (a) having another psychological problem that requires immediate attention; (b) having current alcohol or drug dependence or abuse, psychosis or severe organic illness; (c) currently being treated in a similar treatment program; (d) being capable of inserting their hands in a plastic container with a cockroach or a spider (during the behavioral test); and (e) taking anxiolytics during the study (or in the case of taking them, changing the drug or dose during the study).
A total of 63 participants who meet the eligibility criteria were included in the study. The randomization of the participants took place after assessing the eligibility criteria. The person responsible for the randomization was an independent researcher with no clinical involvement in the trial and no access to the study data. She assigned participants to either the IVE (N = 31) or ARE (N = 32) condition, based on a computer-generated randomization list created by the "Random Allocation Software"; version 1.0. Therapists and participants involved in the trial were blind to treatment allocation during the assessment.

Measures
The assessment protocol included diagnostic, main outcome, and secondary measures to assess the main features of the spider and cockroach phobias, interference and severity measures as well as expectations and satisfaction regarding the exposure treatment. In this paper, the most relevant measures are presented.
Primary outcome measures. Behavioral Avoidance Test (BAT; Adapted from Öst, Salkovskis, and Hellström's [54]). The BAT is an observational measure used to assess the features of the phobia in a context of exposure to the feared object, in order to obtain objective data about the person's fear. For this study, a container containing a live cockroach or spider was placed 5 meters from the entrance to a room. Then, participants were asked to enter the room and come as close to the animal as possible. Before the test, the therapist asked them about their level of anxiety, avoidance and belief related to the fear on a scale from 0 to 10, where 0 is "no fear", "no avoidance" and "does not believe the content of the thought at all"), and 10 is "extreme fear", "total avoidance" and "believe the thought is totally true". Their performances on the test were scored, transforming the distance into a score rated on a scale from 0 to 12, where 0 = "refuses to enter the room" and 12 = "The participant interacts by holding the animal on a post card for more than 20 seconds". The maximum anxiety experienced by the participant during the BAT performance and the severity of the fear assessed by the therapist are also measured on a scale from 0 to 10, where 0 = "no fear" and 10 = "severe fear". This measure was used in a previous study where a more detailed description was provided [50].
Secondary Outcome Measures. Fear of Spiders Questionnaire (FSQ; adapted from Szymanski and O'Donohue [55]). This is a self-report questionnaire containing 18 items about spiders and designed to assess the severity of the phobia. Each item is answered on a Likert scale ranging from 0 ("I strongly disagree") to 7 (I strongly agree). Scores can range from 0 to 126. Muris and Merckelbach [56] found that the mean score in a group of people before treatment was 89.1 (SD = 19.6) and after treatment 39.9 (SD = 25.4). In the same study, the mean score of control subjects without spider phobia was 3.0 (SD = 7.8). The FSQ has excellent psychometric properties. To evaluate cockroach phobia, we used an adaptation in the Spanish population [57].
Spider Phobia Beliefs Questionnaire (SBQ; adapted from Arntz, Lavy, van der Berg and van Rijssoort [58]). This is a self-report questionnaire composed of 78 items divided into two scales: fear beliefs related to spiders (SBQ-1) and the person's reaction to their presence (SBQ-2). All items are scored on a scale from 0 to 100, where 0 means "I do not believe it at all" and 100 means "I totally believe it." Regarding the psychometric properties reported by [58], on the one hand, the average scores on subscale SBQ-1 in clinical samples were 48.76 before treatment (SD = 17.74) and 10.15 (13.69) after treatment; on the other hand, the mean scores on subscale SBQ-2 were 49.79 (SD = 18.72) before treatment and 8.00 (SD = 13.15) after treatment. As for reliability, good results are reported for internal consistency and test-retest reliability. The SBQ was also adapted to evaluate cockroach phobia in the Spanish population [59].
Fear and Avoidance Scales (Adapted from Marks and Mathews [60]). Along with their therapists, the participants established the situations related to cockroaches and spiders that caused them the most fear and distress (e.g. taking out the trash at night), as well as their negative thoughts associated with them (e.g. If I see cockroaches I'll go crazy; The cockroaches are going to jump on me). They then used scales from 0 to 10 to assess their degree of fear (0 = "no fear"; 10 = "extreme fear") and avoidance (0 = "never avoid"; 10 = "always avoid") for each feared situation. In addition, the degree of belief in the negative thoughts related to the target behaviors was also assessed on a scale ranging from 0 ("I do not believe the content of the thought at all") to 10 ("I believe the thought is totally true"). For this study, the most significant target behavior chosen by each participant was used (Main Target Behaviour, MTB).
Diagnostic Status. The Anxiety Diagnostic Interview Schedule IV (ADIS-IV-L [61]) is a semi-structured interview used to determine the diagnostic status and quantify the different features related to the phobia, such as the fear and avoidance rates, on a scale from 0 to 8 (0 = no fear, no avoidance; 8 = extreme fear, extreme avoidance), and the interference and distress perceived by the participant (on a scale from 0 to 8, where 0 = "not at all" and 8 = "Very severe). The ADIS-IV has shown adequate psychometric properties according to Antony, Orsillo, and Roemer [62], with an inter-rater reliability ranging from satisfactory to excellent when used by expert clinicians familiar with the DSM diagnostic criteria [63].
Clinician Severity Scale (CSS; [64]). At the end of the individual interviews, the clinician rated the severity of the patient's phobias on a scale from 0 to 8, where 0 = "symptom free" and 8 = "extremely severe and disabling, all aspects of life are affected". This scale was used in previous studies to assess spider phobia [33].
Expectations and satisfaction regarding the exposure treatment (adapted from Borkovec and Nau, [65]). This questionnaire measures the participants' expectations for the exposure component before treatment and their satisfaction with it after treatment. It includes six items rated from 0 ('not at all') to 10 ('very much') to address how logical, satisfactory, recommendable, useful for other problems and for the patient's problem, and how aversive the treatment is. The adaptation of these scales has been used in previous studies [66,67].

Augmented Reality System and Hardware
In the present work, two devices were used to display Mixed Reality images: 1) AR 5DT HMD (head-mounted display) with a 800 x600 resolution and a high (40 degrees) fields of view where an USB Creative NX-Ultra camera is attached to the HMD to capture video stream; and 2) VR Goggles (Vuzix) that include two LCD devices with a 640x480 resolution and a 30-degree field of view and an embedded camera. The system includes 3D spiders and cockroaches and enables real-time interactivity, so that participants can see the actual and real place where they are through the display device, and the feared stimuli (spiders and cockroaches) in the same place (Fig 1).
Bodies and movements of the spiders and cockroaches were modelled using 3DStudio and exported in VRML format; their movements and texture were similar to those of real spiders and cockroaches. Modulating variables that can be manipulated in the AR System are the following: number of animals, movements of the animals, their size (small, medium and large), type of spider (Fig 2), and, finally, the possibility of displaying the animal on various surfaces (e.g., on the table, on the floor, on the person, etc.).
All of these combined options enable the therapist to apply the treatment progressively.
In Fig 3, we can see a person interacting with the cockroaches with her hands. A full description of the system can be found in previous studies [43,50].

Treatment
We used exposure therapy according to the "one-session treatment" guidelines developed by Öst et al. [44]. The treatment program included several therapeutic components applied in only one individual session lasting up to 3 hours. The components, based on Öst [45], were exposure to the feared object (cockroaches or spiders), modelling, reinforced practice, and cognitive challenge. The focus of the treatment was for participants to face phobic situations in a controlled, graduated and planned way in order to tolerate the fear experienced, which would  allow them to disconfirm the negative thoughts related to the presence of the feared object and its consequences, and prevent cognitive and behavioral avoidance in a safe context. The therapist's instructions to the participants followed the recommendations of Öst, Salkovskis, and Hellström [54]: his/her role is to encourage the participant and collaborate with him/her to advance in the treatment objectives. Moreover, once the treatment was over, participants were advised to continue their exposure to the feared animals in order to generalize the results to other situations, although they were not given guided instructions for this exposure.
This treatment was applied in two different ways: • IVE condition: Participants in the IVE condition were exposed to real small animals, that is, live cockroaches or spiders (see Fig 4) • ARE condition: Participants were exposed to virtual animals (cockroaches or spiders) using the AR system (see Figs 1 and 3).

Therapists
Five therapists were involved in the study, all of whom had a PhD or a Master's degree in Psychology. All of them were trained in CBT and had extensive experience in the treatment of anxiety disorders and in the exposure technique (either in vivo or VR/AR). In addition, they received training in this treatment protocol for senior clinicians. Depending on the availability of therapists, in certain cases the clinician who performed the initial assessment was different from the one who conducted the treatment, while in other cases the clinician was the same; the same thing occurred in follow-up assessments. In addition, it is important to point out that all the therapists conducted both the IVE and ARE treatments. All the therapists were supervised by senior clinicians with PhDs in weekly sessions. Moreover, all of the assessment and treatment sessions were video recorded in order to supervise each therapist's performance.

Procedure
The study took place at the Emotional Disorders Clinic at Jaume I University. The assessment protocol was conducted in one session lasting one-and-a-half hours. During this session, after the eligibility criteria had been confirmed, participants were informed and asked to participate in the study. If they agreed to participate, they signed the consent form. Then, they were randomly assigned to one of the two experimental treatment conditions (IVE or ARE) and continued to complete the rest of the assessment. The treatment was carried out in one intensive IVE or ARE session (up to three hours). Finally, after completing the treatment session, all the participants were again assessed at post-treatment (on the same day) and at 3-and 6-month follow-ups.
To control procedural fidelity, the assessment and treatment protocol were available. In fact, specific instructions for conducting both conditions (IVE or ARE) were presented, based on the OST protocol [44,45,54]. In addition, as mentioned above, all the therapists were supervised by senior clinicians and video recorded in order to control compliance with the procedure.

Statistics and data analysis
The Chi-square test was conducted to evaluate group differences in demographic variables (gender, sex, level of studies and marital status). In addition, Student's "t" test was used to test differences between the two treatment conditions at pre-treatment on all of the efficacy measures and treatment acceptance, as well as other variables, such as participants' age and duration of the phobia. All post-treatment and follow-up analyses involve a conservative intentionto-treat (ITT; [68]) design, where missing data were addressed by carrying forward the last available data (last observation carried forward model; LOCF). Furthermore, data considering the completer participants were also analyzed. In order to test the treatment efficacy in both conditions repeated-measures ANOVAs were used to compare the time effect on the measures at four assessment times (pre, post, 3-and 6-month follow-ups) and the time interaction among the treatment conditions at pre, post, 3and 6-month follow-ups. Effect sizes (Cohen's d) were calculated for within-and betweengroup changes, from pre-treatment to post-treatment, from pre-treatment to 3-month followup, and from pre-treatment to 6-month follow-up. Cohen's d [69] is based on the pooled standard deviation, and effect sizes were categorized as no effect (< 0.2), small effect (0.2-0.5), medium effect (0.5-0.80) or large effect (> 0.80). Finally, Chi-squared tests were performed in order to examine the differential clinically significant improvement rates, based on the Jacobson and Truax indexes [70] for FSQ and SBQ scores at post-treatment and 3-and 6-month follow-ups in the two treatment conditions. In this study, the classification proposed by Iraurgi [71] and Kupfer [72] was used, where "Recovered" means the participant is situated in the normal or functional distribution; "Improved" indicates a significant improvement, but not situated in the normal or functional distribution; "No change" means no significant improvement and not situated in the normal or functioning distribution; and finally, "Deteriorated" indicates a significant deterioration.
All statistical analyses were performed using SPSS 20.0.

Participant flow and attrition
A complete description of the participants' attrition is shown in Fig 5. Initially, 103 people contacted the Emotional Disorders Clinic at Jaume I University to show their interest in taking part in the study. Out of these 103 people, 75 were assessed for eligibility criteria. Twelve participants were excluded from the study before treatment. The reasons for their exclusion were: they did not meet the inclusion criteria (N = 2), they refused to attend (N = 7), or we were unable to contact them again (N = 3). Finally, based on the DSM-IV-TR criteria [7], 63 participants were included in the study.
As the flowchart reveals, none of the participants withdrew from either treatment condition after randomization or during the treatment session. In addition, all of them completed the posttreatment assessment. At the 3-month follow-up, a total of 60 participants attended the assessment session, 30 participants in the IVE condition (96.77%) and 30 in the ARE condition (93.75%). At the 6-month follow-up, a total of 53 participants attended the assessment session, 27 participants in the IVE condition (87.10%) and 26 participants in the ARE condition (81.25%). No significant differences in attrition rates were found between the treatment conditions.

Baseline data and pre-treatment differences
Demographic and clinical characteristics. Descriptive data collected about demographic variables, diagnosis, phobia's duration, and medication are shown in Table 1.
As the table shows, most of the sample were women (93.7%) and had a diagnosis of cockroach phobia (85.7%). The mean age was 31.73 (SD = 10.74), ranging from 20 to 70 years. The sample was practically divided between married and single, and most of them had a university degree (84.1%). The average length of the phobia was 18.21 years, and only 2 participants were taking anxiolytic drugs. One participant took medication only occasionally due to a previous diagnosis of agoraphobia, but he/she did not take it during the treatment. The other person remained at the same dose throughout the study without any changes in the type of drug. In the case of this participant, she was taking Alprazolam in the framework of a previous eating disorder (in order to control binge-eating episodes).
Pre-treatment differences. Statistical analyses showed no differences between the two groups at pre-treatment on any demographic variables, phobia duration, medication or diagnostic variables (cockroach or spider phobia).
Regarding primary and secondary clinical variables, no differences were found between the two experimental conditions on any of these measures at pre-treatment. Means and SDs are shown in Table 2.
Effectiveness: change in primary and secondary outcomes at pre-post and 3-and 6-month follow-ups in both treatment conditions Means, standard deviations, and within-group and between-group effect sizes are shown in Table 2 for the ITT analysis of all the outcome measures at pre-treatment, post-treatment and 6-month follow-up. Data from 3-month follow-up are available as S1 Table.  ; and Severity-BAT assessed by the clinician (F 3, 60 = 89.14, p < .000). A significant time X treatment group interaction effect was found for the variable "Avoidance-BAT" (F 3, 61 = 65.59, p < .020), with this interaction favoring the participants in the IVE condition. Pairwise comparisons revealed that these differences between the two conditions on this variable took place at post-treatment, while both conditions showed similar scores at the 3-and 6-month follow-ups. Fig 6 shows the evolution of the primary outcome measures at post-treatment and follows-ups as well as the effect sizes reached in each treatment condition.
Completer analysis also revealed a significant time effect on all of the variables related to the BAT, similar to the ITT analysis. However, regarding differences between the two groups, a significant time X treatment group interaction was found for the variable "Fear-BAT" (F 3, 45 = 3.42, p < .029), with participants from the IVE group showing more improvement on this variable. In the pairwise comparisons, results revealed that this difference was present only at posttreatment, disappearing at the 3-and 6-month follow-ups, as described in the ITT analysis.
Secondary outcome measures. Regarding the fear related to spiders and cockroaches measured by the FSQ, the ITT analysis showed a significant time effect (F 3, 61 = 158.93, p < .000), whereas no statistical differences were found between the two treatment conditions. Both groups improved in a similar way. In the case of beliefs related to spiders and cockroaches measured by the first scale of the SBQ (SBQ-1), a significant time effect was observed (F 3, 61 = 207.70, p < .000), with no differences between the two conditions in the time X treatment group interaction. The same pattern was observed on the other scale of the SBQ (SBQ-2), corresponding to the participants' beliefs about their reactions in the presence of a spider or cockroach. In this case, a significant time effect was also detected (F 3, 61 = 136.13, p < .000), and no differences were found between the two conditions at any of the assessment moments. ITT analysis also showed a significant time effect for all of the measures related to the MTB: Fear-MTB (F 3, 61 = 143.29, p < .000); Avoidance-MTB (F 3, 61 = 163.31, p < .000); and degree of Belief-MTB related to the catastrophic thought (F 3, 61 = 135.94, p < .000). A significant time X treatment group interaction effect was found for the variable "Avoidance-MTB" (F 3, 61 = 3.20, p < .044), with participants from the IVE condition showing more improvement on this measure. Pairwise comparisons revealed that this difference was observed only at post-treatment, as both conditions had similar scores at the 3-and 6-month follow-ups. Regarding the last measure, the severity assessed by the clinician (CSS), statistical analysis showed a significant time effect in both conditions (F 3, 59 = 143.29, p < .000), and no differences were found in terms of an interaction effect between the two conditions. Completer analysis showed the same outcome pattern as the ITT analysis related to the time effect on these measures. Both conditions showed a significant time effect on the FSQ, SBQ, MTB measures and CSS. Regarding the time X treatment group interaction, in this analytical procedure, no differences were found on any of these measured variables, and so both conditions were equally efficacious at post-treatment and at the follow-ups. Table 2 shows within-group and between-group effect sizes measured by Cohen's d [69] for all measures obtained from the ITT analysis. As Table 2 reveals, all the measures included in the assessment reached d values above 0.8 in both conditions at post-treatment and follow-ups. According to Cohen, values equal to 0.8 or higher are considered large effect sizes. As for the effect sizes of the time X treatment group interaction, medium-large effect sizes, according to Cohen, were found at post-treatment for the variables: "previous avoidance (d = -0.80) experienced before the BAT performance" and "avoidance of the MTB" (d = -0.65). However, all of these variables obtained values around 0.30 or less in the time x condition interaction at follow-ups. Taking into account the effect sizes reported in the completer analysis, scores obtained in the time effect analysis showed a similar pattern to the ITT analysis. As for the effect sizes of the time X treatment group interaction at post-treatment, large effect sizes, according to Cohen, were found in the variable: "previous fear (d = -1.06) experienced before the BAT performance". As the ITT analysis shows, the effect size of this variable at follow-ups was d < 0.30.

Duration of Exposure, Diagnostic Status, Meaningful Clinical Improvement
The analysis carried out on the Duration of the Exposure showed that the mean time for the IVE condition was 137 minutes (ranging from 62 to 180), and 141.83 minutes for the ARE (ranging from 70 to 180). ttests revealed no differences between the two treatment conditions in this variable.
Regarding the Diagnostic Status, the ADIS-IV interview at post-treatment revealed a high percentage of participants with SP (cockroaches or spiders) who were diagnosis-free (IVE = 22, 71.0%; ARE = 18, 56.3%). The same thing occurred at the 3-month (IVE = 24, 77.4%; ARE = 22, 68.8%) and 6-month follow-ups (IVE = 20, 64.5%; ARE = 20, 62.5%). Regarding the statistically significant differences in these percentages, the Chi-Square test revealed no differences between the two treatment conditions at any of the assessment times.
Finally, clinically meaningful improvement was calculated for FSQ and SBQ scores using Jacobson and Truax's indexes [70]. Percentages of participants from each condition who were recovered, improved, had no change or were impaired, according to these measures are available as supporting information; see S2 Table on the FSQ and SBQ Scores at Post-treatment, 3and 6-month follow-up. As data reveal, regarding the scores for fear assessed by the FSQ and the scores for beliefs related to the feared animal (SBQ-1) and the self-reaction (SBQ-2), the percentages of participants classified as "Recovered" and "Improved" were very high in both treatment conditions at post-treatment, and also at the 3-and 6-month follow-ups. No statistically significant differences were found between the two treatment conditions in these percentages on the FSQ or the SBQ.

Acceptance of the exposure component
Participants' expectations and satisfaction with the treatment revealed that participants in both treatment conditions evaluated the exposure component quite positively. Means and SD obtained for expectations and satisfaction, as well as the results derived from the comparisons of the two treatment conditions, are shown as supporting information; see S3 Table. Furthermore, the results obtained showed statistical differences between the two treatment conditions regarding expectations for the treatment: participants in ARE condition considered that the exposure component would be less "aversive" than participants of the IVE condition before treatment.
Regarding the satisfaction reported by participants after the treatment, there were significant statistical differences between the two treatment conditions: participants in IVE considered the treatment received more "useful for their problem" than participants in ARE conditions; in contrast, participants in ARE conditions found the treatment received less "aversive" than participants in IVE condition.

Discussion
The main objective of this study was to provide empirical evidence about the efficacy and acceptance of the implementation of an AR exposure component in the treatment of small animal phobia (cockroaches and spiders) using a RCT with two conditions, IVE and ARE. The interventions in both conditions were based on the OST protocol [44]. Data obtained showed that both conditions resulted in statistically significant improvements on both primary (BAT) and secondary (FSQ, SBQ, MTB and CSS) outcome measures using both analytical procedures (ITT and completer). These results were found at post-treatment, and they were maintained at the follow-ups, obtaining large within-group effect sizes on all of the variables of more than 0.8, based on Cohen [69].
We compared ARE and IVE treatment conditions, and significant differences between the two groups were found on the primary outcome variable "Avoidance-BAT" as well as on the secondary outcome variable "Avoidance-MTB", according to the ITT analysis, whereas the completers analysis procedure revealed differences only in the primary outcome variable "Fear-BAT". These differences favored the IVE condition only at post-treatment, with medium-large effect sizes, but at the 3-and 6-month follow-ups, these differences were not observed. In fact, small effect sizes were obtained at follow-ups, lower than those reported by the meta-analyses carried out [28,29] It is also important that no differences were found in the duration of the exposure session between the two treatment conditions.
Regarding the percentages of diagnostic status and clinically significant change estimations, results indicated significant improvements in both groups, with no significant differences between the two treatment conditions. Moreover, in both conditions, scores obtained on both the FSQ and SBQ in the last follow-up (6-month) were similar to those reported by Muris and Merckelbach [56] in a normal population and by Arntz et al. [58] in a clinical population after treatment.
In sum, both conditions (ARE and IVE) resulted in statistically significant changes in both primary and secondary outcomes, and these results were maintained at the follow-ups; additionally, no significant differences were found between the two conditions in the long term. As for the dropouts, none of the participants dropped out during the treatment, and so there were no differences between the two groups in this sense.
The results agree with those obtained in previous studies by our research group, where this AR system to apply the exposure component in cockroach phobia [43,50] obtained similar scores on both the primary outcome measure (BAT) and the secondary measures, such as the FSQ, FSQ, SBQ, MTB and CSS. In the last years, other AR systems have been developed for the treatment of SP, specifically for butterfly phobia [73] and spider phobia [74]. However, data about the efficacy of these AR systems have not been reported yet. We hope that in the near future these systems are tested and can provide additional data.
Similarly, data from the present study also coincide with findings from studies where VR was used to treat SP (animal subtype), such as those by [31,[33][34][35]75,76]; where VR was effective in reducing the fear and avoidance experienced by the participants. In addition, the data obtained support the meta-analysis carried out by [77], where in vivo exposure compared to other forms of SP treatment (including the use of VR) was superior at post-treatment, but not at the follow-ups. Thus, the advantage of IVE also disappeared in the long term, as occurred in the present study. We would like to have had enough statistical power to analyze whether both conditions were equally effective using an equivalence testing procedure; unfortunately, our sample size did not allow us to use this methodology. We can state that both treatment conditions were effective for the treatment of SP, with large effect sizes, but we cannot conclude that both conditions were equally efficacious.
Finally, the results obtained also agree with those from studies where the OST was applied in a traditional format, both individually (e.g. [78,46,54]) and in a group (e.g. [47,79]). The Öst one-session treatment is designed to expose the patient to the phobic situation in a planned, graded and controlled way. Moreover, it is presented to the patient as a team situation because the patient and therapist work together. One important strength of this procedure, based on the habituation rationale paradigm, is that it uses a long period of time (up to 3 hours), and this enables the patient to confront a large number of situations and test several irrational beliefs. Another advantage, according to Öst [44] and Zlomke and Davis [80], is the intensive exposure. It provides a context where "overlearning" can occur, maximizing the exposure through the generation of exposure situations beyond those often experienced in the natural environment, in order to increase exposure efficacy. However, this would be much less important in the inhibitory learning paradigm recently defended by Craske et al. [81], From this point of view, it would be much more important to appeal to strategies such as expectancy violation or variability, rather than to long exposure periods. Moreover, this intensive exposure can be quite tiring for patients and the therapist. It requires tolerating moderate distress during a prolonged session, and so patient motivation is crucial [44]. Otherwise, taking into account what was previously been mentioned about other patients suffering from phobias who do not follow any therapy, the one-session exposure would be a real advantage increasing the likelihood of receiving treatment since it is a single session. Finally, it is also necessary to state that the onesession treatment is should be a starting point, and that the patient must continue the exposure in his/her life after therapy [82].
In short, these results support the use of AR as an effective tool for the treatment of specific phobias.
Regarding the acceptance of the treatment, results obtained showed that participants from both conditions reported high expectations about the treatment before receiving it, as well as high satisfaction with it when it was applied. Significant differences were found between the two conditions: participants in IVE considered the treatment more useful for their problem at post-treatment than participants in the ARE condition; but participants in ARE considered the treatment less aversive than participants in the IVE condition, both regarding their expectations before receiving it and the satisfaction expressed after it.
This study also provides data about its applicability, as none of the participants dropped out during the treatment. These data are lower than those reported by other authors [82,83]. It is important to point out that although none of the participants refused to participate in the study, it was observed that participants from IVE condition were more reluctant to receive the treatment than participant from ARE condition. This percentage was even lower than the one reported by Öst [82] where only 0.8% of the 500 participants examined refused to participate in the different ÖST studies.
The acceptance data obtained in this study are in line with previous studies in the literature about the use of AR for the treatment of specific phobia in adults [43] and children [84]; and with studies where the VR was used to treat this problem [24,37,85,86]. In addition, as Botella et al. [50] reported, AR has some advantages, such as greater control of the exposure by the therapist (e.g. number and size of the animals, animals' behavior, etc.), not having to keep animals in the clinic, and the possibility of interacting with a virtual element in the real world by using one's body (e.g. placing cockroaches/spiders on hands, feet, etc.). Therefore, it is important to highlight that AR is an alternative treatment for patients and therapists, depending on their preferences.
Moreover, in recent studies, VR environments were used in the treatment of spider phobia to examine whether exposure to the phobic stimulus in different contexts and/or with different stimuli reduces the recurrence of fear [75,76,87]. Results showed that exposure to multiple virtual contexts and multiple virtual phobic stimuli reduced the recurrence of fear to a greater extent than exposure to only one scenario and the same stimuli, making it possible to generalize the results. These studies are in line with the work carried out by Craske et al. [81], focused on optimizing exposure therapy based on the inhibitory learning approach, because VR/AR can maximize some strategies reported by Craske. Some examples would be "deepened extinction", where multiple fear stimuli (e.g. different types of spiders) are first extinguished separately before being combined during extinction (e.g. exposure to these types of spiders at the same time); "variability" (varying stimuli, durations, levels of intensity, or varying the order of the hierarchy items) because VR/AR allows greater control by therapists; and finally, "exposure to multiple contexts". Regarding this last strategy, as mentioned above, VR/AR facilitates the exposure to different contexts, producing a positive effect in terms of fear renewal and generalization of results [75,76,87]. This effect can often be difficult to achieve in the "real world", but it can be made easier by using AR or VR, providing the therapist with much more control. For example, AR can facilitate the exposure to a range of multiple stimuli (e.g. different kind of spiders, different sizes or number of animals, etc.), and it can be used in different contexts, such as different rooms in the clinician's office, using different lights in these rooms, at the patient's house, or on the street (by using mobile devices), etc. For instance, at our work place, where an AR serious game on a mobile phone is used as the application device [88], the patient was able to practice exposure to the feared stimuli in different contexts, both before the One-session VR treatment with the therapist and after the session, in order to reinforce what had been learned.
This study has some limitations such as, on the one hand, the lack of a waiting list control group. However, based on previous studies on exposure technique efficacy [28,29], we decided to compare AR with IVE, the most powerful procedure for the treatment of SP at the present time. Another limitation was the sample size, which did not allow us to use an equivalence testing procedure. We would like to have had enough statistical power to analyze whether both conditions were equally effective using an equivalence testing procedure. Unfortunately, our sample size did not allow us to use this methodology. In addition, this study was conducted in a research context. Therefore, no data about clinical settings or clinicians' acceptance were obtained. Moreover, no diagnostic interrater reliability was applied, as the assessors were not always independent from the therapists. Finally, there were some adverse effects in some cases due to the duration of the exposure session (ranging from 62 to 180 minutes). Some participants in both conditions reported feeling tired, and some of the first participants in the ARE condition suffered from dizziness and back pain due to the AR 5DT HMD. Therefore, the AR HMD was immediately replaced by a pair of AR glasses (VR goggles, Vuzix). It should be noted that these symptoms were not severe and disappeared a few hours after exposure. In any case, the administration of the Simulator Sickness Questionnaire (SSQ) would be desirable. For future studies we recommend taking breaks during the session if it is prolonged.
Regarding future lines of research, a promising line involves improving AR systems. In fact, AR systems that do not require the use of any visual device have been developed, such as the "Therapeutic Lamp" system [89], a technology based on AR-based interactive projection. The recent emergence of less invasive and more comfortable visual devices on the market at very affordable prices (e.g. "Google Glass" or "Oculus Rift") can facilitate the use of AR by mental healthcare professionals. Similarly, an attempt has also been made to advance in this field by combining the AR system with the use of an AR serious game running on a mobile phone [88], or the use of an AR system with children as a first step before in vivo exposure [84].
The results obtained in this study provide empirical evidence about the efficacy and participants' acceptance of AR for the treatment of specific phobias. The use of AR provides an additional option in administering exposure treatment for specific phobias and a new alternative for both patients and therapists, depending on their preferences. Finally, new research lines can be opened up, in order to define the best strategies to enhance the exposure treatment, reduce the recurrence of fear, and improve the acceptability of exposure-based treatments. According to Kazdin [13], it is necessary to consider new therapeutic models.