The accuracy of the frontal extent in stereoscopic environments: A comparison of direct selection and virtual cursor techniques

This experiment investigated the accuracy of distance judgment and perception of the frontal extent in a stereoscopic environment. Eight virtual targets were projected in a circular arrangement with two center-to-center target distances (18 cm and 36 cm) and three target sizes (0.6 cm, 1.5 cm, and 3.7 cm). Fourteen participants judged the positions of virtual targets presented at a distance of 90 cm from them by employing two different interaction techniques: the direct selection technique and the virtual cursor technique. The results showed overall higher accuracy with the virtual cursor technique than with the direct selection technique. It was also found that the target size significantly affected the frontal extent accuracy. In addition, significant interactions between technique and center-to-center target distance were observed. The direct selection technique was more accurate at the 18 cm center-to-center target distance along the horizontal (x) and vertical (y) axes, while the virtual cursor technique was more accurate for the 36 cm center-to-center target distance along the y axis. During the direct selection, estimations tended to converge to the center of the virtual space; however, this convergence was not observed in the virtual cursor condition. The accuracy of pointing estimations suffered on the left side of participants. These findings could provide direction for virtual reality developers in selecting proper interaction techniques and appropriately positioning virtual targets in stereoscopic environments.


Introduction
In recent years, the development of virtual reality (VR) has added a new dimension to humancomputer interactions. In many applications of VR, a user interacts with virtual objects in a virtual environment (VE), such as for training in maintenance operations [1], manual assembly operations [2], and surgical simulations [3]. These applications mostly require observers to accurately interact with virtual objects at a particular distance. Among the various factors, interaction performance is evaluated on how accurately observers can judge the positions of the virtual objects in the VE. The observer can make distance estimates with either exocentric PLOS  or egocentric distance references. Exocentric distance is the distance between two objects or points, and egocentric distance is the distance between the observer and the object. By using a metric of relative distance from the observer to the object, Cutting [4] classified space into three types: personal space, where the distance is within arm's reach; the action space, where the distance is about 1.5-30 m; and the vista space, which is beyond approximately 30 m. Spatial information such as distance, size, space, and relations are substantially important in the real world and VEs. Many recent studies have considered distance estimations and compared the accuracy of the estimates in the real world and VEs as their foundation of study [5,6]. In addition, most of them have reported the accuracy of distance perceptions in in-depth planes, but only a few studies have been conducted in a frontal extent [7,8]. Generally, the perception of distance may vary; it can be accurate, underestimated, or overestimated. Whenever the distances are systematically reported to be over-or underestimated, the overall space within all three dimensions may be perceived as compressed or expanded. In many of the studies on depth perception, underestimation of depth in VE is often reported. The insufficiency of the rendered VE [9], differences of judgment methods [10], and different viewing conditions [11] have been shown to contribute to distance compression. A holistic review by Renner, Velichkovsky [12] summarized the probable factors causing underestimation: technical factors, measurement methods, compositional factors, and human factors. Nevertheless, it is not completely understood why spaces are perceived as smaller in VEs, in contrast to the accurate perception of space in the real world.
One of the key issues in the perception of three-dimensional space is depth cues (i.e., sources of information about the spatial relations of the objects within the environment). A number of studies have shown the effects of depth cues on the perception of egocentric and exocentric distances (further review and analysis are available in [12,13]). Investigations of the effects of environmental contexts on distance perception [14,15] have reported that a continuous and homogeneously-textured ground surface was helpful for veridical distance perception. Another good example is research on the so-called action-specific effects on perception [16]. The results of that research indicated that the perception of various environmental and object properties, such as distance and object size, varies as a function of observers' ability and intent to act in the environment. For example, observers wearing a heavy backpack might perceive an object as farther away because they would have to expend greater energy to walk to the object. Generally, depth or distance cues are classified as monocular or binocular [17]. Some cues are monocular; optic inputs in one eye are sufficient to extract distance information (e.g., occlusion, relative size, and relative density). Other cues are binocular; information from both eyes is combined to perceive the distance (e.g., binocular disparity, convergence, and accommodation).
The positioning of objects in the frontal extent is also a vital focus of study because of its importance for many applications in VEs [18]. Moreover, compared to studies in depth planes, studies on the distance estimation of objects in the frontal extent show a large variety of results. Geuss, Stefanucci [19] reported accurate judgments of approximately 100% of inter-object distances in VE. However, a more recent study by Kelly,Hammel [20] found under-perception of distance in the frontal extent in a grass-scene VE condition, but confirmed accurate estimation in the room VE condition. Another study, focusing on inter-objects in VE, also observed overestimation [21]. Considering the relationship between objects in the frontal extent of VEs, it is an important issue for various applications, and related information should be provided accurately [22]. Therefore, the present study focused on evaluating the accuracy of exocentric distances and the resulting perception of space in the frontal extent of a VE.
In general, two types of methods are employed to judge distance in VEs: action-based and non-action-based judgment. Action-based judgment involves walking, pointing, reaching, or combinations of these tasks and requires an observer to view a target and then perform the action. This judgment method is perceived to have greater validity because the actions are usually related to actions performed in both the real world and VEs, such as walking through or interacting within spaces [23]. A verbal report, an example of a non-action-based judgment, requires observers to view a target and then verbally report the perceived distance. The verbal report can also be described as conscious report/magnitude estimation of the distance to a target. It is well accepted that a verbal report is a simple way of measuring perceived distance; however, this type of report in the real word tends to be more variable and less accurate than action-based judgment [24].
Most previous studies have tried to estimate distance perception in VEs by using reporting methods such as perceptual matching techniques [25][26][27][28], verbal judgments [29,30], blind walking [31,32], triangulated blind walking [33], and timed imagined walking [34]. Recently, VR applications have become more interactive, allowing a user not only to visualize a 3D image but also to interact with the 3D objects in the VEs. For such interactions, researchers have investigated a 3D mid-air input technique [35][36][37] as an alternative to conventional 2D input devices (e.g., trackball, mouse, etc.) and touch screens [38]. Generally, the technique employs a 3D target and pointing hand movement as an input function for various systems requiring freehand or touchless computer interactions. The challenge is to determine whether such direct-hand pointing interactions have comparable performance to those of traditional 2D input devices.
A recent work by Lin and Woldegiorgis [39] investigated the performance of direct-pointing wherein participants directly moved their hands to reach for real/virtual targets projected in front of (negative parallax) the VE projection display. The results revealed that participants tended to overestimate the depth by approximately 10 cm, and that the overestimation decreased as the depth increased. Moreover, the direct-pointing method was claimed to reduce the underestimation problem that is commonly reported in VEs. A similar method (directpointing) was employed to estimate the distance of virtual targets presented in the frontal extent at three depth levels [40]. The study concluded that compression occurred in the frontal plane and that observers tended to underestimate the depth at 100 cm and 150 cm. Bruder, Steinicke [41] and Swan, Singh [28] employed direct-reaching with the hand as a reporting method to reach virtual targets. In Bruder, Steinicke [41], a comparison of interaction techniques between 3D mid-air and 2D touch screen controls indicated no significant differences in the error rates of target selection in a stereoscopic environment. In Swan, Singh [28], a comparative study between direct-matching and blind reaching in estimating the positions of mixed real and virtual targets in a stereoscopic environment indicated that direct-matching was more accurate than blind reaching. In our study, these techniques (direct-pointing, matching, and direct-reaching) are considered direct interaction techniques.
In contrast to a direct interaction technique, which requires direct involvement with the object rather than communicating with an intermediary [42], an indirect interaction technique allows a user to use a physical control (e.g., sliders, joysticks, etc.) to control an icon (e.g., a virtual cursor, a virtual hand cursor) to perform a specific task [43]. Previous studies have applied the indirect interaction technique in VEs. Bruder, Steinicke [44] compared three different approaches for selecting virtual targets: direct input, distant input with a virtual offset cursor (white marker), and distant input with a virtual hand cursor (hand cursor). In the direct input approach, participants positioned the tips of their index fingers on the target, and in the virtual offset and hand cursor conditions, they moved a white marker or hand cursor to the virtual target. The results revealed that subjects made fewer errors in selecting a virtual target at different heights above a 3D tabletop setup when using an input with an offset cursor than when they used direct input and the offset hand. Poupyrev and Ichikawa [45] compared an interaction metaphor of a virtual pointer (the ray casting technique) and a virtual cursor (direct input with a virtual offset cursor) for object selection and repositioning tasks. The comparison showed that the virtual pointer (considered an indirect interaction technique) exhibited more accuracy in the selection of objects within reaching distance than did the virtual cursor (considered a direct interaction technique). A recent study by Deng, Geng [46] asked participants to position a ball-shaped object in a spherical area in a virtual space using head tracking or handheld controllers. These techniques allowed a virtual light ray emitted from the controllers to move the object. Studies on interaction techniques (direct-and indirect interactions) and the extent to which this factor influences the accuracy of distance estimation are summarized in the following section (see Table 1 for an overview of studies on interaction techniques). Although it is as important as perception, interaction performance (distance accuracy) and especially the effects of the interaction techniques in VEs have not yet been studied as much as visual perception. Therefore, the present study investigated the effects of interaction techniques on the accuracy of distance estimation in a projection stereoscopic display. Moreover, even though some previous studies (Table 1) have analyzed the performance of interaction techniques in VE, they considered the techniques separately, focusing on either the direct or the indirect interaction technique. This study therefore attempted to compare direct interaction and indirect interaction techniques in terms of their effects on the accuracy of distance estimation. To provide comprehensive knowledge of the spatial information, distance estimation in the frontal plane needs to be evaluated.
In the present study, the accuracy of distance estimation in the frontal plane of a stereoscopic environment using two interaction techniques (direct-selection and virtual cursor) was studied. We evaluated these two interaction techniques in selecting a 3D stereoscopic object in front of the projection screen display. We used a standard selection task (ISO 9241-9) to determine differences in 3D object selection performance for varied sizes of the targets and varied center-to-center distances between targets. The results of this study should provide direction for the choice of interaction techniques, as well as the optimal sizes and distances between targets in stereoscopic environments.
Based on the results of previous related studies [6,8,40,41,44,49], the present study tested the following hypotheses: H1: Interaction technique affects the accuracy of distance estimation in the VE, with the virtual cursor technique being more accurate than the direct-selection technique.
H2: Space is compressed (underestimated) in the frontal extent of the VE. H3: The center-to-center distance accuracy is higher for both narrower inter-object distances and larger target sizes.

Methods
The purpose of the experiment was to investigate the accuracy of distance estimation and perception of the frontal extent in a stereoscopic environment, where two interaction techniques, namely, the direct-selection and virtual cursor techniques, were considered. The two interactions of the direct-selection and virtual cursor techniques were developed based on interaction terms between users and targets in VE by Mine [43].

Direct selection of virtual objects
3D interaction in VEs has been the focus of many research groups over the last few decades [50]. Direct interaction provides the most natural type of interaction with virtual objects [37]; however, this technique leads to confusion because the participant is touching an intangible object, or touching a void [51]. In addition to distance estimation being less accurate in VE than in the real world, direct interaction can also lead to double vision and vergence-accommodation conflicts [52]. Although this technique introduces visual conflicts, most results from similar studies agree that direct interaction could significantly improve the performance of object manipulation [53] because optimal performance may be achieved when visual and motor spaces are coupled closely [54,55].
Direct interaction is an impression of direct involvement with an object, rather than of communication with an intermediary [42]. In our study, in the direct interaction condition, participants estimated the distance by pointing to the outer surface of a spherical virtual target ( Fig 1A). This reporting method is similar to one of the two methods employed in Napieralski, Altenhoff [10], wherein a physical arm was employed to estimate a target depth in VE. Since the distances considered in their experiment were within arm's reach, pointing by hand was sufficient. However, in the present study, we used a physical object (i.e., a pointing stick) with a marker attached to the tip to be detected by the tracking system.

Indirect selection of virtual objects
Indirect interaction requires conversion between input and output [42], such as a slider on a panel that controls the intensity of light; in this case, a user directly controls the intermediary slider and indirectly controls the light. In our study, for indirect interaction, participants estimated the distance by moving a virtual cursor controlled with a gamepad (Fig 1B). They needed to place the virtual cursor (represented by a hand cursor) at the center of the surface of the virtual target. In this type of interaction, both the virtual cursor and the virtual targets were displayed stereoscopically, eliminating the mismatch between the real and virtual objects. This approach may reduce the visual conflicts which commonly occur when a user is trying to select a virtual object with a physical object (e.g., the user's real finger). However, it is not clear whether the reduction of visual conflicts can improve overall selection task performance.

Participants
Fourteen participants, eleven males and three females aged between 23 and 31 years old (M = 24.64, SD = 2.06), were recruited. All participants self-reported normal or corrected-tonormal vision and right-hand dominance. Regarding experience with electronic systems, most of the participants were familiar with the input devices. Three of them had never held a gamepad, and the rest had operated gamepads before. None rated themselves as experts in their use. Among all the participants, only three had previous experience in virtual reality. Prior to the experiment, all participants had to pass a stereo vision test by viewing a virtual target projected 90 cm from them. All participants who qualified were invited to participate in the experiment. Written informed consent was provided to each participant prior to the experiment. The participants were given explanations and were aware of the aims of the experiment, and they volunteered to experience a virtual environment. The participants received neither any form of payment nor compensation with academic credit. The experiment was approved by the research ethics committee of National Taiwan University (NTU-REC No: 201209HS002). The accuracy of the frontal extent in stereoscopic environments

Experimental variables and design
The experiment considered two interaction techniques, two center-to-center (c2c) distances between targets, and three target sizes. Therefore, a combination of three independent variables, namely, interaction technique (direct pointing and distant cursor), c2c distance (18 cm and 36 cm), and target size (0.6 cm, 1.5 cm, and 3.7 cm), were used and tested on all the participants (within-subject design). After being divided into two equal groups, the participants were assigned to start with one of the two techniques (direct pointing or distant cursor). Once the technique was chosen, the c2c distance and target size were varied randomly. In both the direct selection and virtual cursor conditions, the participants were presented with two levels of c2c target distance and three levels of target size. After completing all the trials in the first round of the experiment, the participants returned at least two days later to complete the other condition. The dependent variables were the accuracies of the x-position, y-position, and exocentric distance. Based on the results of the three accuracy measures, a perceived frontal extent in VE could be evaluated. In addition, the task completion time, defined as the time for the participant to complete the pointing task with eight targets, was measured.
The participants estimated the target positions by pointing to the center of the target surface with a pointing stick (direct selection technique) or by controlling the movement of a virtual cursor with a gamepad (virtual cursor technique). During the task, the data of the reflective marker (attached to the tip of the pointing stick) positions in the direct selection technique, tracked by six infrared cameras at a rate of 120 frames per second, were collected. In the virtual cursor condition, the positions of the virtual cursor and reference targets within the VE were recorded by a Unity 3D system. The data collected were then organized to analyze how close the estimates were to the corresponding reference targets. First, a positional estimate of a single object along the x-and y-axes were collected, and then the accuracies of the x-and y-positions were further evaluated. Second, the position data of the c2c distance between two consecutive (with respect to pointing order) targets were then used to calculate the accuracy of exocentric distance. After that, the overall positional perception of the targets was used to evaluate whether space was compressed (underestimated) or expanded (overestimated) in the frontal plane. The accuracy was calculated using the following formula [39,56], where De indicates the participant's estimated position/distance, obtained from the data recorded under the two techniques, and Da is the corresponding actual or reference position/ distance. A value closer to one indicates the estimation was more accurate.

Experimental setup and stimuli
The experimental space was 4.6 m x 3.2 m x 2.5 m in size and partitioned by black curtains to create an excellent stereoscopic environment. The VE and stereoscopic targets were developed in Unity 3D. The targets were displayed in the order specified by ISO [57]. The eight targets appeared one at a time in sequence until all were displayed (Fig 2A). The VE was projected by a ViewSonic 3D projector onto a projection screen that was 130 cm wide x 100 cm high. The VE space was a uniform dark blue (Fig 2). The participants wore NVIDIA 3D glasses integrated with a 3D emitter to perceive stereoscopic vision. The stereoscopic targets were displayed at a distance of 90 cm from the participant. The participant and the projector were placed at a fixed position of 210 cm from the projection screen (as shown in Fig 3). The origins for reference measurement in both the direct selection and the virtual cursor conditions were set at the center of the projector, which was placed under the table (75 cm above the floor) and perpendicular to the participant's eyes.
The pointing task in the direct selection technique was performed using a light wooden stick of length 80 cm. The material was carefully selected so that the weight of the stick would not affect participants' pointing posture or performance. A 0.6 cm reflective marker was  The accuracy of the frontal extent in stereoscopic environments attached to the tip of the stick. A wireless remote control for confirming pointing judgment was also attached to the lower end of the stick. In the virtual cursor technique, a dual analog gamepad was used to control a virtual cursor the along x-, y-, and z-axes within the VE. The gamepad's left analog stick controlled a virtual cursor in two-degrees-of-freedom (DoF) translation (up and down, left and right), including diagonal movements. The gamepad's right analog stick was used to control the depth (forward and backward) of the virtual cursor. The size of the virtual hand cursor was scaled approximately to the size of each target to provide good visualization when touching the target. In both the direct selection and virtual cursor conditions, the initial positions of the tip and virtual cursor were kept fixed. Using the center of the projector as the reference of measurement, the relative positions of the initial points of the tip and the virtual cursor were about 82 cm diagonally to the left with respect to the origin. The position of the cursor was reset to the initial point for each trial. Applying more force on the gamepad resulted in higher speed. Prior to conducting the experiment, a preliminary experiment was conducted to obtain the optimum sensitivity value of the gamepad. The sensitivity was set to the value that enabled the user to control the virtual cursor for precise and fast movement. We optimized the gamepad control with respect to the gain between exerted force and virtual cursor speed. The resulting average sensitivity value was set to approximately 2 m/ s. Participants pressed the "X" button on the gamepad grip to confirm the pointing judgment. For both conditions, no visual feedback (e.g., change in color or shape) was given when the target was chosen, other than the appearance of the next target.

Procedure
The participants completed all the trials of the experiment in two sessions, one for the direct selection technique and the other for the virtual cursor technique, separated by at least two days to minimize the effects of learning and fatigue. Prior to the experiment, the participants completed a consent form detailing the purposes and procedures of the task. An equivalent verbal explanation was also given while their interpupillary distances (IPDs) were measured.
A fixed chinrest was placed on the tabletop for the participants to rest their chins to minimize the possibility of differences in viewing perception due to head movements. The participants were asked to familiarize themselves with the experimental setup; i.e., they were instructed to view the VE and to obtain a clear image of the virtual targets. The experimenter introduced the procedure and showed them how to employ the interaction technique to be used in the first session. Then the participants practiced two or three trials until they completed a trial without procedural errors. The first experimental session for the selected technique followed the practice session. Each participant completed a total of 96 trials (2 techniques x 3 target sizes x 2 c2c distances x 8 targets) in two sessions of about 30 minutes each.

Results
Repeated-measures ANOVA was used to evaluate the accuracies of the x-position, y-position, and exocentric distance for the three independent variables. Corresponding to the accuracies of the x-position, y-position, and exocentric distance, the perception of virtual space in the frontal extent could also be evaluated.

Accuracy of x-position, y-position and exocentric distance
The accuracy of the x-position was significantly higher in the virtual cursor condition  Fig 5A, the direct selection technique was more accurate at the 18 cm distance than at the 36 cm distance between targets. For the virtual cursor technique, the opposite trend was found: Judgments were more accurate for the c2c distance of 36 cm than for that of 18 cm. From this study, it is evident that, with respect to the accuracy of the xand y-positions, the direct selection technique was more accurate for the narrower inter-object distance, while the virtual cursor technique was more accurate for the wider inter-object distance in the y-(vertical axis) position. However, for the x-(horizontal axis) position, there were no significant differences in accuracy for the 36 cm and 18 cm inter-object distances The accuracy of the frontal extent in stereoscopic environments when the virtual cursor technique was used. Regarding target sizes, accuracy was marginally higher for the largest target size (3.7 cm) with both techniques than for the other two sizes.

Perception of virtual space in the frontal extent
The pointing estimations with the two interaction techniques with respect to targets with c2c distances of 36 cm and 18 cm are shown in Fig 6. It can be observed that when direct selection was employed, then in the x-and y-positions of the wider c2c distance (36 cm), underestimation occurred because the points were concentrated close to the center (Fig 6A-Fig 6C). The figures also show underestimation of the exocentric distance in the direct selection condition. The overall direct selection estimations for the narrower c2c distance, 18 cm, were systematically shifted slightly to the right with respect to the reference targets ( Fig 6D-Fig 6F). Therefore, from the plotting of pointing estimations, it was determined that, when the distance between objects was wider and the direct selection technique was employed, the frontal view distance estimations were compressed.
Considering the target sizes shown in Fig 6D-Fig 6F, the estimations of positions with both techniques for the smallest target size (0.6 cm) were more dispersed than those for the other two sizes (1.5 cm and 3.7 cm). Consequently, smaller target sizes and narrower c2c distances led to less accuracy with both techniques.

Task completion time
Task completion time was measured from the moment the participants judged the first target until they completed the task and all eight targets had appeared. The c2c distance exhibited  The pairwise contrast was significant (p < .005) for the smallest target size. The results revealed that the participants took significantly longer with a virtual cursor than with the direct selection technique when targets were displayed with the smallest target size (0.6 cm).

Discussion
This study evaluated the accuracies along the x-and y-directions and also the exocentric distance of the frontal extent of the stereoscopic environment with two interaction techniques, two c2c target distance levels, and three target sizes. The positions of pointing estimations were then used to analyze the observers' perceptions of the frontal extent.

Accuracy of x-position, y-position and exocentric distance
The accuracy results for the x-and y-positions showed that the interaction technique played an important role. From the results, it is observed that the virtual cursor technique was more accurate than the direct pointing technique. This result was in line with the finding of a previous study [44], which revealed a smaller error rate for a virtual cursor than for direct pointing in target selection under a stereoscopic environment. Although different techniques and different 3D displays were employed in Bruder, Steinicke [44], one of the main findings was that the direct selection technique was less accurate than the virtual cursor technique. A plausible explanation for this difference is the visual conflicts or misperceptions apparent in the direct pointing technique, where the virtual target appears blurred, in contrast to the pointing stick, which appears sharp [58]. However, in the present study, the participants reported no visual conflict issues when the direct selection technique was employed.
In the virtual cursor condition, a virtual hand cursor was placed on the outer surface of the spherical virtual target. However, occlusion would occur if the virtual cursor passed through the virtual target, and the cursor always occluded the spherical virtual target; thus, the participant might realize that the cursor had passed behind the target. Although no visual feedback was given (e.g., color or shape change) when the cursor touched the target, the appearance of the hand cursor could serve as a visual cue that contributed to the higher accuracy results of the x-and y-positions as compared to the direct selection technique. Another visual cue that might have affected the accurate judgment of distance in the virtual cursor condition was the relative size cue. Since the size of the virtual hand cursor was scaled approximately to the size of each target, then as the hand cursor moved towards the target, the user could compare the sizes of the cursor and target until they were similar, which might have indicated that they were at the same depth. In other words, when the hand cursor was smaller than the target, the cursor was farther from the target. Thus, this relative size cue might have resulted in more accurate selection.
Contrary to expectations, the present study did not find a significant difference between narrower (18 cm) and wider (36 cm) c2c target distances in the accuracies of the x-position, y-position and exocentric distance. This result is inconsistent with those of previous studies, such as Lin, Woldegiorgis [6], which reported a marginally significant difference at one shortest distance of 10-20 cm between targets. This difference might be accounted for by the differences in the judgment methods and the experimental tasks. In the experiment of Lin, Woldegiorgis [6], perceptual matching by sketching the distance between two real and virtual targets was used. In contrast, the present experiment used a reciprocal tapping task with the more direct interaction of reaching/pointing (direct selection and virtual cursor) to the target referent, so the insignificant differences in the accuracies obtained in the present experiment could have been partly due to the consistency of accuracy obtained by the judgment methods. We also speculate that the two fixed c2c distances between targets used in this study contributed to the insignificant differences in accuracy. A future study with a greater focus on c2c target distance, a factor which may have contributed to the difference in distance accuracy, is therefore suggested.
Another factor that might have influenced the accuracy of the x-position, y-position, and exocentric distance was target size. We found that estimations for the largest target (3.7 cm) were marginally better than those for the medium (1.5 cm) and small (0.6 cm) targets. These results confirm that target size affected the estimation, with accuracy increasing with the target size [59]. Moreover, these findings provide further support to the idea that accurate distance estimation with respect to target size in the real world can also be generalized to VEs. The overall estimation accuracies of the x-position, y-position, and exocentric distance were approximately 93% for the distant cursor method and 85% for the direct selection technique.
This study is distinctive because it has combined x-position, y-position, exocentric distance, and interaction technique, unlike previous VR studies. Despite differences in the distance reporting techniques and measurement accuracy of the previous studies [39,44,59] and the present study, the results confirm that direct selection of virtual objects can be used as a potential response method [60,61] in stereoscopic environments. In addition, the virtual cursor technique has the potential for high accuracy of virtual object estimation [44].

User perception of the frontal extent
In this study, the perception of frontal extent was also analyzed based on the pointing estimations. From Fig 6, it can be observed that pointing estimations were concentrated at the center, particularly during the direct selection technique at the 36 cm c2c distance. This finding is in substantial agreement with recent studies [20,40,62] showing that the user's frontal extent perception is smaller than the actual space in a stereoscopic environment. However, the compression in the frontal extent was not so visible for the virtual cursor technique. These differences may be explained by the fact that in this study, both the cursor and the targets were displayed stereoscopically, so the virtual space compression may have been reduced [44], which could have been induced by the variation of the references in the stereoscopic environment. Another possible reason is the familiar size of the objects and a higher sense of presence, particularly in the virtual cursor condition. If the size of an object (i.e., a hand cursor) is familiar to the observer, absolute distance information is available. A study by Interrante, Ries [63] added not only familiar objects to an unfamiliar environment but also used a virtual replica of a room that the participants had seen before as the virtual environment. The results revealed that the participants did not underestimate the virtual distances. In a follow-up study, the participants underestimated distances in both enlarged replica and shrunken replica conditions [64]. The authors concluded that the good estimates might have been due to a higher sense of presence. These findings are also supported by Sun, Li [65], who found that increased involvement of the users due to visual cues (i.e., avatars) can improve user performance in virtual environments. The described studies [63][64][65] have shown consistently that adding familiar objects to enhance a sense of presence can improve distance estimations. In the present study, although the virtual environment was presented in a sparse environment (i.e., spheres in front of a dark blue background) for both conditions, only the virtual cursor condition contained a familiar size cue (i.e., the hand cursor), and its appearance might have enhanced the sense of presence during the pointing to or selection of the target compared to the direct selection condition. Thus, these findings suggest that a low sense of presence in the direct selection condition might be a potential cause of underestimation. However, at this stage, it might be difficult to conclude that direct interaction by reaching or pointing to the target can cause underestimation of the frontal extent, for few comparative studies have applied this direct interaction technique. Further experimental investigations are needed to answer the question of whether the underestimation of the frontal extent is due to the low sense of presence in stereoscopic environments.
Nevertheless, the compression of space perception was not so apparent for the narrower c2c (18 cm) distance with the two techniques. Therefore, further experimentation with the user's virtual space perception in the frontal extent for various ranges of c2c distance might be needed to fully understand the effect of the interaction of technique and c2c distance. In the present study, the effect of depth (z-axis) on x-and y-position accuracy was not specifically studied. Although the participants were instructed to select the target as accurately as possible within the same frontal plane, the pointing inaccuracy on the z-axis could have played a part in the variation of estimations. Further studies taking this variable into account will be needed to clarify the possible effect of depth on frontal plane estimations.
The estimation of positions with both techniques for the 18 cm c2c distance (Fig 6D to Fig  6F) appeared to be more dispersed than that for the 36 cm c2c distance (Fig 6A to Fig 6C). As shown in Fig 6E and Fig 6F, estimations with the virtual cursor technique at the narrower (18 cm) c2c distance shifted slightly to the left side of the reference target, while the estimations with the direct selection technique moved towards the right side of the reference target ( Fig  6D to Fig 6F). This finding seems to be consistent with a recent study [40] showing that direct pointing estimation accuracy suffers more on the left side of the participants. A plausible explanation for this is the orientation of the interaction technique (direct selection by reaching/pointing to a target), since all the participants were right-handed [40]. The pointing direction of the left-side targets was from the right side of the participant, which would lead to a movement towards the center of the participant's body. Comparing the corresponding accuracies with those of left-handed participants would provide more information that could possibly explain the possible reasons for the difference. In addition, the fact that the experiment had only two c2c distance levels and that the shifting estimations only occurred in the 18 cm condition may have contributed to the inaccuracy of the user's perception. A future study should include a greater range of c2c distances to determine whether the present finding of compression of perception in the frontal space can be generalized.
The direct selection technique estimations of x-position (horizontal) and y-position (vertical) of the later projected targets were more accurate than those of the first target (Fig 6). This result may be related to the result from Richardson and Waller [66], who reported that the exocentric distance judgment of participants was improved by the learning effects from training. Since all participants started by pointing at the first target, their judgment could be expected to improve for the later targets.

Conclusion
In this study, we attempted to compare selection task performance and perception of the frontal extent in a stereoscopic display environment using two interaction techniques: direct selection and a virtual cursor. Selection behavior and performance using direct or indirect interaction techniques have been the concern of several previous works. In addition, this issue is important in areas such as human-machine interaction research. Indeed, virtual reality is now becoming more established and promising, and recent developments have led to applications allowing users to directly perceive and interact with 3D objects rather than just watching a 3D model. The underlying belief of current virtual reality research is that this will lead to more effective human-machine interfaces. In addition, to enrich the knowledge of users' interactions in virtual environments, comprehensive understanding of users' interactions in such environments is needed. To develop this understanding, we performed a comparative study with two different interaction techniques by using exactly the same experimental conditions.
In particular, this study was conducted to investigate accuracy in the frontal extent and the corresponding perception of the frontal extent in a stereoscopic environment, where virtual targets of three different sizes were presented at two center-to-center distances and two interaction techniques (direct selection and a virtual cursor) were employed. The results revealed that the accuracies of the x-position, y-position, and exocentric distance in the frontal extent were all affected by the interaction technique. The accuracies were significantly higher for the virtual cursor technique than for the direct selection technique.
Similarly, the results showed that the target size affected the accuracies of the x-position, yposition, and exocentric distance. Accuracy was relatively higher in the large (3.6 cm) target size condition than in the medium (1.5 cm) and small (0.6 cm) target size conditions. There was also an interaction effect between center-to-center distance and technique in the accuracies of the x-position and y-position. In the direct selection condition, it was observed that the accuracies of the x-position and y-position improved as the separation of targets decreased. On the other hand, in the virtual cursor condition, the accuracy of the y-position improved as the separation between targets increased. However, no significant differences in the accuracy of the x-position were found between the two center-to-center distances.
Generally, the accuracy of space perception in the frontal stereoscopic environment was found to be compressed (concentrated to the center) with respect to the target references, as observed from the plots of the pointing estimations. The findings of this study are important for virtual reality application designers and consumers. Developers of manufacturing simulations could potentially improve user accuracy by considering the observed systematic underestimations in the frontal extent. Moreover, the perception of the frontal extent with the direct selection method was found to be underestimated in the wider (36 cm) center-to-center distance condition, although the virtual cursor might have contributed to the reduction of the underestimation. These results thus provide critical information for deciding which interaction techniques can yield better estimation, how virtual targets can be appropriately positioned to enhance users' performance, and what the relevant target sizes in the frontal extent are so as to reduce judgment errors in stereoscopic environments.
Since accuracy is important in both real-world and virtual environments, the findings of this study imply the necessity of maintaining appropriate levels of both the distance between targets and the size of the virtual targets. Generally, if a direct selection technique is used, the separation between targets should be kept as small as possible, while the target size should be as large as possible. If, on the other hand, a virtual cursor is used, a wider range of size and distance between targets can be accommodated, and the associated accuracy may be relatively higher. Therefore, the virtual cursor technique had an overall relative advantage over the direct selection technique with respect to accuracy under the given experimental conditions. Furthermore, to assist distance perception as well as possible, it might be important to provide sufficient depth cues by displaying a rich virtual environment containing a texture or gradient background, and by adding objects of familiar sizes to the virtual scene to provide a good sense of presence. The findings of our study should be useful for virtual reality developers in making various decisions, especially in the design of applications such as 3D medical surgery training and manufacturing, where the surgeons and engineers are primarily concerned about the accuracy and precision of interactions.
Finally, it is important to note that this experiment only addressed task performance in a stereoscopic environment projection-based display, using a particular tracking system and input device, and over a particular range of distances between targets (exocentric distance). Understanding the generalizability of the results will require replication of the methodology across a range of virtual environments, displays, and spatial (egocentric or exocentric) configurations.