Training Enhances the Interference of Numerosity on Duration Judgement

The interference of magnitudes in different dimensions has been demonstrated previously, but the effect of training in one dimension on judgment of another has yet to be examined. The present study aimed to investigate the effect of training in numerosity judgment on judgment of duration. 32 participants took part in two sessions, 12 days apart, and had to judge which of two successive sets of items was presented longer. Half of the participants (training group) were additionally trained in 11 sessions to judge which one of the two successive sets of items was more numerous. It was found that the participants in the training group became more prone to the interference of numerosity on judging duration after training, when compared to the control group. Thus, being trained to more easily perceive the difference in number of items in the two sets affected the perception of duration. On the 3-month follow up session, no effect was found with 20 participants (n = 10 for each group). These findings indicate that the interference of magnitudes in different dimensions can be modulated by training. We discuss that this modulatory effect might be due to neural changes in shared brain regions between interfering magnitudes and/or is mediated by higher levels of perception.


Introduction
Perception of magnitudes in different dimensions, e.g. quantity, length, duration, speed, brightness, weight etc., is ubiquitous across most animal species and all phases of human life [1][2][3][4][5]. The ability to process magnitudes in one dimension or another is developed early in life. Even pre-linguistic infants, being only 4.5 to 8 months of age, show some numerical competence [6,7]. They are able to discriminate between stimuli consisting of 1, 2 and 3 items and they can even perform basic arithmetic, such as 1+1 = 2 as well as 3-1 = 2 [8,9]. These processing abilities continue to develop throughout adolescence and adulthood due to training and environmental changes [10].
Numerous studies have shown interference and association of one dimension with another, suggesting possible commonality(s) on the neural and/or perceptual level. The ability to record numbers and time has been demonstrated in animals by Meck & Church [11]. They showed that the same mechanism is involved for counting and timing. Additionally, electrophysiological and neuroimaging studies have identified a common network of brain regions processing numbers and time [12]. This evidence has been interpreted as indicative of a common neuronal structure for magnitudes. Based on previous findings of interactions between magnitudes, Walsh [13,14] proposed that magnitudes are interconnected within shared brain areas, which he referred to as a generalised magnitude system, proposing 'A theory of magnitude' (ATOM). According to this theory, the shared properties of dimensions such as space, time and quantity and their close cooperation in interaction with spatial and temporal structure of the external world, are suggestive of a common brain area, namely the parietal cortex. Therefore, magnitudes can sometimes interfere with each other leading to misperceptions of one dimension or another, as investigated by this study.
An intuitive 'more A-more B' mapping between different dimensions has been proposed by Stavy & Tirosh [15], suggesting, for example, that the bigger a train is, the faster it is perceived to be. Horne and Turnbull, as well as Lechelt and Nelson showed that an increasing or decreasing number of lights [16] or a set of serially presented lights [17] is perceived more numerous if presented for a longer duration. Furthermore, Xuan, Zhang, He, & Chen [18] found that irrelevant magnitude information, such as size, luminance, and numerosity, can affect temporal judgements. Using Stroop-like paradigms, they found that stimuli with larger magnitudes in nontemporal dimensions were perceived as being presented longer. Another study by Oliveri, Vicario, & Salerno [19], who used a time estimation task, found that high digits lead to an overestimation, whereas low digits lead to an underestimation of perceived duration. Thus, a temporal duration judgement can be biased by a number's magnitude. Dormal et al [20] had subjects compare two successive series of flashing dots, and found that numerical cues interfered with the duration processing, but temporal cues did not interfere with numerosity processing. A recent study by Javadi and Aichelburg [21], in contrast, identified a reciprocal relation between judgement of duration and numerosity. Their results showed that a set of items was perceived as being more numerous when it was presented for a longer duration, and vice versa, i.e. a set was perceived as being presented longer, when it contained more items.
Considering the inconsistency in the literature, the relationship between these two dimensions necessitated further investigation using a method that allows modulating the perception of one dimension to examine whether this also modulates the perception of another dimension.
To the best of our knowledge, the effect of training on the interference of two magnitudes has not yet been examined that may provide valuable insights into the relation of two dimensions. All of the mentioned studies were performed on participants with no prior training on judgement of the certain dimension of interest. The present study aimed to investigate the effect of training in one dimension (numerosity) on judgement of another dimension (duration). All participants took part in two sessions. They had to judge which of two successive sets of items was presented longer. In between these two sessions, half of the participants (training group) were additionally trained to judge which one of the two successive sets of items was more numerous. We anticipated that training would facilitate participants to perceive the changes of the interfering dimension more readily, therefore leading to increased interference, when compared to the untrained, control condition. Additionally, we ran a follow-up session after 3 months to investigate the lasting effects of training on the interference of numerosity on judgement of duration.

Participants
Thirty-two (17 females, 18-20 years old) subjects took part divided into two experimental groups: training (n = 16) and control (n = 16). All the participants were healthy with no history of neurological or psychiatric disorder, right handed, with normal or corrected-to-normal vision, and were naive to the purpose of the study. One participant in the training group dropped out due to illness during the training period. After 3 months, all 31 participants were invited to a follow-up testing session to study the persisting effects of the training. Twenty of them (10 in each group) took part in this testing session. Figure 1(a) shows the design of the experiment and the order of the three sessions. All participants gave a written, informed consent in accordance with the Declaration of Helsinki. The study protocol was approved by the ethics committee of University College London (UCL).

Apparatus
Experiments were run on desktop computers with a 17-inch CRT monitor and 75 Hz refresh rate with a resolution of 10246768 pixels. The monitor was placed 53cm from the participants' eyes. Stimuli presentation and the recording of response time were accomplished using MATLAB (v7.5; Math-Works Company) and the Psychtoolbox v3 [22,23]. Data analyses were performed using Palamedes toolbox for MATLAB [24] and SPSS (v17.0; LEAD Technologies, Inc.). Responses were made on a conventional computer keyboard using index and middle fingers of the participants' right hand.

Stimuli
Stimuli were sets of items consisting of the image of a synthetic ball placed in random locations within a 25.32619.12 visual degrees virtual rectangle at the centre of the monitor on a black background. The items were a solid yellow sphere with a mild shading created by 3DS Max (Autodesk) and 1.6161.61 visual degrees. In order to avoid overlap and to assure distance between items, they were set apart by at least 5/2 of their radius (measured from their centres).

Design
The study adopted a mixed-factor design with three testing sessions (1 st , 2 nd and follow-up testing sessions). The retention interval between the 1 st and 2 nd testing sessions was 12 days and 3 months between the 2 nd and the follow-up testing session. Participants were randomly assigned to either group: training or control. All participants took part in two sessions, on Saturday and Friday in the time span of two weeks (12 days apart). During these 12 days, half of the participants (training group) were additionally trained for 11 sessions. Training sessions began on the Sunday after the first testing session with only one training session on the following weekend. Figure 1(a) shows the timing of the 3 testing and 11 training sessions.
In the testing sessions, participants had to compare the duration of presentation in two consecutively shown sets of items and select the set that was presented longer (duration judgement), whereas during the training sessions they had to compare the number of items in the two consecutive sets and select the more numerous one (numerosity judgement). Two independent variables, namely the duration of presentation of each set (t 1 and t 2 ) and the number of items in each set (n 1 and n 2 ), were modified. Trials in the testing sessions were either 'veridical' or 'phantom'. Veridical trials were the trials in which the number of items in the two sets were the same, but the durations of presentation were different (n 1 = n 2 = 28, t 1 ? t 2 and t M {53 ms, 66 ms, 80 ms, 93 ms, 106 ms}). Phantom trials, on the other hand, were the trials in which, the number of items in the two sets were different, while the durations of the presentation were identical (n 1 ? n 2 , t 1 = t 2 = 80ms and n M {28, 31, 34, 37, 40}). The values for n and t were selected based on Javadi and Aichelburg [21].
Training sessions composed only of trials in which the number of items in the two sets were different while keeping the durations of presentation constant (n 1 ? n 2 , t 1 = t 2 = 80ms and n M {28, 31, 34, 37, 40}).
Javadi and Aichelburg [21] showed that total occupied area, size, and density do not affect the judgement of the participants in this task, considering the current presentation magnitudes of duration and numerosity. Therefore, we did not control for the total occupancy of the items to keep the stimuli as simple as possible not to distract the participant with complexity of the stimuli.

Procedure
Testing sessions were composed of eight blocks of 80 trials (8 repetitions per absolute value of difference level |n 2 -n 1 | and |t 2t 1 |, see below), resulting in 320 veridical and 320 phantom trials in total. Although the trials in which both n 2 -n 1 = 0 and t 2 -t 1 = 0 were the same in between veridical and phantom trials, we included separate trials for the two types of trials to keep the number of samples in all conditions equal. Training sessions did not include any veridical trial, therefore the total number of trials were half of the trials in testing sessions. The procedure of one trial is shown in Figure 1 After each block, feedback was given based on the participant's performance on the veridical trials. Participants were instructed to respond as accurately and as quickly as possible, within the response period. Participants were also asked to keep their gaze point at the centre of the monitor at all times.

Statistical Analysis
Performance and response times were recorded. Performance refers to the percentage of selecting the first set. A logistic psychometric function, P(k) = 100/(1+ exp(-b (k -a))), was fitted to performance and mean response time over k = t 1 2 t 2 (9 levels) for veridical trials and k = n 1 2 n 2 (9 levels) for phantom trials for each participant and for each testing session. Two free parameters were used for curve fitting: a is the point of maximum growth, indicating the point of subjective equality (PSE) and b is the growth rate, indicating the sensitivity to different magnitudes. The logistic function has been widely used to describe psychometric functions. One of its main advantages is independency of the two parameters of PSE and sensitivity.
Four separate 262 mixed-factor analysis of variances (ANOVA) with testing session number (1 st /2 nd session) as within subject factor and group (training/control) as between subject factor, were conducted on the two dependent variables (a and b) for phantom and veridical conditions. Post-hoc Bonferroni corrected two-tailed independent sample t-tests were run to compare the performance of the two groups in the two testing sessions.
Additionally the performance and response time of the participants in the follow-up session in the two groups were compared using two-tailed independent sample t-test. As about one-third of the whole sample was unable to participate in the follow-up session, we did not run a large 362 mixed-factor ANOVA with sessions (1 st /2 nd /follow-up session) as within subject factor and group as between subject factor.
The performance of the participants throughout the training sessions was also analysed. Two two-tailed paired-samples t-tests were conducted on the performance accuracy and response time of the participants in the first and the last training session. Data was tested for normality.

Results
A 262 mixed-factor ANOVA with group and session number as independent factors and b for veridical trials as dependent factor showed no significant difference in any of the comparisons (F ,1). Figure 2 shows the performance of the participants for veridical trials.
Post-hoc Bonferroni corrected independent-sample t-tests on the b values of the two groups in the two sessions showed no significant difference for the first session (t(29) = 0.41, p = 0.68) but a significant difference for the second session (t(29) = 2.68, p = 0.01).
a values were subjected to similar 262 mixed-factor ANOVAs for veridical and phantom conditions. These analyses showed no significant difference in any of the comparisons (F ,1). Table 1 shows the mean and standard deviation (SD) of a and b for different conditions and groups over the two testing sessions.
Similarly the response times were analysed using 262 mixedfactor ANOVAs. None of the comparisons were significant (F ,1). Table 2 shows the mean and standard deviation (SD) of response times for different conditions and groups over the two testing sessions.
Performance of the participants in the follow-up session were also subjected to two independent sample t-tests for a and b. This analysis showed no significant difference for a (t(18) = 0.58, p = 0.57) and no significant difference for b (t(18) = 1.63, p = 0.12). Similarly the response times were analysed. None of the comparisons were significant (t ,1).
Percentage performance accuracy of the participants in the training group during the training sessions was also analysed. A paired t-test comparing the accuracy of the participants in the first training session and the last training session (11 th session) showed a highly significant difference (t(14) = 4.32, p,0.001), Figure 4.

Discussion
Judgement of numerosity and duration has been shown to interfere. The effect of training in one dimension (numerosity) on judgement of another dimension (duration) was investigated in the present study. The main question under investigation was if perceiving differences in numerosity was enhanced via training, would this affect the interference of numerosity on judgement of duration? The results showed that participants in the training group became more prone to the interference of numerosity on judgement of duration after training when compared to participants in the control group, as assessed by sensitivity parameter ('b'). However, there was no significant effect in the follow-up session.
No significant difference in point of subjective equality (PSE) ('a') assessed by two separate 262 ANOVA on veridical and phantom trials shows that participants in the two groups and over the two testing sessions were not biased differently towards either the first and second sets. PSE and sensitivity ('b') parameters of the fitted curves over response times were also subjected to two similar ANOVAs. No effect was significant, although one might expect to have faster response times on the second session of the training group as they underwent 11 sessions of training with similar response procedure. We speculate that it is because the tasks were orthogonal, i.e. judgement of duration on testing sessions and numerosity on training sessions. A closer look at response times over blocks in each testing session showed that participants' response time dropped abruptly from the 1 st block to the 2 nd and remained fairly stable thereafter. Table 3 summarises reaction times for the first two and last blocks (1 st , 2 nd and 8 th ). This shows that participants in both groups achieved the shortest response time over the 1 st block of each session.
Behavioural studies aimed to investigate the association of numerosity and duration perception. Xuan et al. [18] found that stimuli with larger magnitudes in nontemporal dimensions were perceived as being presented longer. Furthermore, Oliveri et al. [19] showed a temporal duration judgement can be biased by a number's magnitude. Dormal et al. [20] used flashing dots in a series and had subjects compare two successive series, and found that numerical cues interfered with the duration processing. Droit-Volet, Clément and Fayol [25], in a study on children aged 5 and 8 years old, showed that in a temporal bisection task, number interfered with temporal performance (more strongly for 5-yearold children). Recently, a reciprocal relation between duration and numerosity has been identified [21]. In accordance with these findings, Stavy and Tirosh [15] have suggested an intuitive 'more A-more B' mapping between different dimensions, e.g. the bigger the trains are, the faster they are perceived. This relation, however, is not always true for both directions, i.e. one dimension can interfere with the other dimension, but not vice versa. Dormal et al. [20] did not find any interference effect of temporal cues on numerosity processing. Droit-Volet et al. [25] also showed no interference of duration with numerical discrimination in a numerical bisection task.
There is an ongoing debate on the brain areas involved in perception of numerosity and duration (time as a more general term). Imaging and brain stimulation, as well as lesion studies have tried to find the neural substrates involved in the perception of these two dimensions. The majority of studies have reported parietal regions presupposed (for a review see [26] and metaanalysis see [27]). Walsh [13,14] proposed in a theory of magnitude (ATOM) the parietal cortex as the common brain area, involved in perception of time, space, number, size, speed and other magnitudes. Subsequently, Bueti and Walsh [28] revised this theory. ATOM revolves primarily around the role of the parietal cortex (as the major area for sensory integration and object manipulations), needed for active interactions with the environment in order to acquire knowledge. This theorem, however, does not fully explain how this area contributes to the cognition of magnitude in different dimensions. Contrary to Walsh [13,14], Dormal et al [29] showed a contribution of frontal areas in decision-making in numerosity and duration processing.
Although there are many studies on time and numerosity perception, there are only a few studies looking at how perception of these two dimensions link to each other. Cappalletti et al. [30], in a lesion study, showed the dissociation between duration, numerosity and space processing. Using TMS, Dormal et al. [31] showed a similar effect. They demonstrated that the stimulation of left IPS impaired performance in a numerosity comparison task, whereas duration comparison was not affected. Only recently Dormal et al. [29], in a functional magnetic resonance imaging (fMRI) study, showed activation of the IPS and areas in the precentral, middle and superior frontal gyri for both numerosity and duration processing. Moreover, based on psychophysiological interaction (PPI) analysis, they proposed that the right IPS contributes in both numerosity and duration processing.
To the best of our knowledge, while there are many studies on interference of magnitudes in different dimensions (see above), none address the possible modulatory effect(s) of training on this interference. We trained participants on numerosity judgement and investigated whether their sharper perception of numerosity interferes more strongly with their perception of duration. The reported interference effect might result from two different mechanisms or a combination of the two. One possible mechanism stems from the commonality of numerosity and time perception in some brain areas (see above). As a result of training these brain areas may have been fine-tuned for more precise perception of numerosity, which consequently increased the interference of numerosity with perception of duration, see Figure 5(a). Another possible explanation is that the interfering effect of sharper perception of numerosity is mediated through higher levels of perception and/or other brain areas responsible for magnitude perception, i.e. the encoding of duration is intact but the integration of this coding with that of numerosity leads to higher interference, see Figure 5(b). Considering the vast evidence on brain areas involved in numerosity and duration perception, we propose that the combination of these two mechanisms underlies the increase in interference of numerosity in duration judgement. Further research needs to be done to differentiate between these two possibilities.
The ability of the brain to respond to training in a specific task by altering and adapting structurally as well as functionally has been investigated greatly in the literature (for a review see [32,33]). These studies have led to an extensive body of evidence revealing lifelong plasticity. Nonetheless, the evidence on these changes is ambiguous. The follow-up study revealed that the effects of training fairly decreased over the 3 months post-training. Therefore, we speculate that the functional and possible structural effects of training were not long lasting. To our knowledge, no prior study has investigated the structural or functional changes of training in numerosity or duration judgement. Further research using imaging techniques could reveal not only whether training was sufficient to lead to changes within the brain, but would also allow more generalisable insights into the underlying neural structures of magnitude perception.  The control group did not undergo a training procedure as the training group did. Therefore, one might argue that the reported effect, i.e. training in numerosity judgement enhanced the interference of numerosity in duration judgement, is purely a result of difference in procedure. We, however, argue that the modulatory effect of training cannot be due to lack of training in the control group, as the only difference that we observed between control and training groups lied in their performance in phantom trials, reflected as steeper slope (increased sensitivity), and not in veridical trials, and neither in response times.
By increasing the number of items in each set and shortening the duration of presentation of each set, we aimed to match the difficulty of judgement in the two dimensions [21]. It has to be mentioned that the reported interference could be dependent on the magnitudes used in this study, e.g. decreasing the numerosity to a range that is easier to count (for example less than 10) might abolish the interference effect. Therefore, extending these findings to other magnitudes and dimensions (such as space or numeric symbols) needs careful considerations.
Another interpretation of results could be as follows: switching the task from numerosity judgement to duration judgement (for the training group) could indeed disturb the participants, due to the distribution of attention to two dimensions, leading to higher interference when compared to the control group. Results, however, showed that this is not the case, as the performance of the participants in the training group was comparable with that of control group for the veridical condition. In a more exaggerated way, it could be the case that participants based their decision on numerosity of the two sets in trials with equal presentation length. Based on the results, this possibility can be dismissed as well, as participants in the training group, especially, after such an intensive training did not achieve high performance percentage for phantom trials. This shows that participants kept their decision based on the duration of the two sets and they were not aware of the interference of numerosity of the two sets. Additionally, post study interviews revealed that even if participants detected the difference in numerosity of the two sets, they ignored the variation in that dimension.
On another note, the effect of training could turn out to be facilitatory, rather than interference. Based on this study, it cannot be determined whether this effect is found through reversed intervention, i.e. training in duration judgement and testing on numerosity judgement, and even if, whether it will go in the same direction, i.e. training duration perception may lead to facilitation, instead of interference in the judgments of numerosity. Further  Table 3. The mean and SD (in parentheses) of response times (s) for veridical and phantom trials split over the groups for the 2 nd session.  research needs to determine whether dimensions are interchangeable, so that training in one specific dimension leads to interference while another may lead to facilitation, or whether a hierarchy of dimensions exists, where one dimension takes a superior role over others. We focused on investigating the dimensions duration and numerosity, while disregarding space, with the aim of dissociating these two dimensions. Further research should investigate the relationship between all three dimensions, which may allow insights into which dimension plays the predominant role in our judgments.
In conclusion, our results showed that training participants to more clearly perceive the difference in the number of items in the two sets affected their perception of duration presentation. Research on magnitude perception remains in its infancy and many more questions are yet to be answered. Future research needs to address the paramount question concerning specific operations underlying magnitude representation [28] and their interconnections.