The authors have declared that no competing interests exist.
Conceived and designed the experiments: NN SB LA DA BE. Performed the experiments: NN SB. Analyzed the data: NN DA. Contributed reagents/materials/analysis tools: LA. Wrote the paper: NN DA.
Recently, there has been a growing emphasis on basic number processing competencies (such as the ability to judge which of two numbers is larger) and their role in predicting individual differences in schoolrelevant math achievement. Children’s ability to compare both symbolic (e.g. Arabic numerals) and nonsymbolic (e.g. dot arrays) magnitudes has been found to correlate with their math achievement. The available evidence, however, has focused on computerized paradigms, which may not always be suitable for universal, quick application in the classroom. Furthermore, it is currently unclear whether both symbolic and nonsymbolic magnitude comparison are related to children’s performance on tests of arithmetic competence and whether either of these factors relate to arithmetic achievement over and above other factors such as working memory and reading ability. In order to address these outstanding issues, we designed a quick (2 minute) paperandpencil tool to assess children’s ability to compare symbolic and nonsymbolic numerical magnitudes and assessed the degree to which performance on this measure explains individual differences in achievement. Children were required to cross out the larger of two, singledigit numerical magnitudes under time constraints. Results from a group of 160 children from grades 1–3 revealed that both symbolic and nonsymbolic number comparison accuracy were related to individual differences in arithmetic achievement. However, only symbolic number comparison performance accounted for unique variance in arithmetic achievement. The theoretical and practical implications of these findings are discussed which include the use of this measure as a possible tool for identifying students at risk for future difficulties in mathematics.
There is growing evidence to suggest math skills are just as important as reading skills when predicting a child’s academic success and competence in mathematics is crucial to one’s success in school and the workplace
Against this background, early identification of students at risk for developing poor math achievement should be a key priority of education systems and their teachers in the classroom. In the domain of reading, much progress in early diagnosis of atrisk children has been made by focusing on processing competencies that are foundational to reading, such as phonological awareness
So what might be the foundational competencies that serve as a scaffold for children’s early mathematical learning? In order to process numbers it is necessary to have an understanding of the magnitudes they represent (e.g., knowing that the Arabic digit 3 stands for three items). Without an understanding of numerical magnitude and its association with numerical symbols the learning of mental arithmetic cannot get off the ground. Therefore, tests aiming to characterize the foundational skills of children’s numerical abilities should include measures of numerical magnitude processing. Research has shed light onto how numerical magnitudes are represented by adult humans
To measure numerical magnitude processing in older children and adults, researchers have frequently employed number comparison paradigms in which participants are asked to choose which of two numbers is larger in numerical magnitude. When individuals compare numerical magnitudes, an inverse relationship between the numerical distance of two magnitudes and the reaction time required to make a correct comparison is obtained
To explain the numerical distance effect, one popular account posits that numerically close magnitudes have more representational features in common than those that are farther apart. Because of this, discriminating between a pair of numerical magnitudes is more challenging for quantities that are numerically closer together, which results in the NDE during comparison tasks
Another effect that is observed in numerical magnitude comparison studies is the numerical ratio effect (NRE
The finding that the numerical ratio between two numbers influences the speed with which they can be accurately compared is consistent with Weber’s Law which states that the just noticeable difference between two stimuli is directly proportional to the magnitude of the stimulus with which the comparison is being made. This is reflected in the NRE where a specific difference between two magnitudes results in a faster response time the smaller the absolute values of the magnitudes being compared.
Against the background of the review of the existing literature described above, it is clear that much has been uncovered about the characteristics of the representation and processing of both symbolic and nonsymbolic numerical magnitudes across development and species. A question resulting from this research, which has been a growing focus in recent years, is whether individual differences in basic number processing are related to betweensubjects variability in mathematical achievement. In other words, are metrics of numerical magnitude processing, such as the numerical distance and numerical ratio effects, meaningful predictors of individual differences in children’s level of mathematical competence? And if so, can such measures be used to detect children at risk of developing mathematical learning difficulties, such as developmental dyscalculia?
In recent years, a growing number of studies have begun to answer this question. In one of the pioneering studies in this area, Durand, Hulme, Larkin and Snowling
More recently, Holloway and Ansari
The work of Durand et al.
Contrary to the findings by Holloway and Ansari
While Halberda, Mazzocco and Feigenson
In sum, while some studies suggest that symbolic but not nonsymbolic numerical magnitude comparison performance is related to children’s arithmetic skills, other studies have clearly shown that not only are nonsymbolic numerical magnitude processing skills correlated with children’s math performance but that such skills also predict arithmetic achievement over the course of developmental time. Few studies have conducted withinsubject studies using both symbolic and nonsymbolic numerical magnitude processing and thus, it is unclear which of these might be a stronger, unique predictor of children’s arithmetic achievement scores.
Empirical findings such as those discussed above, raise the question whether or not a quick, efficient and classroom friendly assessment tool could be designed to formally measure basic magnitude processing in children. To partially address this question, Chard and colleagues
However, it is important to note that, similar to the aforementioned Durand et al.
Taken together, previous research strongly suggests a relationship between, on the one hand, both symbolic and nonsymbolic number comparison and, on the other hand, individual differences in math achievement. Preliminary research has also demonstrated that an assessment of children’s symbolic magnitude processing is related to math performance, particularly arithmetic achievement
A basic paperandpencil assessment would be a valuable tool for several reasons. To begin, it would be very economical due to its low cost in comparison to computerized versions of the test that require specialized equipment and software. A test of this kind could also be quickly and easily administered and scored by the teacher in a large group setting. This would allow teachers to test the individual differences in basic numerical magnitude processing competence among their students. As this test would not require specialized software it could be used by educators in any setting such as schools with few resources or classrooms in developing countries and could be easily integrated into large scale studies that may be run by school boards, agencies or local governments.
The studies discussed above demonstrate that individual differences in basic magnitude processing are related to children’s math scores. In this context it is important to acknowledge that magnitude processing is not the only (or strongest) predictor of individual differences in math achievement. There is a large body of evidence demonstrating that math performance is related to cognitive abilities such as working memory. For example, working memory has been shown to play an important role in math skills such as solving both simple and complex arithmetic problems
In light of these findings, the objectives of the current study were threefold. First, we wanted to investigate whether a basic pencilandpaper measure of symbolic and nonsymbolic number processing could characterize developmental changes in basic numerical magnitude processing, such as agerelated improvement in accuracy of numerical comparisons. Our second goal was to explore whether performance on such a basic assessment tool of magnitude processing is capable of explaining variability in children’s math achievement scores and thirdly, we wanted to determine whether it explains significant variance over other factors such as working memory and reading skills.
A total of 197 students in Grades 1–3 participated in the current study. Eleven students were removed due to incorrect completion of the digit comparison task such as skipping pages of items or marking their responses in an unclear manner. Another four were removed from analysis due to performing at ceiling on the task (that is, they completed all trials correctly within the timelimit allotted). Twelve more children were removed due to their inability to reach a basal score on the Math Fluency and Calculation subtests of the WoodcockJohnson III Subtests of Achievement (WJ III; see below). For the Math Fluency test, any participant who had three or fewer items correct after one minute did not reach basal. For the Calculation test, if a child did not respond correctly to at least one of two practice items, the child did not reach basal and testing was discontinued. Five children were not able to reach basal on the Reading Fluency test of the WJ III; that is, they had fewer than three items correct on the four practice exercises. Three children did not reach basal on the Vocabulary subtest of the Wechsler Abbreviated Scale of Intelligence
Permission was granted from a local school board and school principals to recruit students from elementary schools in a region of Southwestern Ontario. Letters of information and consent forms approved by the University of Western Ontario’s Research Ethics Board were received and completed by parents of the participants before the study began. Interested parents representing 36 schools in both urban and rural areas consented to having their child(ren) participate in the current study. Participants were from various socioeconomic and ethnic groups.
During the magnitude comparison task participants were required to compare pairs of magnitudes ranging from 1–9. Stimuli were given in both symbolic (56 digit pairs) and nonsymbolic (56 pairs of dot arrays) formats. In both formats of presentation, each numerical magnitude was counterbalanced for the side of presentation (i.e., 27, 72). Furthermore, in the nonsymbolic form, dot stimuli were controlled for area and density.
To control for area and density, half of the dot arrays used were matched for total area and half of the dot arrays were matched for total perimeter. In other words, half of the trials had equal area while the other half had equal perimeter. The array with the most dots had a greater perimeter when cumulative surface area was matched. The array with the most dots had more cumulative surface area when perimeter was matched. To avoid having the participant rely on the relative size of the dot arrays, both perimetermatched and areamatched trials were presented randomly. To ensure that the test items became increasingly more difficult, the numerical ratio between the numerical magnitudes presented was manipulated. Easier items (with smaller ratios) were presented first and more difficult items were presented next (increasingly larger ratios). By starting with the easier items, this ensured that children remained motivated to complete the task. The order of trials in our assessment was similar to the order of ratios presented in
Number pair  Ratio 
1–9  .11 
1–8  .13 
1–7  .14 
1–6  .17 
1–5  .20 
2–9  .22 
2–8  .25 
2–7  .29 
3–9  .33 
3–8  .38 
2–5  .40 
3–7  .43 
4–9  .44 
3–6  .50 
4–8  .50 
5–9  .56 
4–7  .57 
3–5  .60 
5–8  .63 
2–3  .67 
5–7  .71 
6–8  .75 
7–9  .78 
4–5  .80 
5–6  .83 
6–7  .86 
7–8  .88 
8–9  .89 
During the test, participants were told to cross out the larger of the two magnitudes and were given one minute to complete the symbolic condition and one minute to complete the nonsymbolic condition. To ensure that participants understood the task, each child completed three sample items with the examiner and then nine practice items on their own before beginning the assessment (see
Figures A, B, and C are examples of symbolic items. Figures D, E and F are examples of nonsymbolic items.
In order to determine the subjects’ competence in mathematics, the WoodcockJohnson III Subtests of Achievement (WJ III
In order to assess the reading ability of each participant, children were given the Reading Fluency subtest of the WJ III
Cognitive performance was measured using two subtests of the Wechsler Abbreviated Scale of Intelligence (WASI
The Automated Working Memory Assessment (AWMA
The current study was part of a largescale study wherein children’s reading, math and language skills were tested. All participants were assessed at their respective elementary school in three onehour sessions over a period of three weeks at the end of the school year. Each participant was tested individually by trained examiners in a quiet area outside of the classroom.
The means and standard deviations for the tests used are shown in
Bar graph representing overall performance of participants in each grade for symbolic and nonsymbolic items. Grade 1 participants were significantly better at nonsymbolic items compared to symbolic items. Participants in grades 2 and 3 did not demonstrate any differences between conditions. Standard errors are represented by the error bars attached to each column.
Test  N  Mean Raw scores (S.D.)  Range (min.max.)  Mean standard scores (S.D.)  Range (min.max.) 
Age (months)  160  97.54 (9.38)  77–115  N/A  N/A 
Symbolic  160  36.65 (7.82)  16–55  N/A  N/A 
Nonsymbolic  160  36.40 (6.01)  21–54  N/A  N/A 
Math Fluency  160  31.23 (13.05)  4–75  92.60 (13.60)  65–136 
Calculation  160  10.26 (3.09)  1–17  95.05 (15.36)  29–135 
Listening Recall  160  10.00 (3.04)  4–20  103.29 (11.45)  78–135 
Counting Recall  160  15.56 (4.35)  5–31  103.31 (13.74)  71–133 
OddOneOut  160  17.50 (4.14)  3–29  110.76 (13.24)  71–133 
Spatial Recall  160  14.35 (4.68)  1–26  104.84 (13.61)  69–137 
Vocabulary 
160  28.04 (5.86)  13–43  49.73 (8.49)  29–69 
Block Design 
160  16.51 (10.11)  3–48  53.65 (10.14)  34–80 
Reading Fluency  160  28.66 (11.37)  2–57  101.90 (10.51)  75–142 
The WASI uses a population mean of 50 and standard deviation of 10.
Correlations were calculated for the following variables across all three grades (see
Variable  1  2  3  4  5  6  7  8  9  10  11  12 
1. MF  –  .64 
.40 
.45 
.38 
.28 
.34 
.30 
.17 
.43 
.33 
.43 
2. MC  –  .31 
.35 
.28 
.29 
.43 
.41 
.35 
.35 
.26 
.34 

3. RF  –  .32 
.13  .39 
.19 
.33 
.05  .31 
.27 
.33 

4. OOO  –  .51 
.31 
.40 
.22 
.27 
.31 
.15  .26 

5. SR  –  .22 
.26 
.15  .30 
.21 
.12  .19 

6. LR  –  .44 
.32 
.05  .18 
.12  .18 

7. CR  –  .33 
.23 
.15  .03  .11  
8. Vocab  –  .25 
.16 
.11  .16 

9. BD  –  .20 
.34 
.30 

10. Sym  –  .59 
.92 

11. Nonsym  –  .87 

12. Overall  – 
As seen from
Scatterplot showing significant correlation between standard scores on the Math Fluency subtest of the WoodcockJohnson III battery and overall mean score of the magnitude comparison task (symbolic and nonsymbolic combined) for all participants. The solid line represents the linear regression line for this relationship.
Scatterplot showing significant correlation between standard scores on the Calculation subtest of the WoodcockJohnson III battery and overall mean score of the magnitude comparison task (symbolic and nonsymbolic combined) for all participants. The solid line represents the linear regression line for this relationship.
Further analyses were conducted on the significant association between magnitude comparison and arithmetic achievement to examine the relationship between performance on the paperandpencil assessment and test scores for each grade level. As can be seen in
Variable  1  2  3  4  5 
1. MF  –  .73 
.34  .25  .34 
2. MC  –  .52 
.25  .44 

3. Sym  –  .56 
.88 

4. Nonsym  –  .87 

5. Overall  – 
Variable  1  2  3  4  5 
1. MF  –  .59 
.42 
.33 
.41 
2. MC  –  .31 
.15  .27 

3. Sym  –  .68 
.94 

4. Nonsym  –  .88 

5. Overall  – 
Variable  1  2  3  4  5 
1. MF  –  .62 
.45 
.33 
.45 
2. MC  –  .30 
.35 
.37 

3. Sym  –  .56 
.90 

4. Nonsym  –  .86 

5. Overall  – 
We then examined whether this graderelated difference in the strength of the correlations between, on the one hand, the symbolic and nonsymbolic performance and, on the other hand, Math Fluency and Calculation scores were statistically significant. In other words, whether the nonsignificant correlations in Grade 1 differed significantly from the significant correlations in the other grades. To do this we transformed correlation coefficients into Fisher’s
Thus while the correlations in Grade 1 between math scores and symbolic and nonsymbolic performance on the paperandpencil test do not pass the threshold for statistical significance (likely due to the comparatively small sample size), these correlations do not significantly differ from the ones in grades two and three. Therefore, a true developmental change in the relationships between arithmetic performance and the present measure of symbolic and nonsymbolic numerical magnitude processing cannot be supported by the present data. Instead the difference in the correlational strengths is likely due to differential sample sizes and, importantly, the correlations are significant when all three samples are collapsed into on group.
Since Reading Fluency, verbal working memory, visualspatial working memory and IQ each correlated with children’s scores on Math Fluency and Calculation, the specificity of the key relationship between number comparison and arithmetic skills needed to be further investigated. To do so, two linear regressions were performed: one to examine the relationship between Math Fluency (dependent variable), symbolic and nonsymbolic total score while controlling for age, verbal working memory, visualspatial working memory, IQ and Reading Fluency; and the other, to examine the relationship between Calculation (dependent variable), symbolic and nonsymbolic total score while controlling for age, verbal working memory, visualspatial working memory, IQ and Reading Fluency. Since no hypotheses were made about the order of predictors and, in an effort to investigate which variables accounted for significant unique variance, all predictor variables were entered as one step (see
Math Fluency  
Predictor  β  
Age  .014  .187 
Reading  .208 
2.49 
OddOneOut  .148  1.91 
Spatial Recall  .183 
2.51 
Listening Recall  −.029  −.375 
Counting Recall  .159 
2.14 
Vocabulary  .088  1.24 
Block Design  −.066  −.912 
Symbolic  .197 
2.35 
Nonsymbolic  .128  1.56 
Calculation  
Predictor  β  
Age  .126  1.72 
Reading  .126  1.53 
OddOneOut  .027  .355 
Spatial Recall  .049  .693 
Listening Recall  .020  .268 
Counting Recall  .226 
3.11 
Vocabulary  .157 
2.26 
Block Design  .186 
2.61 
Symbolic  .170 
2.07 
Nonsymbolic  .013  .164 
Results demonstrated that our first linear regression using Math Fluency as a dependent variable was significant (
The second regression analysis using Calculation as a dependent variable was also significant (
The purpose of this study was to extend previous research in three principal ways: 1) to investigate whether a basic paperandpencil measure of symbolic and nonsymbolic numerical magnitude processing could be used to measure agerelated changes in basic numerical magnitude processing skills, 2) to explore whether performance on this basic assessment tool is related to individual differences in children’s performance on measures of arithmetic achievement, and 3) to determine whether it explains significant variance over other factors such as age, working memory, reading skills and IQ.
With regards to the first aim of our study, we found agerelated differences in the performance of children on the paperandpencil measure. Specifically, analyses demonstrated a main effect of grade, which indicates that children improved in the magnitude comparison task as they became older, replicating previous findings and suggesting that this test, like computerized measures, can be used to characterize developmental changes in numerical magnitude processing. Furthermore, a format by grade interaction was also found whereby Grade 1 students were the only age group that performed significantly better on the nonsymbolic than symbolic items. This finding demonstrates that younger children were more accurate at nonsymbolic number processing than symbolic processing, whereas older children did not show this difference. These results indicate that over the course of developmental time, typically developing children become more proficient with symbolic number processing as they progress in school and acquire more familiarity and automaticity with numerical symbols. Moreover, it also suggests that perhaps young children have strong preexisting representations of nonsymbolic numerical magnitude (that can even be found in infancy) and only gradually map these onto symbolic representations.
The results from the current study also demonstrated that participants’ scores on this basic assessment tool significantly correlated with their scores on standardized tests of arithmetic achievement. More specifically, a significant positive relationship was found between Math Fluency, Calculation and the accuracy with which participants completed the symbolic items, nonsymbolic items and overall total scores on the magnitude comparison task. This finding indicates that children who scored highly on Calculation and Math Fluency also tended to receive high scores on our test. This association of numerical magnitude comparison skills and individual differences in arithmetic skills replicates findings in earlier work. For instance, the positive correlation found in the current study between performance on a timed numerical comparison task and individual differences in arithmetic performance replicates the work of Durand, Hulme, Larkin and Snowling
Finally, a key finding from our study indicated that performance on the symbolic items accounts for unique variance in arithmetic skills. Interestingly, this same result was not found for performance on the nonsymbolic items as demonstrated in previous research
Specifically, we found that while simple correlations show that both are related to arithmetic achievement, when we examined which of them accounts for unique variance, using multiple regression analyses, only symbolic magnitude comparison was found to account for unique, significant variance in children’s performance on the standardized tests of arithmetic achievement. Since the simple correlations revealed that accuracy on both the symbolic and nonsymbolic tasks independently correlated with math achievement, it is possible that they share variance related to core magnitude processing, but that nonsymbolic does not contribute any additional, unique variance to math performance while symbolic does. We speculate that the unique variance accounted for by symbolic processing is related to recognizing numerals and mapping numerals to magnitudes – a skill that is important in the mental manipulation of digits during calculation. While it is possible that symbolic and nonsymbolic share variance related to numerical magnitude processing, it is equally plausible that their shared variance (and the absence of unique variance accounted for by the nonsymbolic task) is explained by nonnumerical factors that are tapped by both tasks, such as speed of processing, attention, working memory or a complex combination of these factors and numerical magnitude processing. It is impossible to arbitrate between these different explanations given the current data. However, what the current data show are that symbolic number comparison explains unique variance while nonsymbolic does not, strengthening the notion that the mapping of symbols to numerical magnitudes is a critical correlate of individual differences in children’s arithmetic achievement
While children’s performance on the symbolic items of our test accounts for unique variance in arithmetic performance it is not the greatest predictor of arithmetic achievement. For example, the counting recall task of the AWMA accounted for variance in Calculation performance over and above symbolic number comparison scores. This demonstrates that while our test does account for some unique variability in children’s arithmetic skills, other number related abilities as well as measures of working memory, such as the counting recall task, also play an important role in children’s arithmetic skills. This should be considered and investigated further in future research of this kind.
Finally, the results from the multiple regression reveal, as previous studies have demonstrated
The age range of our sample and measures of math achievement used in the current study are very similar to the work done by Holloway and Ansari
Our findings also suggested a developmental trend whereby the relationship between symbolic performance and math achievement became stronger and more significant the older the participants, which may be construed to be contrary to the findings reported by Holloway and Ansari
As seen in
Unfortunately, there were a greater number of parents of children in grades two and three who agreed to have their children participate in the study than parents of children in Grade 1. These practical constraints of the study led to considerable differences in sample size between grade levels. Future investigations of this kind should therefore be conducted using equal sample sizes.
In sum, the current results demonstrate that a relationship exists between performance on a basic magnitude comparison task and individual differences in math achievement (as measured by arithmetic skills). Furthermore, it was found that symbolic processing accounts for unique variance in arithmetic skills while nonsymbolic processing does not. Finally, results indicate that a measure of this kind can characterize developmental changes in basic numerical magnitude processing.
As mentioned, previous research has shown that children who have strong skills in higher order mathematics, such as arithmetic, also demonstrate strong magnitude processing skills. The measurement tool investigated in the current study will allow educators the opportunity to quickly and easily assess these foundational competencies. A test of this kind will also help educators to focus on these essential skills during math instruction in the classroom. By focusing on these basic, yet foundational abilities educators can directly foster the numerical magnitude processing abilities of their students.
In addition, previous research has shown that not all measures of basic number processing correlate with individual differences in math achievement
In the current study, we found that children’s performance on nonsymbolic items correlated with their arithmetic skills. This may suggest that the nonsymbolic portion of our assessment may be used by itself with preschool children and children that do not yet have a semantic representation of number symbols, further demonstrating the utility of this simple assessment. Future studies would have to be used to investigate this line of research. In addition, future research should seek to examine the reliability of the number comparison assessment by measuring the testretest reliability of this assessment tool. Using a longitudinal design, forthcoming research should also seek to investigate this assessment tool and its predictive ability to identify children who are at risk for developing difficulties in mathematics. Such research is critical, as the current findings are merely correlational and may indicate that basic magnitude processing facilitates math development, but performance on the test may equally well reflect the fact that greater practice with arithmetic leads to improved performance in numerical magnitude comparison. A test that has the potential to truly predict individual differences in arithmetic ability would be a significant contribution to scores of classrooms and could have a great impact on the future of many students. By identifying atrisk children earlier and more reliably, findings from this and future studies will put us one step closer to improving the numeracy skills of students with difficulties in math and possibly enhance the teaching strategies currently used to instruct this specific group of children.