The authors have declared that no competing interests exist.
Conceived and designed the experiments: DCG. Performed the experiments: MKH LN DHB. Analyzed the data: DCG DHB. Wrote the paper: DCG MKH LN.
One in five adults in the United States is functionally innumerate; they do not possess the mathematical competencies needed for many modern jobs. We administered functional numeracy measures used in studies of young adults’ employability and wages to 180 thirteenyearolds. The adolescents began the study in kindergarten and participated in multiple assessments of intelligence, working memory, mathematical cognition, achievement, and inclass attentive behavior. Their number system knowledge at the beginning of first grade was defined by measures that assessed knowledge of the systematic relations among Arabic numerals and skill at using this knowledge to solve arithmetic problems. Early number system knowledge predicted functional numeracy more than six years later (ß = 0.195, p = .0014) controlling for intelligence, working memory, inclass attentive behavior, mathematical achievement, demographic and other factors, but skill at using counting procedures to solve arithmetic problems did not. In all, we identified specific beginning of schooling numerical knowledge that contributes to individual differences in adolescents’ functional numeracy and demonstrated that performance on mathematical achievement tests underestimates the importance of this early knowledge.
A substantial number of adults have not mastered the mathematics expected of an eighth grader (22% in the U.S.)
Early identification and remediation of knowledge deficits that predict longterm risk of innumeracy thus have the potential to yield substantial social and personal benefits
As part of a kindergarten to ninth grade longitudinal study of children’s mathematical development, in seventh grade we administered tests that are similar
The study was reviewed and approved by the Institutional Review Board of the University of Missouri. Written consent was obtained from all parents, and all participants provided verbal assent for all assessments.
The data are from a prospective longitudinal study of children’s mathematical development and risk of learning disability
At the end of first grade, the intelligence of the sample was average (M = 102, SD = 15), based on the Wechsler Abbreviated Scale of Intelligence (WASI)
The intelligence of the 109 children who did not participate in the seventh grade assessment was average (M = 94, SD = 15), but lower than that of the final sample (p<.0001). Their kindergarten mathematics achievement was average (M = 99, SD = 14) but slightly (d = .22) lower than that of the final sample (p = .01). Their reading achievement was high average (M = 110, SD = 16) and did not differ from that of the final sample (p = .16). The group differences, favoring the final sample, in intelligence and kindergarten mathematics achievement suggest that the results obtained in these analyses may be an underestimate of the actual relation between beginning of first grade early quantitative competencies assessed by mathematical cognition tasks (below) and seventh grade functional numeracy.
The mean age at the time of the first grade mathematical cognition assessment was 6.8 years (SD = 4 months) and 13.0 years (SD = 4 months) at the time of the seventh grade numeracy assessment. The racial composition was white (77%), Asian (5%), black (5%), and mixed race (8%), with the parents of the remaining children identifying them as Native American, Pacific Islander, or unknown. Across racial categories, 4% of the sample identified as ethnically Hispanic. Thirtyfour percent of the children attending the schools from which the sample was drawn were eligible for free or reduced price lunches.
The tests were the Colored Progressive Matrixes
Mathematics and reading achievement were assessed using the Numerical Operations and Word Reading subtests from the Wechsler Individual Achievement TestIIAbbreviated
Fourteen simple addition problems and six more complex problems were horizontally presented, one at a time, on flash cards in first grade and on the screen of a laptop computer thereafter. The simple problems consisted of the integers 2 through 9, with the constraint that the same two integers (e.g., 2+2) were never used in the same problem; ½ of the problems summed to 10 or less and the smaller valued addend appeared in the first position for ½ of the problems. The complex items were 16+7, 3+18, 9+15, 17+4, 6+19, and 14+8.
The child was asked to solve each problem (without pencil and paper) as quickly as possible without making too many mistakes. It was emphasized that the child could use whatever strategy was easiest to get the answer, and to speak the answer; from second grade forward, the answer was spoken into a voice activated microphone that recorded reaction time (RT) from problem onset. After solving each problem the child was asked to describe how they got the answer. Based on the child’s description and the experimenter’s observations, the trial was classified based on problem solving strategy. The four most common strategies were counting fingers, verbal counting, retrieval (quickly stating an answer and describing they “just remembered”), and decomposition (describing that they solved the problem by decomposing one addend and successively adding these smaller sets to the other addend; e.g., 17+8 = 17+3+5). Counting trials were further classified as min (stating the larger valued addend and counting the smaller one), sum (counting both addends starting from one), or max (stating the value of the smaller addend and then counting the larger one). The combination of experimenter observation and child reports immediately after each problem is solved has proven to be a useful measure of children’s strategy choices
The variables used here were the frequency with which min counting was correctly used to solve the simple problems and the more complex problems. The frequency of correctly retrieving the answers was also used for simple problems, and the frequency with which decomposition was correctly used for complex problems (
Variable  Factor  Operationalization 
Simple addition counting  Counting Competence  Frequency and accuracy of use of mature procedure 
Complex addition counting  Counting Competence  Frequency and accuracy of use of mature procedure 
Simple addition retrieval  Number System Knowledge  Correct retrieval of answers to number combinations 
Complex addition decomposition  Number System Knowledge  Frequency of correct use of decomposition 
Number line accuracy  Number System Knowledge  Accuracy in placement of numerals on a number line 
Number sets fluency  Number System Knowledge  Signal detection measure based on hits and misses 
Two types of stimuli were used: objects (e.g., stars) in a 1/2′′ square and an Arabic numeral (18 pt font) in a 1/2′′ square. Stimuli are joined in dominolike rectangles with different combinations of objects and numerals (
The task is to circle rectangles that contain collections of objects, Arabic numerals, or a combination that match a target number. For the actual task, children had 120 sec to identify which of 72 items matched a target of 5, and 180 sec for a target of 9.
The tester began by explaining two items that matched a target sum of 4; then, used the target sum of 3 for practice. The measure was then administered. The child was told to move across each line of the page from left to right without skipping any; to “circle any groups that can be put together to make the top number, 5 (9)”; and to “work as fast as you can without making many mistakes.” The child had 60 sec per page for the target 5; 90 sec per page for the target 9. Time limits were chosen to avoid ceiling effects and to assess fluent recognition and manipulation of quantities associated with collections of objects and Arabic numerals. Performance is consistent across target number and item content (e.g., whether the rectangle included Arabic numerals or objects) and thus these were combined to create an overall frequency of hits (α = .88), correct rejections (α = .85), misses (α = .70), and false alarms (α = .90)
After first grade, some of the children completed all items in less than the maximum times (120 and 180 sec for targets of 5 and 9, respectively) and thus their scores were adjusted upwards; specifically, (hits – false alarms)×(maximum RT/actual RT). The adjustment enabled us to maintain the sensitivity of the test, despite faster processing times across grades.
A series of twentyfour 25 cm number lines containing a blank line with two endpoints (0 and 100) was presented, one at a time, to the child with a target number (e.g., 45) in a large font printed above the line. The child’s task was to mark the line where the target number (using pencil and paper in first grade and a laptop and mouse thereafter) should lie
The mechanisms that support children’s learning of the mathematical number line are debated
The six mathematical cognition variables listed in
The Working Memory Test Battery for Children (WMTBC)
The Strength and Weaknesses of ADHD–symptoms and normalbehavior (SWAN) was used as the measure of inclass attentive behavior
The six control variables were sex, race, first grade school site, beginning of first grade speed of Arabic numeral encoding and articulation, and raw kindergarten Numerical Operations and Word Reading scores. The race variable provided separate contrasts of White children with Black children, White children with Asian children, and White children with all remaining children. The estimates for the race contrasts need to be interpreted with caution, given the small sample size for some of the contrasts (see Control Variables in SI). Their inclusion is important nonetheless as a control variable.
The measures were selected based on labor economic studies of employability, wages, and related outcomes in adulthood
Competence in solving multistep word problems was assessed using the first form (15 items) of the Arithmetic Aptitude Test from the Educational Testing Service (ETS) kit of factorreferenced tests
The first form of three tests from the ETS kit
Based on Hecht
The mean score for the division test was 1.4 (SD = 2.5) problems solved correctly and the median was 0 (75% of the participants did not solve a single problem correctly), a pattern of very low performance that was also found by Siegler et al.
The test is composed of 16 pairs of fractions and was developed based on children’s common problem solving errors or the strategies they use when solving fractions problems
Answers were scored as hits (coded 1) or misses (coded −1). Hits were significantly correlated across the four problem types (rs = .39 to.74, ps<.0001) and thus summed to create a total hits variable (α = .81). Misses were also significantly correlated (rs = .36 to.74, p<.0001) and summed (α = .79). The fractions comparison score was hits minus misses. The validity of the measure was demonstrated by showing that scores predict one year gains in mathematics achievement, controlling for previous mathematics achievement, intelligence, and working memory
The four word problem, computational arithmetic, and fractions measures were submitted to a principal components factor analysis, which yielded a single factor (Eigenvalue = 2.6) that explained 66% of the covariation among the variables (factor loadings >.76). A Functional Numeracy composite was created by taking the standardized mean of the four variables (
The CPM and WASI were administered in the spring of kindergarten and first grade, respectively, and the achievement tests were administered every spring beginning in kindergarten. The mathematical cognition tasks were administered once a year, beginning in the fall of first grade. The WMTCB was administered in first (M = 84 months, SD = 6) and fifth (M = 128 months, SD = 5) grades (Table S2 in
Adolescents’ scores on the functional numeracy measure were significantly correlated with their beginning of first grade counting competence (r = .31, p<.0001) and number system knowledge (r = 0.69, p<.0001) scores (Table S3 in
Prediction of Functional Numeracy  
Effect  Estimates  t  p 
Intercept  0.248±0.115  2.15  0.0332 


Girls contrasted with boys  −0.096±0.118  −0.81  0.4183 
Mixed race contrasted with White  −0.096±0.119  −0.81  0.4212 
Black contrasted with White  0.020±0.216  0.09  0.9280 
Asian contrasted with White  0.508±0.190  2.67  0.0084 
Kindergarten mathematics achievement  0.108±0.056  1.94  0.0540 
Kindergarten reading achievement  0.002±0.063  0.03  0.9755 
Number processing speed  −0.003±0.051  −0.06  0.9526 


Intelligence  0.105±0.065  1.62  0.1081 
First grade phonological loop  −0.047±0.071  −0.66  0.5086 
First grade visuospatial sketch pad  −0.077±0.055  −1.40  0.1645 
First grade central executive  0.023±0.064  0.37  0.7097 
Fifth grade phonological loop  0.000±0.060  0.01  0.9936 
Fifth grade visuospatial sketch pad  0.043±0.054  0.80  0.4272 
Fifth grade central executive  0.130±0.060  2.18  0.0307 
Inclass attentive behavior  0.167±0.057  2.93  0.0039 


Counting Competence  0.044±0.051  0.85  0.3984 
Number System Knowledge  0.287±0.070  4.00  0.0001 
The same analyses were conducted for each of the four tests that composed the functional numeracy composite (Table S4 in
Seventh grade mathematics achievement and functional numeracy scores were significantly correlated, r = .79, p<.0001, but less so once fifth grade working memory (the assessment closest to seventh grade), intelligence, and inclass attentive behavior were controlled, pr = .50, p<.0001. At noted earlier, the functional numeracy measures have been shown to be predictive of important life outcomes in adults
Number system knowledge remained predictive of functional numeracy, after controlling for seventh grade mathematics achievement (ß = 0.195, p = .0014; Table S5 in
Logistic regression revealed a 1
The analyses thus far indicate that children who begin first grade with low number system knowledge are at heightened risk for low functional numeracy scores in seventh grade. As a follow up, we sought to determine whether firsttofifth grade growth in number system knowledge is also related to functional numeracy in seventh grade.
The measures that defined the Number System Knowledge factor were administered in first through fifth grade, inclusive. A principle components factor analysis, with promax rotation confirmed that the four variables defined the same Number System Knowledge factor identified for first grade in second to fifth grade, inclusive (Eigenvalues >1.76, factor loaders>.54).
To make each measure comparable to the others and across grades, the associated scores were defined as the percentage of maximum possible performance; specifically, for simple addition (number of problems correctly retrieved/14), for complex addition (number of problems correctly solved with decomposition/6), for Number Sets (RT adjusted dprime score/maximum score achieved in fifth grade across all children), and number line [1– (mean error/50)]. Fifty was chosen for the latter, because random placements would, on average, result in mean errors of 50 on the 0to100 number line. A child making random placements would thus have a score of 1–1, or 0 percent. The most accurate child in our study had a mean error of 1.75 in fifth grade, resulting in a score of 0.965.
Children who scored in the bottom quartile on the functional numeracy measure had a lower number system knowledge start point and slower first to fifth grade growth than children in the top and middle quartiles (ps<.0803;
The score is the percentage of the maximum possible score across the four tasks that composed the Number System Knowledge factor.
The results provide three key insights into children’s mathematical development. The first is that some aspects of their school entry quantitative knowledge, as measured by the mathematical cognition tasks, contribute to longterm functional numeracy, controlling other factors that affect learning, whereas other aspects of their knowledge do not. Of particular importance were the competencies common to the measures that defined the Number System Knowledge factor. All of these measures require explicit processing of Arabic numerals and operating on them in ways consistent with the logical, systematic relations among numerals. At school entry, this emerging knowledge of the number system includes an understanding of the relative magnitude of numerals, their ordering, and the ability to combine and decompose them into smaller and larger numerals and to use this knowledge to solve arithmetic problems. Whether or not this explicit number system knowledge is dependent on a potentially inherent sense of magnitude for its initial development
At the same time, children’s skill at using counting procedures to solve addition problems at the beginning of first grade was not predictive of their later functional numeracy scores, holding other factors constant. One potential reason for this is because children who begin school behind their peers in the use of these counting procedures tend to catch up with other children within one or two years
The second key finding is the previously noted relation between mathematics achievement in kindergarten and mathematics achievement throughout schooling
The third key finding is that growth in number system knowledge is less important for predicting functional numeracy than is school entry number system knowledge. Children scoring in the bottom quartile on the numeracy measure in seventh grade started school behind their peers in number system knowledge and showed less rapid growth from first to second grade, but typical growth thereafter. Future studies are needed to determine how this early number system knowledge influences the learning of more complex aspects of the number system (e.g., the base10 organization), and how this influences emerging functional numeracy. For now, the implication is that interventions to improve children’s early understanding of the relations among numerals need to be implemented before the start of schooling or in first grade, and fortunately such interventions are being developed
This file contains: Method and Materials–provides detailed description of the working memory and functional numeracy measures; Control Variables–provides detailed description of the control variables; Table S1–Standardized Factor Loadings for the Mathematical Cognition Measures in First Grade; Table S2–Overall Design of the Missouri Study; Table S3–Means and Correlations Among Variables. All variables were standardized (M = 0, SD = 1) and analyzed in PROC GLM
(DOCX)
We thank Linda Coutts, Chip Sharp, Jennifer ByrdCraven, Chatty Numtee, Amanda Shocklee, Sara Ensenberger, Kendra Andersen Cerveny, Rebecca Hale, Patrick Maloney, Ashley Stickney, Nick Geary, Mary Lemp, Cy Nadler, Mike Coutts, Katherine Waller, Rehab Mojid, Jasmine Tilghman, Caitlin Cole, Leah Thomas, Erin Twellman, Patricia Hoard, Jonathan Thacker, Alex Wilkerson, Stacey Jones, James Dent, Erin Willoughby, Kelly Regan, Kristy Kuntz, Rachel Christensen, Jenni Hoffman, and Stephen Cobb for help on various aspects of the project.